1
|
Mikuls TR, Baker JF, Cannon GW, England BR, Kerr G, Reimold A. The Veterans Affairs Rheumatoid Arthritis Registry: A unique population in rheumatoid arthritis research. Semin Arthritis Rheum 2024:152580. [PMID: 39580339 DOI: 10.1016/j.semarthrit.2024.152580] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2024] [Accepted: 10/28/2024] [Indexed: 11/25/2024]
Abstract
BACKGROUND As the largest integrated healthcare system in the U.S., the Veterans Affairs (VA) provides a unique context for the conduct of clinical and clinical-translational research in rheumatoid arthritis (RA). OBJECTIVES To review attributes of the VA Rheumatoid Arthritis Registry (RA) and highlight its research contributions. FINDINGS With >3,600 participants enrolled from 19 VA medical centers across the U.S., VARA includes longitudinally collected clinical data and a central biorepository that includes serum, plasma, and DNA collected at enrollment. VARA research capacity is enhanced via active linkages with internal data including the VA's Corporate Data Warehouse and elements captured during oncology care. This capacity is further enabled via active linkages with the National Death Index and Centers for Medicare & Medicaid Services (CMS) data. CONCLUSION As a highly unique study population with comprehensive data annotation available to researchers, VARA is poised to continue address impactful questions in RA for years to come.
Collapse
Affiliation(s)
- Ted R Mikuls
- Division of Rheumatology, VA Nebraska Western Iowa Health Care System & University of Nebraska Medical Center, Omaha, NE, USA.
| | - Joshua F Baker
- Corporal Michael J. Crescenz VA Medical Center and University of Pennsylvania, Philadelphia, PA, USA
| | - Grant W Cannon
- VA Salt Lake City Health Care System and University of Utah, Salt Lake City, UT, USA
| | - Bryant R England
- Division of Rheumatology, VA Nebraska Western Iowa Health Care System & University of Nebraska Medical Center, Omaha, NE, USA
| | - Gail Kerr
- Washington D.C. VA, Howard University, & Georgetown University, Washington DC, USA
| | - Andreas Reimold
- Dallas VA & University of Texas Southwestern, Dallas, TX, USA
| |
Collapse
|
2
|
Graça M, Nobre R, Sousa L, Ilic A. Distributed transformer for high order epistasis detection in large-scale datasets. Sci Rep 2024; 14:14579. [PMID: 38918413 PMCID: PMC11199512 DOI: 10.1038/s41598-024-65317-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2024] [Accepted: 06/19/2024] [Indexed: 06/27/2024] Open
Abstract
Understanding the genetic basis of complex diseases is one of the most important challenges in current precision medicine. To this end, Genome-Wide Association Studies aim to correlate Single Nucleotide Polymorphisms (SNPs) to the presence or absence of certain traits. However, these studies do not consider interactions between several SNPs, known as epistasis, which explain most genetic diseases. Analyzing SNP combinations to detect epistasis is a major computational task, due to the enormous search space. A possible solution is to employ deep learning strategies for genomic prediction, but the lack of explainability derived from the black-box nature of neural networks is a challenge yet to be addressed. Herein, a novel, flexible, portable, and scalable framework for network interpretation based on transformers is proposed to tackle any-order epistasis. The results on various epistasis scenarios show that the proposed framework outperforms state-of-the-art methods for explainability, while being scalable to large datasets and portable to various deep learning accelerators. The proposed framework is validated on three WTCCC datasets, identifying SNPs related to genes known in the literature that have direct relationships with the studied diseases.
Collapse
Affiliation(s)
- Miguel Graça
- INESC-ID, Instituto Superior Técnico, 1000-029, Lisbon, Portugal.
| | - Ricardo Nobre
- INESC-ID, Instituto Superior Técnico, 1000-029, Lisbon, Portugal
| | - Leonel Sousa
- INESC-ID, Instituto Superior Técnico, 1000-029, Lisbon, Portugal
| | - Aleksandar Ilic
- INESC-ID, Instituto Superior Técnico, 1000-029, Lisbon, Portugal
| |
Collapse
|
3
|
Arueyingho OV, Al-Taie A, McCallum C. Scoping review: Machine learning interventions in the management of healthcare systems. Digit Health 2024; 10:20552076221144095. [PMID: 39444734 PMCID: PMC11497546 DOI: 10.1177/20552076221144095] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2021] [Accepted: 11/18/2022] [Indexed: 10/25/2024] Open
Abstract
Background Healthcare institutions focus on improving the quality of life for end-users, with key performance indicators like access to essential medicines reflecting the effectiveness of management. Effective healthcare management involves planning, organizing, and controlling institutions built on human resources, data systems, service delivery, access to medicines, finance, and leadership. According to the World Health Organization, these elements must be balanced for an optimal healthcare system. Big data generated from healthcare institutions, including health records and genomic data, is crucial for smart staffing, decision-making, risk management, and patient engagement. Properly organizing and analysing this data is essential, and machine learning, a sub-field of artificial intelligence, can optimize these processes, leading to better overall healthcare management. Objectives This review examines the major applications of machine learning in healthcare management, the algorithms frequently used in data analysis, their limitations, and the evidence-based benefits of machine learning in healthcare. Methods Following PRISMA guidelines, databases such as IEEE Xplore, ScienceDirect, ACM Digital Library, and SCOPUS were searched for eligible articles published between 2011 and 2021. Articles had to be in English, peer-reviewed, and include relevant keywords like healthcare, management, and machine learning. Results Out of 51 relevant articles, 6 met the inclusion criteria. Identified algorithms include topic modelling, dynamic clustering, neural networks, decision trees, and ensemble classifiers, applied in areas such as electronic health records, chatbots, and multi-disease prediction. Conclusion Machine learning supports healthcare management by aiding decision-making, processing big data, and providing insights for system improvements.
Collapse
Affiliation(s)
- Oritsetimeyin V Arueyingho
- School of Computer Science, Electrical and Electronic Engineering, and Engineering Maths (SCEEM), Centre for Doctoral Training in Digital Health and Care, University of Bristol, UK
| | - Anmar Al-Taie
- School of Computer Science, Electrical and Electronic Engineering, and Engineering Maths (SCEEM), Centre for Doctoral Training in Digital Health and Care, University of Bristol, UK
| | - Claire McCallum
- Department of Clinical Pharmacy, Faculty of Pharmacy, Istinye University, Istanbul, Turkey
| |
Collapse
|
4
|
Galozzi P, Basso D, Plebani M, Padoan A. Artificial Intelligence and laboratory data in rheumatic diseases. Clin Chim Acta 2023; 546:117388. [PMID: 37187221 DOI: 10.1016/j.cca.2023.117388] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2023] [Revised: 05/09/2023] [Accepted: 05/09/2023] [Indexed: 05/17/2023]
Abstract
Artificial intelligence (AI)-based medical technologies are rapidly evolving into actionable solutions for clinical practice. Machine learning (ML) algorithms can process increasing amounts of laboratory data such as gene expression immunophenotyping data and biomarkers. In recent years, the analysis of ML has become particularly useful for the study of complex chronic diseases, such as rheumatic diseases, heterogenous conditions with multiple triggers. Numerous studies have used ML to classify patients and improve diagnosis, to stratify the risk and determine disease subtypes, as well as to discover biomarkers and gene signatures. This review aims to provide examples of ML models for specific rheumatic diseases using laboratory data and some insights into relevant strengths and limitations. A better understanding and future application of these analytical strategies could facilitate the development of precision medicine for rheumatic patients.
Collapse
Affiliation(s)
- Paola Galozzi
- Department of Medicine-DIMED, University of Padova, Padova, Italy.
| | - Daniela Basso
- Department of Medicine-DIMED, University of Padova, Padova, Italy; Laboratory Medicine Unit, University Hospital of Padova, Padova, Italy
| | - Mario Plebani
- Department of Medicine-DIMED, University of Padova, Padova, Italy; Laboratory Medicine Unit, University Hospital of Padova, Padova, Italy
| | - Andrea Padoan
- Department of Medicine-DIMED, University of Padova, Padova, Italy; Laboratory Medicine Unit, University Hospital of Padova, Padova, Italy
| |
Collapse
|
5
|
Momtazmanesh S, Nowroozi A, Rezaei N. Artificial Intelligence in Rheumatoid Arthritis: Current Status and Future Perspectives: A State-of-the-Art Review. Rheumatol Ther 2022; 9:1249-1304. [PMID: 35849321 PMCID: PMC9510088 DOI: 10.1007/s40744-022-00475-4] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2022] [Accepted: 06/24/2022] [Indexed: 11/23/2022] Open
Abstract
Investigation of the potential applications of artificial intelligence (AI), including machine learning (ML) and deep learning (DL) techniques, is an exponentially growing field in medicine and healthcare. These methods can be critical in providing high-quality care to patients with chronic rheumatological diseases lacking an optimal treatment, like rheumatoid arthritis (RA), which is the second most prevalent autoimmune disease. Herein, following reviewing the basic concepts of AI, we summarize the advances in its applications in RA clinical practice and research. We provide directions for future investigations in this field after reviewing the current knowledge gaps and technical and ethical challenges in applying AI. Automated models have been largely used to improve RA diagnosis since the early 2000s, and they have used a wide variety of techniques, e.g., support vector machine, random forest, and artificial neural networks. AI algorithms can facilitate screening and identification of susceptible groups, diagnosis using omics, imaging, clinical, and sensor data, patient detection within electronic health record (EHR), i.e., phenotyping, treatment response assessment, monitoring disease course, determining prognosis, novel drug discovery, and enhancing basic science research. They can also aid in risk assessment for incidence of comorbidities, e.g., cardiovascular diseases, in patients with RA. However, the proposed models may vary significantly in their performance and reliability. Despite the promising results achieved by AI models in enhancing early diagnosis and management of patients with RA, they are not fully ready to be incorporated into clinical practice. Future investigations are required to ensure development of reliable and generalizable algorithms while they carefully look for any potential source of bias or misconduct. We showed that a growing body of evidence supports the potential role of AI in revolutionizing screening, diagnosis, and management of patients with RA. However, multiple obstacles hinder clinical applications of AI models. Incorporating the machine and/or deep learning algorithms into real-world settings would be a key step in the progress of AI in medicine.
Collapse
Affiliation(s)
- Sara Momtazmanesh
- School of Medicine, Tehran University of Medical Sciences, Tehran, Iran
- Network of Immunity in Infection, Malignancy and Autoimmunity (NIIMA), Universal Scientific Education and Research Network (USERN), Tehran, Iran
- Research Center for Immunodeficiencies, Pediatrics Center of Excellence, Children's Medical Center, Tehran University of Medical Sciences, Dr. Gharib St, Keshavarz Blvd, Tehran, Iran
| | - Ali Nowroozi
- School of Medicine, Tehran University of Medical Sciences, Tehran, Iran
- Network of Immunity in Infection, Malignancy and Autoimmunity (NIIMA), Universal Scientific Education and Research Network (USERN), Tehran, Iran
| | - Nima Rezaei
- Network of Immunity in Infection, Malignancy and Autoimmunity (NIIMA), Universal Scientific Education and Research Network (USERN), Tehran, Iran.
- Research Center for Immunodeficiencies, Pediatrics Center of Excellence, Children's Medical Center, Tehran University of Medical Sciences, Dr. Gharib St, Keshavarz Blvd, Tehran, Iran.
- Department of Immunology, School of Medicine, Tehran University of Medical Sciences, Tehran, Iran.
| |
Collapse
|
6
|
De Cock D, Myasoedova E, Aletaha D, Studenic P. Big data analyses and individual health profiling in the arena of rheumatic and musculoskeletal diseases (RMDs). Ther Adv Musculoskelet Dis 2022; 14:1759720X221105978. [PMID: 35794905 PMCID: PMC9251966 DOI: 10.1177/1759720x221105978] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2022] [Accepted: 05/22/2022] [Indexed: 11/17/2022] Open
Abstract
Health care processes are under constant development and will need to embrace advances in technology and health science aiming to provide optimal care. Considering the perspective of increasing treatment options for people with rheumatic and musculoskeletal diseases, but in many cases not reaching all treatment targets that matter to patients, care systems bare potential to improve on a holistic level. This review provides an overview of systems and technologies under evaluation over the past years that show potential to impact diagnosis and treatment of rheumatic diseases in about 10 years from now. We summarize initiatives and studies from the field of electronic health records, biobanking, remote monitoring, and artificial intelligence. The combination and implementation of these opportunities in daily clinical care will be key for a new era in care of our patients. This aims to inform rheumatologists and healthcare providers concerned with chronic inflammatory musculoskeletal conditions about current important and promising developments in science that might substantially impact the management processes of rheumatic diseases in the 2030s.
Collapse
Affiliation(s)
- Diederik De Cock
- Clinical and Experimental Endocrinology, Department of Chronic Diseases and Metabolism, KU Leuven, Leuven, Belgium
| | - Elena Myasoedova
- Division of Rheumatology, Department of Internal Medicine and Division of Epidemiology, Department of Quantitative Health Sciences, Mayo Clinic, Rochester, MN, USA
| | - Daniel Aletaha
- Division of Rheumatology, Department of Internal Medicine 3, Medical University Vienna, Vienna, Austria
| | - Paul Studenic
- Division of Rheumatology, Department of Internal Medicine 3, Medical University Vienna, Waehringer Guertel 18-20, 1090 Vienna, Austria
| |
Collapse
|
7
|
Bai L, Zhang Y, Wang P, Zhu X, Xiong JW, Cui L. Improved diagnosis of rheumatoid arthritis using an artificial neural network. Sci Rep 2022; 12:9810. [PMID: 35697754 PMCID: PMC9192742 DOI: 10.1038/s41598-022-13750-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2021] [Accepted: 05/27/2022] [Indexed: 11/29/2022] Open
Abstract
Rheumatoid arthritis (RA) is chronic systemic disease that can cause joint damage, disability and destructive polyarthritis. Current diagnosis of RA is based on a combination of clinical and laboratory features. However, RA diagnosis can be difficult at its disease onset on account of overlapping symptoms with other arthritis, so early recognition and diagnosis of RA permit the better management of patients. In order to improve the medical diagnosis of RA and evaluate the effects of different clinical features on RA diagnosis, we applied an artificial neural network (ANN) as the training algorithm, and used fivefold cross-validation to evaluate its performance. From each sample, we obtained data on 6 features: age, sex, rheumatoid factor, anti-citrullinated peptide antibody (CCP), 14-3-3η, and anti-carbamylated protein (CarP) antibodies. After training, this ANN model assigned each sample a probability for being either an RA patient or a non-RA patient. On the validation dataset, the F1 for all samples by this ANN model was 0.916, which was higher than the 0.906 we previously reported using an optimal threshold algorithm. Therefore, this ANN algorithm not only improved the accuracy of RA diagnosis, but also revealed that anti-CCP had the greatest effect while age and anti-CarP had a weaker on RA diagnosis.
Collapse
Affiliation(s)
- Linlu Bai
- Beijing Key Laboratory of Cardiometabolic Molecular Medicine, Institute of Molecular Medicine, College of Future Technology, Academy for Advanced Interdisciplinary Studies, and State Key Laboratory of Natural and Biomimetic Drugs, Peking University, No. 5 Yiheyuan Road, Haidian District, Beijing, 100871, China
| | - Yuan Zhang
- Department of Laboratory Medicine, Peking University Third Hospital, No. 49 North Garden Road, Haidian District, Beijing, 100191, China
| | - Pan Wang
- Department of Laboratory Medicine, Peking University Third Hospital, No. 49 North Garden Road, Haidian District, Beijing, 100191, China
| | - Xiaojun Zhu
- Beijing Key Laboratory of Cardiometabolic Molecular Medicine, Institute of Molecular Medicine, College of Future Technology, Academy for Advanced Interdisciplinary Studies, and State Key Laboratory of Natural and Biomimetic Drugs, Peking University, No. 5 Yiheyuan Road, Haidian District, Beijing, 100871, China
| | - Jing-Wei Xiong
- Beijing Key Laboratory of Cardiometabolic Molecular Medicine, Institute of Molecular Medicine, College of Future Technology, Academy for Advanced Interdisciplinary Studies, and State Key Laboratory of Natural and Biomimetic Drugs, Peking University, No. 5 Yiheyuan Road, Haidian District, Beijing, 100871, China.
| | - Liyan Cui
- Department of Laboratory Medicine, Peking University Third Hospital, No. 49 North Garden Road, Haidian District, Beijing, 100191, China.
| |
Collapse
|
8
|
Wang S, Hou Y, Li X, Meng X, Zhang Y, Wang X. Practical Implementation of Artificial Intelligence-Based Deep Learning and Cloud Computing on the Application of Traditional Medicine and Western Medicine in the Diagnosis and Treatment of Rheumatoid Arthritis. Front Pharmacol 2022; 12:765435. [PMID: 35002704 PMCID: PMC8733656 DOI: 10.3389/fphar.2021.765435] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2021] [Accepted: 12/09/2021] [Indexed: 12/23/2022] Open
Abstract
Rheumatoid arthritis (RA), an autoimmune disease of unknown etiology, is a serious threat to the health of middle-aged and elderly people. Although western medicine, traditional medicine such as traditional Chinese medicine, Tibetan medicine and other ethnic medicine have shown certain advantages in the diagnosis and treatment of RA, there are still some practical shortcomings, such as delayed diagnosis, improper treatment scheme and unclear drug mechanism. At present, the applications of artificial intelligence (AI)-based deep learning and cloud computing has aroused wide attention in the medical and health field, especially in screening potential active ingredients, targets and action pathways of single drugs or prescriptions in traditional medicine and optimizing disease diagnosis and treatment models. Integrated information and analysis of RA patients based on AI and medical big data will unquestionably benefit more RA patients worldwide. In this review, we mainly elaborated the application status and prospect of AI-assisted deep learning and cloud computation-oriented western medicine and traditional medicine on the diagnosis and treatment of RA in different stages. It can be predicted that with the help of AI, more pharmacological mechanisms of effective ethnic drugs against RA will be elucidated and more accurate solutions will be provided for the treatment and diagnosis of RA in the future.
Collapse
Affiliation(s)
- Shaohui Wang
- School of Ethnic Medicine, Chengdu University of Traditional Chinese Medicine, Chengdu, China
| | - Ya Hou
- School of Pharmacy, Chengdu University of Traditional Chinese Medicine, Chengdu, China
| | - Xuanhao Li
- Chengdu Second People's Hospital, Chengdu, China
| | - Xianli Meng
- State Key Laboratory of Southwestern Chinese Medicine Resources, Innovative Institute of Chinese Medicine and Pharmacy, Chengdu University of Traditional Chinese Medicine, Chengdu, China
| | - Yi Zhang
- School of Ethnic Medicine, Chengdu University of Traditional Chinese Medicine, Chengdu, China
| | - Xiaobo Wang
- State Key Laboratory of Southwestern Chinese Medicine Resources, Innovative Institute of Chinese Medicine and Pharmacy, Chengdu University of Traditional Chinese Medicine, Chengdu, China
| |
Collapse
|
9
|
Kingsmore KM, Puglisi CE, Grammer AC, Lipsky PE. An introduction to machine learning and analysis of its use in rheumatic diseases. Nat Rev Rheumatol 2021; 17:710-730. [PMID: 34728818 DOI: 10.1038/s41584-021-00708-w] [Citation(s) in RCA: 36] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/04/2021] [Indexed: 02/07/2023]
Abstract
Machine learning (ML) is a computerized analytical technique that is being increasingly employed in biomedicine. ML often provides an advantage over explicitly programmed strategies in the analysis of multidimensional information by recognizing relationships in the data that were not previously appreciated. As such, the use of ML in rheumatology is increasing, and numerous studies have employed ML to classify patients with rheumatic autoimmune inflammatory diseases (RAIDs) from medical records and imaging, biometric or gene expression data. However, these studies are limited by sample size, the accuracy of sample labelling, and absence of datasets for external validation. In addition, there is potential for ML models to overfit or underfit the data and, thereby, these models might produce results that cannot be replicated in an unrelated dataset. In this Review, we introduce the basic principles of ML and discuss its current strengths and weaknesses in the classification of patients with RAIDs. Moreover, we highlight the successful analysis of the same type of input data (for example, medical records) with different algorithms, illustrating the potential plasticity of this analytical approach. Altogether, a better understanding of ML and the future application of advanced analytical techniques based on this approach, coupled with the increasing availability of biomedical data, may facilitate the development of meaningful precision medicine for patients with RAIDs.
Collapse
Affiliation(s)
| | | | - Amrie C Grammer
- AMPEL BioSolutions and RILITE Research Institute, Charlottesville, VA, USA
| | - Peter E Lipsky
- AMPEL BioSolutions and RILITE Research Institute, Charlottesville, VA, USA
| |
Collapse
|
10
|
Chicco D, Faultless T. Brief Survey on Machine Learning in Epistasis. Methods Mol Biol 2021; 2212:169-179. [PMID: 33733356 DOI: 10.1007/978-1-0716-0947-7_11] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/09/2023]
Abstract
In biology, the term "epistasis" indicates the effect of the interaction of a gene with another gene. A gene can interact with an independently sorted gene, located far away on the chromosome or on an entirely different chromosome, and this interaction can have a strong effect on the function of the two genes. These changes then can alter the consequences of the biological processes, influencing the organism's phenotype. Machine learning is an area of computer science that develops statistical methods able to recognize patterns from data. A typical machine learning algorithm consists of a training phase, where the model learns to recognize specific trends in the data, and a test phase, where the trained model applies its learned intelligence to recognize trends in external data. Scientists have applied machine learning to epistasis problems multiple times, especially to identify gene-gene interactions from genome-wide association study (GWAS) data. In this brief survey, we report and describe the main scientific articles published in data mining and epistasis. Our article confirms the effectiveness of machine learning in this genetics subfield.
Collapse
Affiliation(s)
- Davide Chicco
- Krembil Research Institute, Toronto, Ontario, Canada.
| | | |
Collapse
|
11
|
Protocol for Epistasis Detection with Machine Learning Using GenEpi Package. Methods Mol Biol 2021. [PMID: 33733363 DOI: 10.1007/978-1-0716-0947-7_18] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register]
Abstract
To develop medical treatments and prevention, the association between disease and genetic variants needs to be identified. The main goal of genome-wide association study (GWAS) is to discover the underlying reason for vulnerability to disease and utilize this knowledge for the development of prevention and treatment against these diseases. Given the methods available to address the scientific problems involved in the search for epistasis, there is not any standard for detecting epistasis, and this remains a problem due to limited statistical power. The GenEpi package is a Python package that uses a two-level workflow machine learning model to detect within-gene and cross-gene epistasis. This protocol chapter shows the usage of GenEpi with example data. The package uses a three-step procedure to reduce dimensionality, select the within-gene epistasis, and select the cross-gene epistasis. The package also provides a medium to build prediction models with the combination of genetic features and environmental influences.
Collapse
|
12
|
Zhu S, Ye L, Bennett S, Xu H, He D, Xu J. Molecular structure, gene expression and functional role of WFDC1 in angiogenesis and cancer. Cell Biochem Funct 2021; 39:588-595. [PMID: 33615507 DOI: 10.1002/cbf.3624] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2020] [Revised: 12/29/2020] [Accepted: 01/17/2021] [Indexed: 02/04/2023]
Abstract
Whey acidic proteins (WAP) perform a diverse range of important biological functions, including proteinase activity, calcium transport and bacterial growth. The WAP four-disulphide core domain protein 1 (WFDC1) gene (also called PS20), encodes the 20 kDa prostate stromal protein (ps20), which is a member of the WAP-type four-disulphide core domain family of proteins, and exhibits characteristics of serine protease inhibitors, such as elafin and secretory leukocyte protease inhibitor. Molecular structural analysis reveals that ps20 consists of four-disulphide bonds formed by eight cysteine residues located at the carboxyl terminus of the protein. Wfdc1-null mice were found to display no overt developmental phenotype, suggesting a dispensable role in organ growth and development. However, WFDC1 was able to mediate endothelial cell migration and pericyte stabilization, which are vital for the formation of functional vascular structures. WFDC1 was also found to be downregulated in cancers and exhibited a regulatory effect on cell proliferation. In addition, it was involved in the modulation of memory T cells during human immunodeficiency virus infection. Gaining a solid understanding of the mechanisms by which WFDC1 regulates tissue homeostasis and disease processes, in a tissue specific manner, will be an important move towards the development of WFDC1/ps20 as potential therapeutic targets.
Collapse
Affiliation(s)
- Sipin Zhu
- Department of Orthopaedics, The Second Affiliated Hospital and Yuying Children's Hospital of Wenzhou Medical University, Wenzhou, China.,Division of Regenerative Biology, School of Biomedical Sciences, University of Western Australia, Perth, Western Australia, Australia
| | - Lin Ye
- Department of Orthopaedic Surgery, The Fifth Affiliated Hospital of Wenzhou Medical University, Lishui Municipal Central Hospital, Lishui, China
| | - Samuel Bennett
- Division of Regenerative Biology, School of Biomedical Sciences, University of Western Australia, Perth, Western Australia, Australia
| | - Huazi Xu
- Department of Orthopaedics, The Second Affiliated Hospital and Yuying Children's Hospital of Wenzhou Medical University, Wenzhou, China
| | - Dengwei He
- Department of Orthopaedic Surgery, The Fifth Affiliated Hospital of Wenzhou Medical University, Lishui Municipal Central Hospital, Lishui, China
| | - Jiake Xu
- Department of Orthopaedics, The Second Affiliated Hospital and Yuying Children's Hospital of Wenzhou Medical University, Wenzhou, China.,Division of Regenerative Biology, School of Biomedical Sciences, University of Western Australia, Perth, Western Australia, Australia
| |
Collapse
|
13
|
Sundaramurthy S, Jayavel P. A hybrid Grey Wolf Optimization and Particle Swarm Optimization with C4.5 approach for prediction of Rheumatoid Arthritis. Appl Soft Comput 2020. [DOI: 10.1016/j.asoc.2020.106500] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
|
14
|
The basics of data, big data, and machine learning in clinical practice. Clin Rheumatol 2020; 40:11-23. [PMID: 32504192 DOI: 10.1007/s10067-020-05196-z] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2019] [Revised: 05/05/2020] [Accepted: 05/20/2020] [Indexed: 12/29/2022]
Abstract
Health informatics and biomedical computing have introduced the use of computer methods to analyze clinical information and provide tools to assist clinicians during the diagnosis and treatment of diverse clinical conditions. With the amount of information that can be obtained in the healthcare setting, new methods to acquire, organize, and analyze the data are being developed each day, including new applications in the world of big data and machine learning. In this review, first we present the most basic concepts in data science, including the structural hierarchy of information and how it is managed. A section is dedicated to discussing topics relevant to the acquisition of data, importantly the availability and use of online resources such as survey software and cloud computing services. Along with digital datasets, these tools make it possible to create more diverse models and facilitate collaboration. After, we describe concepts and techniques in machine learning used to process and analyze health data, especially those most widely applied in rheumatology. Overall, the objective of this review is to aid in the comprehension of how data science is used in health, with a special emphasis on the relevance to the field of rheumatology. It provides clinicians with basic tools on how to approach and understand new trends in health informatics analysis currently being used in rheumatology practice. If clinicians understand the potential use and limitations of health informatics, this will facilitate interdisciplinary conversations and continued projects relating to data, big data, and machine learning.
Collapse
|
15
|
Tian H, Cao S, Hu M, Wang Y, Fu Q, Pan Y, Qin T. Identification of predictive factors in hepatocellular carcinoma outcome: A longitudinal study. Oncol Lett 2020; 20:765-773. [PMID: 32566003 PMCID: PMC7285798 DOI: 10.3892/ol.2020.11581] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2019] [Accepted: 02/19/2020] [Indexed: 12/24/2022] Open
Abstract
Various surgical methods impact the prognosis of patients with hepatocellular carcinoma (HCC) differently. However, clinical guidelines remain inconsistent and the relative importance of predictors of survival outcomes requires further evaluation. The present study aimed to rank the importance of predictive factors that impact the survival outcomes of patients with HCC and to compare the prognosis associated with different surgical methods based on data obtained from the Surveillance, Epidemiology and End Results database. To achieve these aims, the present study used a random forest (RF) model to detect important predictive factors associated with survival outcomes in patients with HCC. Cox regression analysis was used to compare different surgery methods. The variables included in the Cox regression model were selected based on the Gini index calculated by the RF model. Using the RF model, the present study demonstrated that surgery method, tumor size and age were the first, second and third most important factors associated with HCC prognosis, respectively. Overall, patients who underwent local tumor destruction [(hazard ratio (HR)=0.48; 95% confidence interval (CI), 0.45–0.51; P<0.001)], wedge or segmental resection (HR, 0.31; 95% CI, 0.29–0.33; P<0.001), lobectomy (HR, 0.29, 95% CI, 0.27–0.31; P<0.001) or liver transplantation (HR, 0.16; 95% CI, 0.14–0.17; P<0.001) demonstrated improved overall survival time compared with those treated with surgery, with a gradual decreasing trend observed in HRs. The present study demonstrated that the surgical method used is the most important predictor of the survival outcomes of patients with HCC. Liver transplantation resulted in the best prognosis for patients with HCC, except for those with undifferentiated tumors or distant metastasis.
Collapse
Affiliation(s)
- Huiyuan Tian
- Department of Research and Discipline Development, Henan Provincial People's Hospital, Zhengzhou University People's Hospital, Henan University People's Hospital, Zhengzhou, Henan 450003, P.R. China
| | - Shaofeng Cao
- Department of Gastroenterology, The Fifth Affiliated Hospital of Zhengzhou University, Zhengzhou, Henan 450000, P.R. China
| | - Mingxing Hu
- Department of Hepatobiliary and Pancreatic Surgery, Henan Provincial People's Hospital, Zhengzhou University People's Hospital, Henan University People's Hospital, Zhengzhou, Henan 450003, P.R. China
| | - Yuzhu Wang
- Department of Hepatobiliary and Pancreatic Surgery, Henan Provincial People's Hospital, Zhengzhou University People's Hospital, Henan University People's Hospital, Zhengzhou, Henan 450003, P.R. China
| | - Qiang Fu
- Department of Hepatobiliary and Pancreatic Surgery, Henan Provincial People's Hospital, Zhengzhou University People's Hospital, Henan University People's Hospital, Zhengzhou, Henan 450003, P.R. China
| | - Yanfeng Pan
- Department of Infectious Diseases, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, Henan 450000, P.R. China
| | - Tao Qin
- Department of Hepatobiliary and Pancreatic Surgery, Henan Provincial People's Hospital, Zhengzhou University People's Hospital, Henan University People's Hospital, Zhengzhou, Henan 450003, P.R. China
| |
Collapse
|
16
|
Stafford IS, Kellermann M, Mossotto E, Beattie RM, MacArthur BD, Ennis S. A systematic review of the applications of artificial intelligence and machine learning in autoimmune diseases. NPJ Digit Med 2020; 3:30. [PMID: 32195365 PMCID: PMC7062883 DOI: 10.1038/s41746-020-0229-3] [Citation(s) in RCA: 110] [Impact Index Per Article: 22.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2019] [Accepted: 01/17/2020] [Indexed: 02/07/2023] Open
Abstract
Autoimmune diseases are chronic, multifactorial conditions. Through machine learning (ML), a branch of the wider field of artificial intelligence, it is possible to extract patterns within patient data, and exploit these patterns to predict patient outcomes for improved clinical management. Here, we surveyed the use of ML methods to address clinical problems in autoimmune disease. A systematic review was conducted using MEDLINE, embase and computers and applied sciences complete databases. Relevant papers included "machine learning" or "artificial intelligence" and the autoimmune diseases search term(s) in their title, abstract or key words. Exclusion criteria: studies not written in English, no real human patient data included, publication prior to 2001, studies that were not peer reviewed, non-autoimmune disease comorbidity research and review papers. 169 (of 702) studies met the criteria for inclusion. Support vector machines and random forests were the most popular ML methods used. ML models using data on multiple sclerosis, rheumatoid arthritis and inflammatory bowel disease were most common. A small proportion of studies (7.7% or 13/169) combined different data types in the modelling process. Cross-validation, combined with a separate testing set for more robust model evaluation occurred in 8.3% of papers (14/169). The field may benefit from adopting a best practice of validation, cross-validation and independent testing of ML models. Many models achieved good predictive results in simple scenarios (e.g. classification of cases and controls). Progression to more complex predictive models may be achievable in future through integration of multiple data types.
Collapse
Affiliation(s)
- I. S. Stafford
- Department of Human Genetics and Genomic Medicine, University of Southampton, Southampton, UK
- Institute for Life Sciences, University of Southampton, Southampton, UK
| | - M. Kellermann
- Department of Human Genetics and Genomic Medicine, University of Southampton, Southampton, UK
| | - E. Mossotto
- Department of Human Genetics and Genomic Medicine, University of Southampton, Southampton, UK
- Institute for Life Sciences, University of Southampton, Southampton, UK
| | - R. M. Beattie
- Department of Paediatric Gastroenterology, Southampton Children’s Hospital, Southampton, UK
| | - B. D. MacArthur
- Institute for Life Sciences, University of Southampton, Southampton, UK
| | - S. Ennis
- Department of Human Genetics and Genomic Medicine, University of Southampton, Southampton, UK
| |
Collapse
|
17
|
Hong Y, Hou B, Jiang H, Zhang J. Machine learning and artificial neural network accelerated computational discoveries in materials science. WIRES COMPUTATIONAL MOLECULAR SCIENCE 2019. [DOI: 10.1002/wcms.1450] [Citation(s) in RCA: 30] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Affiliation(s)
- Yang Hong
- Department of Chemistry University of Nebraska‐Lincoln Lincoln Nebraska
| | - Bo Hou
- Department of Engineering University of Cambridge Cambridge UK
| | - Hengle Jiang
- Holland Computing Center University of Nebraska‐Lincoln Lincoln Nebraska
| | - Jingchao Zhang
- Holland Computing Center University of Nebraska‐Lincoln Lincoln Nebraska
| |
Collapse
|
18
|
Galarza-Muñoz G, Briggs FBS, Evsyukova I, Schott-Lerner G, Kennedy EM, Nyanhete T, Wang L, Bergamaschi L, Widen SG, Tomaras GD, Ko DC, Bradrick SS, Barcellos LF, Gregory SG, Garcia-Blanco MA. Human Epistatic Interaction Controls IL7R Splicing and Increases Multiple Sclerosis Risk. Cell 2017; 169:72-84.e13. [PMID: 28340352 DOI: 10.1016/j.cell.2017.03.007] [Citation(s) in RCA: 74] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2016] [Revised: 09/18/2016] [Accepted: 03/02/2017] [Indexed: 12/18/2022]
Abstract
Multiple sclerosis (MS) is an autoimmune disorder where T cells attack neurons in the central nervous system (CNS) leading to demyelination and neurological deficits. A driver of increased MS risk is the soluble form of the interleukin-7 receptor alpha chain gene (sIL7R) produced by alternative splicing of IL7R exon 6. Here, we identified the RNA helicase DDX39B as a potent activator of this exon and consequently a repressor of sIL7R, and we found strong genetic association of DDX39B with MS risk. Indeed, we showed that a genetic variant in the 5' UTR of DDX39B reduces translation of DDX39B mRNAs and increases MS risk. Importantly, this DDX39B variant showed strong genetic and functional epistasis with allelic variants in IL7R exon 6. This study establishes the occurrence of biological epistasis in humans and provides mechanistic insight into the regulation of IL7R exon 6 splicing and its impact on MS risk.
Collapse
Affiliation(s)
- Gaddiel Galarza-Muñoz
- Department of Molecular Genetics and Microbiology, Duke University Medical Center, Durham, NC 27710, USA; Center for RNA Biology, Duke University, Durham, NC 27710, USA; Department of Biochemistry and Molecular Biology, University of Texas Medical Branch, Galveston, TX 77555, USA
| | - Farren B S Briggs
- Department of Epidemiology and Biostatistics, School of Medicine, Case Western Reserve University, Cleveland, OH 44106, USA
| | - Irina Evsyukova
- Center for RNA Biology, Duke University, Durham, NC 27710, USA
| | - Geraldine Schott-Lerner
- Department of Biochemistry and Molecular Biology, University of Texas Medical Branch, Galveston, TX 77555, USA
| | - Edward M Kennedy
- Department of Molecular Genetics and Microbiology, Duke University Medical Center, Durham, NC 27710, USA
| | - Tinashe Nyanhete
- Department of Immunology, Duke University Durham, NC 27710, USA; Department of Surgery, Duke University Durham, NC 27710, USA
| | - Liuyang Wang
- Department of Molecular Genetics and Microbiology, Duke University Medical Center, Durham, NC 27710, USA
| | - Laura Bergamaschi
- Duke Molecular Physiology Institute, Duke University, Durham, NC 27701, USA
| | - Steven G Widen
- Department of Biochemistry and Molecular Biology, University of Texas Medical Branch, Galveston, TX 77555, USA
| | - Georgia D Tomaras
- Department of Molecular Genetics and Microbiology, Duke University Medical Center, Durham, NC 27710, USA; Department of Immunology, Duke University Durham, NC 27710, USA; Department of Surgery, Duke University Durham, NC 27710, USA
| | - Dennis C Ko
- Department of Molecular Genetics and Microbiology, Duke University Medical Center, Durham, NC 27710, USA; Department of Medicine, Duke University Medical Center; Durham, NC 27710, USA
| | - Shelton S Bradrick
- Department of Molecular Genetics and Microbiology, Duke University Medical Center, Durham, NC 27710, USA; Center for RNA Biology, Duke University, Durham, NC 27710, USA; Department of Biochemistry and Molecular Biology, University of Texas Medical Branch, Galveston, TX 77555, USA
| | - Lisa F Barcellos
- Division of Epidemiology, School of Public Health, University of California Berkeley, Berkeley, CA 94720, USA
| | - Simon G Gregory
- Duke Molecular Physiology Institute, Duke University, Durham, NC 27701, USA; Department of Neurology, Duke University Medical Center, Durham, NC 27710, USA.
| | - Mariano A Garcia-Blanco
- Department of Molecular Genetics and Microbiology, Duke University Medical Center, Durham, NC 27710, USA; Center for RNA Biology, Duke University, Durham, NC 27710, USA; Department of Biochemistry and Molecular Biology, University of Texas Medical Branch, Galveston, TX 77555, USA.
| |
Collapse
|
19
|
Zhang X, Yuan Z, Ji J, Li H, Xue F. Network or regression-based methods for disease discrimination: a comparison study. BMC Med Res Methodol 2016; 16:100. [PMID: 27538955 PMCID: PMC4991108 DOI: 10.1186/s12874-016-0207-2] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2015] [Accepted: 08/09/2016] [Indexed: 11/23/2022] Open
Abstract
Background In stark contrast to network-centric view for complex disease, regression-based methods are preferred in disease prediction, especially for epidemiologists and clinical professionals. It remains a controversy whether the network-based methods have advantageous performance than regression-based methods, and to what extent do they outperform. Methods Simulations under different scenarios (the input variables are independent or in network relationship) as well as an application were conducted to assess the prediction performance of four typical methods including Bayesian network, neural network, logistic regression and regression splines. Results The simulation results reveal that Bayesian network showed a better performance when the variables were in a network relationship or in a chain structure. For the special wheel network structure, logistic regression had a considerable performance compared to others. Further application on GWAS of leprosy show Bayesian network still outperforms other methods. Conclusion Although regression-based methods are still popular and widely used, network-based approaches should be paid more attention, since they capture the complex relationship between variables. Electronic supplementary material The online version of this article (doi:10.1186/s12874-016-0207-2) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Xiaoshuai Zhang
- Department of Epidemiology and Biostatistics, School of Public Health, Shandong University, PO Box 100, Jinan, 250012, China
| | - Zhongshang Yuan
- Department of Epidemiology and Biostatistics, School of Public Health, Shandong University, PO Box 100, Jinan, 250012, China
| | - Jiadong Ji
- Department of Epidemiology and Biostatistics, School of Public Health, Shandong University, PO Box 100, Jinan, 250012, China
| | - Hongkai Li
- Department of Epidemiology and Biostatistics, School of Public Health, Shandong University, PO Box 100, Jinan, 250012, China
| | - Fuzhong Xue
- Department of Epidemiology and Biostatistics, School of Public Health, Shandong University, PO Box 100, Jinan, 250012, China.
| |
Collapse
|
20
|
Genetic data: The new challenge of personalized medicine, insights for rheumatoid arthritis patients. Gene 2016; 583:90-101. [PMID: 26869316 DOI: 10.1016/j.gene.2016.02.004] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2015] [Revised: 01/18/2016] [Accepted: 02/05/2016] [Indexed: 01/15/2023]
Abstract
Rapid advances in genotyping technology, analytical methods, and the establishment of large cohorts for population genetic studies have resulted in a large new body of information about the genetic basis of human rheumatoid arthritis (RA). Improved understanding of the root pathogenesis of the disease holds the promise of improved diagnostic and prognostic tools based upon this information. In this review, we summarize the nature of new genetic findings in human RA, including susceptibility loci and gene-gene and gene-environment interactions, as well as genetic loci associated with sub-groups of patients and those associated with response to therapy. Possible uses of these data are discussed, such as prediction of disease risk as well as personalized therapy and prediction of therapeutic response and risk of adverse events. While these applications are largely not refined to the point of clinical utility in RA, it seems likely that multi-parameter datasets including genetic, clinical, and biomarker data will be employed in the future care of RA patients.
Collapse
|
21
|
Mikuls TR, Reimold A, Kerr GS, Cannon GW. Insights and Implications of the VA Rheumatoid Arthritis Registry. Fed Pract 2015; 32:24-29. [PMID: 30766061 PMCID: PMC6363303] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
The VA Rheumatoid Arthritis Registry addresses the underrepresentation of veterans in rheumatoid arthritis research, serving as a repository that links banked serum, plasma, and DNA samples with an array of patient-level information.
Collapse
Affiliation(s)
- Ted R Mikuls
- is a staff rheumatologist and research scientist at the VA Nebraska Western-Iowa Health Care System and the Umbach Professor of Rheumatology at the University of Nebraska Medical Center, both in Omaha. is chief of the Rheumatology Section at the Dallas VAMC and associate professor at the University of Texas Southwestern Medical Center, both in Dallas. is chief of the Division of Rheumatology at the Washington DC VAMC and professor of medicine at Georgetown University, both in Washington, DC. is associate chief of staff at the George E. Wahlen VAMC and professor of medicine in the Division of Rheumatology at the University of Utah, both in Salt Lake City
| | - Andreas Reimold
- is a staff rheumatologist and research scientist at the VA Nebraska Western-Iowa Health Care System and the Umbach Professor of Rheumatology at the University of Nebraska Medical Center, both in Omaha. is chief of the Rheumatology Section at the Dallas VAMC and associate professor at the University of Texas Southwestern Medical Center, both in Dallas. is chief of the Division of Rheumatology at the Washington DC VAMC and professor of medicine at Georgetown University, both in Washington, DC. is associate chief of staff at the George E. Wahlen VAMC and professor of medicine in the Division of Rheumatology at the University of Utah, both in Salt Lake City
| | - Gail S Kerr
- is a staff rheumatologist and research scientist at the VA Nebraska Western-Iowa Health Care System and the Umbach Professor of Rheumatology at the University of Nebraska Medical Center, both in Omaha. is chief of the Rheumatology Section at the Dallas VAMC and associate professor at the University of Texas Southwestern Medical Center, both in Dallas. is chief of the Division of Rheumatology at the Washington DC VAMC and professor of medicine at Georgetown University, both in Washington, DC. is associate chief of staff at the George E. Wahlen VAMC and professor of medicine in the Division of Rheumatology at the University of Utah, both in Salt Lake City
| | - Grant W Cannon
- is a staff rheumatologist and research scientist at the VA Nebraska Western-Iowa Health Care System and the Umbach Professor of Rheumatology at the University of Nebraska Medical Center, both in Omaha. is chief of the Rheumatology Section at the Dallas VAMC and associate professor at the University of Texas Southwestern Medical Center, both in Dallas. is chief of the Division of Rheumatology at the Washington DC VAMC and professor of medicine at Georgetown University, both in Washington, DC. is associate chief of staff at the George E. Wahlen VAMC and professor of medicine in the Division of Rheumatology at the University of Utah, both in Salt Lake City
| |
Collapse
|
22
|
Elshazli R, Settin A. Association of PTPN22 rs2476601 and STAT4 rs7574865 polymorphisms with rheumatoid arthritis: A meta-analysis update. Immunobiology 2015; 220:1012-24. [PMID: 25963842 DOI: 10.1016/j.imbio.2015.04.003] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2015] [Accepted: 04/20/2015] [Indexed: 12/14/2022]
Abstract
BACKGROUND Rheumatoid arthritis (RA) is a common autoimmune disease with a complex genetic background. The genes encoding protein tyrosine phosphatase non-receptor type 22 (PTPN22) and signal transducer and activator of transcription 4 (STAT4) have been reported to be associated with RA in several ethnic populations. OBJECTIVES This work aims to assess the association between PTPN22 rs2476601 and STAT4 rs7574865 polymorphisms with RA susceptibility through an updated meta-analysis of available case-control studies. METHODS A literature search of all relevant studies published from January 2007 up to December 2014 was conducted using Pubmed and Science Direct databases. The observed studies that were related to an association between PTPN22 rs2476601 and STAT4 rs7574865 polymorphisms with RA susceptibility were identified. Meta-analysis of the pooled and stratified data was done and assessed using varied genetic models. RESULTS Thirty-seven case-control studies with a total of 47 comparisons (29 for PTPN22 rs2476601 polymorphism and 18 for STAT4 rs7574865 polymorphism) met our inclusion criteria. The meta-analysis showed an association between PTPN22 T allele, CT+TT and TT genotypes with RA susceptibility. Furthermore, The meta-analysis showed an association between STAT4 T allele, GT+TT and TT genotypes with RA susceptibility. Stratification of RA patients according to ethnic groups showed that PTPN22 T allele, CT+TT genotypes, STAT4 T allele and STAT4 GT+TT were significantly associated with RA in European, Asian, African subjects, while PTPN22 TT genotype was significantly associated with RA in European but not in Asian and African subjects and STAT4 TT genotype was significantly associated with RA in European and Asian but not in African subject. A subgroup analysis according to the presence or absence of rheumatoid factor (RF) and anti-cyclic citrullinated peptide (anti-CCP) antibodies revealed that the association between PTPN22 rs2476601 and STAT4 rs7574865 polymorphisms with RA susceptibility may not be dependent on RF and anti-CCP antibodies. CONCLUSIONS Our meta-analysis demonstrated that PTPN22 rs2476601 and STAT4 rs7574865 polymorphisms confers susceptibility to RA in total subjects and in major ethnic groups. The association may not be dependent on RF and anti-CCP antibodies.
Collapse
Affiliation(s)
- Rami Elshazli
- Department of Biochemistry, Faculty of Science, Tanta University, Tanta, Egypt.
| | - Ahmad Settin
- Genetics Unit, Children Hospital, Mansoura University, Mansoura, Egypt
| |
Collapse
|
23
|
Jiang X, Neapolitan RE. LEAP: biomarker inference through learning and evaluating association patterns. Genet Epidemiol 2015; 39:173-84. [PMID: 25677188 PMCID: PMC4366363 DOI: 10.1002/gepi.21889] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2014] [Revised: 12/16/2014] [Accepted: 01/06/2015] [Indexed: 01/22/2023]
Abstract
Single nucleotide polymorphism (SNP) high-dimensional datasets are available from Genome Wide Association Studies (GWAS). Such data provide researchers opportunities to investigate the complex genetic basis of diseases. Much of genetic risk might be due to undiscovered epistatic interactions, which are interactions in which combination of several genes affect disease. Research aimed at discovering interacting SNPs from GWAS datasets proceeded in two directions. First, tools were developed to evaluate candidate interactions. Second, algorithms were developed to search over the space of candidate interactions. Another problem when learning interacting SNPs, which has not received much attention, is evaluating how likely it is that the learned SNPs are associated with the disease. A complete system should provide this information as well. We develop such a system. Our system, called LEAP, includes a new heuristic search algorithm for learning interacting SNPs, and a Bayesian network based algorithm for computing the probability of their association. We evaluated the performance of LEAP using 100 1,000-SNP simulated datasets, each of which contains 15 SNPs involved in interactions. When learning interacting SNPs from these datasets, LEAP outperformed seven others methods. Furthermore, only SNPs involved in interactions were found to be probable. We also used LEAP to analyze real Alzheimer's disease and breast cancer GWAS datasets. We obtained interesting and new results from the Alzheimer's dataset, but limited results from the breast cancer dataset. We conclude that our results support that LEAP is a useful tool for extracting candidate interacting SNPs from high-dimensional datasets and determining their probability.
Collapse
Affiliation(s)
- Xia Jiang
- Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
| | | |
Collapse
|
24
|
Ferreiro-Iglesias A, Calaza M, Perez-Pampin E, Lopez Longo FJ, Marenco JL, Blanco FJ, Narvaez J, Navarro F, Cañete JD, de la Serna AR, Gonzalez-Alvaro I, Herrero-Beaumont G, Pablos JL, Balsa A, Fernandez-Gutierrez B, Caliz R, Gomez-Reino JJ, Gonzalez A. Lack of replication of interactions between polymorphisms in rheumatoid arthritis susceptibility: case-control study. Arthritis Res Ther 2014; 16:436. [PMID: 25260880 PMCID: PMC4207328 DOI: 10.1186/s13075-014-0436-x] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2014] [Accepted: 08/21/2014] [Indexed: 01/18/2023] Open
Abstract
Introduction Approximately 100 loci have been definitively associated with rheumatoid arthritis (RA) susceptibility. However, they explain only a fraction of RA heritability. Interactions between polymorphisms could explain part of the remaining heritability. Multiple interactions have been reported, but only the shared epitope (SE) × protein tyrosine phosphatase nonreceptor type 22 (PTPN22) interaction has been replicated convincingly. Two recent studies deserve attention because of their quality, including their replication in a second sample collection. In one of them, researchers identified interactions between PTPN22 and seven single-nucleotide polymorphisms (SNPs). The other showed interactions between the SE and the null genotype of glutathione S-transferase Mu 1 (GSTM1) in the anti–cyclic citrullinated peptide–positive (anti-CCP+) patients. In the present study, we aimed to replicate association with RA susceptibility of interactions described in these two high-quality studies. Methods A total of 1,744 patients with RA and 1,650 healthy controls of Spanish ancestry were studied. Polymorphisms were genotyped by single-base extension. SE genotypes of 736 patients were available from previous studies. Interaction analysis was done using multiple methods, including those originally reported and the most powerful methods described. Results Genotypes of one of the SNPs (rs4695888) failed quality control tests. The call rate for the other eight polymorphisms was 99.9%. The frequencies of the polymorphisms were similar in RA patients and controls, except for PTPN22 SNP. None of the interactions between PTPN22 SNPs and the six SNPs that met quality control tests was replicated as a significant interaction term—the originally reported finding—or with any of the other methods. Nor was the interaction between GSTM1 and the SE replicated as a departure from additivity in anti-CCP+ patients or with any of the other methods. Conclusions None of the interactions tested were replicated in spite of sufficient power and assessment with different assays. These negative results indicate that whether interactions are significant contributors to RA susceptibility remains unknown and that strict standards need to be applied to claim that an interaction exists.
Collapse
|
25
|
Jiang X, Neapolitan RE. Mining pure, strict epistatic interactions from high-dimensional datasets: ameliorating the curse of dimensionality. PLoS One 2012; 7:e46771. [PMID: 23071633 PMCID: PMC3470561 DOI: 10.1371/journal.pone.0046771] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2012] [Accepted: 09/07/2012] [Indexed: 01/13/2023] Open
Abstract
BACKGROUND The interaction between loci to affect phenotype is called epistasis. It is strict epistasis if no proper subset of the interacting loci exhibits a marginal effect. For many diseases, it is likely that unknown epistatic interactions affect disease susceptibility. A difficulty when mining epistatic interactions from high-dimensional datasets concerns the curse of dimensionality. There are too many combinations of SNPs to perform an exhaustive search. A method that could locate strict epistasis without an exhaustive search can be considered the brass ring of methods for analyzing high-dimensional datasets. METHODOLOGY/FINDINGS A SNP pattern is a Bayesian network representing SNP-disease relationships. The Bayesian score for a SNP pattern is the probability of the data given the pattern, and has been used to learn SNP patterns. We identified a bound for the score of a SNP pattern. The bound provides an upper limit on the Bayesian score of any pattern that could be obtained by expanding a given pattern. We felt that the bound might enable the data to say something about the promise of expanding a 1-SNP pattern even when there are no marginal effects. We tested the bound using simulated datasets and semi-synthetic high-dimensional datasets obtained from GWAS datasets. We found that the bound was able to dramatically reduce the search time for strict epistasis. Using an Alzheimer's dataset, we showed that it is possible to discover an interaction involving the APOE gene based on its score because of its large marginal effect, but that the bound is most effective at discovering interactions without marginal effects. CONCLUSIONS/SIGNIFICANCE We conclude that the bound appears to ameliorate the curse of dimensionality in high-dimensional datasets. This is a very consequential result and could be pivotal in our efforts to reveal the dark matter of genetic disease risk from high-dimensional datasets.
Collapse
Affiliation(s)
- Xia Jiang
- Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America.
| | | |
Collapse
|
26
|
Abstract
The present evaluates the key features of the WFDC1 [WAP (whey acidic protein) four disulfide core 1] gene that encodes ps20 (20 kDa prostate stromal protein), a member of the WAP family. ps20 was first characterized as a growth inhibitory activity that was secreted by fetal urogenital sinus mesenchymal cells. Purified ps20 exhibited several activities that centre on cell adhesion, migration and proliferation. The WFDC1 gene was cloned, contained seven exons, and was mapped to chromosome 16q24, suggesting that it may function as a tumour suppressor; however, direct evidence of this has not emerged. In vivo, ps20 stimulated angiogenesis, although expression of WFDC1/ps20 was down-regulated in the reactive stroma tumour microenvironment in prostate cancer. WFDC1 expression is differential in other cancers and inflammatory conditions. Recent studies point to a role in viral infectivity. Although mechanisms of action are not fully understood, WFDC1/ps20 is emerging as a secreted matricellular protein that probably affects response to micro-organisms and tissue repair homoeostasis.
Collapse
|
27
|
Boulesteix AL, Bender A, Lorenzo Bermejo J, Strobl C. Random forest Gini importance favours SNPs with large minor allele frequency: impact, sources and recommendations. Brief Bioinform 2011; 13:292-304. [DOI: 10.1093/bib/bbr053] [Citation(s) in RCA: 66] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023] Open
|
28
|
Abstract
PURPOSE OF REVIEW To review recent progress in the genetics of rheumatoid arthritis (RA) and discuss the implications for understanding the pathogenesis of the disease as well as clinical application. RECENT FINDINGS Protection against anticitrullinated protein antibody (ACPA) positive RA was shown to be associated wit DRB1*1301. Genome-wide association studies (GWASs) added about 10 new loci to the list of already more than 20 loci associated with RA, so the list is now over 30. Typing for the known risk loci is not helpful for prediction of the risk for RA. It is remarkable how few functional studies have been published. SUMMARY Known genetic factors explain 50-60% of the genetic variance for susceptibility to ACPA-positive and 30-50% for ACPA-negative RA. Searching for the remaining missing or hidden heritability is in all probability not going to yield much for prediction and/or targeted intervention. Therefore, I conclude that if you want to find more genes you should have a lot of patience, time and money, stop with convential GWAS and invest in large-scale sequencing of selected patients and controls. I have a better suggestion, however: use the information that is already available to perform functional studies in order to understand the mechanism of the known associations!
Collapse
|
29
|
Liu C, Ackerman HH, Carulli JP. A genome-wide screen of gene-gene interactions for rheumatoid arthritis susceptibility. Hum Genet 2011; 129:473-85. [PMID: 21210282 DOI: 10.1007/s00439-010-0943-z] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2010] [Accepted: 12/22/2010] [Indexed: 12/16/2022]
Abstract
The objective of the study was to identify interacting genes contributing to rheumatoid arthritis (RA) susceptibility and identify SNPs that discriminate between RA patients who were anti-cyclic citrullinated protein positive and healthy controls. We analyzed two independent cohorts from the North American Rheumatoid Arthritis Consortium. A cohort of 908 RA cases and 1,260 controls was used to discover pairwise interactions among SNPs and to identify a set of single nucleotide polymorphisms (SNPs) that predict RA status, and a second cohort of 952 cases and 1,760 controls was used to validate the findings. After adjusting for HLA-shared epitope alleles, we identified and replicated seven SNP pairs within the HLA class II locus with significant interaction effects. We failed to replicate significant pairwise interactions among non-HLA SNPs. The machine learning approach "random forest" applied to a set of SNPs selected from single-SNP and pairwise interaction tests identified 93 SNPs that distinguish RA cases from controls with 70% accuracy. HLA SNPs provide the most classification information, and inclusion of non-HLA SNPs improved classification. While specific gene-gene interactions are difficult to validate using genome-wide SNP data, a stepwise approach combining association and classification methods identifies candidate interacting SNPs that distinguish RA cases from healthy controls.
Collapse
Affiliation(s)
- Chunyu Liu
- Center for Population Studies and the Framingham Heart Study, National Heart, Lung, and Blood Institute/NIH, 73 Mt. Wayte Avenue, Framingham, MA 01702, USA.
| | | | | |
Collapse
|
30
|
Machine learning techniques for single nucleotide polymorphism--disease classification models in schizophrenia. Molecules 2010; 15:4875-89. [PMID: 20657396 PMCID: PMC6257637 DOI: 10.3390/molecules15074875] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2010] [Revised: 07/08/2010] [Accepted: 07/09/2010] [Indexed: 11/16/2022] Open
Abstract
Single nucleotide polymorphisms (SNPs) can be used as inputs in disease computational studies such as pattern searching and classification models. Schizophrenia is an example of a complex disease with an important social impact. The multiple causes of this disease create the need of new genetic or proteomic patterns that can diagnose patients using biological information. This work presents a computational study of disease machine learning classification models using only single nucleotide polymorphisms at the HTR2A and DRD3 genes from Galician (Northwest Spain) schizophrenic patients. These classification models establish for the first time, to the best knowledge of the authors, a relationship between the sequence of the nucleic acid molecule and schizophrenia (Quantitative Genotype – Disease Relationships) that can automatically recognize schizophrenia DNA sequences and correctly classify between 78.3–93.8% of schizophrenia subjects when using datasets which include simulated negative subjects and a linear artificial neural network.
Collapse
|