1
|
Multiple Criteria Optimization (MCO): A gene selection deterministic tool in RStudio. PLoS One 2022; 17:e0262890. [PMID: 35085348 PMCID: PMC8794188 DOI: 10.1371/journal.pone.0262890] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2020] [Accepted: 01/09/2022] [Indexed: 11/19/2022] Open
Abstract
Identifying genes with the largest expression changes (gene selection) to characterize a given condition is a popular first step to drive exploration into molecular mechanisms and is, therefore, paramount for therapeutic development. Reproducibility in the sciences makes it necessary to emphasize objectivity and systematic repeatability in biological and informatics analyses, including gene selection. With these two characteristics in mind, in previous works our research team has proposed using multiple criteria optimization (MCO) in gene selection to analyze microarray datasets. The result of this effort is the MCO algorithm, which selects genes with the largest expression changes without user manipulation of neither informatics nor statistical parameters. Furthermore, the user is not required to choose either a preference structure among multiple measures or a predetermined quantity of genes to be deemed significant a priori. This implies that using the same datasets and performance measures (PMs), the method will converge to the same set of selected differentially expressed genes (repeatability) despite who carries out the analysis (objectivity). The present work describes the development of an open-source tool in RStudio to enable both: (1) individual analysis of single datasets with two or three PMs and (2) meta-analysis with up to five microarray datasets, using one PM from each dataset. The capabilities afforded by the code include license-free portability and the possibility to carry out analyses via modest computer hardware, such as personal laptops. The code provides affordable, repeatable, and objective detection of differentially expressed genes from microarrays. It can be used to analyze other experiments with similar experimental comparative layouts, such as microRNA arrays and protein arrays, among others. As a demonstration of the capabilities of the code, the analysis of four publicly-available microarray datasets related to Parkinson´s Disease (PD) is presented here, treating each dataset individually or as a four-way meta-analysis. These MCO-supported analyses made it possible to identify MMP9 and TUBB2A as potential PD genetic biomarkers based on their persistent appearance across each of the case studies. A literature search confirmed the importance of these genes in PD and indeed as PD biomarkers, which evidences the code´s potential.
Collapse
|
2
|
Cruz-Rivera YE, Perez-Morales J, Santiago YM, Gonzalez VM, Morales L, Cabrera-Rios M, Isaza CE. A Selection of Important Genes and Their Correlated Behavior in Alzheimer's Disease. J Alzheimers Dis 2019; 65:193-205. [PMID: 30040709 PMCID: PMC6087431 DOI: 10.3233/jad-170799] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Abstract
In 2017, approximately 5 million Americans were living with Alzheimer’s disease (AD), and it is estimated that by 2050 this number could increase to 16 million. In this study, we apply mathematical optimization to approach microarray analysis to detect differentially expressed genes and determine the most correlated structure among their expression changes. The analysis of GSE4757 microarray dataset, which compares expression between AD neurons without neurofibrillary tangles (controls) and with neurofibrillary tangles (cases), was casted as a multiple criteria optimization (MCO) problem. Through the analysis it was possible to determine a series of Pareto efficient frontiers to find the most differentially expressed genes, which are here proposed as potential AD biomarkers. The Traveling Sales Problem (TSP) model was used to find the cyclical path of maximal correlation between the expression changes among the genes deemed important from the previous stage. This leads to a structure capable of guiding biological exploration with enhanced precision and repeatability. Ten genes were selected (FTL, GFAP, HNRNPA3, COX1, ND2, ND3, ND4, NUCKS1, RPL41, and RPS10) and their most correlated cyclic structure was found in our analyses. The biological functions of their products were found to be linked to inflammation and neurodegenerative diseases and some of them had not been reported for AD before. The TSP path connects genes coding for mitochondrial electron transfer proteins. Some of these proteins are closely related to other electron transport proteins already reported as important for AD.
Collapse
Affiliation(s)
- Yazeli E Cruz-Rivera
- The Applied Optimization Group/Department of Industrial Engineering, University of Puerto Rico, Mayagüez Campus, Mayagüez, Puerto Rico
| | - Jaileene Perez-Morales
- Department of Basic Science-Biochemistry Division, Ponce Health Sciences University, Ponce, Puerto Rico
| | - Yaritza M Santiago
- The Applied Optimization Group/Department of Industrial Engineering, University of Puerto Rico, Mayagüez Campus, Mayagüez, Puerto Rico
| | - Valerie M Gonzalez
- The Applied Optimization Group/Department of Industrial Engineering, University of Puerto Rico, Mayagüez Campus, Mayagüez, Puerto Rico
| | - Luisa Morales
- Public Health Program, Ponce Health Sciences University, Ponce, Puerto Rico
| | - Mauricio Cabrera-Rios
- The Applied Optimization Group/Department of Industrial Engineering, University of Puerto Rico, Mayagüez Campus, Mayagüez, Puerto Rico
| | - Clara E Isaza
- The Applied Optimization Group/Department of Industrial Engineering, University of Puerto Rico, Mayagüez Campus, Mayagüez, Puerto Rico.,Public Health Program, Ponce Health Sciences University, Ponce, Puerto Rico
| |
Collapse
|
3
|
Azimzadeh Jamalkandi S, Mozhgani SH, Gholami Pourbadie H, Mirzaie M, Noorbakhsh F, Vaziri B, Gholami A, Ansari-Pour N, Jafari M. Systems Biomedicine of Rabies Delineates the Affected Signaling Pathways. Front Microbiol 2016; 7:1688. [PMID: 27872612 PMCID: PMC5098112 DOI: 10.3389/fmicb.2016.01688] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2016] [Accepted: 10/07/2016] [Indexed: 12/16/2022] Open
Abstract
The prototypical neurotropic virus, rabies, is a member of the Rhabdoviridae family that causes lethal encephalomyelitis. Although there have been a plethora of studies investigating the etiological mechanism of the rabies virus and many precautionary methods have been implemented to avert the disease outbreak over the last century, the disease has surprisingly no definite remedy at its late stages. The psychological symptoms and the underlying etiology, as well as the rare survival rate from rabies encephalitis, has still remained a mystery. We, therefore, undertook a systems biomedicine approach to identify the network of gene products implicated in rabies. This was done by meta-analyzing whole-transcriptome microarray datasets of the CNS infected by strain CVS-11, and integrating them with interactome data using computational and statistical methods. We first determined the differentially expressed genes (DEGs) in each study and horizontally integrated the results at the mRNA and microRNA levels separately. A total of 61 seed genes involved in signal propagation system were obtained by means of unifying mRNA and microRNA detected integrated DEGs. We then reconstructed a refined protein–protein interaction network (PPIN) of infected cells to elucidate the rabies-implicated signal transduction network (RISN). To validate our findings, we confirmed differential expression of randomly selected genes in the network using Real-time PCR. In conclusion, the identification of seed genes and their network neighborhood within the refined PPIN can be useful for demonstrating signaling pathways including interferon circumvent, toward proliferation and survival, and neuropathological clue, explaining the intricate underlying molecular neuropathology of rabies infection and thus rendered a molecular framework for predicting potential drug targets.
Collapse
Affiliation(s)
| | - Sayed-Hamidreza Mozhgani
- Department of Virology, School of Public Health, Tehran University of Medical Sciences Tehran, Iran
| | | | - Mehdi Mirzaie
- Department of Applied Mathematics, Faculty of Mathematical Sciences, Tarbiat Modares University Tehran, Iran
| | - Farshid Noorbakhsh
- Department of Immunology, School of Medicine, Tehran University of Medical Sciences Tehran, Iran
| | - Behrouz Vaziri
- Protein Chemistry and Proteomics Unit, Medical Biotechnology Department, Biotechnology Research Center, Pasteur Institute of Iran Tehran, Iran
| | - Alireza Gholami
- WHO Collaborating Center for Reference and Research on Rabies, Pasteur Institute of Iran Tehran, Iran
| | - Naser Ansari-Pour
- Faculty of New Sciences and Technology, University of TehranTehran, Iran; Department of Genetics, Evolution and Environment, UCL Genetics Institute, University College LondonLondon, UK
| | - Mohieddin Jafari
- Drug Design and Bioinformatics Unit, Medical Biotechnology Department, Biotechnology Research Center, Pasteur Institute of Iran Tehran, Iran
| |
Collapse
|
4
|
Camacho-Cáceres KI, Acevedo-Díaz JC, Pérez-Marty LM, Ortiz M, Irizarry J, Cabrera-Ríos M, Isaza CE. Multiple criteria optimization joint analyses of microarray experiments in lung cancer: from existing microarray data to new knowledge. Cancer Med 2015; 4:1884-900. [PMID: 26471143 PMCID: PMC4940807 DOI: 10.1002/cam4.540] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2015] [Revised: 07/30/2015] [Accepted: 07/14/2015] [Indexed: 12/14/2022] Open
Abstract
Microarrays can provide large amounts of data for genetic relative expression in illnesses of interest such as cancer in short time. These data, however, are stored and often times abandoned when new experimental technologies arrive. This work reexamines lung cancer microarray data with a novel multiple criteria optimization‐based strategy aiming to detect highly differentially expressed genes. This strategy does not require any adjustment of parameters by the user and is capable to handle multiple and incommensurate units across microarrays. In the analysis, groups of samples from patients with distinct smoking habits (never smoker, current smoker) and different gender are contrasted to elicit sets of highly differentially expressed genes, several of which are already associated to lung cancer and other types of cancer. The list of genes is provided with a discussion of their role in cancer, as well as the possible research directions for each of them.
Collapse
Affiliation(s)
- Katia I Camacho-Cáceres
- Bio IE Lab, The Applied Optimization Group, Industrial Engineering Department, University of Puerto Rico, Mayaguez, Puerto Rico
| | - Juan C Acevedo-Díaz
- Bio IE Lab, The Applied Optimization Group, Industrial Engineering Department, University of Puerto Rico, Mayaguez, Puerto Rico
| | - Lynn M Pérez-Marty
- Bio IE Lab, The Applied Optimization Group, Industrial Engineering Department, University of Puerto Rico, Mayaguez, Puerto Rico
| | - Michael Ortiz
- Bio IE Lab, The Applied Optimization Group, Industrial Engineering Department, University of Puerto Rico, Mayaguez, Puerto Rico
| | - Juan Irizarry
- Bio IE Lab, The Applied Optimization Group, Industrial Engineering Department, University of Puerto Rico, Mayaguez, Puerto Rico
| | - Mauricio Cabrera-Ríos
- Bio IE Lab, The Applied Optimization Group, Industrial Engineering Department, University of Puerto Rico, Mayaguez, Puerto Rico
| | - Clara E Isaza
- Bio IE Lab, The Applied Optimization Group, Industrial Engineering Department, University of Puerto Rico, Mayaguez, Puerto Rico.,Public Health Program, Ponce Health Sciences University, Ponce, Puerto Rico
| |
Collapse
|