1
|
Morais-Rodrigues F, Silv Erio-Machado R, Kato RB, Rodrigues DLN, Valdez-Baez J, Fonseca V, San EJ, Gomes LGR, Dos Santos RG, Vinicius Canário Viana M, da Cruz Ferraz Dutra J, Teixeira Dornelles Parise M, Parise D, Campos FF, de Souza SJ, Ortega JM, Barh D, Ghosh P, Azevedo VAC, Dos Santos MA. Analysis of the microarray gene expression for breast cancer progression after the application modified logistic regression. Gene 2019; 726:144168. [PMID: 31759986 DOI: 10.1016/j.gene.2019.144168] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2019] [Revised: 09/21/2019] [Accepted: 10/11/2019] [Indexed: 01/02/2023]
Abstract
Methods based around statistics and linear algebra have been increasingly used in attempts to address emerging questions in microarray literature. Microarray technology is a long-used tool in the global analysis of gene expression, allowing for the simultaneous investigation of hundreds or thousands of genes in a sample. It is characterized by a low sample size and a large feature number created a non-square matrix, and by the incomplete rank, that can generate countless more solution in classifiers. To avoid the problem of the 'curse of dimensionality' many authors have performed feature selection or reduced the size of data matrix. In this work, we introduce a new logistic regression-based model to classify breast cancer tumor samples based on microarray expression data, including all features of gene expression and without reducing the microarray data matrix. If the user still deems it necessary to perform feature reduction, it can be done after the application of the methodology, still maintaining a good classification. This methodology allowed the correct classification of breast cancer sample data sets from Gene Expression Omnibus (GEO) data series GSE65194, GSE20711, and GSE25055, which contain the microarray data of said breast cancer samples. Classification had a minimum performance of 80% (sensitivity and specificity), and explored all possible data combinations, including breast cancer subtypes. This methodology highlighted genes not yet studied in breast cancer, some of which have been observed in Gene Regulatory Networks (GRNs). In this work we examine the patterns and features of a GRN composed of transcription factors (TFs) in MCF-7 breast cancer cell lines, providing valuable information regarding breast cancer. In particular, some genes whose αi ∗ associated parameter values revealed extreme positive and negative values, and, as such, can be identified as breast cancer prediction genes. We indicate that the PKN2, MKL1, MED23, CUL5 and GLI genes demonstrate a tumor suppressor profile, and that the MTR, ITGA2B, TELO2, MRPL9, MTTL1, WIPI1, KLHL20, PI4KB, FOLR1 and SHC1 genes demonstrate an oncogenic profile. We propose that these may serve as potential breast cancer prediction genes, and should be prioritized for further clinical studies on breast cancer. This new model allows for the assignment of values to the αi ∗ parameters associated with gene expression. It was noted that some αi ∗ parameters are associated with genes previously described as breast cancer biomarkers, as well as other genes not yet studied in relation to this disease.
Collapse
Affiliation(s)
- Francielly Morais-Rodrigues
- Institute of Biological Sciences, Federal University of Minas Gerais, Brazil. Av. Antônio Carlos, 6627, Belo Horizonte, MG 31270-901, Brazil.
| | - Rita Silv Erio-Machado
- Institute of Biological Sciences, Federal University of Minas Gerais, Brazil. Av. Antônio Carlos, 6627, Belo Horizonte, MG 31270-901, Brazil
| | - Rodrigo Bentes Kato
- Institute of Biological Sciences, Federal University of Minas Gerais, Brazil. Av. Antônio Carlos, 6627, Belo Horizonte, MG 31270-901, Brazil
| | - Diego Lucas Neres Rodrigues
- Institute of Biological Sciences, Federal University of Minas Gerais, Brazil. Av. Antônio Carlos, 6627, Belo Horizonte, MG 31270-901, Brazil
| | - Juan Valdez-Baez
- Institute of Biological Sciences, Federal University of Minas Gerais, Brazil. Av. Antônio Carlos, 6627, Belo Horizonte, MG 31270-901, Brazil
| | - Vagner Fonseca
- Institute of Biological Sciences, Federal University of Minas Gerais, Brazil. Av. Antônio Carlos, 6627, Belo Horizonte, MG 31270-901, Brazil; KwaZulu-Natal Research Innovation and Sequencing Platform (KRISP), College of Health Sciences, University of KwaZulu-Natal, Durban 4001, South Africa
| | - Emmanuel James San
- KwaZulu-Natal Research Innovation and Sequencing Platform (KRISP), College of Health Sciences, University of KwaZulu-Natal, Durban 4001, South Africa
| | - Lucas Gabriel Rodrigues Gomes
- Institute of Biological Sciences, Federal University of Minas Gerais, Brazil. Av. Antônio Carlos, 6627, Belo Horizonte, MG 31270-901, Brazil
| | - Roselane Gonçalves Dos Santos
- Institute of Biological Sciences, Federal University of Minas Gerais, Brazil. Av. Antônio Carlos, 6627, Belo Horizonte, MG 31270-901, Brazil
| | - Marcus Vinicius Canário Viana
- Institute of Biological Sciences, Federal University of Minas Gerais, Brazil. Av. Antônio Carlos, 6627, Belo Horizonte, MG 31270-901, Brazil; Federal University of Pará, UFPA, Brazil
| | - Joyce da Cruz Ferraz Dutra
- Institute of Biological Sciences, Federal University of Minas Gerais, Brazil. Av. Antônio Carlos, 6627, Belo Horizonte, MG 31270-901, Brazil
| | - Mariana Teixeira Dornelles Parise
- Institute of Biological Sciences, Federal University of Minas Gerais, Brazil. Av. Antônio Carlos, 6627, Belo Horizonte, MG 31270-901, Brazil
| | - Doglas Parise
- Institute of Biological Sciences, Federal University of Minas Gerais, Brazil. Av. Antônio Carlos, 6627, Belo Horizonte, MG 31270-901, Brazil
| | - Frederico F Campos
- Department of Computer Science, Federal University of Minas Gerais, Brazil Av Antônio Carlos, 6627, Belo Horizonte, MG 31270-901, Brazil
| | | | - José Miguel Ortega
- Institute of Biological Sciences, Federal University of Minas Gerais, Brazil. Av. Antônio Carlos, 6627, Belo Horizonte, MG 31270-901, Brazil
| | - Debmalya Barh
- Centre for Genomics and Applied Gene Technology, Institute of Integrative Omics and Applied Biotechnology (IIOAB), Nonakuri, Purba Medinipur, West Bengal 721172, India
| | - Preetam Ghosh
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USA
| | - Vasco A C Azevedo
- Institute of Biological Sciences, Federal University of Minas Gerais, Brazil. Av. Antônio Carlos, 6627, Belo Horizonte, MG 31270-901, Brazil
| | - Marcos A Dos Santos
- Department of Computer Science, Federal University of Minas Gerais, Brazil Av Antônio Carlos, 6627, Belo Horizonte, MG 31270-901, Brazil
| |
Collapse
|
2
|
da Silva VL, Fonseca AF, Fonseca M, da Silva TE, Coelho AC, Kroll JE, de Souza JES, Stransky B, de Souza GA, de Souza SJ. Genome-wide identification of cancer/testis genes and their association with prognosis in a pan-cancer analysis. Oncotarget 2017; 8:92966-92977. [PMID: 29190970 PMCID: PMC5696236 DOI: 10.18632/oncotarget.21715] [Citation(s) in RCA: 34] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2017] [Accepted: 08/17/2017] [Indexed: 11/29/2022] Open
Abstract
Cancer/testis (CT) genes are excellent candidates for cancer immunotherapies because of their restrict expression in normal tissues and the capacity to elicit an immune response when expressed in tumor cells. In this study, we provide a genome-wide screen for CT genes with the identification of 745 putative CT genes. Comparison with a set of known CT genes shows that 201 new CT genes were identified. Integration of gene expression and clinical data led us to identify dozens of CT genes associated with either good or poor prognosis. For the CT genes related to good prognosis, we show that there is a direct relationship between CT gene expression and a signal for CD8+ cells infiltration for some tumor types, especially melanoma.
Collapse
Affiliation(s)
- Vandeclecio Lira da Silva
- Instituto do Cérebro, UFRN, Natal, Brazil.,Ph.D. Program in Bioinformatics, UFRN, Natal, Brazil.,Bioinformatics Multidisciplinary Environment (BioME), Digital Metropolis Institute, UFRN, Natal, Brazil
| | - André Faustino Fonseca
- Instituto do Cérebro, UFRN, Natal, Brazil.,Ph.D. Program in Bioinformatics, UFRN, Natal, Brazil.,Bioinformatics Multidisciplinary Environment (BioME), Digital Metropolis Institute, UFRN, Natal, Brazil
| | | | | | - Ana Carolina Coelho
- Instituto do Cérebro, UFRN, Natal, Brazil.,Bioinformatics Multidisciplinary Environment (BioME), Digital Metropolis Institute, UFRN, Natal, Brazil
| | - José Eduardo Kroll
- Instituto do Cérebro, UFRN, Natal, Brazil.,Bioinformatics Multidisciplinary Environment (BioME), Digital Metropolis Institute, UFRN, Natal, Brazil.,Instituto de Bioinformática e Biotecnologia, Natal, Brazil
| | - Jorge Estefano Santana de Souza
- Bioinformatics Multidisciplinary Environment (BioME), Digital Metropolis Institute, UFRN, Natal, Brazil.,Instituto Metrópole Digital, UFRN, Natal, Brazil
| | - Beatriz Stransky
- Bioinformatics Multidisciplinary Environment (BioME), Digital Metropolis Institute, UFRN, Natal, Brazil.,Departmento de Engenharia Biomédica, UFRN, Natal, Brazil
| | - Gustavo Antonio de Souza
- Instituto do Cérebro, UFRN, Natal, Brazil.,Bioinformatics Multidisciplinary Environment (BioME), Digital Metropolis Institute, UFRN, Natal, Brazil
| | - Sandro José de Souza
- Instituto do Cérebro, UFRN, Natal, Brazil.,Bioinformatics Multidisciplinary Environment (BioME), Digital Metropolis Institute, UFRN, Natal, Brazil
| |
Collapse
|
3
|
Bioinformatics Analysis of the Human Surfaceome Reveals New Targets for a Variety of Tumor Types. Int J Genomics 2016; 2016:8346198. [PMID: 28097125 PMCID: PMC5206789 DOI: 10.1155/2016/8346198] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2016] [Revised: 09/07/2016] [Accepted: 10/18/2016] [Indexed: 12/27/2022] Open
Abstract
It is estimated that 10 to 20% of all genes in the human genome encode cell surface proteins and due to their subcellular localization these proteins represent excellent targets for cancer diagnosis and therapeutics. Therefore, a precise characterization of the surfaceome set in different types of tumor is needed. Using TCGA data from 15 different tumor types and a new method to identify cancer genes, the S-score, we identified several potential therapeutic targets within the surfaceome set. This allowed us to expand a previous analysis from us and provided a clear characterization of the human surfaceome in the tumor landscape. Moreover, we present evidence that a three-gene set-WNT5A, CNGA2, and IGSF9B-can be used as a signature associated with shorter survival in breast cancer patients. The data made available here will help the community to develop more efficient diagnostic and therapeutic tools for a variety of tumor types.
Collapse
|
4
|
Fonseca A, Gubitoso MD, Reis MS, de Souza SJ, Barrera J. A New Approach for Identification of Cancer-related Pathways using Protein Networks and Genomic Data. Cancer Inform 2016; 14:139-49. [PMID: 27158220 PMCID: PMC4854218 DOI: 10.4137/cin.s30800] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2015] [Revised: 02/11/2016] [Accepted: 02/17/2016] [Indexed: 11/11/2022] Open
Abstract
Cancer cells have anomalous development and proliferation due to disturbances in their control systems. The study of the behavior of cellular control system requires high-throughput dynamical data. Unfortunately, this type of data is not largely available. This fact motivates the main issue of this article: how to use static omics data and available biological knowledge to get new information about the elements of the control system in cancer cells. Two important measures to access the state of the cellular control system are the gene expression profile and the signaling pathways. This article uses a combination of these two static omics data to gain insights on the states of a cancer cell. To extract information from this kind of data, a statistical computational model was formalized and implemented. In order to exemplify the application of some aspects of the developed conceptual framework, we verified the hypothesis that different types of cancer cells have different disturbed signaling pathways. To this end, we developed a method that recovers small protein networks, called motifs, which are differentially represented in some subtypes of breast cancer. These differentially represented motifs are enriched with specific gene ontologies as well as with new putative cancer genes.
Collapse
Affiliation(s)
| | | | | | | | - Junior Barrera
- Institute of Mathematics and Statistics, USP, São Paulo, Brazil.; LETA, CeTICS, Butantan Institute, São Paulo, Brazil
| |
Collapse
|
5
|
A genetic network that suppresses genome rearrangements in Saccharomyces cerevisiae and contains defects in cancers. Nat Commun 2016; 7:11256. [PMID: 27071721 PMCID: PMC4833866 DOI: 10.1038/ncomms11256] [Citation(s) in RCA: 34] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2015] [Accepted: 03/07/2016] [Indexed: 01/09/2023] Open
Abstract
Gross chromosomal rearrangements (GCRs) play an important role in human diseases, including cancer. The identity of all Genome Instability Suppressing (GIS) genes is not currently known. Here multiple Saccharomyces cerevisiae GCR assays and query mutations were crossed into arrays of mutants to identify progeny with increased GCR rates. One hundred eighty two GIS genes were identified that suppressed GCR formation. Another 438 cooperatively acting GIS genes were identified that were not GIS genes, but suppressed the increased genome instability caused by individual query mutations. Analysis of TCGA data using the human genes predicted to act in GIS pathways revealed that a minimum of 93% of ovarian and 66% of colorectal cancer cases had defects affecting one or more predicted GIS gene. These defects included loss-of-function mutations, copy-number changes associated with reduced expression, and silencing. In contrast, acute myeloid leukaemia cases did not appear to have defects affecting the predicted GIS genes. Here, Richard Kolodner and colleagues use assays in Saccharomyces cerevisiae to identify 182 genetic modifiers of gross chromosomal rearrangements (GCRs). They also compared these Genome Instability Suppressing (GIS) genes and pathways in human cancer genome, and found many ovarian and colorectal cancer cases have alterations to GIS pathways.
Collapse
|
6
|
Winck FV, Prado Ribeiro AC, Ramos Domingues R, Ling LY, Riaño-Pachón DM, Rivera C, Brandão TB, Gouvea AF, Santos-Silva AR, Coletta RD, Paes Leme AF. Insights into immune responses in oral cancer through proteomic analysis of saliva and salivary extracellular vesicles. Sci Rep 2015; 5:16305. [PMID: 26538482 PMCID: PMC4633731 DOI: 10.1038/srep16305] [Citation(s) in RCA: 101] [Impact Index Per Article: 11.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2015] [Accepted: 10/09/2015] [Indexed: 01/08/2023] Open
Abstract
The development and progression of oral cavity squamous cell carcinoma (OSCC) involves complex cellular mechanisms that contribute to the low five-year survival rate of approximately 20% among diagnosed patients. However, the biological processes essential to tumor progression are not completely understood. Therefore, detecting alterations in the salivary proteome may assist in elucidating the cellular mechanisms modulated in OSCC and improve the clinical prognosis of the disease. The proteome of whole saliva and salivary extracellular vesicles (EVs) from patients with OSCC and healthy individuals were analyzed by LC-MS/MS and label-free protein quantification. Proteome data analysis was performed using statistical, machine learning and feature selection methods with additional functional annotation. Biological processes related to immune responses, peptidase inhibitor activity, iron coordination and protease binding were overrepresented in the group of differentially expressed proteins. Proteins related to the inflammatory system, transport of metals and cellular growth and proliferation were identified in the proteome of salivary EVs. The proteomics data were robust and could classify OSCC with 90% accuracy. The saliva proteome analysis revealed that immune processes are related to the presence of OSCC and indicate that proteomics data can contribute to determining OSCC prognosis.
Collapse
Affiliation(s)
- Flavia V. Winck
- Laboratório de Espectrometria de Massas, Laboratório Nacional de Biociências, LNBio, CNPEM, Campinas, SP, Brazil
| | | | - Romênia Ramos Domingues
- Laboratório de Espectrometria de Massas, Laboratório Nacional de Biociências, LNBio, CNPEM, Campinas, SP, Brazil
| | - Liu Yi Ling
- Laboratório Nacional de Ciência e Tecnologia do Bioetanol, CTBE, CNPEM, Campinas, SP, Brazil
| | | | - César Rivera
- Laboratório de Espectrometria de Massas, Laboratório Nacional de Biociências, LNBio, CNPEM, Campinas, SP, Brazil
- Departamento de Ciencias Básicas Biomédicas, Universidad de Talca (UTALCA), Talca, Chile
| | - Thaís Bianca Brandão
- Instituto do Câncer do Estado de São Paulo, Octavio Frias de Oliveira, ICESP, São Paulo, SP, Brazil
| | - Adriele Ferreira Gouvea
- Instituto do Câncer do Estado de São Paulo, Octavio Frias de Oliveira, ICESP, São Paulo, SP, Brazil
| | - Alan Roger Santos-Silva
- Faculdade de Odontologia de Piracicaba, Universidade Estadual de Campinas, UNICAMP, Piracicaba, SP, Brazil
| | - Ricardo D. Coletta
- Faculdade de Odontologia de Piracicaba, Universidade Estadual de Campinas, UNICAMP, Piracicaba, SP, Brazil
| | - Adriana F. Paes Leme
- Laboratório de Espectrometria de Massas, Laboratório Nacional de Biociências, LNBio, CNPEM, Campinas, SP, Brazil
| |
Collapse
|