1
|
Novais AA, Tamarindo GH, Melo LMM, Balieiro BC, Nóbrega D, dos Santos G, Saldanha SF, de Souza FF, Chuffa LGDA, Bracha S, Zuccari DAPDC. Exploring Canine Mammary Cancer through Liquid Biopsy: Proteomic Profiling of Small Extracellular Vesicles. Cancers (Basel) 2024; 16:2562. [PMID: 39061201 PMCID: PMC11275101 DOI: 10.3390/cancers16142562] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2024] [Revised: 07/11/2024] [Accepted: 07/12/2024] [Indexed: 07/28/2024] Open
Abstract
(Background). Canine mammary tumors (CMTs) have emerged as an important model for understanding pathophysiological aspects of human disease. Liquid biopsy (LB), which relies on blood-borne biomarkers and offers minimal invasiveness, holds promise for reflecting the disease status of patients. Small extracellular vesicles (SEVs) and their protein cargo have recently gained attention as potential tools for disease screening and monitoring. (Objectives). This study aimed to isolate SEVs from canine patients and analyze their proteomic profile to assess their diagnostic and prognostic potential. (Methods). Plasma samples were collected from female dogs grouped into CMT (malignant and benign), healthy controls, relapse, and remission groups. SEVs were isolated and characterized using ultracentrifugation (UC), nanoparticle tracking analysis (NTA) and transmission electron microscopy (TEM). Proteomic analysis of circulating SEVs was conducted using liquid chromatography-mass spectrometry (LC-MS). (Results). While no significant differences were observed in the concentration and size of exosomes among the studied groups, proteomic profiling revealed important variations. Mass spectrometry identified exclusive proteins that could serve as potential biomarkers for mammary cancer. These included Inter-alpha-trypsin inhibitor heavy chain (ITIH2 and ITI4), phosphopyruvate hydratase or alpha enolase (ENO1), eukaryotic translation elongation factor 2 (eEF2), actin (ACTB), transthyretin (TTR), beta-2-glycoprotein 1 (APOH) and gelsolin (GSN) found in female dogs with malignant tumors. Additionally, vitamin D-binding protein (VDBP), also known as group-specific component (GC), was identified as a protein present during remission. (Conclusions). The results underscore the potential of proteins found in SEVs as valuable biomarkers in CMTs. Despite the lack of differences in vesicle concentration and size between the groups, the analysis of protein content revealed promising markers with potential applications in CMT diagnosis and monitoring. These findings suggest a novel approach in the development of more precise and effective diagnostic tools for this challenging clinical condition.
Collapse
Affiliation(s)
- Adriana Alonso Novais
- Institute of Health Science (ICS), Universidade Federal de Mato Grosso (UFMT), Sinop 78550-728, MT, Brazil; (A.A.N.); (L.M.M.M.)
| | - Guilherme Henrique Tamarindo
- Brazilian Biosciences National Laboratory, Brazilian Center for Research in Energy and Materials (CNPEM), Campinas 13083-100, SP, Brazil;
| | - Luryan Mikaelly Minotti Melo
- Institute of Health Science (ICS), Universidade Federal de Mato Grosso (UFMT), Sinop 78550-728, MT, Brazil; (A.A.N.); (L.M.M.M.)
| | - Beatriz Castilho Balieiro
- Molecular Investigation of Cancer Laboratory (MICL), Department of Molecular Biology, Faculdade de Medicina de São José do Rio Preto/(FAMERP), São José do Rio Preto 15090-000, SP, Brazil;
| | - Daniela Nóbrega
- Pat Animal Laboratory, São José do Rio Preto 15070-000, SP, Brazil;
| | - Gislaine dos Santos
- Laboratory of Molecular Morphophysiology and Development (LMMD/ZMV), University of São Paulo, Pirassununga 13635-900, SP, Brazil; (G.d.S.); (S.F.S.)
| | - Schaienni Fontoura Saldanha
- Laboratory of Molecular Morphophysiology and Development (LMMD/ZMV), University of São Paulo, Pirassununga 13635-900, SP, Brazil; (G.d.S.); (S.F.S.)
| | - Fabiana Ferreira de Souza
- Department of Veterinary Surgery and Animal Reproduction, School of Veterinary Medicine and Animal Science, FMVZ, São Paulo State University (UNESP), Botucatu 18618-681, SP, Brazil;
| | - Luiz Gustavo de Almeida Chuffa
- Department of Structural and Functional Biology, Institute of Biosciences, UNESP—São Paulo State University, Botucatu 18618-689, SP, Brazil;
| | - Shay Bracha
- Department of Veterinary Clinical Sciences, College of Veterinary Medicine, Ohio State University, Columbus, OH 43210, USA;
| | - Debora Aparecida Pires de Campos Zuccari
- Molecular Investigation of Cancer Laboratory (MICL), Department of Molecular Biology, Faculdade de Medicina de São José do Rio Preto/(FAMERP), São José do Rio Preto 15090-000, SP, Brazil;
| |
Collapse
|
2
|
Association of SNP rs5069 in APOA1 with Benign Breast Diseases in a Mexican Population. Genes (Basel) 2022; 13:genes13050738. [PMID: 35627123 PMCID: PMC9141650 DOI: 10.3390/genes13050738] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2022] [Revised: 04/14/2022] [Accepted: 04/20/2022] [Indexed: 02/01/2023] Open
Abstract
Breast cancer (BCa) is the most common type of cancer affecting women worldwide. Some histological subtypes of benign breast disease (BBD) are considered risk factors for developing BCa. Single nucleotide polymorphisms (SNPs) in the genes encoding apolipoproteins A-I (APOA1) and B (APOB) have been associated with BCa in Tunisian, Chinese, and Taiwanese populations. The objective of this pilot study is to evaluate the possible contribution of APOA1 and APOB polymorphisms to BCa and BBD in the Mexican population. We analyzed the association of 4 SNPs in genes encoding apolipoproteins: rs670 and rs5069 in the APOA1 gene, and rs693 and rs1042031 in the APOB gene, by performing PCR-RFLP with DNA extracted from the biopsy tissue of Mexican women with BCa or BBD and whole blood samples obtained from the general population (GP). Our results showed an association between the CT + TT genotypes of the SNP rs5069 and BBD (p = 0.03201). In the A-T haplotype, the frequency of the SNPs rs670 and rs5069 differed significantly between the BBD group and the GP and BCa groups (p = 0.004111; p = 0.01303). In conclusion, the SNP rs5069 is associated with BBD but not with BCa in the Mexican population.
Collapse
|
3
|
Chantada-Vázquez MDP, Castro López A, García-Vence M, Acea-Nebril B, Bravo SB, Núñez C. Protein Corona Gold Nanoparticles Fingerprinting Reveals a Profile of Blood Coagulation Proteins in the Serum of HER2-Overexpressing Breast Cancer Patients. Int J Mol Sci 2020; 21:ijms21228449. [PMID: 33182810 PMCID: PMC7696934 DOI: 10.3390/ijms21228449] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2020] [Revised: 11/03/2020] [Accepted: 11/09/2020] [Indexed: 02/06/2023] Open
Abstract
Breast cancer (BC) is a molecularly heterogeneous disease that encompasses five major molecular subtypes (luminal A (LA), luminal B HER2 negative (LB-), luminal B HER2 positive (LB+), HER2 positive (HER2+) and triple negative breast cancer (TNBC)). BC treatment mainly depends on the identification of the specific subtype. Despite the correct identification, therapies could fail in some patients. Thus, further insights into the genetic and molecular status of the different BC subtypes could be very useful to improve the response of BC patients to the range of available therapies. In this way, we used gold nanoparticles (AuNPs, 12.96 ± 0.72 nm) as a scavenging tool in combination with Sequential Window Acquisition of All Theoretical Mass Spectra (SWATH-MS) to quantitatively analyze the serum proteome alterations in the different breast cancer intrinsic subtypes. The differentially regulated proteins specific of each subtype were further analyzed with the bioinformatic tools STRING and PANTHER to identify the major molecular function, biological processes, cellular origin, protein class and biological pathways altered due to the heterogeneity in proteome of the different BC subtypes. Importantly, a profile of blood coagulation proteins was identified in the serum of HER2-overexpressing BC patients.
Collapse
Affiliation(s)
- María del Pilar Chantada-Vázquez
- Research Unit, Lucus Augusti University Hospital (HULA), Servizo Galego de Saúde (SERGAS), 27002 Lugo, Spain;
- Proteomic Unit, Health Research Institute of Santiago de Compostela (IDIS), University Clinical Hospital of Santiago de Compostela (CHUS), 15706 Santiago de Compostela, Spain;
| | - Antonio Castro López
- Breast Unit, Hospital Universitario Lucus Augusti (HULA), Servizo Galego de Saúde (SERGAS), 27002 Lugo, Spain;
| | - María García-Vence
- Proteomic Unit, Health Research Institute of Santiago de Compostela (IDIS), University Clinical Hospital of Santiago de Compostela (CHUS), 15706 Santiago de Compostela, Spain;
| | - Benigno Acea-Nebril
- Department of Surgery, Breast Unit, Complexo Hospitalario Universitario A Coruña (CHUAC), Servizo Galego de Saúde (SERGAS), 15006 A Coruña, Spain;
| | - Susana B. Bravo
- Proteomic Unit, Health Research Institute of Santiago de Compostela (IDIS), University Clinical Hospital of Santiago de Compostela (CHUS), 15706 Santiago de Compostela, Spain;
- Correspondence: (S.B.B.); (C.N.)
| | - Cristina Núñez
- Research Unit, Lucus Augusti University Hospital (HULA), Servizo Galego de Saúde (SERGAS), 27002 Lugo, Spain;
- Correspondence: (S.B.B.); (C.N.)
| |
Collapse
|
4
|
O'Rourke MB, Sahni S, Samra J, Mittal A, Molloy MP. Data independent acquisition of plasma biomarkers of response to neoadjuvant chemotherapy in pancreatic ductal adenocarcinoma. J Proteomics 2020; 231:103998. [PMID: 33027703 DOI: 10.1016/j.jprot.2020.103998] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2020] [Revised: 08/18/2020] [Accepted: 09/29/2020] [Indexed: 02/07/2023]
Abstract
The detection of disease-related plasma biomarkers has challenged the proteomic community for years. Attractive features for plasma proteomics includes the ease of collection and small volume needed for analysis, but on the other hand, the presence of highly abundant proteins complicates sample preparation procedures and reduces dynamic range. Data independent acquisition label free quantitation (DIA-LFQ) by mass spectrometry partly overcomes the dynamic range issue; however, generating the peptide spectral reference libraries that allow extensive analysis of the plasma proteome can be a slow and expensive task which is unattainable for many laboratories. We investigated the re-purposing of publically available plasma proteome datasets and the impact on peptide/protein detection for DIA-LFQ. We carried out these studies in the context of identifying putative biomarkers of response to neoadjuvant chemotherapy (NAC) for pancreatic ductal adenocarcinoma, as no useful plasma biomarkers have been clinically adopted. We demonstrated the benefit in searching DIA data against multiple spectral libraries to show that complement proteins were linked to NAC response in PDAC patients, confirming previous observations of the prognostic utility of complement following adjuvant chemotherapy. Our workflow demonstrates that DIA-LFQ can be readily applied in the oncology setting for the putative assignment of clinically relevant plasma biomarkers. STATEMENT OF SIGNIFICANCE: The proteomic mass spectrometry analysis of undepleted, unfractionated human plasma has benefits for sample throughput but remains challenging to obtain deep coverage. This work evaluated the re-purposing of open source peptide mass spectrometry data from human plasma to create spectral reference libraries for use in Data independent acquisition (DIA). We showed how seeding in locally acquired data to integrate iRT peptides into spectral libraries increased identification confidence by facilitating querying of multiple libraries. This workflow was applied to the discovery of putative plasma biomarkers for response to neoadjuvant chemotherapy (NAC) in pancreatic ductal adenocarcinoma patients. There is a paucity of prior information in the literature on this topic and we show that good responder patients have reduced levels of complement proteins.
Collapse
Affiliation(s)
- Matthew B O'Rourke
- Bowel Cancer and Biomarker Laboratory, Kolling Institute, Royal North Shore Hospital, The University of Sydney, Australia
| | - Sumit Sahni
- Bill Walsh Translational Cancer Laboratory, Kolling Institute, Royal North Shore Hospital, The University of Sydney, Australia
| | - Jaswinder Samra
- Upper GI Surgical Unit, Royal North Shore Hospital, Sydney, Australia
| | - Anubhav Mittal
- Upper GI Surgical Unit, Royal North Shore Hospital, Sydney, Australia
| | - Mark P Molloy
- Bowel Cancer and Biomarker Laboratory, Kolling Institute, Royal North Shore Hospital, The University of Sydney, Australia.
| |
Collapse
|
5
|
Del Pilar Chantada-Vázquez M, López AC, Vence MG, Vázquez-Estévez S, Acea-Nebril B, Calatayud DG, Jardiel T, Bravo SB, Núñez C. Proteomic investigation on bio-corona of Au, Ag and Fe nanoparticles for the discovery of triple negative breast cancer serum protein biomarkers. J Proteomics 2019; 212:103581. [PMID: 31731051 DOI: 10.1016/j.jprot.2019.103581] [Citation(s) in RCA: 37] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2019] [Revised: 10/14/2019] [Accepted: 11/07/2019] [Indexed: 12/21/2022]
Abstract
Nowadays, there are no targeted therapeutic modalities for triple negative breast cancer (TNBC). This disease is associated with poor prognosis and worst clinical outcome because of the aggressive nature of the tumor, delayed diagnosis, and non-specific symptoms in the early stages. Therefore, identification of novel specific TNBC serum biomarkers for screening and therapeutic purposes remains an urgent clinical requirement. New user-friendly and cheap methods for biomarker identification are needed, and nanotechnology offers new opportunities. When dispersed in blood, nanoparticles (NPs) are covered by a protein shell termed "protein corona" (PC). While alterations in protein patterns are challeging to detect by conventional blood analyses, PC acts as a "nano-concentrator" of serum proteins with affinity for NPs' surface. So, the characterization of PC could allow the detection of otherwise undetectable changes in protein concentration at an early stage of the disease or after chemotherapy or surgery. To explore this research idea, serum samples from 8 triple negative breast cancer (TNBC) patients and 8 patients without malignancy were allowed to interact with gold nanoparticles (AuNPs: 10.02 ± 0.91 nm), silver nanoparticles (AgNPs: 9.73 ± 1.70 nm) and magnetic nanoparticles (MNPs: (9.30 ± 0.67 nm). Here, in order to identify biomarker candidates in serum of TNBC patients, these nanomaterials were combined with electrophoretic separation (SDS-PAGE) to performed qualitative and quantitative comparisons of the serum proteomes of TNBC patients (n = 8) and healthy controls (n = 8) by liquid chromatography tandem-mass spectrometry (LC-MS/MS) analysis. The results were validated through a sequential window acquisition of all theoretical mass spectra (SWATH) analysis, performed in total serum samples (patients and controls) using this approach as a multiple reaction monitoring (MRM) analysis. SIGNIFICANCE: It is well known that several proteins presented in human serum are important biomarkers for the diagnosis or prognosis of different diseases, as triple negative breast cancer (TNBC). Determining how nanomaterials as gold nanoparticles (AuNPs: 10.02 ± 0.91 nm), silver nanoparticles (AgNPs: 9.73 ± 1.70 nm) and magnetic nanoparticles (MNPs: (9.30 ± 0.67 nm) interact with human serum will assist not only in understanding their effects on the biological system (biocompability and toxicity), but also to obtain information for developing novel nanomaterials with high specificity and selectivity towards proteins with an important biological function (prognostic and diagnostic protein biomarkers).
Collapse
Affiliation(s)
| | - Antonio Castro López
- Breast Unit, Hospital Universitario Lucus Augusti (HULA), Servizo Galego de Saúde (SERGAS), 27002 Lugo, Spain
| | - María García Vence
- Proteomic Unit, Instituto de Investigaciones Sanitarias-IDIS, Complejo Hospitalario Universitario de Santiago de Compostela (CHUS), 15706 Santiago de Compostela, Spain
| | - Sergio Vázquez-Estévez
- Oncology Division, Hospital Universitario Lucus Augusti (HULA), Servizo Galego de Saúde (SERGAS), 27002 Lugo, Spain
| | - Benigno Acea-Nebril
- Department of Surgery, Breast Unit, Complexo Hospitalario Universitario A Coruña (CHUAC), SERGAS, A Coruña, Spain
| | - David G Calatayud
- Department of Electroceramics, Instituto de Cerámica y Vidrio-CSIC, Kelsen 5, Campus de Cantoblanco, 28049 Madrid, Spain
| | - Teresa Jardiel
- Department of Electroceramics, Instituto de Cerámica y Vidrio-CSIC, Kelsen 5, Campus de Cantoblanco, 28049 Madrid, Spain
| | - Susana B Bravo
- Proteomic Unit, Instituto de Investigaciones Sanitarias-IDIS, Complejo Hospitalario Universitario de Santiago de Compostela (CHUS), 15706 Santiago de Compostela, Spain.
| | - Cristina Núñez
- Research Unit, Hospital Universitario Lucus Augusti (HULA), Servizo Galego de Saúde (SERGAS), 27002 Lugo, Spain.
| |
Collapse
|
6
|
Abstract
Breast cancer is a global health issue, and as the tumor burden increases, we need to come up with newer, better technologies which are convenient, cheap, rapid, sensitive with a high specificity. Technological advancements in the field of cancer biomarker has led to the development of techniques such as mass spectrometric analysis and microarray analysis in which genes, proteins and hundreds and thousands of metabolites can be identified with the emergence of genomics, proteomics and metabolomics. This research is focused on finding biomarkers for diagnosis, prognosis, staging, treatment response and targets for chemotherapy, generating a panel of markers which provide better clinical information compared to a single marker in the panel. This review briefly summarizes application of genomics and proteomics followed by key concepts and applications of metabolomics in breast cancer, with the conclusion that an integration of the three “OMIC” technologies may hold the key to future biomarker discovery.
Collapse
Affiliation(s)
- Naila Irum Hadi
- Dr. Naila Irum Hadi, MBBS, MPhil, PhD fellow. Professor of Pathology, Ziauddin University, Karachi, Pakistan
| | - Qamar Jamal
- Dr. Qamar Jamal, MBBS, MPhil, PhD. Professor of Pathology, Ziauddin University, Karachi, Pakistan
| |
Collapse
|
7
|
Beretov J, Wasinger VC, Millar EKA, Schwartz P, Graham PH, Li Y. Proteomic Analysis of Urine to Identify Breast Cancer Biomarker Candidates Using a Label-Free LC-MS/MS Approach. PLoS One 2015; 10:e0141876. [PMID: 26544852 PMCID: PMC4636393 DOI: 10.1371/journal.pone.0141876] [Citation(s) in RCA: 75] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2014] [Accepted: 10/14/2015] [Indexed: 01/11/2023] Open
Abstract
INTRODUCTION Breast cancer is a complex heterogeneous disease and is a leading cause of death in women. Early diagnosis and monitoring progression of breast cancer are important for improving prognosis. The aim of this study was to identify protein biomarkers in urine for early screening detection and monitoring invasive breast cancer progression. METHOD We performed a comparative proteomic analysis using ion count relative quantification label free LC-MS/MS analysis of urine from breast cancer patients (n = 20) and healthy control women (n = 20). RESULTS Unbiased label free LC-MS/MS-based proteomics was used to provide a profile of abundant proteins in the biological system of breast cancer patients. Data analysis revealed 59 urinary proteins that were significantly different in breast cancer patients compared to the normal control subjects (p<0.05, fold change >3). Thirty-six urinary proteins were exclusively found in specific breast cancer stages, with 24 increasing and 12 decreasing in their abundance. Amongst the 59 significant urinary proteins identified, a list of 13 novel up-regulated proteins were revealed that may be used to detect breast cancer. These include stage specific markers associated with pre-invasive breast cancer in the ductal carcinoma in-situ (DCIS) samples (Leucine LRC36, MAST4 and Uncharacterized protein CI131), early invasive breast cancer (DYH8, HBA, PEPA, uncharacterized protein C4orf14 (CD014), filaggrin and MMRN2) and metastatic breast cancer (AGRIN, NEGR1, FIBA and Keratin KIC10). Preliminary validation of 3 potential markers (ECM1, MAST4 and filaggrin) identified was performed in breast cancer cell lines by Western blotting. One potential marker MAST4 was further validated in human breast cancer tissues as well as individual human breast cancer urine samples with immunohistochemistry and Western blotting, respectively. CONCLUSIONS Our results indicate that urine is a useful non-invasive source of biomarkers and the profile patterns (biomarkers) identified, have potential for clinical use in the detection of BC. Validation with a larger independent cohort of patients is required in the following study.
Collapse
Affiliation(s)
- Julia Beretov
- Cancer Care Centre, St George Hospital, Kogarah, Australia
- St George and Sutherland Clinical School, Faculty of Medicine, University of New South Wales (UNSW), Kensington, Australia
- SEALS, Anatomical Pathology, St George Hospital, Kogarah, Australia
| | - Valerie C. Wasinger
- Bioanalytical Mass Spectrometry Facility, Mark Wainwright Analytical Centre, UNSW, Kensington, Australia
- School of Medical Sciences, UNSW, Kensington, Australia
| | - Ewan K. A. Millar
- SEALS, Anatomical Pathology, St George Hospital, Kogarah, Australia
- School of Medical Sciences, UNSW, Kensington, Australia
- Cancer Research Program, Kinghorn Cancer Centre and Garvan Institute of Medical Research, Darlinghurst, Australia
- School of Medicine and Health Sciences, University of Western Sydney, Campbelltown, Australia
| | - Peter Schwartz
- Breast Surgery, St George Private Hospital, Kogarah, Australia
| | - Peter H. Graham
- Cancer Care Centre, St George Hospital, Kogarah, Australia
- St George and Sutherland Clinical School, Faculty of Medicine, University of New South Wales (UNSW), Kensington, Australia
| | - Yong Li
- Cancer Care Centre, St George Hospital, Kogarah, Australia
- St George and Sutherland Clinical School, Faculty of Medicine, University of New South Wales (UNSW), Kensington, Australia
- * E-mail:
| |
Collapse
|
8
|
Majid A, Ali S, Iqbal M, Kausar N. Prediction of human breast and colon cancers from imbalanced data using nearest neighbor and support vector machines. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2014; 113:792-808. [PMID: 24472367 DOI: 10.1016/j.cmpb.2014.01.001] [Citation(s) in RCA: 42] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/06/2013] [Revised: 12/29/2013] [Accepted: 01/03/2014] [Indexed: 06/03/2023]
Abstract
This study proposes a novel prediction approach for human breast and colon cancers using different feature spaces. The proposed scheme consists of two stages: the preprocessor and the predictor. In the preprocessor stage, the mega-trend diffusion (MTD) technique is employed to increase the samples of the minority class, thereby balancing the dataset. In the predictor stage, machine-learning approaches of K-nearest neighbor (KNN) and support vector machines (SVM) are used to develop hybrid MTD-SVM and MTD-KNN prediction models. MTD-SVM model has provided the best values of accuracy, G-mean and Matthew's correlation coefficient of 96.71%, 96.70% and 71.98% for cancer/non-cancer dataset, breast/non-breast cancer dataset and colon/non-colon cancer dataset, respectively. We found that hybrid MTD-SVM is the best with respect to prediction performance and computational cost. MTD-KNN model has achieved moderately better prediction as compared to hybrid MTD-NB (Naïve Bayes) but at the expense of higher computing cost. MTD-KNN model is faster than MTD-RF (random forest) but its prediction is not better than MTD-RF. To the best of our knowledge, the reported results are the best results, so far, for these datasets. The proposed scheme indicates that the developed models can be used as a tool for the prediction of cancer. This scheme may be useful for study of any sequential information such as protein sequence or any nucleic acid sequence.
Collapse
Affiliation(s)
- Abdul Majid
- Department of Computer & Information Sciences, Pakistan Institute of Engineering & Applied Sciences, Nilore, 45650 Islamabad, Pakistan.
| | - Safdar Ali
- Department of Computer & Information Sciences, Pakistan Institute of Engineering & Applied Sciences, Nilore, 45650 Islamabad, Pakistan.
| | - Mubashar Iqbal
- Department of Computer & Information Sciences, Pakistan Institute of Engineering & Applied Sciences, Nilore, 45650 Islamabad, Pakistan.
| | - Nabeela Kausar
- Department of Computer & Information Sciences, Pakistan Institute of Engineering & Applied Sciences, Nilore, 45650 Islamabad, Pakistan.
| |
Collapse
|
9
|
Ali S, Majid A, Khan A. IDM-PhyChm-Ens: intelligent decision-making ensemble methodology for classification of human breast cancer using physicochemical properties of amino acids. Amino Acids 2014; 46:977-93. [PMID: 24390396 DOI: 10.1007/s00726-013-1659-x] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2013] [Accepted: 12/20/2013] [Indexed: 12/21/2022]
Abstract
Development of an accurate and reliable intelligent decision-making method for the construction of cancer diagnosis system is one of the fast growing research areas of health sciences. Such decision-making system can provide adequate information for cancer diagnosis and drug discovery. Descriptors derived from physicochemical properties of protein sequences are very useful for classifying cancerous proteins. Recently, several interesting research studies have been reported on breast cancer classification. To this end, we propose the exploitation of the physicochemical properties of amino acids in protein primary sequences such as hydrophobicity (Hd) and hydrophilicity (Hb) for breast cancer classification. Hd and Hb properties of amino acids, in recent literature, are reported to be quite effective in characterizing the constituent amino acids and are used to study protein foldings, interactions, structures, and sequence-order effects. Especially, using these physicochemical properties, we observed that proline, serine, tyrosine, cysteine, arginine, and asparagine amino acids offer high discrimination between cancerous and healthy proteins. In addition, unlike traditional ensemble classification approaches, the proposed 'IDM-PhyChm-Ens' method was developed by combining the decision spaces of a specific classifier trained on different feature spaces. The different feature spaces used were amino acid composition, split amino acid composition, and pseudo amino acid composition. Consequently, we have exploited different feature spaces using Hd and Hb properties of amino acids to develop an accurate method for classification of cancerous protein sequences. We developed ensemble classifiers using diverse learning algorithms such as random forest (RF), support vector machines (SVM), and K-nearest neighbor (KNN) trained on different feature spaces. We observed that ensemble-RF, in case of cancer classification, performed better than ensemble-SVM and ensemble-KNN. Our analysis demonstrates that ensemble-RF, ensemble-SVM and ensemble-KNN are more effective than their individual counterparts. The proposed 'IDM-PhyChm-Ens' method has shown improved performance compared to existing techniques.
Collapse
Affiliation(s)
- Safdar Ali
- Department of Computer and Information Sciences, Pakistan Institute of Engineering, and Applied Sciences, Nilore, Islamabad, 45650, Pakistan,
| | | | | |
Collapse
|