1
|
Buszewski B, Błońska D, Kłodzińska E, Konop M, Kubesová A, Šalplachta J. Determination of Pathogens by Electrophoretic and Spectrometric Techniques. Crit Rev Anal Chem 2023:1-24. [PMID: 37326587 DOI: 10.1080/10408347.2023.2219748] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/17/2023]
Abstract
In modern medical diagnostics, where analytical chemistry plays a key role, fast and accurate identification of pathogens is becoming increasingly important. Infectious diseases pose a growing threat to public health due to population growth, international air travel, bacterial resistance to antibiotics, and other factors. For instance, the detection of SARS-CoV-2 in patient samples is a key tool to monitor the spread of the disease. While there are several techniques for identifying pathogens by their genetic code, most of these methods are too expensive or slow to effectively analyze clinical and environmental samples that may contain hundreds or even thousands of different microbes. Standard approaches (e.g., culture media and biochemical assays) are known to be very time- and labor-intensive. The purpose of this review paper is to highlight the problems associated with the analysis and identification of pathogens that cause many serious infections. Special attention was paid to the description of mechanisms and the explanation of the phenomena and processes occurring on the surface of pathogens as biocolloids (charge distribution). This review also highlights the importance of electromigration techniques and demonstrates their potential for pathogen pre-separation and fractionation and demonstrates the use of spectrometric methods, such as MALDI-TOF MS, for their detection and identification.
Collapse
Affiliation(s)
- Bogusław Buszewski
- Prof. Jan Czochralski Kuyavian-Pomeranian Research & Development Centre, Torun, Poland
- Department of Environmental Chemistry and Bioanalytics, Nicolaus Copernicus University in Toruń, Torun, Poland
| | - Dominika Błońska
- Department of Environmental Chemistry and Bioanalytics, Nicolaus Copernicus University in Toruń, Torun, Poland
- Centre for Modern Interdisciplinary Technologies, Torun, Poland
| | - Ewa Kłodzińska
- Department of Experimental Physiology and Pathophysiology, Laboratory of Centre for Preclinical Research, Medical University of Warsaw, Warsaw, Poland
| | - Marek Konop
- Department of Experimental Physiology and Pathophysiology, Laboratory of Centre for Preclinical Research, Medical University of Warsaw, Warsaw, Poland
| | - Anna Kubesová
- Institute of Analytical Chemistry of the CAS, Brno, Czech Republic
| | - Jiří Šalplachta
- Institute of Analytical Chemistry of the CAS, Brno, Czech Republic
| |
Collapse
|
2
|
Xu Z, Chen J, Vougas K, Shah A, Shah H, Misra R, Mkrtchyan HV. Comparative Proteomic Profiling of Methicillin-Susceptible and Resistant Staphylococcus aureus. Proteomics 2020; 20:e1900221. [PMID: 31872541 DOI: 10.1002/pmic.201900221] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2019] [Revised: 12/10/2019] [Indexed: 11/08/2022]
Abstract
Staphylococcus aureus is a highly successful human pathogen responsible for a wide range of infections. This study provides insights into the virulence, pathogenicity, and antimicrobial resistance determinants of methicillin-susceptible and methicillin-resistant S. aureus (MSSA; MRSA) recovered from non-healthcare environments. Three environmental MSSA and three environmental MRSA are selected for proteomic profiling using isobaric tag for relative and absolute quantitation tandem mass spectrometry (iTRAQ MS/MS). Gene Ontology annotation and Kyoto Encyclopedia of Genes and Genomes pathway annotation are applied to interpret the functions of the proteins detected. 792 proteins are identified in MSSA and MRSA. Comparative analysis of MRSA and MSSA reveals that 8 of out 792 proteins are upregulated and 156 are downregulated. Proteins that have differences in abundance are predominantly involved in catalytic and binding activity. Among 164 differently abundant proteins, 29 are involved in pathogenesis, antimicrobial resistance, stress response, mismatch repair, and cell wall synthesis. Twenty-two proteins associated with pathogenicity including SPA, SBI, CLFA, and DLT are upregulated in MRSA. Moreover, the upregulated pathogenic protein ENTC2 in MSSA is determined to be a super antigen, potentially capable of triggering toxic shock syndrome in the host. Enhanced pathogenicity, antimicrobial resistance, and stress response are observed in MRSA compared to MSSA.
Collapse
Affiliation(s)
- Zhen Xu
- Department of Sanitary Toxicology and Chemistry, Tianjin Key Laboratory of Environment, Nutrition and Public Health, Center for International Collaborative Research on Environment, Nutrition and Public Health, Tianjin Medical University, Tianjin, 300070, China.,School of Biological and Chemical Sciences, Queen Mary University of London, London, E1 4NS, UK
| | - Jiazhen Chen
- Department of Infectious Diseases, Huashan Hospital, Fudan University, Shanghai, 200040, China
| | - Kostas Vougas
- Biotechnology Division, Biomedical Research Foundation of the Academy of Athens, Athens, 115 27, Greece
| | - Ajit Shah
- Department of Natural Sciences, Middlesex University, London, NW4 4BT, UK
| | - Haroun Shah
- Department of Natural Sciences, Middlesex University, London, NW4 4BT, UK
| | - Raju Misra
- Molecular Biology, Core Research Laboratories, Natural History Museum, Cromwell Rd, London, SW7 5BD, UK
| | - Hermine V Mkrtchyan
- School of Biological and Chemical Sciences, Queen Mary University of London, London, E1 4NS, UK.,School of Health, Sport and Biosciences, University of East London, London, E15 4LZ, UK
| |
Collapse
|
3
|
Tong DL, Kempsell KE, Szakmany T, Ball G. Development of a Bioinformatics Framework for Identification and Validation of Genomic Biomarkers and Key Immunopathology Processes and Controllers in Infectious and Non-infectious Severe Inflammatory Response Syndrome. Front Immunol 2020; 11:380. [PMID: 32318053 PMCID: PMC7147506 DOI: 10.3389/fimmu.2020.00380] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2019] [Accepted: 02/17/2020] [Indexed: 12/12/2022] Open
Abstract
Sepsis is defined as dysregulated host response caused by systemic infection, leading to organ failure. It is a life-threatening condition, often requiring admission to an intensive care unit (ICU). The causative agents and processes involved are multifactorial but are characterized by an overarching inflammatory response, sharing elements in common with severe inflammatory response syndrome (SIRS) of non-infectious origin. Sepsis presents with a range of pathophysiological and genetic features which make clinical differentiation from SIRS very challenging. This may reflect a poor understanding of the key gene inter-activities and/or pathway associations underlying these disease processes. Improved understanding is critical for early differential recognition of sepsis and SIRS and to improve patient management and clinical outcomes. Judicious selection of gene biomarkers suitable for development of diagnostic tests/testing could make differentiation of sepsis and SIRS feasible. Here we describe a methodologic framework for the identification and validation of biomarkers in SIRS, sepsis and septic shock patients, using a 2-tier gene screening, artificial neural network (ANN) data mining technique, using previously published gene expression datasets. Eight key hub markers have been identified which may delineate distinct, core disease processes and which show potential for informing underlying immunological and pathological processes and thus patient stratification and treatment. These do not show sufficient fold change differences between the different disease states to be useful as primary diagnostic biomarkers, but are instrumental in identifying candidate pathways and other associated biomarkers for further exploration.
Collapse
Affiliation(s)
- Dong Ling Tong
- Artificial Intelligence Laboratory, Faculty of Engineering and Computing, First City University College, Petaling Jaya, Malaysia.,School of Science and Technology, Nottingham Trent University, Nottingham, United Kingdom
| | - Karen E Kempsell
- Public Health England, National Infection Service, Porton Down, Salisbury, United Kingdom
| | - Tamas Szakmany
- Department of Anaesthesia Intensive Care and Pain Medicine, Division of Population Medicine, Cardiff University, Cardiff, United Kingdom
| | - Graham Ball
- School of Science and Technology, Nottingham Trent University, Nottingham, United Kingdom
| |
Collapse
|
4
|
Kheirelseid EAH, Miller N, Chang KH, Curran C, Hennessey E, Sheehan M, Newell J, Lemetre C, Balls G, Kerin MJ. miRNA expressions in rectal cancer as predictors of response to neoadjuvant chemoradiation therapy. Int J Colorectal Dis 2013; 28:247-60. [PMID: 22903298 DOI: 10.1007/s00384-012-1549-9] [Citation(s) in RCA: 56] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 07/25/2012] [Indexed: 02/04/2023]
Abstract
INTRODUCTION Neoadjuvant chemoradiation therapy has been shown to improve the outcome in patients with rectal cancer and is generally accepted as standard care; however, only selected patients would benefit from this treatment. We aimed to identify predictors of response to neoadjuvant chemoradiation therapy in colorectal cancer using formalin-fixed paraffin-embedded (FFPE) tissues as source of genetic materials and microarray analysis as investigation tool. METHODS After optimization of RNA extraction methods from FFPE, microarray analysis was carried out on total RNA extracted from 12 pre-treatment FFPE rectal tissues using Megaplex pool A. Microarray data were analysed using an artificial neural network algorithm. Statistical analysis and correlation with clinicopathological data was performed using SPSS software. RESULTS A distinct miRNA expression signature predictive of response to neoadjuvant CRT in 12 FFPE pre-treatment rectal cancer tissue samples was identified. These signatures consisted of three miRNA transcripts (miR-16, miR-590-5p and miR-153) to predict complete vs. incomplete response and two miRNA transcript (miR-519c-3p and miR-561) to predict good vs. poor response with a median accuracy of 100 %. CONCLUSION Using microarray analysis of pretreatment FFPE rectal cancer tissues, we identified for the first time a group of miRNA predictors of response to neoadjuvant CRT. This, indeed, can lead to a significant improvement in patient selection criteria and personalized rectal cancer management.
Collapse
Affiliation(s)
- Elrasheid A H Kheirelseid
- Department of Surgery, National University of Ireland Galway, Clinical Science Institute, Costello Road, Galway, Ireland.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
5
|
Ahmadi H, Ahmadi A, Azimzadeh-Jamalkandi S, Shoorehdeli MA, Salehzadeh-Yazdi A, Bidkhori G, Masoudi-Nejad A. HomoTarget: a new algorithm for prediction of microRNA targets in Homo sapiens. Genomics 2012; 101:94-100. [PMID: 23174671 DOI: 10.1016/j.ygeno.2012.11.005] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2012] [Revised: 09/25/2012] [Accepted: 11/09/2012] [Indexed: 12/19/2022]
Abstract
MiRNAs play an essential role in the networks of gene regulation by inhibiting the translation of target mRNAs. Several computational approaches have been proposed for the prediction of miRNA target-genes. Reports reveal a large fraction of under-predicted or falsely predicted target genes. Thus, there is an imperative need to develop a computational method by which the target mRNAs of existing miRNAs can be correctly identified. In this study, combined pattern recognition neural network (PRNN) and principle component analysis (PCA) architecture has been proposed in order to model the complicated relationship between miRNAs and their target mRNAs in humans. The results of several types of intelligent classifiers and our proposed model were compared, showing that our algorithm outperformed them with higher sensitivity and specificity. Using the recent release of the mirBase database to find potential targets of miRNAs, this model incorporated twelve structural, thermodynamic and positional features of miRNA:mRNA binding sites to select target candidates.
Collapse
Affiliation(s)
- Hamed Ahmadi
- Laboratory of Systems Biology and Bioinformatics (LBB), Institute of Biochemistry and Biophysics, University of Tehran, Tehran, Iran
| | - Ali Ahmadi
- Department of Electrical and Computer Engineering, Khajeh-Nasir Toosi University, Tehran, Iran
| | - Sadegh Azimzadeh-Jamalkandi
- Laboratory of Systems Biology and Bioinformatics (LBB), Institute of Biochemistry and Biophysics, University of Tehran, Tehran, Iran
| | | | - Ali Salehzadeh-Yazdi
- Laboratory of Systems Biology and Bioinformatics (LBB), Institute of Biochemistry and Biophysics, University of Tehran, Tehran, Iran
| | - Gholamreza Bidkhori
- Laboratory of Systems Biology and Bioinformatics (LBB), Institute of Biochemistry and Biophysics, University of Tehran, Tehran, Iran
| | - Ali Masoudi-Nejad
- Laboratory of Systems Biology and Bioinformatics (LBB), Institute of Biochemistry and Biophysics, University of Tehran, Tehran, Iran.
| |
Collapse
|
6
|
Matharoo-Ball B, Ratcliffe L, Lancashire L, Ugurel S, Miles AK, Weston DJ, Rees R, Schadendorf D, Ball G, Creaser CS. Diagnostic biomarkers differentiating metastatic melanoma patients from healthy controls identified by an integrated MALDI-TOF mass spectrometry/bioinformatic approach. Proteomics Clin Appl 2012; 1:605-20. [PMID: 21136712 DOI: 10.1002/prca.200700022] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
The prognosis of advanced metastatic melanoma (American Joint Committee on Cancer (AJCC) stage IV) remains dismal with a 5-year survival rate of 6-18%. In the present study, an integrated MALDI mass spectrometric approach combined with artificial neural networks (ANNs) analysis and modeling has been used for the identification of biomarker ions in serum from stage IV melanoma patients allowing the discrimination of metastatic disease from healthy status with high specificities of 92% for protein ions and 100% for peptide biomarkers. Our ANNs model also correctly classified 98% of a blind validation set of AJCC stage I melanoma samples as nonstage IV samples, emphasizing the power of the newly defined biomarkers to identify patients with late-stage metastatic melanoma. Sequence analysis identified peptides derived from metastasis-associated proteins; alpha 1-acid glycoprotein precursor-1/2 (AAG-1/2) and complement C3 component precursor-1 (CCCP-1). Furthermore, quantitation of serum AAG by an immunoassay showed a significant (p<0.001) increase in AAG serum concentration in stage IV patients in comparison with healthy volunteers; moreover; the quantity of AAG plotted against MALDI-MS peak intensity classified the groups into two distinct clusters. Ongoing studies of other disease stages will provide evidence whether our strategy is sufficiently robust to give rise to stage-specific protein/peptide signatures in melanoma.
Collapse
Affiliation(s)
- Balwir Matharoo-Ball
- Interdisciplinary Biomedical Research Centre, School of Biomedical and Natural Sciences, Nottingham Trent University, Clifton Lane, Nottingham, UK
| | | | | | | | | | | | | | | | | | | |
Collapse
|
7
|
Kaur P, Schlatzer D, Cooke K, Chance MR. Pairwise protein expression classifier for candidate biomarker discovery for early detection of human disease prognosis. BMC Bioinformatics 2012; 13:191. [PMID: 22870920 PMCID: PMC3468399 DOI: 10.1186/1471-2105-13-191] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2012] [Accepted: 07/30/2012] [Indexed: 12/24/2022] Open
Abstract
BACKGROUND An approach to molecular classification based on the comparative expression of protein pairs is presented. The method overcomes some of the present limitations in using peptide intensity data for class prediction for problems such as the detection of a disease, disease prognosis, or for predicting treatment response. Data analysis is particularly challenging in these situations due to sample size (typically tens) being much smaller than the large number of peptides (typically thousands). Methods based upon high dimensional statistical models, machine learning or other complex classifiers generate decisions which may be very accurate but can be complex and difficult to interpret in simple or biologically meaningful terms. A classification scheme, called ProtPair, is presented that generates simple decision rules leading to accurate classification which is based on measurement of very few proteins and requires only relative expression values, providing specific targeted hypotheses suitable for straightforward validation. RESULTS ProtPair has been tested against clinical data from 21 patients following a bone marrow transplant, 13 of which progress to idiopathic pneumonia syndrome (IPS). The approach combines multiple peptide pairs originating from the same set of proteins, with each unique peptide pair providing an independent measure of discriminatory power. The prediction rate of the ProtPair for IPS study as measured by leave-one-out CV is 69.1%, which can be very beneficial for clinical diagnosis as it may flag patients in need of closer monitoring. The "top ranked" proteins provided by ProtPair are known to be associated with the biological processes and pathways intimately associated with known IPS biology based on mouse models. CONCLUSIONS An approach to biomarker discovery, called ProtPair, is presented. ProtPair is based on the differential expression of pairs of peptides and the associated proteins. Using mass spectrometry data from "bottom up" proteomics methods, functionally related proteins/peptide pairs exhibiting co-ordinated changes expression profile are discovered, which represent a signature for patients progressing to various disease conditions. The method has been tested against clinical data from patients progressing to idiopthatic pneumonia syndrome (IPS) following a bone marrow transplant. The data indicates that patients with improper regulation in the concentration of specific acute phase response proteins at the time of bone marrow transplant are highly likely to develop IPS within few weeks. The results lead to a specific set of protein pairs that can be efficiently verified by investigating the pairwise abundance change in independent cohorts using ELISA or targeted mass spectrometry techniques. This generalized classifier can be extended to other clinical problems in a variety of contexts.
Collapse
Affiliation(s)
- Parminder Kaur
- Case Center for Proteomics and Bioinformatics, Case Western Reserve University, Cleveland, OH 44106, USA
| | - Daniela Schlatzer
- Case Center for Proteomics and Bioinformatics, Case Western Reserve University, Cleveland, OH 44106, USA
| | - Kenneth Cooke
- Pediatric Hematology and Oncology, University Hospitals, Cleveland, OH 44106, USA
| | - Mark R Chance
- Case Center for Proteomics and Bioinformatics, Case Western Reserve University, Cleveland, OH 44106, USA
| |
Collapse
|
8
|
Linear normalised hash function for clustering gene sequences and identifying reference sequences from multiple sequence alignments. MICROBIAL INFORMATICS AND EXPERIMENTATION 2012; 2:2. [PMID: 22587938 PMCID: PMC3351711 DOI: 10.1186/2042-5783-2-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/05/2011] [Accepted: 01/26/2012] [Indexed: 11/10/2022]
Abstract
BACKGROUND Comparative genomics has put additional demands on the assessment of similarity between sequences and their clustering as means for classification. However, defining the optimal number of clusters, cluster density and boundaries for sets of potentially related sequences of genes with variable degrees of polymorphism remains a significant challenge. The aim of this study was to develop a method that would identify the cluster centroids and the optimal number of clusters for a given sensitivity level and could work equally well for the different sequence datasets. RESULTS A novel method that combines the linear mapping hash function and multiple sequence alignment (MSA) was developed. This method takes advantage of the already sorted by similarity sequences from the MSA output, and identifies the optimal number of clusters, clusters cut-offs, and clusters centroids that can represent reference gene vouchers for the different species. The linear mapping hash function can map an already ordered by similarity distance matrix to indices to reveal gaps in the values around which the optimal cut-offs of the different clusters can be identified. The method was evaluated using sets of closely related (16S rRNA gene sequences of Nocardia species) and highly variable (VP1 genomic region of Enterovirus 71) sequences and outperformed existing unsupervised machine learning clustering methods and dimensionality reduction methods. This method does not require prior knowledge of the number of clusters or the distance between clusters, handles clusters of different sizes and shapes, and scales linearly with the dataset. CONCLUSIONS The combination of MSA with the linear mapping hash function is a computationally efficient way of gene sequence clustering and can be a valuable tool for the assessment of similarity, clustering of different microbial genomes, identifying reference sequences, and for the study of evolution of bacteria and viruses.
Collapse
|
9
|
Gump BB, MacKenzie JA, Dumas AK, Palmer CD, Parsons PJ, Segu ZM, Mechref YS, Bendinskas KG. Fish consumption, low-level mercury, lipids, and inflammatory markers in children. ENVIRONMENTAL RESEARCH 2012; 112:204-211. [PMID: 22030286 PMCID: PMC3267839 DOI: 10.1016/j.envres.2011.10.002] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/18/2011] [Revised: 10/01/2011] [Accepted: 10/05/2011] [Indexed: 05/29/2023]
Abstract
There is considerable evidence that consuming fish has numerous health benefits, including a reduced risk of cardiovascular disease. However, fish is also the primary source of human exposure to mercury (Hg). In a cross-sectional study of 9-11 year old children (N=100), we measured fish consumption, blood lipids, total blood Hg, diurnal salivary cortisol (4 samples collected throughout the day), and performed a proteomic analysis of serum proteins using spectral count shotgun proteomics. Children who consumed fish had a significantly more atheroprotective lipid profile but higher levels of blood Hg relative to children that did not consume fish. Although the levels of blood Hg were very low in these children (M=0.77 μg/L; all but 1 participant had levels below 3.27 μg/L), increasing blood Hg was significantly associated with blunted diurnal cortisol levels. Blood Hg was also significantly associated with acute-phase proteins suggesting systemic inflammation, and several of these proteins were found to significantly reduce the association between Hg and diminished cortisol when included in the model. This study of a pediatric population is the first to document an association between blood Hg, systemic inflammation, and endocrine disruption in humans. Without a better understanding of the long-term consequences of an atheroprotective lipid profile relative to blunted diurnal cortisol and systemic inflammation, a determination of the risk-benefit ratio for fish consumption by children is not possible.
Collapse
Affiliation(s)
- Brooks B Gump
- Department of Public Health, Food Studies, and Nutrition, Syracuse University, Syracuse, NY 13244, USA.
| | | | | | | | | | | | | | | |
Collapse
|
10
|
Oh JH, Gao J. Fast kernel discriminant analysis for classification of liver cancer mass spectra. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2011; 8:1522-1534. [PMID: 20479503 DOI: 10.1109/tcbb.2010.42] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2023]
Abstract
The classification of serum samples based on mass spectrometry (MS) has been increasingly used for monitoring disease progression and for diagnosing early disease. However, the classification task in mass spectrometry data is extremely challenging due to the very huge size of peaks (features) on mass spectra. Linear discriminant analysis (LDA) has been widely used for dimension reduction and feature extraction in many applications. However, the conversional LDA suffers from the singularity problem when dealing with high-dimensional features. Another critical limitation is its linearity property which results in failing in classification problems over nonlinearly clustered data sets. To overcome such problems, we develop a new fast kernel discriminant analysis (FKDA) that is pretty fast in the calculation of optimal discriminant vectors. FKDA is applied to the classification of liver cancer mass spectrometry data that consist of three categories: hepatocellular carcinoma, cirrhosis, and healthy that was originally analyzed by Ressom et al. We demonstrate the superiority and effectiveness of FKDA when compared to other classification techniques.
Collapse
Affiliation(s)
- Jung Hun Oh
- Division of Bioinformatics and Outcomes Research, Department of Radiation Oncology, Washington University School of Medicine, St. Louis, MO 63110, USA
| | | |
Collapse
|
11
|
Chang KH, Miller N, Kheirelseid EAH, Lemetre C, Ball GR, Smith MJ, Regan M, McAnena OJ, Kerin MJ. MicroRNA signature analysis in colorectal cancer: identification of expression profiles in stage II tumors associated with aggressive disease. Int J Colorectal Dis 2011; 26:1415-22. [PMID: 21739196 DOI: 10.1007/s00384-011-1279-4] [Citation(s) in RCA: 87] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 06/28/2011] [Indexed: 02/07/2023]
Abstract
PURPOSE Colorectal cancer (CRC) is a clinically diverse disease whose molecular etiology remains poorly understood. The purpose of this study was to identify miRNA expression patterns predictive of CRC tumor status and to investigate associations between microRNA (miRNA) expression and clinicopathological parameters. METHODS Expression profiling of 380 miRNAs was performed on 20 paired stage II tumor and normal tissues. Artificial neural network (ANN) analysis was applied to identify miRNAs predictive of tumor status. The validation of specific miRNAs was performed on 102 tissue specimens of varying stages. RESULTS Thirty-three miRNAs were identified as differentially expressed in tumor versus normal tissues. ANN analysis identified three miRNAs (miR-139-5p, miR-31, and miR-17-92 cluster) predictive of tumor status in stage II disease. Elevated expression of miR-31 (p = 0.004) and miR-139-5p (p < 0.001) and reduced expression of miR-143 (p = 0.016) were associated with aggressive mucinous phenotype. Increased expression of miR-10b was also associated with mucinous tumors (p = 0.004). Furthermore, progressively increasing levels of miR-10b expression were observed from T1 to T4 lesions and from stage I to IV disease. CONCLUSION Association of specific miRNAs with clinicopathological features indicates their biological relevance and highlights the power of ANN to reliably predict clinically relevant miRNA biomarkers, which it is hoped will better stratify patients to guide adjuvant therapy.
Collapse
Affiliation(s)
- Kah Hoong Chang
- Department of Surgery, National University of Ireland, Galway, Ireland
| | | | | | | | | | | | | | | | | |
Collapse
|
12
|
Ho YP, Reddy PM. Advances in mass spectrometry for the identification of pathogens. MASS SPECTROMETRY REVIEWS 2011; 30:1203-24. [PMID: 21557290 PMCID: PMC7168406 DOI: 10.1002/mas.20320] [Citation(s) in RCA: 41] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/30/2010] [Revised: 08/06/2010] [Accepted: 08/06/2010] [Indexed: 05/25/2023]
Abstract
Mass spectrometry (MS) has become an important technique to identify microbial biomarkers. The rapid and accurate MS identification of microorganisms without any extensive pretreatment of samples is now possible. This review summarizes MS methods that are currently utilized in microbial analyses. Affinity methods are effective to clean, enrich, and investigate microorganisms from complex matrices. Functionalized magnetic nanoparticles might concentrate traces of target microorganisms from sample solutions. Therefore, nanoparticle-based techniques have a favorable detection limit. MS coupled with various chromatographic techniques, such as liquid chromatography and capillary electrophoresis, reduces the complexity of microbial biomarkers and yields reliable results. The direct analysis of whole pathogenic microbial cells with matrix-assisted laser desorption/ionization MS without sample separation reveals specific biomarkers for taxonomy, and has the advantages of simplicity, rapidity, and high-throughput measurements. The MS detection of polymerase chain reaction (PCR)-amplified microbial nucleic acids provides an alternative to biomarker analysis. This review will conclude with some current applications of MS in the identification of pathogens.
Collapse
Affiliation(s)
- Yen-Peng Ho
- Department of Chemistry, National Dong Hwa University, Hualien 97401, Taiwan.
| | | |
Collapse
|
13
|
Tong DL, Boocock DJ, Coveney C, Saif J, Gomez SG, Querol S, Rees R, Ball GR. A simpler method of preprocessing MALDI-TOF MS data for differential biomarker analysis: stem cell and melanoma cancer studies. Clin Proteomics 2011; 8:14. [PMID: 21929822 PMCID: PMC3224566 DOI: 10.1186/1559-0275-8-14] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2011] [Accepted: 09/19/2011] [Indexed: 01/11/2023] Open
Abstract
INTRODUCTION Raw spectral data from matrix-assisted laser desorption/ionisation time-of-flight (MALDI-TOF) with MS profiling techniques usually contains complex information not readily providing biological insight into disease. The association of identified features within raw data to a known peptide is extremely difficult. Data preprocessing to remove uncertainty characteristics in the data is normally required before performing any further analysis. This study proposes an alternative yet simple solution to preprocess raw MALDI-TOF-MS data for identification of candidate marker ions. Two in-house MALDI-TOF-MS data sets from two different sample sources (melanoma serum and cord blood plasma) are used in our study. METHOD Raw MS spectral profiles were preprocessed using the proposed approach to identify peak regions in the spectra. The preprocessed data was then analysed using bespoke machine learning algorithms for data reduction and ion selection. Using the selected ions, an ANN-based predictive model was constructed to examine the predictive power of these ions for classification. RESULTS Our model identified 10 candidate marker ions for both data sets. These ion panels achieved over 90% classification accuracy on blind validation data. Receiver operating characteristics analysis was performed and the area under the curve for melanoma and cord blood classifiers was 0.991 and 0.986, respectively. CONCLUSION The results suggest that our data preprocessing technique removes unwanted characteristics of the raw data, while preserving the predictive components of the data. Ion identification analysis can be carried out using MALDI-TOF-MS data with the proposed data preprocessing technique coupled with bespoke algorithms for data reduction and ion selection.
Collapse
Affiliation(s)
- Dong L Tong
- The John van Geest Cancer Research Centre, School of Science and Technology, Nottingham Trent University, Clifton Lane, Nottingham, NG11 8NS, UK.
| | | | | | | | | | | | | | | |
Collapse
|
14
|
Helal M, Kong F, Chen SCA, Bain M, Christen R, Sintchenko V. Defining reference sequences for Nocardia species by similarity and clustering analyses of 16S rRNA gene sequence data. PLoS One 2011; 6:e19517. [PMID: 21687706 PMCID: PMC3110597 DOI: 10.1371/journal.pone.0019517] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2010] [Accepted: 04/08/2011] [Indexed: 01/08/2023] Open
Abstract
Background The intra- and inter-species genetic diversity of bacteria and the absence of ‘reference’, or the most representative, sequences of individual species present a significant challenge for sequence-based identification. The aims of this study were to determine the utility, and compare the performance of several clustering and classification algorithms to identify the species of 364 sequences of 16S rRNA gene with a defined species in GenBank, and 110 sequences of 16S rRNA gene with no defined species, all within the genus Nocardia. Methods A total of 364 16S rRNA gene sequences of Nocardia species were studied. In addition, 110 16S rRNA gene sequences assigned only to the Nocardia genus level at the time of submission to GenBank were used for machine learning classification experiments. Different clustering algorithms were compared with a novel algorithm or the linear mapping (LM) of the distance matrix. Principal Components Analysis was used for the dimensionality reduction and visualization. Results The LM algorithm achieved the highest performance and classified the set of 364 16S rRNA sequences into 80 clusters, the majority of which (83.52%) corresponded with the original species. The most representative 16S rRNA sequences for individual Nocardia species have been identified as ‘centroids’ in respective clusters from which the distances to all other sequences were minimized; 110 16S rRNA gene sequences with identifications recorded only at the genus level were classified using machine learning methods. Simple kNN machine learning demonstrated the highest performance and classified Nocardia species sequences with an accuracy of 92.7% and a mean frequency of 0.578. Conclusion The identification of centroids of 16S rRNA gene sequence clusters using novel distance matrix clustering enables the identification of the most representative sequences for each individual species of Nocardia and allows the quantitation of inter- and intra-species variability.
Collapse
Affiliation(s)
- Manal Helal
- Sydney Medical School, The University of Sydney, Sydney, New South Wales, Australia
- Centre for Infectious Diseases and Microbiology, Westmead Hospital, Sydney West Area Health Service, Sydney, New South Wales, Australia
| | - Fanrong Kong
- Centre for Infectious Diseases and Microbiology, Westmead Hospital, Sydney West Area Health Service, Sydney, New South Wales, Australia
| | - Sharon C. A. Chen
- Sydney Medical School, The University of Sydney, Sydney, New South Wales, Australia
- Centre for Infectious Diseases and Microbiology, Westmead Hospital, Sydney West Area Health Service, Sydney, New South Wales, Australia
| | - Michael Bain
- School of Computer Science and Engineering, University of New South Wales, Sydney, New South Wales, Australia
| | - Richard Christen
- University of Nice Sophia-Antipolis, and CNRS UMR6543, Parc Valrose, Centre de Biochimie, Nice, France
| | - Vitali Sintchenko
- Sydney Medical School, The University of Sydney, Sydney, New South Wales, Australia
- Centre for Infectious Diseases and Microbiology, Westmead Hospital, Sydney West Area Health Service, Sydney, New South Wales, Australia
- * E-mail:
| |
Collapse
|
15
|
Lancashire LJ, Roberts DL, Dive C, Renehan AG. The development of composite circulating biomarker models for use in anticancer drug clinical development. Int J Cancer 2011; 128:1843-51. [PMID: 20549702 DOI: 10.1002/ijc.25513] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
The development of informative composite circulating biomarkers predicting cancer presence or therapy response is clinically attractive but optimal approaches to modeling are as yet unclear. This study investigated multidimensional relationships within an example panel of serum insulin-like growth factor (IGF) peptides using logistic regression (LR), fractional polynomial (FP), regression, artificial neural networks (ANNs) and support vector machines (SVMs) to derive predictive models for colorectal cancer (CRC). Two phase 2 biomarker validation analyses were performed: controls were ambulant adults (n = 722); cases were: (i) CRC patients (n = 100) and (ii) patients with acromegaly (n = 52), the latter as "positive" discriminators. Serum IGF-I, IGF-II, IGF binding protein (IGFBP)-2 and -3 were measured. Discriminatory characteristics were compared within and between models. For the LR, FP and ANN models, and to a lesser extent SVMs, the addition of covariates at several steps improved discrimination characteristics. The optimum biomarker combination discriminating CRC vs. controls was achieved using ANN models [sensitivity, 94%; specificity, 90%; accuracy, 0.975 (95% CIs: 0.948 1.000)]. ANN modeling significantly outperformed LR, FP and SVM in terms of discrimination (p < 0.0001) and calibration. The acromegaly analysis demonstrated expected high performance characteristics in the ANN model [accuracy, 0.993 (95% CIs: 0.977, 1.000)]. Curved decision surfaces generated from the ANNs revealed the potential clinical utility. This example demonstrated improved discriminatory characteristics within the composite biomarker ANN model and a final model that outperformed the three other models. This modeling approach forms the basis to evaluate composite biomarkers as pharmacological and predictive biomarkers in future clinical trials.
Collapse
Affiliation(s)
- Lee J Lancashire
- Clinical and Experimental Pharmacology Group, Paterson Institute for Cancer Research, Manchester, UK
| | | | | | | |
Collapse
|
16
|
Chandra V, Ramakrishnan R, Ramanathan S. An ANN model for the identification of deleterious nsSNPs in tumor suppressor genes. Bioinformation 2011; 6:41-4. [PMID: 21464845 PMCID: PMC3064852 DOI: 10.6026/97320630006041] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2011] [Accepted: 02/17/2011] [Indexed: 11/23/2022] Open
Abstract
UNLABELLED Human genetic variations primarily result from single nucleotide polymorphisms (SNPs) that occurs approximately every 1000 bases in the overall human population. The non-synonymous SNPs (nsSNPs), lead to amino acid changes in the protein product may account for nearly half of the known genetic variations linked to inherited human diseases and cancer. One of the main problems of medical genetics today is to identify nsSNPs that underlie disease-related phenotypes in humans. An attempt was made to develop a new approach to predict such nsSNPs. This would enhance our understanding of genetic diseases and helps to predict the disease. We detect nsSNPs and all possible and reliable alleles by ANN, a soft computing model using potential SNP information. Reliable nsSNPs are identified, based on the reconstructed alleles and on sequence redundancy. The model gives good results with mean specificity (95.85&), sensitivity (97.40&) and accuracy (96.25&). Our results indicate that ANNs can serve as a useful method to analyze quantitative effect of nsSNPs on protein function and would be useful for large-scale analysis of genomic nsSNP data. AVAILABILITY The database is available for free at http://www.snp.mirworks.in.
Collapse
Affiliation(s)
- Vinod Chandra
- Department of Computer Applications, College of Engineering Trivandrum, Kerala, India
| | - Rejimoan Ramakrishnan
- Department of Computer Science, P.S.G. College of Technology,Coimbatore, Tamil Nadu, India
| | - Shalini Ramanathan
- Department of Computer Science, P.S.G. College of Technology,Coimbatore, Tamil Nadu, India
| |
Collapse
|
17
|
Tracing the transition of methicillin resistance in sub-populations of Staphylococcus aureus, using SELDI-TOF Mass Spectrometry and Artificial Neural Network Analysis. Syst Appl Microbiol 2011; 34:81-6. [DOI: 10.1016/j.syapm.2010.11.002] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2010] [Revised: 11/17/2010] [Accepted: 11/19/2010] [Indexed: 11/18/2022]
|
18
|
Abstract
BACKGROUND Mass spectrometry (MS) is a suitable technology for microorganism identification and characterization. CONTENT This review summarizes the MS-based methods currently used for the analyses of pathogens. Direct analysis of whole pathogenic microbial cells using MS without sample fractionation reveals specific biomarkers for taxonomy and provides rapid and high-throughput capabilities. MS coupled with various chromatography- and affinity-based techniques simplifies the complexity of the signals of the microbial biomarkers and provides more accurate results. Affinity-based methods, including those employing nanotechnology, can be used to concentrate traces of target microorganisms from sample solutions and, thereby, improve detection limits. Approaches combining amplification of nucleic acid targets from pathogens with MS-based detection are alternatives to biomarker analyses. Many data analysis methods, including multivariate analysis and bioinformatics approaches, have been developed for microbial identification. The review concludes with some current clinical applications of MS in the identification and typing of infectious microorganisms, as well as some perspectives. SUMMARY Advances in instrumentation (separation and mass analysis), ionization techniques, and biological methodologies will all enhance the capabilities of MS for the analysis of pathogens.
Collapse
Affiliation(s)
- Yen-Peng Ho
- Department of Chemistry, National Dong Hwa University, Hualien, Taiwan.
| | | |
Collapse
|
19
|
Chandra V, Girijadevi R, Nair AS, Pillai SS, Pillai RM. MTar: a computational microRNA target prediction architecture for human transcriptome. BMC Bioinformatics 2010; 11 Suppl 1:S2. [PMID: 20122191 PMCID: PMC3009490 DOI: 10.1186/1471-2105-11-s1-s2] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
BACKGROUND MicroRNAs (miRNAs) play an essential task in gene regulatory networks by inhibiting the expression of target mRNAs. As their mRNA targets are genes involved in important cell functions, there is a growing interest in identifying the relationship between miRNAs and their target mRNAs. So, there is now a imperative need to develop a computational method by which we can identify the target mRNAs of existing miRNAs. Here, we proposed an efficient machine learning model to unravel the relationship between miRNAs and their target mRNAs. RESULTS We present a novel computational architecture MTar for miRNA target prediction which reports 94.5% sensitivity and 90.5% specificity. We identified 16 positional, thermodynamic and structural parameters from the wet lab proven miRNA:mRNA pairs and MTar makes use of these parameters for miRNA target identification. It incorporates an Artificial Neural Network (ANN) verifier which is trained by wet lab proven microRNA targets. A number of hitherto unknown targets of many miRNA families were located using MTar. The method identifies all three potential miRNA targets (5' seed-only, 5' dominant, and 3' canonical) whereas the existing solutions focus on 5' complementarities alone. CONCLUSION MTar, an ANN based architecture for identifying functional regulatory miRNA-mRNA interaction using predicted miRNA targets. The area of target prediction has received a new momentum with the function of a thermodynamic model incorporating target accessibility. This model incorporates sixteen structural, thermodynamic and positional features of residues in miRNA: mRNA pairs were employed to select target candidates. So our novel machine learning architecture, MTar is found to be more comprehensive than the existing methods in predicting miRNA targets, especially human transcritome.
Collapse
Affiliation(s)
- Vinod Chandra
- Centre for Bioinformatics, University of Kerala, Thiruvananthapuram, India.
| | | | | | | | | |
Collapse
|
20
|
Xiao D, Yang Y, Liu H, Yu H, Yan Y, Huang W, Jiang W, Liao W, Hu Q, Huang B. Development of a method based on surface enhanced laser desorption and ionization time of flight mass spectrometry for rapid identification of Klebsiella pneumoniae. J Microbiol 2009; 47:646-50. [PMID: 19851739 DOI: 10.1007/s12275-009-0092-z] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2009] [Accepted: 06/14/2009] [Indexed: 11/29/2022]
Abstract
A method based on surface enhanced laser desorption and ionization time of flight mass spectrometry (SELDI-TOF MS) was developed for the rapid identification of Klebsiella pneumoniae by directly applying bacterial colonies without further protein extraction. A total of 40 K. pneumoniae and 114 other related microorganisms isolated clinically were analyzed by SELDI-TOF MS. An identification model for K. pneumoniae was established by artificial neural networks (ANNs) with classification accuracy of 100%. The model was blindly tested with 43 K. pneumoniae and 53 control bacteria again. The results showed that the model was successful with accuracy of 96.9%, sensitivity of 100% and specificity of 943%. This strategy is potential for rapid identification of K. pneumoniae.
Collapse
Affiliation(s)
- Daiwen Xiao
- Clinical Laboratory Department, Sichuan Academy of Medical Sciences and Sichuan Provincial People's Hospital, Chengdu, PR China
| | | | | | | | | | | | | | | | | | | |
Collapse
|
21
|
Lowery AJ, Miller N, Devaney A, McNeill RE, Davoren PA, Lemetre C, Benes V, Schmidt S, Blake J, Ball G, Kerin MJ. MicroRNA signatures predict oestrogen receptor, progesterone receptor and HER2/neu receptor status in breast cancer. Breast Cancer Res 2009; 11:R27. [PMID: 19432961 PMCID: PMC2716495 DOI: 10.1186/bcr2257] [Citation(s) in RCA: 331] [Impact Index Per Article: 22.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2008] [Revised: 03/25/2009] [Accepted: 05/11/2009] [Indexed: 12/12/2022] Open
Abstract
INTRODUCTION Breast cancer is a heterogeneous disease encompassing a number of phenotypically diverse tumours. Expression levels of the oestrogen, progesterone and HER2/neu receptors which characterize clinically distinct breast tumours have been shown to change during disease progression and in response to systemic therapies. Mi(cro)RNAs play critical roles in diverse biological processes and are aberrantly expressed in several human neoplasms including breast cancer, where they function as regulators of tumour behaviour and progression. The aims of this study were to identify miRNA signatures that accurately predict the oestrogen receptor (ER), progesterone receptor (PR) and HER2/neu receptor status of breast cancer patients to provide insight into the regulation of breast cancer phenotypes and progression. METHODS Expression profiling of 453 miRNAs was performed in 29 early-stage breast cancer specimens. miRNA signatures associated with ER, PR and HER2/neu status were generated using artificial neural networks (ANN), and expression of specific miRNAs was validated using RQ-PCR. RESULTS Stepwise ANN analysis identified predictive miRNA signatures corresponding with oestrogen (miR-342, miR-299, miR-217, miR-190, miR-135b, miR-218), progesterone (miR-520g, miR-377, miR-527-518a, miR-520f-520c) and HER2/neu (miR-520d, miR-181c, miR-302c, miR-376b, miR-30e) receptor status. MiR-342 and miR-520g expression was further analysed in 95 breast tumours. MiR-342 expression was highest in ER and HER2/neu-positive luminal B tumours and lowest in triple-negative tumours. MiR-520g expression was elevated in ER and PR-negative tumours. CONCLUSIONS This study demonstrates that ANN analysis reliably identifies biologically relevant miRNAs associated with specific breast cancer phenotypes. The association of specific miRNAs with ER, PR and HER2/neu status indicates a role for these miRNAs in disease classification of breast cancer. Decreased expression of miR-342 in the therapeutically challenging triple-negative breast tumours, increased miR-342 expression in the luminal B tumours, and downregulated miR-520g in ER and PR-positive tumours indicates that not only is dysregulated miRNA expression a marker for poorer prognosis breast cancer, but that it could also present an attractive target for therapeutic intervention.
Collapse
Affiliation(s)
- Aoife J Lowery
- Department of Surgery, Clinical Science Institute, University Hospital/National University of Ireland Galway, Galway, Ireland.
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
22
|
A validated gene expression profile for detecting clinical outcome in breast cancer using artificial neural networks. Breast Cancer Res Treat 2009; 120:83-93. [DOI: 10.1007/s10549-009-0378-1] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2009] [Accepted: 03/13/2009] [Indexed: 10/20/2022]
|
23
|
Yang YC, Yu H, Xiao DW, Liu H, Hu Q, Huang B, Liao WJ, Huang WF. Rapid identification of Staphylococcus aureus by surface enhanced laser desorption and ionization time of flight mass spectrometry. J Microbiol Methods 2009; 77:202-6. [PMID: 19230841 DOI: 10.1016/j.mimet.2009.02.004] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2008] [Revised: 02/03/2009] [Accepted: 02/03/2009] [Indexed: 11/25/2022]
Abstract
Staphylococcus aureus (S. aureus), a vital nosocomial pathogen, is responsible for several diseases. With the increasing isolation rate in clinical specimens, rapid identification of this bacterial species is required. But present identification via conventional methods is time-consuming and lacks accuracy. The purpose of the current study was to evaluate the use of surface enhanced laser desorption ionization time of flight mass spectrometry (SELDI-TOF MS) for rapid identification of S. aureus. A total of 120 clinical isolates of S. aureus and 153 non-S. aureus species were identified by conventional methods, and the species nature of all staphylococci was further confirmed by 16S rDNA sequencing. All strains observed were analyzed by SELDI-TOF MS. An identification model for S. aureus was developed and validated by an artificial neural network. The model based on 6 protein peaks exhibited a sensitivity of 98.4% and specificity of 98.6%. This strategy has the potential for rapid identification of S. aureus.
Collapse
Affiliation(s)
- Yong-Chang Yang
- Clinical Laboratory Department, Sichuan Academy of Medical Sciences & Sichuan Provincial People's Hospital, Chengdu 610072, China
| | | | | | | | | | | | | | | |
Collapse
|
24
|
Ma J, Nguyen MN, Rajapakse JC. Gene classification using codon usage and support vector machines. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2009; 6:134-143. [PMID: 19179707 DOI: 10.1109/tcbb.2007.70240] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/27/2023]
Abstract
A novel approach for gene classification, which adopts codon usage bias as input feature vector for classification by support vector machines (SVM) is proposed. The DNA sequence is first converted to a 59-dimensional feature vector where each element corresponds to the relative synonymous usage frequency of a codon. As the input to the classifier is independent of sequence length and variance, our approach is useful when the sequences to be classified are of different lengths, a condition that homology-based methods tend to fail. The method is demonstrated by using 1,841 Human Leukocyte Antigen (HLA) sequences which are classified into two major classes: HLA-I and HLA-II; each major class is further subdivided into sub-groups of HLA-I and HLA-II molecules. Using codon usage frequencies, binary SVM achieved accuracy rate of 99.3% for HLA major class classification and multi-class SVM achieved accuracy rates of 99.73% and 98.38% for sub-class classification of HLA-I and HLA-II molecules, respectively. The results show that gene classification based on codon usage bias is consistent with the molecular structures and biological functions of HLA molecules.
Collapse
Affiliation(s)
- Jianmin Ma
- BioInformatics Research Center, NanyangTechnological University, Singapore 637553.
| | | | | |
Collapse
|
25
|
Lancashire LJ, Rees RC, Ball GR. Identification of gene transcript signatures predictive for estrogen receptor and lymph node status using a stepwise forward selection artificial neural network modelling approach. Artif Intell Med 2008; 43:99-111. [PMID: 18420392 DOI: 10.1016/j.artmed.2008.03.001] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2007] [Revised: 02/29/2008] [Accepted: 03/10/2008] [Indexed: 10/22/2022]
Abstract
OBJECTIVE The advent of microarrays has attracted considerable interest from biologists due to the potential for high throughput analysis of hundreds of thousands of gene transcripts. Subsequent analysis of the data may identify specific features which correspond to characteristics of interest within the population, for example, analysis of gene expression profiles in cancer patients to identify molecular signatures corresponding with prognostic outcome. These high throughput technologies have resulted in an unprecedented rate of data generation, often of high complexity, highlighting the need for novel data analysis methodologies that will cope with data of this nature. METHODS Stepwise methods using artificial neural networks (ANNs) have been developed to identify an optimal subset of predictive gene transcripts from highly dimensional microarray data. Here these methods have been applied to a gene microarray dataset to identify and validate gene signatures corresponding with estrogen receptor and lymph node status in breast cancer. RESULTS Many gene transcripts were identified whose expression could differentiate patients to very high accuracies based upon firstly whether they were positive or negative for estrogen receptor, and secondly whether metastasis to the axillary lymph node had occurred. A number of these genes had been previously reported to have a role in cancer. Significantly fewer genes were used compared to other previous studies. The models using the optimal gene subsets were internally validated using an extensive random sample cross-validation procedure and externally validated using a follow up dataset from a different cohort of patients on a newer array chip containing the same and additional probe sets. Here, the models retained high accuracies, emphasising the potential power of this approach in analysing complex systems. These findings show how the proposed method allows for the rapid analysis and subsequent detailed interrogation of gene expression signatures to provide a further understanding of the underlying molecular mechanisms that could be important in determining novel prognostic markers associated with cancer.
Collapse
Affiliation(s)
- Lee J Lancashire
- Clinical and Experimental Pharmacology, Paterson Institute for Cancer Research, University of Manchester, Manchester M20 4BX, United Kingdom.
| | | | | |
Collapse
|
26
|
Genus-wide Bacillus species identification through proper artificial neural network experiments on fatty acid profiles. Antonie van Leeuwenhoek 2008; 94:187-98. [DOI: 10.1007/s10482-008-9229-z] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/16/2007] [Accepted: 02/12/2008] [Indexed: 10/22/2022]
|
27
|
Matharoo-Ball B, Ball G, Rees R. Clinical proteomics: discovery of cancer biomarkers using mass spectrometry and bioinformatics approaches--a prostate cancer perspective. Vaccine 2008; 25 Suppl 2:B110-21. [PMID: 17916461 DOI: 10.1016/j.vaccine.2007.06.040] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2007] [Revised: 06/01/2007] [Accepted: 06/15/2007] [Indexed: 10/24/2022]
Abstract
Prostate cancer (PCa) is an intractable disease, where diagnosis and clinical prediction of the disease course and response to treatment is compromised by the lack of objective and robust biomarker assays. In late stage metastatic disease, treatment options are limited, although it is recognized that some patients may benefit from immunotherapy and in particular vaccine therapy. However, research into biomarkers that correlate with the clinical outcome of immunotherapy has lagged behind vaccine development. Thus, proteomic tools are increasingly being utilized for the discovery of biomarkers which will allow us to make clinical decisions about patient treatment at an earlier stage and should aid in shortening the development time for vaccines. In this review we will summarize the various proteomic platforms used to investigate new biomarkers in PCa for better patient diagnosis, prognosis, patient stratification, treatment monitoring and clinical surrogate endpoints. We will discuss method limitations and highlight the key areas of research required for understanding the etiology of PCa.
Collapse
Affiliation(s)
- Balwir Matharoo-Ball
- Interdisciplinary Biomedical Research Centre, School of Biomedical and Natural Sciences, Nottingham Trent University, Clifton Lane, Nottingham NG11 8NS, UK
| | | | | |
Collapse
|
28
|
Flikka K, Meukens J, Helsens K, Vandekerckhove J, Eidhammer I, Gevaert K, Martens L. Implementation and application of a versatile clustering tool for tandem mass spectrometry data. Proteomics 2007; 7:3245-58. [PMID: 17708593 DOI: 10.1002/pmic.200700160] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
High-throughput proteomics experiments typically generate large amounts of peptide fragmentation mass spectra during a single experiment. There is often a substantial amount of redundant fragmentation of the same precursors among these spectra, which is usually considered a nuisance. We here discuss the potential of clustering and merging redundant spectra to turn this redundancy into a useful property of the dataset. To this end, we have created the first general-purpose, freely available open-source software application for clustering and merging MS/MS spectra. The application also introduces a novel approach to calculating the similarity of fragmentation mass spectra that takes into account the increased precision of modern mass spectrometers, and we suggest a simple but effective improvement to single-linkage clustering. The application and the novel algorithms are applied to several real-life proteomic datasets and the results are discussed. An analysis of the influence of the different algorithms available and their parameters is given, as well as a number of important applications of the overall approach.
Collapse
Affiliation(s)
- Kristian Flikka
- Computational Biology Unit, Bergen Center for Computational Science, University of Bergen, Bergen, Norway.
| | | | | | | | | | | | | |
Collapse
|
29
|
Matharoo-Ball B, Hughes C, Lancashire L, Tooth D, Ball G, Creaser C, Elgasim M, Rees R, Layfield R, Atiomo W. Characterization of biomarkers in polycystic ovary syndrome (PCOS) using multiple distinct proteomic platforms. J Proteome Res 2007; 6:3321-8. [PMID: 17602513 DOI: 10.1021/pr070124b] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
A variety of prefractionation methods (including a novel reversed-phase solid-phase-extraction (RP-SPE) combined with SDS-PAGE and proteomic based approaches (e.g., 2-dimensional gel electrophoresis (2DE) and MALDI-TOF mass spectrometry combined with Artificial Neural Network (ANN) bioinformatic tools) were used to investigate the protein/peptide signatures in patients with Polycystic Ovary Syndrome (PCOS). Four potential PCOS biomarkers were identified (complement C4alpha3c and C4gamma and haptoglobin alpha and beta chains).
Collapse
Affiliation(s)
- B Matharoo-Ball
- School of Biomedical and Natural Sciences, Nottingham Trent University, Clifton Lane, Nottingham, NG11 8NS, United Kingdom
| | | | | | | | | | | | | | | | | | | |
Collapse
|
30
|
Abstract
Correlated variables have been shown to confound statistical analyses in microarray experiments. The same effect applies to an even greater degree in proteomics, especially with the use of MS for parallel measurements. Biological effects such as PTM, fragmentation, and multimer formation can produce strongly correlated variables. The problem is compounded in some types of MS by technical effects such as incomplete chromatographic separation, binding to multiple surfaces, or multiple ionizations. Existing methods for dimension reduction, notably principal components analysis and related techniques, are not always satisfactory because they produce data that often lack clear biological interpretation. We propose a preprocessing algorithm that clusters highly correlated features, using the Bayes information criterion to select an optimal number of clusters. Statistical analysis of clusters, instead of individual features, benefits from lower noise, and reduces the difficulties associated with strongly correlated data. This preprocessing increases the statistical power of analyses using false discovery rate on simulated data. Strong correlations are often present in real data, and we find that clustering improves biomarker discovery in clinical SELDI-TOF-MS datasets of plasma from patients with Kawasaki disease, and bone-marrow cell extracts from patients with acute myeloid or acute lymphoblastic leukemia.
Collapse
Affiliation(s)
- Scott M Carlson
- Biological Engineering Division, Massachusetts Institute of Technology, Cambridge, MA, USA
| | | | | |
Collapse
|
31
|
Fung ET, Weinberger SR, Gavin E, Zhang F. Bioinformatics approaches in clinical proteomics. Expert Rev Proteomics 2007; 2:847-62. [PMID: 16307515 DOI: 10.1586/14789450.2.6.847] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Abstract
Protein expression profiling is increasingly being used to discover, validate and characterize biomarkers that can potentially be used for diagnostic purposes and to aid in pharmaceutical development. Correct analysis of data obtained from these experiments requires an understanding of the underlying analytic procedures used to obtain the data, statistical principles underlying high-dimensional data and clinical statistical tools used to determine the utility of the interpreted data. This review summarizes each of these steps, with the goal of providing the nonstatistician proteomics researcher with a working understanding of the various approaches that may be used by statisticians. Emphasis is placed on the process of mining high-dimensional data to identify a specific set of biomarkers that may be used in a diagnostic or other assay setting.
Collapse
Affiliation(s)
- Eric T Fung
- Ciphergen Biosystems, Inc., 6611 Dumbarton Circle, Fremont, CA 94555, USA.
| | | | | | | |
Collapse
|
32
|
Tabi Z, Man S. Challenges for cancer vaccine development. Adv Drug Deliv Rev 2006; 58:902-15. [PMID: 16979786 DOI: 10.1016/j.addr.2006.05.004] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2006] [Revised: 05/01/2006] [Accepted: 07/10/2006] [Indexed: 11/19/2022]
Abstract
The first generation of human cancer vaccines has been tested in phase III clinical trials, but only a few of these have demonstrated sufficient efficacy to be licensed for clinical use. This article reviews some of the mechanisms that could contribute to these limited clinical responses, and highlights the challenges faced for development of future vaccines.
Collapse
Affiliation(s)
- Z Tabi
- Department of Oncology and Palliative Medicine, Velindre Hospital, Whitchurch, Cardiff CF14 2TL, UK.
| | | |
Collapse
|
33
|
Collins CD, Purohit S, Podolsky RH, Zhao HS, Schatz D, Eckenrode SE, Yang P, Hopkins D, Muir A, Hoffman M, McIndoe RA, Rewers M, She JX. The application of genomic and proteomic technologies in predictive, preventive and personalized medicine. Vascul Pharmacol 2006; 45:258-67. [PMID: 17030152 DOI: 10.1016/j.vph.2006.08.003] [Citation(s) in RCA: 48] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2006] [Revised: 08/05/2006] [Accepted: 08/05/2006] [Indexed: 11/17/2022]
Abstract
The long asymptomatic period before the onset of chronic diseases offers good opportunities for disease prevention. Indeed, many chronic diseases may be preventable by avoiding those factors that trigger the disease process (primary prevention) or by use of therapy that modulates the disease process before the onset of clinical symptoms (secondary prevention). Accurate prediction is vital for disease prevention so that therapy can be given to those individuals who are most likely to develop the disease. The utility of predictive markers is dependent on three parameters, which must be carefully assessed: sensitivity, specificity and positive predictive value. Specificity is important if a biomarker is to be used to identify individuals either for counseling or for preventive therapy. However, a reciprocal relationship exists between sensitivity and specificity. Thus, successful biomarkers will be highly specific without sacrificing sensitivity. Unfortunately, biomarkers with ideal specificity and sensitivity are difficult to find for many diseases. One potential solution is to use the combinatorial power of a large number of biomarkers, each of which alone may not offer satisfactory specificity and sensitivity. Recent technological advances in genetics, genomics, proteomics, and bioinformatics offer a great opportunity for biomarker discovery. The newly identified biomarkers have the potential to bring increased accuracy in disease diagnosis and classification, as well as therapeutic monitoring. In this review, we will use type 1 diabetes (T1D) as an example, when appropriate, to discuss pertinent issues related to high throughput biomarker discovery.
Collapse
Affiliation(s)
- C D Collins
- Center for Biotechnology and Genomic Medicine, Medical College of Georgia, 1120 15th Street, CA4124, Augusta, GA 30912-2400, United States
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
34
|
Schmid O, Ball G, Lancashire L, Culak R, Shah H. New approaches to identification of bacterial pathogens by surface enhanced laser desorption/ionization time of flight mass spectrometry in concert with artificial neural networks, with special reference to Neisseria gonorrhoeae. J Med Microbiol 2006; 54:1205-1211. [PMID: 16278435 DOI: 10.1099/jmm.0.46223-0] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Surface enhanced laser desorption/ionization-time of flight mass spectrometry (SELDI-TOF MS) has been applied in large numbers of oncological studies but the microbiological field has not been extensively explored to date. This paper describes the application of SELDI-TOF MS in concert with a multi-layer perceptron artificial neural network (ANN) with a back propagation algorithm for the identification of Neisseria gonorrhoeae. N. gonorrhoeae, the aetiological agent of gonorrhoea, is the second most common sexually transmitted disease in the UK and USA. Analysis of over 350 strains of N. gonorrhoeae and closely related species by SELDI-TOF MS facilitated the design of an ANN model and revealed 20 ion peak descriptors of positive, negative and secondary nature that were paramount for the identification of the pathogen. The model performed with over 96 % efficiency when based on these 20 ion peak descriptors and exhibited a sensitivity of 95.7 % and a specificity of 97.1 %, with an area under the curve value of 0.996. The technology has the potential to link several ANN models for a comprehensive rapid identification platform for clinically important pathogens.
Collapse
Affiliation(s)
- Oliver Schmid
- Molecular Identification Services Unit, Centre for Infections, Health Protection Agency, London, UK 2The Nottingham Trent University, School of Biomedical and Natural Sciences, Nottingham, UK
| | - Graham Ball
- Molecular Identification Services Unit, Centre for Infections, Health Protection Agency, London, UK 2The Nottingham Trent University, School of Biomedical and Natural Sciences, Nottingham, UK
| | - Lee Lancashire
- Molecular Identification Services Unit, Centre for Infections, Health Protection Agency, London, UK 2The Nottingham Trent University, School of Biomedical and Natural Sciences, Nottingham, UK
| | - Renata Culak
- Molecular Identification Services Unit, Centre for Infections, Health Protection Agency, London, UK 2The Nottingham Trent University, School of Biomedical and Natural Sciences, Nottingham, UK
| | - Haroun Shah
- Molecular Identification Services Unit, Centre for Infections, Health Protection Agency, London, UK 2The Nottingham Trent University, School of Biomedical and Natural Sciences, Nottingham, UK
| |
Collapse
|
35
|
Iversen C, Lancashire L, Waddington M, Forsythe S, Ball G. Identification of Enterobacter sakazakii from closely related species: the use of artificial neural networks in the analysis of biochemical and 16S rDNA data. BMC Microbiol 2006; 6:28. [PMID: 16533390 PMCID: PMC1421405 DOI: 10.1186/1471-2180-6-28] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2006] [Accepted: 03/13/2006] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Enterobacter sakazakii is an emergent pathogen associated with ingestion of infant formula and accurate identification is important in both industrial and clinical settings. Bacterial species can be difficult to accurately characterise from complex biochemical datasets and computer algorithms can potentially simplify the process. RESULTS Artificial Neural Networks were applied to biochemical and 16S rDNA data derived from 282 strains of Enterobacteriaceae, including 189 E. sakazakii isolates, in order to identify key characteristics which could improve the identification of E. sakazakii. The models developed resulted in a predictive performance for blind (validation) data of 99.3 % correct discrimination between E. sakazakii and closely related species for both phenotypic and genotypic data. Three main regions of the partial rDNA sequence were found to be key in discriminating the species. Comparison between E. sakazakii and other strains also constitutively positive for expression of the enzyme alpha-glucosidase resulted in a predictive performance of 98.7 % for 16S rDNA sequence data and 100% for phenotypic data. CONCLUSION The computationally based methods developed here show a remarkable ability in reducing data dimensionality and complexity, in order to eliminate noise from the system in order to facilitate the speed and reliability of a potential strain identification system. Furthermore, the approaches described are also able to provide valuable information regarding the population structure and distribution of individual species thus providing the foundations for novel assays and diagnostic tests for rapid identification of pathogens.
Collapse
Affiliation(s)
- Carol Iversen
- The Nottingham Trent University, School of Biomedical and Natural Sciences, Clifton Campus, Clifton Lane, Nottingham, NG11 8NS, UK
| | - Lee Lancashire
- The Nottingham Trent University, School of Biomedical and Natural Sciences, Clifton Campus, Clifton Lane, Nottingham, NG11 8NS, UK
- Loreus Ltd., Erasmus Darwin Building, College of Science and Technology, Nottingham Trent University, Clifton Lane, Nottingham, NG11 8NS, UK
| | | | - Stephen Forsythe
- The Nottingham Trent University, School of Biomedical and Natural Sciences, Clifton Campus, Clifton Lane, Nottingham, NG11 8NS, UK
| | - Graham Ball
- The Nottingham Trent University, School of Biomedical and Natural Sciences, Clifton Campus, Clifton Lane, Nottingham, NG11 8NS, UK
- Loreus Ltd., Erasmus Darwin Building, College of Science and Technology, Nottingham Trent University, Clifton Lane, Nottingham, NG11 8NS, UK
| |
Collapse
|
36
|
Meija J. Mathematical tools in analytical mass spectrometry. Anal Bioanal Chem 2006; 385:486-99. [PMID: 16514517 DOI: 10.1007/s00216-006-0298-4] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2005] [Revised: 12/14/2005] [Accepted: 01/05/2006] [Indexed: 10/25/2022]
Abstract
Over the last few decades, mass spectrometry has become a powerful tool for exploring various aspects of molecular processes occurring in biological systems. Such exploration is leading to a greater understanding of various complex life processes; unraveling these processes poses the greatest challenge to contemporary bioscience. With due respect to sample preparation, data analysis is rapidly becoming a major obstacle to the conversion of experimental knowledge into valid conclusions. It is interesting to note that many problems related to mass spectrometry can be solved using techniques from computer science, graph theory and discrete mathematics. The aim of this manuscript is to recollect several essays that demonstrate the power and the need to apply such skills to mass spectrometry data interpretation. Special attention is paid to situations where traditional chemical analysis reaches its limits but mathematical reasoning can still allow us to reach valid conclusions.
Collapse
Affiliation(s)
- Juris Meija
- Department of Chemistry, University of Cincinnati, Cincinnati, OH 45221-0172, USA.
| |
Collapse
|