1
|
Ilyas S, Hussain W, Ashraf A, Khan YD, Khan SA, Chou KC. iMethylK_pseAAC: Improving Accuracy of Lysine Methylation Sites Identification by Incorporating Statistical Moments and Position Relative Features into General PseAAC via Chou's 5-steps Rule. Curr Genomics 2019; 20:275-292. [PMID: 32030087 PMCID: PMC6983956 DOI: 10.2174/1389202920666190809095206] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2019] [Revised: 07/02/2019] [Accepted: 07/26/2019] [Indexed: 02/04/2023] Open
Abstract
BACKGROUND Methylation is one of the most important post-translational modifications in the human body which usually arises on lysine among the most intensely modified residues. It performs a dynamic role in numerous biological procedures, such as regulation of gene expression, regulation of protein function and RNA processing. Therefore, to identify lysine methylation sites is an important challenge as some experimental procedures are time-consuming. OBJECTIVE Herein, we propose a computational predictor named iMethylK_pseAAC to identify lysine methylation sites. METHODS Firstly, we constructed feature vectors based on PseAAC using position and composition rel-ative features and statistical moments. A neural network is trained based on the extracted features. The performance of the proposed method is then validated using cross-validation and jackknife testing. RESULTS The objective evaluation of the predictor showed accuracy of 96.7% for self-consistency, 91.61% for 10-fold cross-validation and 93.42% for jackknife testing. CONCLUSION It is concluded that iMethylK_pseAAC outperforms the counterparts to identify lysine methylation sites such as iMethyl_pseACC, BPB_pPMS and PMeS.
Collapse
Affiliation(s)
| | | | | | - Yaser Daanial Khan
- Address correspondence to this author at the Department of Computer Science, School of Systems and Technology, University of Management and Technology, P.O. Box 10033, C-II, Johar Town, Lahore, Pakistan; Tel: +923054440271; E-mail:
| | | | | |
Collapse
|
2
|
Georgiadis P, Liampa I, Hebels DG, Krauskopf J, Chatziioannou A, Valavanis I, de Kok TM, Kleinjans JC, Bergdahl IA, Melin B, Spaeth F, Palli D, Vermeulen R, Vlaanderen J, Chadeau-Hyam M, Vineis P, Kyrtopoulos SA. Evolving DNA methylation and gene expression markers of B-cell chronic lymphocytic leukemia are present in pre-diagnostic blood samples more than 10 years prior to diagnosis. BMC Genomics 2017; 18:728. [PMID: 28903739 PMCID: PMC5598006 DOI: 10.1186/s12864-017-4117-4] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2017] [Accepted: 09/05/2017] [Indexed: 12/16/2022] Open
Abstract
BACKGROUND B-cell chronic lymphocytic leukemia (CLL) is a common type of adult leukemia. It often follows an indolent course and is preceded by monoclonal B-cell lymphocytosis, an asymptomatic condition, however it is not known what causes subjects with this condition to progress to CLL. Hence the discovery of prediagnostic markers has the potential to improve the identification of subjects likely to develop CLL and may also provide insights into the pathogenesis of the disease of potential clinical relevance. RESULTS We employed peripheral blood buffy coats of 347 apparently healthy subjects, of whom 28 were diagnosed with CLL 2.0-15.7 years after enrollment, to derive for the first time genome-wide DNA methylation, as well as gene and miRNA expression, profiles associated with the risk of future disease. After adjustment for white blood cell composition, we identified 722 differentially methylated CpG sites and 15 differentially expressed genes (Bonferroni-corrected p < 0.05) as well as 2 miRNAs (FDR < 0.05) which were associated with the risk of future CLL. The majority of these signals have also been observed in clinical CLL, suggesting the presence in prediagnostic blood of CLL-like cells. Future CLL cases who, at enrollment, had a relatively low B-cell fraction (<10%), and were therefore less likely to have been suffering from undiagnosed CLL or a precursor condition, showed profiles involving smaller numbers of the same differential signals with intensities, after adjusting for B-cell content, generally smaller than those observed in the full set of cases. A similar picture was obtained when the differential profiles of cases with time-to-diagnosis above the overall median period of 7.4 years were compared with those with shorted time-to-disease. Differentially methylated genes of major functional significance include numerous genes that encode for transcription factors, especially members of the homeobox family, while differentially expressed genes include, among others, multiple genes related to WNT signaling as well as the miRNAs miR-150-5p and miR-155-5p. CONCLUSIONS Our findings demonstrate the presence in prediagnostic blood of future CLL patients, more than 10 years before diagnosis, of CLL-like cells which evolve as preclinical disease progresses, and point to early molecular alterations with a pathogenetic potential.
Collapse
MESH Headings
- Biomarkers, Tumor/genetics
- DNA Methylation
- Gene Expression Profiling
- Gene Expression Regulation, Neoplastic
- Leukemia, Lymphocytic, Chronic, B-Cell/blood
- Leukemia, Lymphocytic, Chronic, B-Cell/diagnosis
- Leukemia, Lymphocytic, Chronic, B-Cell/genetics
- MicroRNAs/genetics
- Prognosis
- Time Factors
- Humans
Collapse
Affiliation(s)
- Panagiotis Georgiadis
- Institute of Biology, Medicinal Chemistry and Biotechnology, National Hellenic Research Foundation, 48, Vassileos Constantinou Avenue, 11635 Athens, Greece
| | - Irene Liampa
- Institute of Biology, Medicinal Chemistry and Biotechnology, National Hellenic Research Foundation, 48, Vassileos Constantinou Avenue, 11635 Athens, Greece
| | - Dennie G. Hebels
- Department of Toxicogenomics, Maastricht University, 6229 Maastricht, ER Netherlands
| | - Julian Krauskopf
- Department of Toxicogenomics, Maastricht University, 6229 Maastricht, ER Netherlands
| | - Aristotelis Chatziioannou
- Institute of Biology, Medicinal Chemistry and Biotechnology, National Hellenic Research Foundation, 48, Vassileos Constantinou Avenue, 11635 Athens, Greece
| | - Ioannis Valavanis
- Institute of Biology, Medicinal Chemistry and Biotechnology, National Hellenic Research Foundation, 48, Vassileos Constantinou Avenue, 11635 Athens, Greece
| | - Theo M.C.M. de Kok
- Department of Toxicogenomics, Maastricht University, 6229 Maastricht, ER Netherlands
| | - Jos C.S. Kleinjans
- Department of Toxicogenomics, Maastricht University, 6229 Maastricht, ER Netherlands
| | - Ingvar A. Bergdahl
- Department of Biobank Research, and Occupational and Environmental Medicine, Department of Public Health and Clinical Medicine, Umeå University, 901 87 Umeå, Sweden
| | - Beatrice Melin
- Department of Radiation Sciences, Oncology, Umeå University, 901 87 Umeå, Sweden
| | - Florentin Spaeth
- Department of Radiation Sciences, Oncology, Umeå University, 901 87 Umeå, Sweden
| | - Domenico Palli
- The Institute for Cancer Research and Prevention, 50141 Florence, Italy
| | - R.C.H. Vermeulen
- Institute for Risk Assessment Sciences, Utrecht University, Utrecht, Netherlands
| | - J. Vlaanderen
- Institute for Risk Assessment Sciences, Utrecht University, Utrecht, Netherlands
| | - Marc Chadeau-Hyam
- Department of Epidemiology and Biostatistics, MRC-HPA Centre for Environment and Health, School of Public Health, Faculty of Medicine, Imperial College, London, W2 1PG UK
| | - Paolo Vineis
- Department of Epidemiology and Biostatistics, MRC-HPA Centre for Environment and Health, School of Public Health, Faculty of Medicine, Imperial College, London, W2 1PG UK
| | - Soterios A. Kyrtopoulos
- Institute of Biology, Medicinal Chemistry and Biotechnology, National Hellenic Research Foundation, 48, Vassileos Constantinou Avenue, 11635 Athens, Greece
| |
Collapse
|
4
|
Omics for prediction of environmental health effects: Blood leukocyte-based cross-omic profiling reliably predicts diseases associated with tobacco smoking. Sci Rep 2016; 6:20544. [PMID: 26837704 PMCID: PMC4738297 DOI: 10.1038/srep20544] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2015] [Accepted: 01/06/2016] [Indexed: 01/13/2023] Open
Abstract
The utility of blood-based omic profiles for linking environmental exposures to their potential health effects was evaluated in 649 individuals, drawn from the general population, in relation to tobacco smoking, an exposure with well-characterised health effects. Using disease connectivity analysis, we found that the combination of smoking-modified, genome-wide gene (including miRNA) expression and DNA methylation profiles predicts with remarkable reliability most diseases and conditions independently known to be causally associated with smoking (indicative estimates of sensitivity and positive predictive value 94% and 84%, respectively). Bioinformatics analysis reveals the importance of a small number of smoking-modified, master-regulatory genes and suggest a central role for altered ubiquitination. The smoking-induced gene expression profiles overlap significantly with profiles present in blood cells of patients with lung cancer or coronary heart disease, diseases strongly associated with tobacco smoking. These results provide proof-of-principle support to the suggestion that omic profiling in peripheral blood has the potential of identifying early, disease-related perturbations caused by toxic exposures and may be a useful tool in hazard and risk assessment.
Collapse
|
5
|
Cancer Biomarkers from Genome-Scale DNA Methylation: Comparison of Evolutionary and Semantic Analysis Methods. MICROARRAYS 2015; 4:647-70. [PMID: 27600245 PMCID: PMC4996413 DOI: 10.3390/microarrays4040647] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/27/2015] [Revised: 11/09/2015] [Accepted: 11/18/2015] [Indexed: 11/16/2022]
Abstract
DNA methylation profiling exploits microarray technologies, thus yielding a wealth of high-volume data. Here, an intelligent framework is applied, encompassing epidemiological genome-scale DNA methylation data produced from the Illumina’s Infinium Human Methylation 450K Bead Chip platform, in an effort to correlate interesting methylation patterns with cancer predisposition and, in particular, breast cancer and B-cell lymphoma. Feature selection and classification are employed in order to select, from an initial set of ~480,000 methylation measurements at CpG sites, predictive cancer epigenetic biomarkers and assess their classification power for discriminating healthy versus cancer related classes. Feature selection exploits evolutionary algorithms or a graph-theoretic methodology which makes use of the semantics information included in the Gene Ontology (GO) tree. The selected features, corresponding to methylation of CpG sites, attained moderate-to-high classification accuracies when imported to a series of classifiers evaluated by resampling or blindfold validation. The semantics-driven selection revealed sets of CpG sites performing similarly with evolutionary selection in the classification tasks. However, gene enrichment and pathway analysis showed that it additionally provides more descriptive sets of GO terms and KEGG pathways regarding the cancer phenotypes studied here. Results support the expediency of this methodology regarding its application in epidemiological studies.
Collapse
|