1
|
Wei HH, Yang W, Tang H, Lin H. The Development of Machine Learning Methods in Cell-Penetrating Peptides Identification: A Brief Review. Curr Drug Metab 2019; 20:217-223. [DOI: 10.2174/1389200219666181010114750] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2018] [Revised: 05/21/2018] [Accepted: 08/02/2018] [Indexed: 11/22/2022]
Abstract
Background:Cell-penetrating Peptides (CPPs) are important short peptides that facilitate cellular intake or uptake of various molecules. CPPs can transport drug molecules through the plasma membrane and send these molecules to different cellular organelles. Thus, CPP identification and related mechanisms have been extensively explored. In order to reveal the penetration mechanisms of a large number of CPPs, it is necessary to develop convenient and fast methods for CPPs identification.Methods:Biochemical experiments can provide precise details for accurately identifying CPP, but these methods are expensive and laborious. To overcome these disadvantages, several computational methods have been developed to identify CPPs. We have performed review on the development of machine learning methods in CPP identification. This review provides an insight into CPP identification.Results:We summarized the machine learning-based CPP identification methods and compared the construction strategies of 11 different computational methods. Furthermore, we pointed out the limitations and difficulties in predicting CPPs.Conclusion:In this review, the last studies on CPP identification using machine learning method were reported. We also discussed the future development direction of CPP recognition with computational methods.
Collapse
Affiliation(s)
- Huan-Huan Wei
- Key Laboratory for NeuroInformation of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China
| | - Wuritu Yang
- Key Laboratory for NeuroInformation of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China
| | - Hua Tang
- Department of Pathophysiology, Southwest Medical University, Luzhou, China
| | - Hao Lin
- Key Laboratory for NeuroInformation of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China
| |
Collapse
|
2
|
Chen W, Song X, Lin H. Combinatorial Pattern of Histone Modifications in Exon Skipping Event. Front Genet 2019; 10:122. [PMID: 30833963 PMCID: PMC6387913 DOI: 10.3389/fgene.2019.00122] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2018] [Accepted: 02/04/2019] [Indexed: 11/18/2022] Open
Abstract
Histone modifications are associated with alternative splicing. It has been suggested that histone modifications act in combinational patterns in gene expression regulation. However, how they interact with each other and what is their casual relationships in the process of RNA splicing remain unclear. In this study, the combinatorial patterns of 38 kinds of histone modifications in the exon skipping event of the CD4+ T cell were analyzed by constructing Bayesian networks. Distinct combinatorial patterns of histone modifications that illustrating their casual relationships were observed in excluded/included exons and the surrounding intronic regions. The Bayesian networks also indicate that some histone modifications directly correlate with RNA splicing. We anticipate that this work could provide novel insights into the effects of histone modifications on RNA splicing regulation.
Collapse
Affiliation(s)
- Wei Chen
- Innovative Institute of Chinese Medicine and Pharmacy, Chengdu University of Traditional Chinese Medicine, Chengdu, China.,Center for Genomics and Computational Biology, School of Life Sciences, North China University of Science and Technology, Tangshan, China.,Key Laboratory for Neuro-Information of Ministry of Education, Center for Informational Biology, School of Life Sciences and Technology, University of Electronic Science and Technology of China, Chengdu, China
| | - Xiaoming Song
- Center for Genomics and Computational Biology, School of Life Sciences, North China University of Science and Technology, Tangshan, China
| | - Hao Lin
- Key Laboratory for Neuro-Information of Ministry of Education, Center for Informational Biology, School of Life Sciences and Technology, University of Electronic Science and Technology of China, Chengdu, China
| |
Collapse
|
3
|
Chen W, Feng P, Ding H, Lin H. Classifying Included and Excluded Exons in Exon Skipping Event Using Histone Modifications. Front Genet 2018; 9:433. [PMID: 30327665 PMCID: PMC6174203 DOI: 10.3389/fgene.2018.00433] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2018] [Accepted: 09/12/2018] [Indexed: 12/15/2022] Open
Abstract
Alternative splicing (AS) not only ensures the diversity of gene expression products, but also closely correlated with genetic diseases. Therefore, knowledge about regulatory mechanisms of AS will provide useful clues for understanding its biological functions. In the current study, a random forest based method was developed to classify included and excluded exons in exon skipping event. In this method, the samples in the dataset were encoded by using optimal histone modification features which were optimized by using the Maximum Relevance Maximum Distance (MRMD) feature selection technique. The proposed method obtained an accuracy of 72.91% in 10-fold cross validation test and outperformed existing methods. Meanwhile, we also systematically analyzed the distribution of histone modifications between included and excluded exons and discovered their preference in both kinds of exons, which might provide insights into researches on the regulatory mechanisms of alternative splicing.
Collapse
Affiliation(s)
- Wei Chen
- Center for Genomics and Computational Biology, School of Life Science, North China University of Science and Technology, Tangshan, China.,Innovative Institute of Chinese Medicine and Pharmacy, Chengdu University of Traditional Chinese Medicine, Chengdu, China
| | - Pengmian Feng
- School of Public Health, North China University of Science and Technology, Tangshan, China
| | - Hui Ding
- Key Laboratory for Neuro-Information of Ministry of Education, Center of Bioinformatics and Center for Information in Biomedicine, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, China
| | - Hao Lin
- Key Laboratory for Neuro-Information of Ministry of Education, Center of Bioinformatics and Center for Information in Biomedicine, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, China
| |
Collapse
|
4
|
Du X, Hu C, Yao Y, Sun S, Zhang Y. Analysis and Prediction of Exon Skipping Events from RNA-Seq with Sequence Information Using Rotation Forest. Int J Mol Sci 2017; 18:ijms18122691. [PMID: 29231888 PMCID: PMC5751293 DOI: 10.3390/ijms18122691] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2017] [Revised: 11/21/2017] [Accepted: 12/08/2017] [Indexed: 12/14/2022] Open
Abstract
In bioinformatics, exon skipping (ES) event prediction is an essential part of alternative splicing (AS) event analysis. Although many methods have been developed to predict ES events, a solution has yet to be found. In this study, given the limitations of machine learning algorithms with RNA-Seq data or genome sequences, a new feature, called RS (RNA-seq and sequence) features, was constructed. These features include RNA-Seq features derived from the RNA-Seq data and sequence features derived from genome sequences. We propose a novel Rotation Forest classifier to predict ES events with the RS features (RotaF-RSES). To validate the efficacy of RotaF-RSES, a dataset from two human tissues was used, and RotaF-RSES achieved an accuracy of 98.4%, a specificity of 99.2%, a sensitivity of 94.1%, and an area under the curve (AUC) of 98.6%. When compared to the other available methods, the results indicate that RotaF-RSES is efficient and can predict ES events with RS features.
Collapse
Affiliation(s)
- Xiuquan Du
- Key Laboratory of Intelligent Computing and Signal Processing of Ministry of Education, Anhui University, Hefei 230601, China.
- Center of Information Support & Assurance Technology, Anhui University, Hefei 230601, China.
- School of Computer Science and Technology, Anhui University, Hefei 230601, China.
| | - Changlin Hu
- School of Computer Science and Technology, Anhui University, Hefei 230601, China.
| | - Yu Yao
- School of Computer Science and Technology, Anhui University, Hefei 230601, China.
| | - Shiwei Sun
- School of Computer Science and Technology, Anhui University, Hefei 230601, China.
| | - Yanping Zhang
- Key Laboratory of Intelligent Computing and Signal Processing of Ministry of Education, Anhui University, Hefei 230601, China.
- Center of Information Support & Assurance Technology, Anhui University, Hefei 230601, China.
- School of Computer Science and Technology, Anhui University, Hefei 230601, China.
| |
Collapse
|
5
|
Hou Y, Huang H, Hu W, Liu H, Sun X. Histone modifications influence skipped exons inclusion. J Bioinform Comput Biol 2017; 15:1750003. [PMID: 28178888 DOI: 10.1142/s0219720017500032] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
Alternative splicing (AS), by which individual genes can produce multiple mRNA, associates with genomic complexity, disease, and development. Histone modifications show important roles in both transcription initiation and mRNA splicing. Here, we intended to find the link between AS and histone modifications in flanking regions through analyzing publicly available data in two human cell lines, GM12878 and K562 cell lines. According to exon inclusion levels, exons were classified into three types, included skipped exons, excluded skipped exons and expressed constitutive exons. We revealed that the inclusion levels of skipped exons (SEs) were negatively correlated with the enrichment of active histone marks in SEs, indicating a role of histone modifications in AS. We also found that active histone modifications were enriched in the upstream exons of SEs, especially around 5[Formula: see text] splicing sites. We inferred that the histone modifications around the 5[Formula: see text] splicing sites in upstream exon of the SEs could help RNA Polymerase II complex to recruit the effector proteins and facilitate AS. It was indicated that nucleosome occupancy had little influence on the inclusion levels of SEs. At last, we proposed an integrated model that describe how histone modifications affected the pre-mRNA splicing.
Collapse
Affiliation(s)
- Yue Hou
- 1 State Key Laboratory of Bioelectronics, School of Biology Science and Medical Engineering, Southeast University, Nanjing, P. R. China
| | - Huan Huang
- 1 State Key Laboratory of Bioelectronics, School of Biology Science and Medical Engineering, Southeast University, Nanjing, P. R. China
| | - Wenqiao Hu
- 1 State Key Laboratory of Bioelectronics, School of Biology Science and Medical Engineering, Southeast University, Nanjing, P. R. China
| | - Hongde Liu
- 1 State Key Laboratory of Bioelectronics, School of Biology Science and Medical Engineering, Southeast University, Nanjing, P. R. China
| | - Xiao Sun
- 1 State Key Laboratory of Bioelectronics, School of Biology Science and Medical Engineering, Southeast University, Nanjing, P. R. China
| |
Collapse
|
6
|
Chen W, Feng P, Tang H, Ding H, Lin H. Identifying 2'-O-methylationation sites by integrating nucleotide chemical properties and nucleotide compositions. Genomics 2016; 107:255-8. [PMID: 27191866 DOI: 10.1016/j.ygeno.2016.05.003] [Citation(s) in RCA: 43] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2016] [Revised: 05/04/2016] [Accepted: 05/13/2016] [Indexed: 10/21/2022]
Abstract
2'-O-methylationation is an important post-transcriptional modification and plays important roles in many biological processes. Although experimental technologies have been proposed to detect 2'-O-methylationation sites, they are cost-ineffective. As complements to experimental techniques, computational methods will facilitate the identification of 2'-O-methylationation sites. In the present study, we proposed a support vector machine-based method to identify 2'-O-methylationation sites. In this method, RNA sequences were formulated by nucleotide chemical properties and nucleotide compositions. In the jackknife cross-validation test, the proposed method obtained an accuracy of 95.58% for identifying 2'-O-methylationation sites in the human genome. Moreover, the model was also validated by identifying 2'-O-methylation sites in the Mus musculus and Saccharomyces cerevisiae genomes, and the obtained accuracies are also satisfactory. These results indicate that the proposed method will become a useful tool for the research on 2'-O-methylation.
Collapse
Affiliation(s)
- Wei Chen
- Department of Physics, School of Sciences, and Center for Genomics and Computational Biology, North China University of Science and Technology, Tangshan 063000, China.
| | - Pengmian Feng
- School of Public Health, North China University of Science and Technology, Tangshan 063000, China
| | - Hua Tang
- Department of Pathophysiology, Sichuan Medical University, Luzhou 646000, China
| | - Hui Ding
- Key Laboratory for Neuro-Information of Ministry of Education, Center of Bioinformatics and Center for Information in Biomedicine, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Hao Lin
- Key Laboratory for Neuro-Information of Ministry of Education, Center of Bioinformatics and Center for Information in Biomedicine, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610054, China.
| |
Collapse
|
7
|
Chen W, Tang H, Lin H. MethyRNA: a web server for identification of N6-methyladenosine sites. J Biomol Struct Dyn 2016; 35:683-687. [DOI: 10.1080/07391102.2016.1157761] [Citation(s) in RCA: 74] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
Affiliation(s)
- Wei Chen
- Department of Physics, School of Sciences, Center for Genomics and Computational Biology, North China University of Science and Technology, Tangshan 063009, China
| | - Hua Tang
- Department of Pathophysiology, Sichuan Medical University, Luzhou 646000, China
| | - Hao Lin
- Key Laboratory for Neuro-Information of Ministry of Education, Center of Bioinformatics, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610054, China
| |
Collapse
|
8
|
Survey of Programs Used to Detect Alternative Splicing Isoforms from Deep Sequencing Data In Silico. BIOMED RESEARCH INTERNATIONAL 2015; 2015:831352. [PMID: 26421304 PMCID: PMC4573434 DOI: 10.1155/2015/831352] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/26/2014] [Revised: 02/17/2015] [Accepted: 03/02/2015] [Indexed: 11/29/2022]
Abstract
Next-generation sequencing techniques have been rapidly emerging. However, the massive sequencing reads hide a great deal of unknown important information. Advances have enabled researchers to discover alternative splicing (AS) sites and isoforms using computational approaches instead of molecular experiments. Given the importance of AS for gene expression and protein diversity in eukaryotes, detecting alternative splicing and isoforms represents a hot topic in systems biology and epigenetics research. The computational methods applied to AS prediction have improved since the emergence of next-generation sequencing. In this study, we introduce state-of-the-art research on AS and then compare the research methods and software tools available for AS based on next-generation sequencing reads. Finally, we discuss the prospects of computational methods related to AS.
Collapse
|