1
|
MOZART, a QSAR Multi-Target Web-Based Tool to Predict Multiple Drug-Enzyme Interactions. Molecules 2023; 28:molecules28031182. [PMID: 36770857 PMCID: PMC9921108 DOI: 10.3390/molecules28031182] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2022] [Revised: 01/09/2023] [Accepted: 01/16/2023] [Indexed: 01/27/2023] Open
Abstract
Developing models able to predict interactions between drugs and enzymes is a primary goal in computational biology since these models may be used for predicting both new active drugs and the interactions between known drugs on untested targets. With the compilation of a large dataset of drug-enzyme pairs (62,524), we recognized a unique opportunity to attempt to build a novel multi-target machine learning (MTML) quantitative structure-activity relationship (QSAR) model for probing interactions among different drugs and enzyme targets. To this end, this paper presents an MTML-QSAR model based on using the features of topological drugs together with the artificial neural network (ANN) multi-layer perceptron (MLP). Validation of the final best model found was carried out by internal cross-validation statistics and other relevant diagnostic statistical parameters. The overall accuracy of the derived model was found to be higher than 96%. Finally, to maximize the diffusion of this model, a public and accessible tool has been developed to allow users to perform their own predictions. The developed web-based tool is public accessible and can be downloaded as free open-source software.
Collapse
|
2
|
Zhang S, Qiao H. KD-KLNMF: Identification of lncRNAs subcellular localization with multiple features and nonnegative matrix factorization. Anal Biochem 2020; 610:113995. [PMID: 33080214 DOI: 10.1016/j.ab.2020.113995] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2020] [Revised: 09/07/2020] [Accepted: 10/12/2020] [Indexed: 12/18/2022]
Abstract
Long non-coding RNAs (lncRNAs) refer to functional RNA molecules with a length more than 200 nucleotides and have minimal or no function to encode proteins. In recent years, more studies show that lncRNAs subcellular localization has valuable clues for their biological functions. So it is count for much to identify lncRNAs subcellular localization. In this paper, a novel statistical model named KD-KLNMF is constructed to predict lncRNAs subcellular localization. Firstly, k-mer and dinucleotide-based spatial autocorrelation are incorporated as the feature vector. Then, Synthetic Minority Over-sampling Technique is used to deal with the imbalance dataset. Next, Kullback-Leibler divergence-based nonnegative matrix factorization is applied to select optimal features. And then we utilize support vector machine as the classifier after comparing with other classifiers. Finally, the jackknife test is performed to evaluate the model. The overall accuracies reach 97.24% and 92.86% on training dataset and independent dataset, respectively. The results are better than the previous methods, which indicate that our model will be a useful and feasible tool to identify lncRNAs subcellular localization. The datasets and source code are freely available at https://github.com/HuijuanQiao/KD-KLNMF.
Collapse
Affiliation(s)
- Shengli Zhang
- School of Mathematics and Statistics, Xidian University, Xi'an, 710071, PR China.
| | - Huijuan Qiao
- School of Mathematics and Statistics, Xidian University, Xi'an, 710071, PR China
| |
Collapse
|
3
|
|
4
|
Zhou GP, Liao SM, Chen D, Huang RB. The Cooperative Effect between Polybasic Region (PBR) and Polysialyltransferase Domain (PSTD) within Tumor-Target Polysialyltranseferase ST8Sia II. Curr Top Med Chem 2020; 19:2831-2841. [PMID: 31755393 DOI: 10.2174/1568026619666191121145924] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2019] [Revised: 10/16/2019] [Accepted: 10/20/2019] [Indexed: 12/29/2022]
Abstract
ST8Sia II (STX) is a highly homologous mammalian polysialyltransferase (polyST), which is a validated tumor-target in the treatment of cancer metastasis reliant on tumor cell polysialylation. PolyST catalyzes the synthesis of α2,8-polysialic acid (polySia) glycans by carrying out the activated CMP-Neu5Ac (Sia) to N- and O-linked oligosaccharide chains on acceptor glycoproteins. In this review article, we summarized the recent studies about intrinsic correlation of two polybasic domains, Polysialyltransferase domain (PSTD) and Polybasic region (PBR) within ST8Sia II molecule, and suggested that the critical amino acid residues within the PSTD and PBR motifs of ST8Sia II for polysialylation of Neural cell adhesion molecules (NCAM) are related to ST8Sia II activity. In addition, the conformational changes of the PSTD domain due to point mutations in the PBR or PSTD domain verified an intramolecular interaction between the PBR and the PSTD. These findings have been incorporated into Zhou's NCAM polysialylation/cell migration model, which will provide new perspectives on drug research and development related to the tumor-target ST8Sia II.
Collapse
Affiliation(s)
- Guo-Ping Zhou
- National Engineering Research Center for Non-Food Biorefinery, State Key Laboratory of Non-Food Biomass and Enzyme Technology, Guangxi Key Laboratory of Bio-refinery, Guangxi Academy of Sciences, 98 Daling Road, Nanning, 530007, China.,Gordon Life Science Institute, NC 27804, United States
| | - Si-Ming Liao
- National Engineering Research Center for Non-Food Biorefinery, State Key Laboratory of Non-Food Biomass and Enzyme Technology, Guangxi Key Laboratory of Bio-refinery, Guangxi Academy of Sciences, 98 Daling Road, Nanning, 530007, China
| | - Dong Chen
- National Engineering Research Center for Non-Food Biorefinery, State Key Laboratory of Non-Food Biomass and Enzyme Technology, Guangxi Key Laboratory of Bio-refinery, Guangxi Academy of Sciences, 98 Daling Road, Nanning, 530007, China
| | - Ri-Bo Huang
- National Engineering Research Center for Non-Food Biorefinery, State Key Laboratory of Non-Food Biomass and Enzyme Technology, Guangxi Key Laboratory of Bio-refinery, Guangxi Academy of Sciences, 98 Daling Road, Nanning, 530007, China
| |
Collapse
|
5
|
Lu B, Liu XH, Liao SM, Lu ZL, Chen D, Troy Ii FA, Huang RB, Zhou GP. A Possible Modulation Mechanism of Intramolecular and Intermolecular Interactions for NCAM Polysialylation and Cell Migration. Curr Top Med Chem 2019; 19:2271-2282. [PMID: 31648641 DOI: 10.2174/1568026619666191018094805] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2019] [Revised: 08/01/2019] [Accepted: 08/06/2019] [Indexed: 12/31/2022]
Abstract
Polysialic acid (polySia) is a novel glycan that posttranslationally modifies neural cell adhesion molecules (NCAMs) in mammalian cells. Up-regulation of polySia-NCAM expression or NCAM polysialylation is associated with tumor cell migration and progression in many metastatic cancers and neurocognition. It has been known that two highly homologous mammalian polysialyltransferases (polySTs), ST8Sia II (STX) and ST8Sia IV (PST), can catalyze polysialylation of NCAM, and two polybasic domains, polybasic region (PBR) and polysialyltransferase domain (PSTD) in polySTs play key roles in affecting polyST activity or NCAM polysialylation. However, the molecular mechanisms of NCAM polysialylation and cell migration are still not entirely clear. In this minireview, the recent research results about the intermolecular interactions between the PBR and NCAM, the PSTD and cytidine monophosphate-sialic acid (CMP-Sia), the PSTD and polySia, and as well as the intramolecular interaction between the PBR and the PSTD within the polyST, are summarized. Based on these cooperative interactions, we have built a novel model of NCAM polysialylation and cell migration mechanisms, which may be helpful to design and develop new polysialyltransferase inhibitors.
Collapse
Affiliation(s)
- Bo Lu
- The National Engineering Research Center for Non-Food Biorefinery, Guangxi Academy of Sciences, Nanning, Guangxi 530007, China
| | - Xue-Hui Liu
- Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China
| | - Si-Ming Liao
- The National Engineering Research Center for Non-Food Biorefinery, Guangxi Academy of Sciences, Nanning, Guangxi 530007, China
| | - Zhi-Long Lu
- The National Engineering Research Center for Non-Food Biorefinery, Guangxi Academy of Sciences, Nanning, Guangxi 530007, China
| | - Dong Chen
- The National Engineering Research Center for Non-Food Biorefinery, Guangxi Academy of Sciences, Nanning, Guangxi 530007, China
| | - Frederic A Troy Ii
- Department of Biochemistry and Molecular Medicine, University of California School of Medicine, Davis, CA, 95817, United States
| | - Ri-Bo Huang
- The National Engineering Research Center for Non-Food Biorefinery, Guangxi Academy of Sciences, Nanning, Guangxi 530007, China.,Life Science and Biotechnology College, Guangxi University, Nanning, Guangxi 530004, China
| | - Guo-Ping Zhou
- The National Engineering Research Center for Non-Food Biorefinery, Guangxi Academy of Sciences, Nanning, Guangxi 530007, China
| |
Collapse
|
6
|
Guo YH, Kuruganti R, Gao Y. Recent Advances in Ginsenosides as Potential Therapeutics Against Breast Cancer. Curr Top Med Chem 2019; 19:2334-2347. [DOI: 10.2174/1568026619666191018100848] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2018] [Revised: 05/10/2019] [Accepted: 08/16/2019] [Indexed: 12/14/2022]
Abstract
The dried root of ginseng (Panax ginseng C. A. Meyer or Panax quinquefolius L.) is a traditional
Chinese medicine widely used to manage cancer symptoms and chemotherapy side effects in
Asia. The anti-cancer efficacy of ginseng is attributed mainly to the presence of saponins, which are
commonly known as ginsenosides. Ginsenosides were first identified as key active ingredients in Panax
ginseng and subsequently found in Panax quinquefolius, both of the same genus. To review the recent
advances on anti-cancer effects of ginsenosides against breast cancer, we conducted a literature study of
scientific articles published from 2010 through 2018 to date by searching the major databases including
Pubmed, SciFinder, Science Direct, Springer, Google Scholar, and CNKI. A total of 50 articles authored
in either English or Chinese related to the anti-breast cancer activity of ginsenosides have been
reviewed, and the in vitro, in vivo, and clinical studies on ginsenosides are summarized. This review focuses
on how ginsenosides exert their anti-breast cancer activities through various mechanisms of action
such as modulation of cell growth, modulation of the cell cycle, modulation of cell death, inhibition of
angiogenesis, inhibition of metastasis, inhibition of multidrug resistance, and cancer immunemodulation.
In summary, recent advances in the evaluation of ginsenosides as therapeutic agents against
breast cancer support further pre-clinical and clinical studies to treat primary and metastatic breast tumors.
Collapse
Affiliation(s)
- Yu-hang Guo
- International Ginseng Institute, School of Agriculture, Middle Tennessee State University, Murfreesboro, TN 37132, United States
| | - Revathimadhubala Kuruganti
- International Ginseng Institute, School of Agriculture, Middle Tennessee State University, Murfreesboro, TN 37132, United States
| | - Ying Gao
- International Ginseng Institute, School of Agriculture, Middle Tennessee State University, Murfreesboro, TN 37132, United States
| |
Collapse
|
7
|
Chou KC. Impacts of Pseudo Amino Acid Components and 5-steps Rule to Proteomics and Proteome Analysis. Curr Top Med Chem 2019; 19:2283-2300. [DOI: 10.2174/1568026619666191018100141] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2019] [Revised: 08/18/2019] [Accepted: 08/26/2019] [Indexed: 01/27/2023]
Abstract
Stimulated by the 5-steps rule during the last decade or so, computational proteomics has achieved remarkable progresses in the following three areas: (1) protein structural class prediction; (2) protein subcellular location prediction; (3) post-translational modification (PTM) site prediction. The results obtained by these predictions are very useful not only for an in-depth study of the functions of proteins and their biological processes in a cell, but also for developing novel drugs against major diseases such as cancers, Alzheimer’s, and Parkinson’s. Moreover, since the targets to be predicted may have the multi-label feature, two sets of metrics are introduced: one is for inspecting the global prediction quality, while the other for the local prediction quality. All the predictors covered in this review have a userfriendly web-server, through which the majority of experimental scientists can easily obtain their desired data without the need to go through the complicated mathematics.
Collapse
Affiliation(s)
- Kuo-Chen Chou
- Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, 610054, China
| |
Collapse
|
8
|
Chou KC. Advances in Predicting Subcellular Localization of Multi-label Proteins and its Implication for Developing Multi-target Drugs. Curr Med Chem 2019; 26:4918-4943. [PMID: 31060481 DOI: 10.2174/0929867326666190507082559] [Citation(s) in RCA: 78] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2018] [Revised: 01/29/2019] [Accepted: 01/31/2019] [Indexed: 12/16/2022]
Abstract
The smallest unit of life is a cell, which contains numerous protein molecules. Most
of the functions critical to the cell’s survival are performed by these proteins located in its different
organelles, usually called ‘‘subcellular locations”. Information of subcellular localization
for a protein can provide useful clues about its function. To reveal the intricate pathways at the
cellular level, knowledge of the subcellular localization of proteins in a cell is prerequisite.
Therefore, one of the fundamental goals in molecular cell biology and proteomics is to determine
the subcellular locations of proteins in an entire cell. It is also indispensable for prioritizing
and selecting the right targets for drug development. Unfortunately, it is both timeconsuming
and costly to determine the subcellular locations of proteins purely based on experiments.
With the avalanche of protein sequences generated in the post-genomic age, it is highly
desired to develop computational methods for rapidly and effectively identifying the subcellular
locations of uncharacterized proteins based on their sequences information alone. Actually,
considerable progresses have been achieved in this regard. This review is focused on those
methods, which have the capacity to deal with multi-label proteins that may simultaneously
exist in two or more subcellular location sites. Protein molecules with this kind of characteristic
are vitally important for finding multi-target drugs, a current hot trend in drug development.
Focused in this review are also those methods that have use-friendly web-servers established so
that the majority of experimental scientists can use them to get the desired results without the
need to go through the detailed mathematics involved.
Collapse
Affiliation(s)
- Kuo-Chen Chou
- Gordon Life Science Institute, Boston, MA 02478, United States
| |
Collapse
|
9
|
Abstract
The smallest unit of life is a cell, which contains numerous protein molecules. Most
of the functions critical to the cell’s survival are performed by these proteins located in its different
organelles, usually called ‘‘subcellular locations”. Information of subcellular localization
for a protein can provide useful clues about its function. To reveal the intricate pathways at the
cellular level, knowledge of the subcellular localization of proteins in a cell is prerequisite.
Therefore, one of the fundamental goals in molecular cell biology and proteomics is to determine
the subcellular locations of proteins in an entire cell. It is also indispensable for prioritizing
and selecting the right targets for drug development. Unfortunately, it is both timeconsuming
and costly to determine the subcellular locations of proteins purely based on experiments.
With the avalanche of protein sequences generated in the post-genomic age, it is highly
desired to develop computational methods for rapidly and effectively identifying the subcellular
locations of uncharacterized proteins based on their sequences information alone. Actually,
considerable progresses have been achieved in this regard. This review is focused on those
methods, which have the capacity to deal with multi-label proteins that may simultaneously
exist in two or more subcellular location sites. Protein molecules with this kind of characteristic
are vitally important for finding multi-target drugs, a current hot trend in drug development.
Focused in this review are also those methods that have use-friendly web-servers established so
that the majority of experimental scientists can use them to get the desired results without the
need to go through the detailed mathematics involved.
Collapse
Affiliation(s)
- Kuo-Chen Chou
- Gordon Life Science Institute, Boston, MA 02478, United States
| |
Collapse
|
10
|
|
11
|
Liao SM, Shen NK, Liang G, Lu B, Lu ZL, Peng LX, Zhou F, Du LQ, Wei YT, Zhou GP, Huang RB. Inhibition of α-amylase Activity by Zn2+: Insights from Spectroscopy and Molecular Dynamics Simulations. Med Chem 2019; 15:510-520. [DOI: 10.2174/1573406415666181217114101] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2018] [Revised: 10/23/2018] [Accepted: 12/12/2018] [Indexed: 02/08/2023]
Abstract
Background:Inhibition of α-amylase activity is an important strategy in the treatment of diabetes mellitus. An important treatment for diabetes mellitus is to reduce the digestion of carbohydrates and blood glucose concentrations. Inhibiting the activity of carbohydrate-degrading enzymes such as α-amylase and glucosidase significantly decreases the blood glucose level. Most inhibitors of α-amylase have serious adverse effects, and the α-amylase inactivation mechanisms for the design of safer inhibitors are yet to be revealed.Objective:In this study, we focused on the inhibitory effect of Zn2+ on the structure and dynamic characteristics of α-amylase from Anoxybacillus sp. GXS-BL (AGXA), which shares the same catalytic residues and similar structures as human pancreatic and salivary α-amylase (HPA and HSA, respectively).Methods:Circular dichroism (CD) spectra of the protein (AGXA) in the absence and presence of Zn2+ were recorded on a Chirascan instrument. The content of different secondary structures of AGXA in the absence and presence of Zn2+ was analyzed using the online SELCON3 program. An AGXA amino acid sequence similarity search was performed on the BLAST online server to find the most similar protein sequence to use as a template for homology modeling. The pocket volume measurer (POVME) program 3.0 was applied to calculate the active site pocket shape and volume, and molecular dynamics simulations were performed with the Amber14 software package.Results:According to circular dichroism experiments, upon Zn2+ binding, the protein secondary structure changed obviously, with the α-helix content decreasing and β-sheet, β-turn and randomcoil content increasing. The structural model of AGXA showed that His217 was near the active site pocket and that Phe178 was at the outer rim of the pocket. Based on the molecular dynamics trajectories, in the free AGXA model, the dihedral angle of C-CA-CB-CG displayed both acute and planar orientations, which corresponded to the open and closed states of the active site pocket, respectively. In the AGXA-Zn model, the dihedral angle of C-CA-CB-CG only showed the planar orientation. As Zn2+ was introduced, the metal center formed a coordination interaction with H217, a cation-π interaction with W244, a coordination interaction with E242 and a cation-π interaction with F178, which prevented F178 from easily rotating to the open state and inhibited the activity of the enzyme.Conclusion:This research may have uncovered a subtle mechanism for inhibiting the activity of α-amylase with transition metal ions, and this finding will help to design more potent and specific inhibitors of α-amylases.
Collapse
Affiliation(s)
- Si-Ming Liao
- Department of Bioengineering, College of Life Science and Technology, Guangxi University, Nanning, Guangxi, 530004, China
| | - Nai-Kun Shen
- School of Marine Sciences and Biotechnology, Guangxi University for Nationalities, Nanning, Guangxi, 530008, China
| | - Ge Liang
- State Key Laboratory of Non-Food Biomass and Enzyme Technology, Guangxi Academy of Sciences, Nanning, Guangxi, 530007, China
| | - Bo Lu
- State Key Laboratory of Non-Food Biomass and Enzyme Technology, Guangxi Academy of Sciences, Nanning, Guangxi, 530007, China
| | - Zhi-Long Lu
- Department of Bioengineering, College of Life Science and Technology, Guangxi University, Nanning, Guangxi, 530004, China
| | - Li-Xin Peng
- Department of Bioengineering, College of Life Science and Technology, Guangxi University, Nanning, Guangxi, 530004, China
| | - Feng Zhou
- State Key Laboratory of Non-Food Biomass and Enzyme Technology, Guangxi Academy of Sciences, Nanning, Guangxi, 530007, China
| | - Li-Qin Du
- Department of Bioengineering, College of Life Science and Technology, Guangxi University, Nanning, Guangxi, 530004, China
| | - Yu-Tuo Wei
- Department of Bioengineering, College of Life Science and Technology, Guangxi University, Nanning, Guangxi, 530004, China
| | - Guo-Ping Zhou
- State Key Laboratory of Non-Food Biomass and Enzyme Technology, Guangxi Academy of Sciences, Nanning, Guangxi, 530007, China
| | - Ri-Bo Huang
- Department of Bioengineering, College of Life Science and Technology, Guangxi University, Nanning, Guangxi, 530004, China
| |
Collapse
|
12
|
Han Q, Yang C, Lu J, Zhang Y, Li J. Metabolism of Oxalate in Humans: A Potential Role Kynurenine Aminotransferase/Glutamine Transaminase/Cysteine Conjugate Beta-lyase Plays in Hyperoxaluria. Curr Med Chem 2019; 26:4944-4963. [PMID: 30907303 DOI: 10.2174/0929867326666190325095223] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2018] [Revised: 02/17/2019] [Accepted: 02/22/2019] [Indexed: 11/22/2022]
Abstract
Hyperoxaluria, excessive urinary oxalate excretion, is a significant health problem worldwide. Disrupted oxalate metabolism has been implicated in hyperoxaluria and accordingly, an enzymatic disturbance in oxalate biosynthesis can result in the primary hyperoxaluria. Alanine glyoxylate aminotransferase-1 and glyoxylate reductase, the enzymes involving glyoxylate (precursor for oxalate) metabolism, have been related to primary hyperoxalurias. Some studies suggest that other enzymes such as glycolate oxidase and alanine glyoxylate aminotransferase-2 might be associated with primary hyperoxaluria as well, but evidence of a definitive link is not strong between the clinical cases and gene mutations. There are still some idiopathic hyperoxalurias, which require a further study for the etiologies. Some aminotransferases, particularly kynurenine aminotransferases, can convert glyoxylate to glycine. Based on biochemical and structural characteristics, expression level, subcellular localization of some aminotransferases, a number of them appear able to catalyze the transamination of glyoxylate to glycine more efficiently than alanine glyoxylate aminotransferase-1. The aim of this minireview is to explore other undermining causes of primary hyperoxaluria and stimulate research toward achieving a comprehensive understanding of underlying mechanisms leading to the disease. Herein, we reviewed all aminotransferases in the liver for their functions in glyoxylate metabolism. Particularly, kynurenine aminotransferase-I and III were carefully discussed regarding their biochemical and structural characteristics, cellular localization, and enzyme inhibition. Kynurenine aminotransferase-III is, so far, the most efficient putative mitochondrial enzyme to transaminate glyoxylate to glycine in mammalian livers, might be an interesting enzyme to look over in hyperoxaluria etiology of primary hyperoxaluria and should be carefully investigated for its involvement in oxalate metabolism.
Collapse
Affiliation(s)
- Qian Han
- Key Laboratory of Tropical Biological Resources of Ministry of Education, Hainan University, Haikou, Hainan 570228. China
| | - Cihan Yang
- Key Laboratory of Tropical Biological Resources of Ministry of Education, Hainan University, Haikou, Hainan 570228. China
| | - Jun Lu
- Central South University Xiangya School of Medicine Affiliated Haikou People's Hospital, Haikou, Hainan 570208. China
| | - Yinai Zhang
- Central South University Xiangya School of Medicine Affiliated Haikou People's Hospital, Haikou, Hainan 570208. China
| | - Jianyong Li
- Department of Biochemistry, Virginia Tech, Blacksburg, VA 24061. United States
| |
Collapse
|
13
|
Shi JY, Li JX, Mao KT, Cao JB, Lei P, Lu HM, Yiu SM. Predicting combinative drug pairs via multiple classifier system with positive samples only. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2019; 168:1-10. [PMID: 30527128 DOI: 10.1016/j.cmpb.2018.11.002] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/01/2018] [Revised: 10/24/2018] [Accepted: 11/12/2018] [Indexed: 06/09/2023]
Abstract
BACKGROUND AND OBJECTIVE Due to the synergistic effects of drugs, drug combination is one of the effective approaches for treating complex diseases. However, the identification of drug combinations by dose-response methods is still costly. It is promising to develop supervised learning-based approaches to predict potential drug combinations on a large scale. Nevertheless, these approaches have the inadequate utilization of heterogeneous features, which causes the loss of information useful to classification. Moreover, they have an intrinsic bias, because they assume unknown drug pairs as non-combinations, of which some could be real drug combinations in practice. METHODS To address above issues, this work first designs a two-layer multiple classifier system (TLMCS) to effectively integrate heterogeneous features involving anatomical therapeutic chemical codes of drugs, drug-drug interactions, drug-target interactions, gene ontology of drug targets, and side effects. To avoid the bias caused by labelling unknown samples as negative, it then utilizes the one-class support vector machines, (which requires no negative instance and only labels approved drug combinations as positive instances), as the member classifiers in TLMCS. Last, both a 10-fold cross validation (10-CV) and a novel prediction are performed to validate the performance of TLMCS. RESULTS The comparison with three state-of-the-art approaches under 10-CV exhibits the superiority of TLMCS, which achieves the area under the receiver operating characteristic curve = 0.824 and the area under the precision-recall curve = 0.372. Moreover, the experiment under the novel prediction demonstrates its ability, where 9 out of the top-20 predicted combinative drug pairs are validated by checking the published literature. Furthermore, for each of the newly-validated drug combinations, this work analyses the combining mode of the member drugs and investigates their relationship in terms of drug targeting pathways. CONCLUSIONS The proposed TLMCS provides an effective framework to integrate those heterogeneous features and is trained by only positive samples such that the bias of taking unknown drug pairs as negative samples can be avoided. Furthermore, its results in the novel prediction reveal five types of drug combinations and three types of drug relationships in terms of pathways.
Collapse
Affiliation(s)
- Jian-Yu Shi
- School of Life Science, Northwestern Polytechnical University, China.
| | - Jia-Xin Li
- School of Life Science, Northwestern Polytechnical University, China.
| | - Kui-Tao Mao
- School of Computer Science, Northwestern Polytechnical University, China.
| | - Jiang-Bo Cao
- School of Life Science, Northwestern Polytechnical University, China.
| | - Peng Lei
- Department of Chinese Medicine, Shaanxi Provincial People's Hospital, China.
| | - Hui-Meng Lu
- School of Life Science, Northwestern Polytechnical University, China.
| | - Siu-Ming Yiu
- Department of Computer Science, The University of Hong Kong, Hong Kong, China.
| |
Collapse
|
14
|
Jia J, Li X, Qiu W, Xiao X, Chou KC. iPPI-PseAAC(CGR): Identify protein-protein interactions by incorporating chaos game representation into PseAAC. J Theor Biol 2019; 460:195-203. [DOI: 10.1016/j.jtbi.2018.10.021] [Citation(s) in RCA: 78] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2018] [Revised: 09/16/2018] [Accepted: 10/08/2018] [Indexed: 01/11/2023]
|
15
|
Chen W, Liang X, Nong Z, Li Y, Pan X, Chen C, Huang L. The Multiple Applications and Possible Mechanisms of the Hyperbaric Oxygenation Therapy. Med Chem 2018; 15:459-471. [PMID: 30569869 DOI: 10.2174/1573406415666181219101328] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2018] [Revised: 10/23/2018] [Accepted: 12/12/2018] [Indexed: 12/18/2022]
Abstract
Hyperbaric Oxygenation Therapy (HBOT) is used as an adjunctive method for multiple diseases. The method meets the routine treating and is non-invasive, as well as provides 100% pure oxygen (O2), which is at above-normal atmospheric pressure in a specialized chamber. It is well known that in the condition of O2 deficiency, it will induce a series of adverse events. In order to prevent the injury induced by anoxia, the capability of offering pressurized O2 by HBOT seems involuntary and significant. In recent years, HBOT displays particular therapeutic efficacy in some degree, and it is thought to be beneficial to the conditions of angiogenesis, tissue ischemia and hypoxia, nerve system disease, diabetic complications, malignancies, Carbon monoxide (CO) poisoning and chronic radiation-induced injury. Single and combination HBOT are both applied in previous studies, and the manuscript is to review the current applications and possible mechanisms of HBOT. The applicability and validity of HBOT for clinical treatment remain controversial, even though it is regarded as an adjunct to conventional medical treatment with many other clinical benefits. There also exists a negative side effect of accepting pressurized O2, such as oxidative stress injury, DNA damage, cellular metabolic, activating of coagulation, endothelial dysfunction, acute neurotoxicity and pulmonary toxicity. Then it is imperative to comprehensively consider the advantages and disadvantages of HBOT in order to obtain a satisfying therapeutic outcome.
Collapse
Affiliation(s)
- Wan Chen
- Department of Emergency, the People's Hospital of Guangxi Zhuang Autonomous Region, Nanning, Guangxi 530021, China
| | - Xingmei Liang
- Department of Pharmacy, Guangxi Medical College, Nanning, Guangxi 530021, China
| | - Zhihuan Nong
- Department of Pharmacology, Guangxi Institute of Chinese Medicine and Pharmaceutical Science, Nanning 530022, China
| | - Yaoxuan Li
- Department of Neurology, the People's Hospital of Guangxi Zhuang Autonomous Region, Nanning 530022, China
| | - Xiaorong Pan
- Department of Hyperbaric oxygen, the People's Hospital of Guangxi Zhuang Autonomous Region, Nanning, Guangxi 530021, China
| | - Chunxia Chen
- Department of Hyperbaric oxygen, the People's Hospital of Guangxi Zhuang Autonomous Region, Nanning, Guangxi 530021, China
| | - Luying Huang
- Department of Respiratory Medicine, the People's Hospital of Guangxi Zhuang Autonomous Region, Nanning, Guangxi 530021, China
| |
Collapse
|
16
|
Xiao X, Xu ZC, Qiu WR, Wang P, Ge HT, Chou KC. iPSW(2L)-PseKNC: A two-layer predictor for identifying promoters and their strength by hybrid features via pseudo K-tuple nucleotide composition. Genomics 2018; 111:1785-1793. [PMID: 30529532 DOI: 10.1016/j.ygeno.2018.12.001] [Citation(s) in RCA: 44] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2018] [Revised: 11/20/2018] [Accepted: 12/04/2018] [Indexed: 12/20/2022]
Abstract
The promoter is a regulatory DNA region about 81-1000 base pairs long, usually located near the transcription start site (TSS) along upstream of a given gene. By combining a certain protein called transcription factor, the promoter provides the starting point for regulated gene transcription, and hence plays a vitally important role in gene transcriptional regulation. With explosive growth of DNA sequences in the post-genomic age, it has become an urgent challenge to develop computational method for effectively identifying promoters because the information thus obtained is very useful for both basic research and drug development. Although some prediction methods were developed in this regard, most of them were limited at merely identifying whether a query DNA sequence being of a promoter or not. However, based on their strength-distinct levels for transcriptional activation and expression, promoter should be divided into two categories: strong and weak types. Here a new two-layer predictor, called "iPSW(2L)-PseKNC", was developed by fusing the physicochemical properties of nucleotides and their nucleotide density into PseKNC (pseudo K-tuple nucleotide composition). Its 1st-layer serves to predict whether a query DNA sequence sample is of promoter or not, while its 2nd-layer is able to predict the strength of promoters. It has been observed through rigorous cross-validations that the 1st-layer sub-predictor is remarkably superior to the existing state-of-the-art predictors in identifying the promoters and non-promoters, and that the 2nd-layer sub-predictor can do what is beyond the reach of the existing predictors. Moreover, the web-server for iPSW(2L)-PseKNC has been established at http://www.jci-bioinfo.cn/iPSW(2L)-PseKNC, by which the majority of experimental scientists can easily get the results they need.
Collapse
Affiliation(s)
- Xuan Xiao
- Computer Department, Jingdezhen Ceramic Institute, Jingdezhen, China; The Gordon Life Science Institute, Boston, MA 02478, USA.
| | - Zhao-Chun Xu
- Computer Department, Jingdezhen Ceramic Institute, Jingdezhen, China.
| | - Wang-Ren Qiu
- Computer Department, Jingdezhen Ceramic Institute, Jingdezhen, China; The Gordon Life Science Institute, Boston, MA 02478, USA
| | - Peng Wang
- Computer Department, Jingdezhen Ceramic Institute, Jingdezhen, China
| | - Hui-Ting Ge
- Computer Department, Jingdezhen Ceramic Institute, Jingdezhen, China
| | - Kuo-Chen Chou
- The Gordon Life Science Institute, Boston, MA 02478, USA; Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China.
| |
Collapse
|
17
|
Chen W, Ding H, Zhou X, Lin H, Chou KC. iRNA(m6A)-PseDNC: Identifying N 6-methyladenosine sites using pseudo dinucleotide composition. Anal Biochem 2018; 561-562:59-65. [PMID: 30201554 DOI: 10.1016/j.ab.2018.09.002] [Citation(s) in RCA: 130] [Impact Index Per Article: 18.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2018] [Revised: 08/31/2018] [Accepted: 09/03/2018] [Indexed: 01/28/2023]
Abstract
As a prevalent post-transcriptional modification, N6-methyladenosine (m6A) plays key roles in a series of biological processes. Although experimental technologies have been developed and applied to identify m6A sites, they are still cost-ineffective for transcriptome-wide detections of m6A. As good complements to the experimental techniques, some computational methods have been proposed to identify m6A sites. However, their performance remains unsatisfactory. In this study, we firstly proposed an Euclidean distance based method to construct a high quality benchmark dataset. By encoding the RNA sequences using pseudo nucleotide composition, a new predictor called iRNA(m6A)-PseDNC was developed to identify m6A sites in the Saccharomyces cerevisiae genome. It has been demonstrated by the 10-fold cross validation test that the performance of iRNA(m6A)-PseDNC is superior to the existing methods. Meanwhile, for the convenience of most experimental scientists, established at the site http://lin-group.cn/server/iRNA(m6A)-PseDNC.php is its web-server, by which users can easily get their desired results without need to go through the detailed mathematics. It is anticipated that iRNA(m6A)-PseDNC will become a useful high throughput tool for identifying m6A sites in the S. cerevisiae genome.
Collapse
Affiliation(s)
- Wei Chen
- School of Sciences, Center for Genomics and Computational Biology, North China University of Science and Technology, Tangshan, 063000, China; Innovative Institute of Chinese Medicine and Pharmacy, Chengdu University of Traditional Chinese Medicine, Chengdu, 611730, China; Gordon Life Science Institute, Boston, MA, 02478, USA.
| | - Hui Ding
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, 610054, China.
| | - Xu Zhou
- School of Sciences, Center for Genomics and Computational Biology, North China University of Science and Technology, Tangshan, 063000, China.
| | - Hao Lin
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, 610054, China; Gordon Life Science Institute, Boston, MA, 02478, USA.
| | - Kuo-Chen Chou
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, 610054, China; Gordon Life Science Institute, Boston, MA, 02478, USA.
| |
Collapse
|
18
|
Giri S, Manivannan J, Srinivasan B, Sundaresan L, Gajalakshmi P, Chatterjee S. A proteome-wide systems toxicological approach deciphers the interaction network of chemotherapeutic drugs in the cardiovascular milieu. RSC Adv 2018; 8:20211-20221. [PMID: 35541641 PMCID: PMC9080753 DOI: 10.1039/c8ra02877j] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2018] [Accepted: 05/21/2018] [Indexed: 12/30/2022] Open
Abstract
Onco-cardiology is critical for the management of cancer therapeutics since many of the anti-cancer agents are associated with cardiotoxicity. Therefore, the major aim of the current study is to employ a novel in silico method combined with experimental validation to explore off-targets and prioritize the enriched molecular pathways related to the specific cardiovascular events other than their intended targets by deriving relationship between drug-target-pathways and cardiovascular complications in order to help onco-cardiologists for the management of strategies to minimize cardiotoxicity. A systems biological understanding of the multi-target effects of a drug requires prior knowledge of proteome-wide binding profiles. In order to achieve the above, we have utilized PharmMapper, a web-based tool that uses a reverse pharmacophore mapping approach (spatial arrangement of features essential for a molecule to interact with a specific target receptor), along with KEGG for exploring the pathway relationship. In the validation part of the study, predicted protein targets and signalling pathways were strengthened with existing datasets of DrugBank and antibody arrays specific to vascular endothelial growth factor (VEGF) signalling in the case of 5-fluorouracil as direct experimental evidence. The current systems toxicological method illustrates the potential of the above big-data in supporting the knowledge of onco-cardiological indications which may lead to the generation of a decision making catalogue in future therapeutic prescription.
Collapse
Affiliation(s)
- Suvendu Giri
- Department of Biotechnology, Anna University Chennai Tamil Nadu India
| | - Jeganathan Manivannan
- Vascular Biology Lab, AU-KBC Research Centre, MIT Campus of Anna University Chennai Tamil Nadu India
- Environmental Health and Toxicology Lab, Department of Environmental Sciences, Bharathiar University Coimbatore Tamil Nadu India
| | | | | | - Palanivel Gajalakshmi
- Vascular Biology Lab, AU-KBC Research Centre, MIT Campus of Anna University Chennai Tamil Nadu India
| | - Suvro Chatterjee
- Department of Biotechnology, Anna University Chennai Tamil Nadu India
- Vascular Biology Lab, AU-KBC Research Centre, MIT Campus of Anna University Chennai Tamil Nadu India
| |
Collapse
|
19
|
Wang S, Yue Y. Protein subnuclear localization based on a new effective representation and intelligent kernel linear discriminant analysis by dichotomous greedy genetic algorithm. PLoS One 2018; 13:e0195636. [PMID: 29649330 PMCID: PMC5896989 DOI: 10.1371/journal.pone.0195636] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2017] [Accepted: 03/26/2018] [Indexed: 01/03/2023] Open
Abstract
A wide variety of methods have been proposed in protein subnuclear localization to improve the prediction accuracy. However, one important trend of these means is to treat fusion representation by fusing multiple feature representations, of which, the fusion process takes a lot of time. In view of this, this paper novelly proposed a method by combining a new single feature representation and a new algorithm to obtain good recognition rate. Specifically, based on the position-specific scoring matrix (PSSM), we proposed a new expression, correlation position-specific scoring matrix (CoPSSM) as the protein feature representation. Based on the classic nonlinear dimension reduction algorithm, kernel linear discriminant analysis (KLDA), we added a new discriminant criterion and proposed a dichotomous greedy genetic algorithm (DGGA) to intelligently select its kernel bandwidth parameter. Two public datasets with Jackknife test and KNN classifier were used for the numerical experiments. The results showed that the overall success rate (OSR) with single representation CoPSSM is larger than that with many relevant representations. The OSR of the proposed method can reach as high as 87.444% and 90.3361% for these two datasets, respectively, outperforming many current methods. To show the generalization of the proposed algorithm, two extra standard datasets of protein subcellular were chosen to conduct the expending experiment, and the prediction accuracy by Jackknife test and Independent test is still considerable.
Collapse
Affiliation(s)
- Shunfang Wang
- School of Information Science and Engineering, Yunnan University, Kunming, PR China
- * E-mail:
| | - Yaoting Yue
- School of Information Science and Engineering, Yunnan University, Kunming, PR China
| |
Collapse
|
20
|
Heterodimer Binding Scaffolds Recognition via the Analysis of Kinetically Hot Residues. Pharmaceuticals (Basel) 2018; 11:ph11010029. [PMID: 29547506 PMCID: PMC5874725 DOI: 10.3390/ph11010029] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2017] [Revised: 03/06/2018] [Accepted: 03/08/2018] [Indexed: 12/13/2022] Open
Abstract
Physical interactions between proteins are often difficult to decipher. The aim of this paper is to present an algorithm that is designed to recognize binding patches and supporting structural scaffolds of interacting heterodimer proteins using the Gaussian Network Model (GNM). The recognition is based on the (self) adjustable identification of kinetically hot residues and their connection to possible binding scaffolds. The kinetically hot residues are residues with the lowest entropy, i.e., the highest contribution to the weighted sum of the fastest modes per chain extracted via GNM. The algorithm adjusts the number of fast modes in the GNM's weighted sum calculation using the ratio of predicted and expected numbers of target residues (contact and the neighboring first-layer residues). This approach produces very good results when applied to dimers with high protein sequence length ratios. The protocol's ability to recognize near native decoys was compared to the ability of the residue-level statistical potential of Lu and Skolnick using the Sternberg and Vakser decoy dimers sets. The statistical potential produced better overall results, but in a number of cases its predicting ability was comparable, or even inferior, to the prediction ability of the adjustable GNM approach. The results presented in this paper suggest that in heterodimers at least one protein has interacting scaffold determined by the immovable, kinetically hot residues. In many cases, interacting proteins (especially if being of noticeably different sizes) either behave as a rigid lock and key or, presumably, exhibit the opposite dynamic behavior. While the binding surface of one protein is rigid and stable, its partner's interacting scaffold is more flexible and adaptable.
Collapse
|
21
|
Feng P, Yang H, Ding H, Lin H, Chen W, Chou KC. iDNA6mA-PseKNC: Identifying DNA N 6-methyladenosine sites by incorporating nucleotide physicochemical properties into PseKNC. Genomics 2018; 111:96-102. [PMID: 29360500 DOI: 10.1016/j.ygeno.2018.01.005] [Citation(s) in RCA: 188] [Impact Index Per Article: 26.9] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2017] [Revised: 12/24/2017] [Accepted: 01/07/2018] [Indexed: 11/29/2022]
Abstract
N6-methyladenine (6mA) is one kind of post-replication modification (PTM or PTRM) occurring in a wide range of DNA sequences. Accurate identification of its sites will be very helpful for revealing the biological functions of 6mA, but it is time-consuming and expensive to determine them by experiments alone. Unfortunately, so far, no bioinformatics tool is available to do so. To fill in such an empty area, we have proposed a novel predictor called iDNA6mA-PseKNC that is established by incorporating nucleotide physicochemical properties into Pseudo K-tuple Nucleotide Composition (PseKNC). It has been observed via rigorous cross-validations that the predictor's sensitivity (Sn), specificity (Sp), accuracy (Acc), and stability (MCC) are 93%, 100%, 96%, and 0.93, respectively. For the convenience of most experimental scientists, a user-friendly web server for iDNA6mA-PseKNC has been established at http://lin-group.cn/server/iDNA6mA-PseKNC, by which users can easily obtain their desired results without the need to go through the complicated mathematical equations involved.
Collapse
Affiliation(s)
- Pengmian Feng
- Hebei Province Key Laboratory of Occupational Health and Safety for Coal Industry, School of Public Health, North China University of Science and Technology, Tangshan 063000, China
| | - Hui Yang
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Hui Ding
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Hao Lin
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China; Gordon Life Science Institute, Boston, MA 02478, USA.
| | - Wei Chen
- Department of Physics, School of Sciences, and Center for Genomics and Computational Biology, North China University of Science and Technology, Tangshan, Tangshan 063000, China; Gordon Life Science Institute, Boston, MA 02478, USA.
| | - Kuo-Chen Chou
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China; Gordon Life Science Institute, Boston, MA 02478, USA.
| |
Collapse
|
22
|
Xiao X, Ye HX, Liu Z, Jia JH, Chou KC. iROS-gPseKNC: Predicting replication origin sites in DNA by incorporating dinucleotide position-specific propensity into general pseudo nucleotide composition. Oncotarget 2018; 7:34180-9. [PMID: 27147572 PMCID: PMC5085147 DOI: 10.18632/oncotarget.9057] [Citation(s) in RCA: 109] [Impact Index Per Article: 15.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2016] [Accepted: 04/09/2016] [Indexed: 11/25/2022] Open
Abstract
DNA replication, occurring in all living organisms and being the basis for biological inheritance, is the process of producing two identical replicas from one original DNA molecule. To in-depth understand such an important biological process and use it for developing new strategy against genetics diseases, the knowledge of duplication origin sites in DNA is indispensible. With the explosive growth of DNA sequences emerging in the postgenomic age, it is highly desired to develop high throughput tools to identify these regions purely based on the sequence information alone. In this paper, by incorporating the dinucleotide position-specific propensity information into the general pseudo nucleotide composition and using the random forest classifier, a new predictor called iROS-gPseKNC was proposed. Rigorously cross-validations have indicated that the proposed predictor is significantly better than the best existing method in sensitivity, specificity, overall accuracy, and stability. Furthermore, a user-friendly web-server for iROS-gPseKNC has been established at http://www.jci-bioinfo.cn/iROS-gPseKNC, by which users can easily get their desired results without the need to bother the complicated mathematics, which were presented just for the integrity of the methodology itself.
Collapse
Affiliation(s)
- Xuan Xiao
- Computer Department, Jing-De-Zhen Ceramic Institute, Jing-De-Zhen, 333403, China.,Information School, ZheJiang Textile and Fashion College, NingBo, 315211, China.,Gordon Life Science Institute, Boston, Massachusetts, 02478, USA
| | - Han-Xiao Ye
- Computer Department, Jing-De-Zhen Ceramic Institute, Jing-De-Zhen, 333403, China
| | - Zi Liu
- School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, 210094, China
| | - Jian-Hua Jia
- Computer Department, Jing-De-Zhen Ceramic Institute, Jing-De-Zhen, 333403, China
| | - Kuo-Chen Chou
- Center of Excellence in Genomic Medicine Research (CEGMR), King Abdulaziz University, Jeddah, 21589, Saudi Arabia.,Gordon Life Science Institute, Boston, Massachusetts, 02478, USA
| |
Collapse
|
23
|
Liu B, Wu H, Zhang D, Wang X, Chou KC. Pse-Analysis: a python package for DNA/RNA and protein/ peptide sequence analysis based on pseudo components and kernel methods. Oncotarget 2017; 8:13338-13343. [PMID: 28076851 PMCID: PMC5355101 DOI: 10.18632/oncotarget.14524] [Citation(s) in RCA: 114] [Impact Index Per Article: 14.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2016] [Accepted: 12/27/2016] [Indexed: 12/20/2022] Open
Abstract
To expedite the pace in conducting genome/proteome analysis, we have developed a Python package called Pse-Analysis. The powerful package can automatically complete the following five procedures: (1) sample feature extraction, (2) optimal parameter selection, (3) model training, (4) cross validation, and (5) evaluating prediction quality. All the work a user needs to do is to input a benchmark dataset along with the query biological sequences concerned. Based on the benchmark dataset, Pse-Analysis will automatically construct an ideal predictor, followed by yielding the predicted results for the submitted query samples. All the aforementioned tedious jobs can be automatically done by the computer. Moreover, the multiprocessing technique was adopted to enhance computational speed by about 6 folds. The Pse-Analysis Python package is freely accessible to the public at http://bioinformatics.hitsz.edu.cn/Pse-Analysis/, and can be directly run on Windows, Linux, and Unix.
Collapse
Affiliation(s)
- Bin Liu
- School of Computer Science and Technology, Harbin Institute of Technology Shenzhen Graduate School, Shenzhen, Guangdong, China.,Key Laboratory of Network Oriented Intelligent Computation, Harbin Institute of Technology Shenzhen Graduate School, Shenzhen, Guangdong, China.,Gordon Life Science Institute, Boston, Massachusetts, USA
| | - Hao Wu
- School of Computer Science and Technology, Harbin Institute of Technology Shenzhen Graduate School, Shenzhen, Guangdong, China
| | - Deyuan Zhang
- School of Computer, Shenyang Aerospace University, Shenyang, Liaoning, China
| | - Xiaolong Wang
- School of Computer Science and Technology, Harbin Institute of Technology Shenzhen Graduate School, Shenzhen, Guangdong, China.,Key Laboratory of Network Oriented Intelligent Computation, Harbin Institute of Technology Shenzhen Graduate School, Shenzhen, Guangdong, China
| | - Kuo-Chen Chou
- Gordon Life Science Institute, Boston, Massachusetts, USA.,Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China
| |
Collapse
|
24
|
DrugECs: An Ensemble System with Feature Subspaces for Accurate Drug-Target Interaction Prediction. BIOMED RESEARCH INTERNATIONAL 2017; 2017:6340316. [PMID: 28744468 PMCID: PMC5514335 DOI: 10.1155/2017/6340316] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/04/2016] [Revised: 04/10/2017] [Accepted: 05/03/2017] [Indexed: 12/21/2022]
Abstract
Background Drug-target interaction is key in drug discovery, especially in the design of new lead compound. However, the work to find a new lead compound for a specific target is complicated and hard, and it always leads to many mistakes. Therefore computational techniques are commonly adopted in drug design, which can save time and costs to a significant extent. Results To address the issue, a new prediction system is proposed in this work to identify drug-target interaction. First, drug-target pairs are encoded with a fragment technique and the software “PaDEL-Descriptor.” The fragment technique is for encoding target proteins, which divides each protein sequence into several fragments in order and encodes each fragment with several physiochemical properties of amino acids. The software “PaDEL-Descriptor” creates encoding vectors for drug molecules. Second, the dataset of drug-target pairs is resampled and several overlapped subsets are obtained, which are then input into kNN (k-Nearest Neighbor) classifier to build an ensemble system. Conclusion Experimental results on the drug-target dataset showed that our method performs better and runs faster than the state-of-the-art predictors.
Collapse
|
25
|
Feng P, Ding H, Yang H, Chen W, Lin H, Chou KC. iRNA-PseColl: Identifying the Occurrence Sites of Different RNA Modifications by Incorporating Collective Effects of Nucleotides into PseKNC. MOLECULAR THERAPY. NUCLEIC ACIDS 2017; 7:155-163. [PMID: 28624191 PMCID: PMC5415964 DOI: 10.1016/j.omtn.2017.03.006] [Citation(s) in RCA: 221] [Impact Index Per Article: 27.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/12/2017] [Revised: 03/16/2017] [Accepted: 03/17/2017] [Indexed: 11/23/2022]
Abstract
There are many different types of RNA modifications, which are essential for numerous biological processes. Knowledge about the occurrence sites of RNA modifications in its sequence is a key for in-depth understanding of their biological functions and mechanism. Unfortunately, it is both time-consuming and laborious to determine these sites purely by experiments alone. Although some computational methods were developed in this regard, each one could only be used to deal with some type of modification individually. To our knowledge, no method has thus far been developed that can identify the occurrence sites for several different types of RNA modifications with one seamless package or platform. To address such a challenge, a novel platform called "iRNA-PseColl" has been developed. It was formed by incorporating both the individual and collective features of the sequence elements into the general pseudo K-tuple nucleotide composition (PseKNC) of RNA via the chemicophysical properties and density distribution of its constituent nucleotides. Rigorous cross-validations have indicated that the anticipated success rates achieved by the proposed platform are quite high. To maximize the convenience for most experimental biologists, the platform's web-server has been provided at http://lin.uestc.edu.cn/server/iRNA-PseColl along with a step-by-step user guide that will allow users to easily achieve their desired results without the need to go through the mathematical details involved in this paper.
Collapse
Affiliation(s)
- Pengmian Feng
- Hebei Province Key Laboratory of Occupational Health and Safety for Coal Industry, School of Public Health, North China University of Science and Technology, Tangshan, 063000, China
| | - Hui Ding
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, 610054, China
| | - Hui Yang
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, 610054, China
| | - Wei Chen
- Department of Physics, School of Sciences, and Center for Genomics and Computational Biology, North China University of Science and Technology, Tangshan 063000, China; Gordon Life Science Institute, Boston, MA 02478, USA.
| | - Hao Lin
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, 610054, China; Gordon Life Science Institute, Boston, MA 02478, USA.
| | - Kuo-Chen Chou
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, 610054, China; Gordon Life Science Institute, Boston, MA 02478, USA.
| |
Collapse
|
26
|
PDC-SGB: Prediction of effective drug combinations using a stochastic gradient boosting algorithm. J Theor Biol 2017; 417:1-7. [DOI: 10.1016/j.jtbi.2017.01.019] [Citation(s) in RCA: 69] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2016] [Revised: 01/06/2017] [Accepted: 01/14/2017] [Indexed: 12/12/2022]
|
27
|
Zhang J, Zhu M, Chen P, Wang B. DrugRPE: Random projection ensemble approach to drug-target interaction prediction. Neurocomputing 2017. [DOI: 10.1016/j.neucom.2016.10.039] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
28
|
Cheng X, Zhao SG, Xiao X, Chou KC. iATC-mISF: a multi-label classifier for predicting the classes of anatomical therapeutic chemicals. Bioinformatics 2016; 33:341-346. [DOI: 10.1093/bioinformatics/btw644] [Citation(s) in RCA: 56] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2016] [Revised: 10/04/2016] [Accepted: 10/05/2016] [Indexed: 01/06/2023] Open
|
29
|
Oshikoya KA, Oreagba IA, Godman B, Oguntayo FS, Fadare J, Orubu S, Massele A, Senbanjo IO. Potential drug-drug interactions in paediatric outpatient prescriptions in Nigeria and implications for the future. Expert Rev Clin Pharmacol 2016; 9:1505-1515. [PMID: 27592636 DOI: 10.1080/17512433.2016.1232619] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
BACKGROUND Information regarding the incidence of drug-drug interactions (DDIs) and adverse drug events (ADEs) among paediatric patients in Nigeria is limited. METHODS Prospective clinical audit among paediatric outpatients in four general hospitals in Nigeria over a 3-month period. Details of ADEs documented in case files was extracted. RESULTS Among 1233 eligible patients, 208 (16.9%) received prescriptions with at least one potential DDI. Seven drug classes were implicated with antimalarial combination therapies predominating. Exposure mostly to a single potential DDI, commonly involved promethazine, artemether/lumefantrine, ciprofloxacin and artemether/lumefantrine. Exposure mostly to major and serious, and moderate and clinically significant, potential DDIs. Overall exposure similar across all age groups and across genders. A significant association was seen between severity of potential DDIs and age. Only 48 (23.1%) of these patients presented at follow-up clinics with only 15 reporting ADEs. CONCLUSION There was exposure to potential DDIs in this population. However, potential DDIs were associated with only a few reported ADEs.
Collapse
Affiliation(s)
- Kazeem Adeola Oshikoya
- a Pharmacology Department , Lagos State University College of Medicine , Ikeja , Nigeria
| | - Ibrahim Adekunle Oreagba
- b Pharmacology, Therapeutic and Toxicology Department , College of Medicine, University of Lagos , Idiaraba , Nigeria
| | - Brian Godman
- c Division of Clinical Pharmacology , Karolinska Institute , Stockholm , Sweden.,d Strathclyde Institute of Pharmacy and Biomedical Sciences , University of Strathclyde , Glasgow , United Kingdom
| | - Fisayo Solomon Oguntayo
- b Pharmacology, Therapeutic and Toxicology Department , College of Medicine, University of Lagos , Idiaraba , Nigeria
| | - Joseph Fadare
- e Department of Pharmacology , Ekiti State University , Ado-Ekiti , Nigeria
| | - Samuel Orubu
- f Faculty of Pharmacy , Niger Delta University , Wilberforce Island , Nigeria
| | - Amos Massele
- g Department of Clinical Pharmacology , School of Medicine, University of Botswana , Gaborone , Botswana
| | | |
Collapse
|
30
|
Liu Z, Xiao X, Yu DJ, Jia J, Qiu WR, Chou KC. pRNAm-PC: Predicting N6-methyladenosine sites in RNA sequences via physical–chemical properties. Anal Biochem 2016; 497:60-7. [DOI: 10.1016/j.ab.2015.12.017] [Citation(s) in RCA: 225] [Impact Index Per Article: 25.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2015] [Revised: 12/02/2015] [Accepted: 12/23/2015] [Indexed: 11/28/2022]
|
31
|
Chen W, Feng P, Ding H, Lin H, Chou KC. Using deformation energy to analyze nucleosome positioning in genomes. Genomics 2016; 107:69-75. [DOI: 10.1016/j.ygeno.2015.12.005] [Citation(s) in RCA: 87] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2015] [Revised: 12/06/2015] [Accepted: 12/22/2015] [Indexed: 12/28/2022]
|
32
|
Predicting DNA Methylation State of CpG Dinucleotide Using Genome Topological Features and Deep Networks. Sci Rep 2016; 6:19598. [PMID: 26797014 PMCID: PMC4726425 DOI: 10.1038/srep19598] [Citation(s) in RCA: 46] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2015] [Accepted: 12/14/2015] [Indexed: 11/09/2022] Open
Abstract
The hypo- or hyper-methylation of the human genome is one of the epigenetic features of leukemia. However, experimental approaches have only determined the methylation state of a small portion of the human genome. We developed deep learning based (stacked denoising autoencoders, or SdAs) software named “DeepMethyl” to predict the methylation state of DNA CpG dinucleotides using features inferred from three-dimensional genome topology (based on Hi-C) and DNA sequence patterns. We used the experimental data from immortalised myelogenous leukemia (K562) and healthy lymphoblastoid (GM12878) cell lines to train the learning models and assess prediction performance. We have tested various SdA architectures with different configurations of hidden layer(s) and amount of pre-training data and compared the performance of deep networks relative to support vector machines (SVMs). Using the methylation states of sequentially neighboring regions as one of the learning features, an SdA achieved a blind test accuracy of 89.7% for GM12878 and 88.6% for K562. When the methylation states of sequentially neighboring regions are unknown, the accuracies are 84.82% for GM12878 and 72.01% for K562. We also analyzed the contribution of genome topological features inferred from Hi-C. DeepMethyl can be accessed at http://dna.cs.usm.edu/deepmethyl/.
Collapse
|
33
|
Protein Sub-Nuclear Localization Based on Effective Fusion Representations and Dimension Reduction Algorithm LDA. Int J Mol Sci 2015; 16:30343-61. [PMID: 26703574 PMCID: PMC4691178 DOI: 10.3390/ijms161226237] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2015] [Revised: 12/07/2015] [Accepted: 12/11/2015] [Indexed: 01/01/2023] Open
Abstract
An effective representation of a protein sequence plays a crucial role in protein sub-nuclear localization. The existing representations, such as dipeptide composition (DipC), pseudo-amino acid composition (PseAAC) and position specific scoring matrix (PSSM), are insufficient to represent protein sequence due to their single perspectives. Thus, this paper proposes two fusion feature representations of DipPSSM and PseAAPSSM to integrate PSSM with DipC and PseAAC, respectively. When constructing each fusion representation, we introduce the balance factors to value the importance of its components. The optimal values of the balance factors are sought by genetic algorithm. Due to the high dimensionality of the proposed representations, linear discriminant analysis (LDA) is used to find its important low dimensional structure, which is essential for classification and location prediction. The numerical experiments on two public datasets with KNN classifier and cross-validation tests showed that in terms of the common indexes of sensitivity, specificity, accuracy and MCC, the proposed fusing representations outperform the traditional representations in protein sub-nuclear localization, and the representation treated by LDA outperforms the untreated one.
Collapse
|
34
|
Liu B, Fang L, Wang S, Wang X, Li H, Chou KC. Identification of microRNA precursor with the degenerate K-tuple or Kmer strategy. J Theor Biol 2015; 385:153-9. [DOI: 10.1016/j.jtbi.2015.08.025] [Citation(s) in RCA: 131] [Impact Index Per Article: 13.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2015] [Revised: 08/21/2015] [Accepted: 08/24/2015] [Indexed: 10/23/2022]
|
35
|
Jia J, Liu Z, Xiao X, Liu B, Chou KC. Identification of protein-protein binding sites by incorporating the physicochemical properties and stationary wavelet transforms into pseudo amino acid composition. J Biomol Struct Dyn 2015; 34:1946-61. [PMID: 26375780 DOI: 10.1080/07391102.2015.1095116] [Citation(s) in RCA: 88] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Abstract
With the explosive growth of protein sequences entering into protein data banks in the post-genomic era, it is highly demanded to develop automated methods for rapidly and effectively identifying the protein-protein binding sites (PPBSs) based on the sequence information alone. To address this problem, we proposed a predictor called iPPBS-PseAAC, in which each amino acid residue site of the proteins concerned was treated as a 15-tuple peptide segment generated by sliding a window along the protein chains with its center aligned with the target residue. The working peptide segment is further formulated by a general form of pseudo amino acid composition via the following procedures: (1) it is converted into a numerical series via the physicochemical properties of amino acids; (2) the numerical series is subsequently converted into a 20-D feature vector by means of the stationary wavelet transform technique. Formed by many individual "Random Forest" classifiers, the operation engine to run prediction is a two-layer ensemble classifier, with the 1st-layer voting out the best training data-set from many bootstrap systems and the 2nd-layer voting out the most relevant one from seven physicochemical properties. Cross-validation tests indicate that the new predictor is very promising, meaning that many important key features, which are deeply hidden in complicated protein sequences, can be extracted via the wavelets transform approach, quite consistent with the facts that many important biological functions of proteins can be elucidated with their low-frequency internal motions. The web server of iPPBS-PseAAC is accessible at http://www.jci-bioinfo.cn/iPPBS-PseAAC , by which users can easily acquire their desired results without the need to follow the complicated mathematical equations involved.
Collapse
Affiliation(s)
- Jianhua Jia
- a Computer Department , Jing-De-Zhen Ceramic Institute , Jing-De-Zhen 333403 , China
| | - Zi Liu
- a Computer Department , Jing-De-Zhen Ceramic Institute , Jing-De-Zhen 333403 , China
| | - Xuan Xiao
- a Computer Department , Jing-De-Zhen Ceramic Institute , Jing-De-Zhen 333403 , China.,c Gordon Life Science Institute , Boston , MA 02478 , USA
| | - Bingxiang Liu
- a Computer Department , Jing-De-Zhen Ceramic Institute , Jing-De-Zhen 333403 , China
| | - Kuo-Chen Chou
- b Center of Excellence in Genomic Medicine Research (CEGMR) , King Abdulaziz University , Jeddah 21589 , Saudi Arabia.,c Gordon Life Science Institute , Boston , MA 02478 , USA
| |
Collapse
|
36
|
Zhang J, Wang G, Feng J, Zhang L, Li J. Identifying ion channel genes related to cardiomyopathy using a novel decision forest strategy. MOLECULAR BIOSYSTEMS 2015; 10:2407-14. [PMID: 24977958 DOI: 10.1039/c4mb00193a] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
Ion channels play many crucial functions in life. Their dysfunction may lead to a number of diseases, such as arrhythmia and beta cell dysfunction. In this study, we firstly selected the ion channel gene expression profiles using a dimensionality reduction method. After that, we applied a novel decision forest strategy to mine cardiomyopathy related ion channel genes. The novel proposed Zi integrated the information of the decision trees' height and the frequency at which a gene was located in the tree. It achieved a much higher ability of feature selection. In the result, 26 cardiomyopathy related ion channel genes were identified. Their Zi were higher than the threshold Z*. Furthermore, most of these genes had been reported to have relationships with cardiomyopathies. In conclusion, our proposed decision forest strategy had a better classification performance. Our result can provide a theoretical basis for cardiovascular researchers.
Collapse
Affiliation(s)
- Jie Zhang
- Department of Prevention, Tongji University School of Medicine, Shanghai, China.
| | | | | | | | | |
Collapse
|
37
|
Large-scale Direct Targeting for Drug Repositioning and Discovery. Sci Rep 2015; 5:11970. [PMID: 26155766 PMCID: PMC4496667 DOI: 10.1038/srep11970] [Citation(s) in RCA: 59] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2014] [Accepted: 06/12/2015] [Indexed: 01/08/2023] Open
Abstract
A system-level identification of drug-target direct interactions is vital to drug repositioning and discovery. However, the biological means on a large scale remains challenging and expensive even nowadays. The available computational models mainly focus on predicting indirect interactions or direct interactions on a small scale. To address these problems, in this work, a novel algorithm termed weighted ensemble similarity (WES) has been developed to identify drug direct targets based on a large-scale of 98,327 drug-target relationships. WES includes: (1) identifying the key ligand structural features that are highly-related to the pharmacological properties in a framework of ensemble; (2) determining a drug’s affiliation of a target by evaluation of the overall similarity (ensemble) rather than a single ligand judgment; and (3) integrating the standardized ensemble similarities (Z score) by Bayesian network and multi-variate kernel approach to make predictions. All these lead WES to predict drug direct targets with external and experimental test accuracies of 70% and 71%, respectively. This shows that the WES method provides a potential in silico model for drug repositioning and discovery.
Collapse
|
38
|
Liu B, Fang L, Liu F, Wang X, Chen J, Chou KC. Identification of real microRNA precursors with a pseudo structure status composition approach. PLoS One 2015; 10:e0121501. [PMID: 25821974 PMCID: PMC4378912 DOI: 10.1371/journal.pone.0121501] [Citation(s) in RCA: 165] [Impact Index Per Article: 16.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2014] [Accepted: 01/31/2015] [Indexed: 01/08/2023] Open
Abstract
Containing about 22 nucleotides, a micro RNA (abbreviated miRNA) is a small non-coding RNA molecule, functioning in transcriptional and post-transcriptional regulation of gene expression. The human genome may encode over 1000 miRNAs. Albeit poorly characterized, miRNAs are widely deemed as important regulators of biological processes. Aberrant expression of miRNAs has been observed in many cancers and other disease states, indicating they are deeply implicated with these diseases, particularly in carcinogenesis. Therefore, it is important for both basic research and miRNA-based therapy to discriminate the real pre-miRNAs from the false ones (such as hairpin sequences with similar stem-loops). Particularly, with the avalanche of RNA sequences generated in the postgenomic age, it is highly desired to develop computational sequence-based methods in this regard. Here two new predictors, called “iMcRNA-PseSSC” and “iMcRNA-ExPseSSC”, were proposed for identifying the human pre-microRNAs by incorporating the global or long-range structure-order information using a way quite similar to the pseudo amino acid composition approach. Rigorous cross-validations on a much larger and more stringent newly constructed benchmark dataset showed that the two new predictors (accessible at http://bioinformatics.hitsz.edu.cn/iMcRNA/) outperformed or were highly comparable with the best existing predictors in this area.
Collapse
Affiliation(s)
- Bin Liu
- School of Computer Science and Technology, Harbin Institute of Technology Shenzhen Graduate School, Shenzhen, Guangdong, China
- Key Laboratory of Network Oriented Intelligent Computation, Harbin Institute of Technology Shenzhen Graduate School, Shenzhen, Guangdong, China
- Gordon Life Science Institute, Belmont, Massachusetts, United States of America
- * E-mail:
| | - Longyun Fang
- School of Computer Science and Technology, Harbin Institute of Technology Shenzhen Graduate School, Shenzhen, Guangdong, China
| | - Fule Liu
- School of Computer Science and Technology, Harbin Institute of Technology Shenzhen Graduate School, Shenzhen, Guangdong, China
| | - Xiaolong Wang
- School of Computer Science and Technology, Harbin Institute of Technology Shenzhen Graduate School, Shenzhen, Guangdong, China
- Key Laboratory of Network Oriented Intelligent Computation, Harbin Institute of Technology Shenzhen Graduate School, Shenzhen, Guangdong, China
| | - Junjie Chen
- School of Computer Science and Technology, Harbin Institute of Technology Shenzhen Graduate School, Shenzhen, Guangdong, China
| | - Kuo-Chen Chou
- Gordon Life Science Institute, Belmont, Massachusetts, United States of America
- Center of Excellence in Genomic Medicine Research (CEGMR), King Abdulaziz University, Jeddah, Saudi Arabia
| |
Collapse
|
39
|
Tanchotsrinon W, Lursinsap C, Poovorawan Y. A high performance prediction of HPV genotypes by Chaos game representation and singular value decomposition. BMC Bioinformatics 2015; 16:71. [PMID: 25880169 PMCID: PMC4375884 DOI: 10.1186/s12859-015-0493-4] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2014] [Accepted: 02/06/2015] [Indexed: 02/02/2023] Open
Abstract
BACKGROUND Human Papillomavirus (HPV) genotyping is an important approach to fight cervical cancer due to the relevant information regarding risk stratification for diagnosis and the better understanding of the relationship of HPV with carcinogenesis. This paper proposed two new feature extraction techniques, i.e. ChaosCentroid and ChaosFrequency, for predicting HPV genotypes associated with the cancer. The additional diversified 12 HPV genotypes, i.e. types 6, 11, 16, 18, 31, 33, 35, 45, 52, 53, 58, and 66, were studied in this paper. In our proposed techniques, a partitioned Chaos Game Representation (CGR) is deployed to represent HPV genomes. ChaosCentroid captures the structure of sequences in terms of centroid of each sub-region with Euclidean distances among the centroids and the center of CGR as the relations of all sub-regions. ChaosFrequency extracts the statistical distribution of mono-, di-, or higher order nucleotides along HPV genomes and forms a matrix of frequency of dots in each sub-region. For performance evaluation, four different types of classifiers, i.e. Multi-layer Perceptron, Radial Basis Function, K-Nearest Neighbor, and Fuzzy K-Nearest Neighbor Techniques were deployed, and our best results from each classifier were compared with the NCBI genotyping tool. RESULTS The experimental results obtained by four different classifiers are in the same trend. ChaosCentroid gave considerably higher performance than ChaosFrequency when the input length is one but it was moderately lower than ChaosFrequency when the input length is two. Both proposed techniques yielded almost or exactly the best performance when the input length is more than three. But there is no significance between our proposed techniques and the comparative alignment method. CONCLUSIONS Our proposed alignment-free and scale-independent method can successfully transform HPV genomes with 7,000 - 10,000 base pairs into features of 1 - 11 dimensions. This signifies that our ChaosCentroid and ChaosFrequency can be served as the effective feature extraction techniques for predicting the HPV genotypes.
Collapse
Affiliation(s)
- Watcharaporn Tanchotsrinon
- Advanced Virtual and Intelligent Computing Research Center (AVIC), Department of Mathematics and Computer Science, Faculty of Science, Chulalongkorn University, Phayathai Road, Bangkok, Thailand.
| | - Chidchanok Lursinsap
- Advanced Virtual and Intelligent Computing Research Center (AVIC), Department of Mathematics and Computer Science, Faculty of Science, Chulalongkorn University, Phayathai Road, Bangkok, Thailand.
| | - Yong Poovorawan
- Center of Excellence in Clinical Virology, Department of Pediatrics, Faculty of Medicine, Chulalongkorn University, Phayathai Road, Bangkok, Thailand.
| |
Collapse
|
40
|
Liu B, Fang L, Liu F, Wang X, Chou KC. iMiRNA-PseDPC: microRNA precursor identification with a pseudo distance-pair composition approach. J Biomol Struct Dyn 2015; 34:223-35. [DOI: 10.1080/07391102.2015.1014422] [Citation(s) in RCA: 96] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
|
41
|
Xiao X, Min JL, Lin WZ, Liu Z, Cheng X, Chou KC. iDrug-Target: predicting the interactions between drug compounds and target proteins in cellular networking via benchmark dataset optimization approach. J Biomol Struct Dyn 2015; 33:2221-33. [DOI: 10.1080/07391102.2014.998710] [Citation(s) in RCA: 146] [Impact Index Per Article: 14.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
Affiliation(s)
- Xuan Xiao
- Computer Department, Jing-De-Zhen Ceramic Institute , Jing-De-Zhen 333046, China
- Information School, ZheJiang Textile & Fashion College , NingBo 315211, China
- Gordon Life Science Institute , 53 South Cottage Road, Boston 02478, MA, USA
| | - Jian-Liang Min
- Computer Department, Jing-De-Zhen Ceramic Institute , Jing-De-Zhen 333046, China
| | - Wei-Zhong Lin
- Computer Department, Jing-De-Zhen Ceramic Institute , Jing-De-Zhen 333046, China
| | - Zi Liu
- Computer Department, Jing-De-Zhen Ceramic Institute , Jing-De-Zhen 333046, China
| | - Xiang Cheng
- Computer Department, Jing-De-Zhen Ceramic Institute , Jing-De-Zhen 333046, China
| | - Kuo-Chen Chou
- Center of Excellence in Genomic Medicine Research (CEGMR), King Abdulaziz University , JeddaH 21589, Saudi Arabia
- Gordon Life Science Institute , 53 South Cottage Road, Boston 02478, MA, USA
| |
Collapse
|
42
|
Ganjtabesh M, Montaseri S, Zare-Mirakabad F. Using temperature effects to predict the interactions between two RNAs. J Theor Biol 2015; 364:98-102. [PMID: 25218429 DOI: 10.1016/j.jtbi.2014.09.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2014] [Revised: 08/30/2014] [Accepted: 09/02/2014] [Indexed: 10/24/2022]
Abstract
MOTIVATION Interaction of two RNA molecules is considered as an important factor that regulates gene expression post-transcriptional process. Most of the ncRNAs prevent the translation of their target mRNA(s) by forming stable bindings with them. Although several computational methods have been proposed to predict the interactions between two RNAs, none of them can produce reliable and accurate results. RESULTS In this paper, a new approach entitled tempRNAs is presented to accurately predict interaction structure between two RNAs based on a gradual temperature decrease. For each specified temperature, our algorithm contains two main steps. First, the secondary structure of each RNA is determined with respect to the previous base pairs as constraints. Second, two RNAs are concatenated and then the interaction between them is calculated according to the previous base pairs. The secondary structures are determined based on minimum free energy model. The proposed algorithm is evaluated for a set of known interacting RNA pairs. The results show the higher accuracy of the proposed method in comparison to the other state-of-the-art algorithms, namely inRNAs and RactIP.
Collapse
Affiliation(s)
- Mohammad Ganjtabesh
- Department of Computer Science, School of Mathematics, Statistics, and Computer Science, University of Tehran, Tehran, Iran.
| | - Soheila Montaseri
- Department of Computer Science, School of Mathematics, Statistics, and Computer Science, University of Tehran, Tehran, Iran.
| | - Fatemeh Zare-Mirakabad
- Department of Computer Science, Faculty of Mathematics and Computer Science, Amirkabir University of Technology, Tehran, Iran.
| |
Collapse
|
43
|
Chen W, Lin H, Chou KC. Pseudo nucleotide composition or PseKNC: an effective formulation for analyzing genomic sequences. MOLECULAR BIOSYSTEMS 2015; 11:2620-34. [DOI: 10.1039/c5mb00155b] [Citation(s) in RCA: 262] [Impact Index Per Article: 26.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
With the avalanche of DNA/RNA sequences generated in the post-genomic age, it is urgent to develop automated methods for analyzing the relationship between the sequences and their functions.
Collapse
Affiliation(s)
- Wei Chen
- Department of Physics
- School of Sciences
- and Center for Genomics and Computational Biology
- Hebei United University
- Tangshan 063000
| | - Hao Lin
- Gordon Life Science Institute
- Boston
- USA
- Key Laboratory for Neuro-Information of Ministry of Education
- Center of Bioinformatics
| | - Kuo-Chen Chou
- Department of Physics
- School of Sciences
- and Center for Genomics and Computational Biology
- Hebei United University
- Tangshan 063000
| |
Collapse
|
44
|
Khan ZU, Hayat M, Khan MA. Discrimination of acidic and alkaline enzyme using Chou’s pseudo amino acid composition in conjunction with probabilistic neural network model. J Theor Biol 2015; 365:197-203. [DOI: 10.1016/j.jtbi.2014.10.014] [Citation(s) in RCA: 110] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2014] [Revised: 09/09/2014] [Accepted: 10/11/2014] [Indexed: 12/11/2022]
|
45
|
Fang J, Yang R, Gao L, Yang S, Pang X, Li C, He Y, Liu AL, Du GH. Consensus models for CDK5 inhibitors in silico and their application to inhibitor discovery. Mol Divers 2014; 19:149-62. [DOI: 10.1007/s11030-014-9561-3] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2014] [Accepted: 11/25/2014] [Indexed: 02/07/2023]
|
46
|
Chen J, Tang YY, Chen CLP, Fang B, Lin Y, Shang Z. Multi-Label Learning With Fuzzy Hypergraph Regularization for Protein Subcellular Location Prediction. IEEE Trans Nanobioscience 2014; 13:438-47. [DOI: 10.1109/tnb.2014.2341111] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|
47
|
3D QSAR studies, pharmacophore modeling and virtual screening on a series of steroidal aromatase inhibitors. Int J Mol Sci 2014; 15:20927-47. [PMID: 25405729 PMCID: PMC4264204 DOI: 10.3390/ijms151120927] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2014] [Revised: 09/28/2014] [Accepted: 10/22/2014] [Indexed: 12/12/2022] Open
Abstract
Aromatase inhibitors are the most important targets in treatment of estrogen-dependent cancers. In order to search for potent steroidal aromatase inhibitors (SAIs) with lower side effects and overcome cellular resistance, comparative molecular field analysis (CoMFA) and comparative molecular similarity indices analysis (CoMSIA) were performed on a series of SAIs to build 3D QSAR models. The reliable and predictive CoMFA and CoMSIA models were obtained with statistical results (CoMFA: q2 = 0.636, r2ncv = 0.988, r2pred = 0.658; CoMSIA: q2 = 0.843, r2ncv = 0.989, r2pred = 0.601). This 3D QSAR approach provides significant insights that can be used to develop novel and potent SAIs. In addition, Genetic algorithm with linear assignment of hypermolecular alignment of database (GALAHAD) was used to derive 3D pharmacophore models. The selected pharmacophore model contains two acceptor atoms and four hydrophobic centers, which was used as a 3D query for virtual screening against NCI2000 database. Six hit compounds were obtained and their biological activities were further predicted by the CoMFA and CoMSIA models, which are expected to design potent and novel SAIs.
Collapse
|
48
|
Qiu WR, Xiao X, Lin WZ, Chou KC. iUbiq-Lys: prediction of lysine ubiquitination sites in proteins by extracting sequence evolution information via a gray system model. J Biomol Struct Dyn 2014; 33:1731-42. [PMID: 25248923 DOI: 10.1080/07391102.2014.968875] [Citation(s) in RCA: 126] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
Abstract
As one of the most important posttranslational modifications (PTMs), ubiquitination plays an important role in regulating varieties of biological processes, such as signal transduction, cell division, apoptosis, and immune response. Ubiquitination is also named "lysine ubiquitination" because it occurs when an ubiquitin is covalently attached to lysine (K) residues of targeting proteins. Given an uncharacterized protein sequence that contains many lysine residues, which one of them is the ubiquitination site, and which one is of non-ubiquitination site? With the avalanche of protein sequences generated in the postgenomic age, it is highly desired for both basic research and drug development to develop an automated method for rapidly and accurately annotating the ubiquitination sites in proteins. In view of this, a new predictor called "iUbiq-Lys" was developed based on the evolutionary information, gray system model, as well as the general form of pseudo-amino acid composition. It was demonstrated via the rigorous cross-validations that the new predictor remarkably outperformed all its counterparts. As a web-server, iUbiq-Lys is accessible to the public at http://www.jci-bioinfo.cn/iUbiq-Lys . For the convenience of most experimental scientists, we have further provided a protocol of step-by-step guide, by which users can easily get their desired results without the need to follow the complicated mathematics that were presented in this paper just for the integrity of its development process.
Collapse
Affiliation(s)
- Wang-Ren Qiu
- a Computer Department, Jing-De-Zhen Ceramic Institute , Jing-De-Zhen 333403 , China
| | | | | | | |
Collapse
|
49
|
Zhong WZ, Zhou SF. Molecular science for drug development and biomedicine. Int J Mol Sci 2014; 15:20072-8. [PMID: 25375190 PMCID: PMC4264156 DOI: 10.3390/ijms151120072] [Citation(s) in RCA: 69] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [MESH Headings] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2014] [Accepted: 10/24/2014] [Indexed: 01/21/2023] Open
Affiliation(s)
- Wei-Zhu Zhong
- Gordon Life Science Institute, Belmont, MA 02478, USA.
| | - Shu-Feng Zhou
- Department of Pharmaceutical Sciences, College of Pharmacy, University of South Florida, Tampa, FL 33620, USA.
| |
Collapse
|
50
|
Potential non homologous protein targets of mycobacterium tuberculosis H37Rv identified from protein–protein interaction network. J Theor Biol 2014; 361:152-8. [DOI: 10.1016/j.jtbi.2014.07.031] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2014] [Revised: 07/26/2014] [Accepted: 07/28/2014] [Indexed: 01/09/2023]
|