1
|
Zhu Y, He J, Wei R, Liu J. Construction and experimental validation of a novel ferroptosis-related gene signature for myelodysplastic syndromes. Immun Inflamm Dis 2024; 12:e1221. [PMID: 38578040 PMCID: PMC10996383 DOI: 10.1002/iid3.1221] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2023] [Revised: 01/26/2024] [Accepted: 03/03/2024] [Indexed: 04/06/2024] Open
Abstract
BACKGROUND Myelodysplastic syndromes (MDS) are clonal hematopoietic disorders characterized by morphological abnormalities and peripheral blood cytopenias, carrying a risk of progression to acute myeloid leukemia. Although ferroptosis is a promising target for MDS treatment, the specific roles of ferroptosis-related genes (FRGs) in MDS diagnosis have not been elucidated. METHODS MDS-related microarray data were obtained from the Gene Expression Omnibus database. A comprehensive analysis of FRG expression levels in patients with MDS and controls was conducted, followed by the use of multiple machine learning methods to establish prediction models. The predictive ability of the optimal model was evaluated using nomogram analysis and an external data set. Functional analysis was applied to explore the underlying mechanisms. The mRNA levels of the model genes were verified in MDS clinical samples by quantitative real-time polymerase chain reaction (qRT-PCR). RESULTS The extreme gradient boosting model demonstrated the best performance, leading to the identification of a panel of six signature genes: SREBF1, PTPN6, PARP9, MAP3K11, MDM4, and EZH2. Receiver operating characteristic curves indicated that the model exhibited high accuracy in predicting MDS diagnosis, with area under the curve values of 0.989 and 0.962 for the training and validation cohorts, respectively. Functional analysis revealed significant associations between these genes and the infiltrating immune cells. The expression levels of these genes were successfully verified in MDS clinical samples. CONCLUSION Our study is the first to identify a novel model using FRGs to predict the risk of developing MDS. FRGs may be implicated in MDS pathogenesis through immune-related pathways. These findings highlight the intricate correlation between ferroptosis and MDS, offering insights that may aid in identifying potential therapeutic targets for this debilitating disorder.
Collapse
Affiliation(s)
- Yidong Zhu
- Department of Traditional Chinese Medicine, Shanghai Tenth People's HospitalTongji University School of MedicineShanghaiChina
| | - Jun He
- Department of Hematology, Shanghai Tenth People's HospitalTongji University School of MedicineShanghaiChina
| | - Rong Wei
- Department of Hematology, Shanghai Tenth People's HospitalTongji University School of MedicineShanghaiChina
| | - Jun Liu
- Department of Traditional Chinese Medicine, Shanghai Tenth People's HospitalTongji University School of MedicineShanghaiChina
| |
Collapse
|
2
|
Wang X, Li A, Li X, Cui H. Empowering Protein Engineering through Recombination of Beneficial Substitutions. Chemistry 2024; 30:e202303889. [PMID: 38288640 DOI: 10.1002/chem.202303889] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2024] [Indexed: 02/24/2024]
Abstract
Directed evolution stands as a seminal technology for generating novel protein functionalities, a cornerstone in biocatalysis, metabolic engineering, and synthetic biology. Today, with the development of various mutagenesis methods and advanced analytical machines, the challenge of diversity generation and high-throughput screening platforms is largely solved, and one of the remaining challenges is: how to empower the potential of single beneficial substitutions with recombination to achieve the epistatic effect. This review overviews experimental and computer-assisted recombination methods in protein engineering campaigns. In addition, integrated and machine learning-guided strategies were highlighted to discuss how these recombination approaches contribute to generating the screening library with better diversity, coverage, and size. A decision tree was finally summarized to guide the further selection of proper recombination strategies in practice, which was beneficial for accelerating protein engineering.
Collapse
Affiliation(s)
- Xinyue Wang
- School of Food Science and Pharmaceutical Engineering, Nanjing Normal University, No. 2 Xuelin Road, Nanjing, 210097, China
| | - Anni Li
- School of Food Science and Pharmaceutical Engineering, Nanjing Normal University, No. 2 Xuelin Road, Nanjing, 210097, China
| | - Xiujuan Li
- School of Food Science and Pharmaceutical Engineering, Nanjing Normal University, No. 2 Xuelin Road, Nanjing, 210097, China
| | - Haiyang Cui
- School of Life Sciences, Nanjing Normal University, No. 2 Xuelin Road, Nanjing, 210097, China
| |
Collapse
|
3
|
Wu Y, Liu C, Huang J, Wang F. Quantitative proteomics reveals pregnancy prognosis signature of polycystic ovary syndrome women based on machine learning. Gynecol Endocrinol 2024; 40:2328613. [PMID: 38497425 DOI: 10.1080/09513590.2024.2328613] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/22/2023] [Accepted: 03/05/2024] [Indexed: 03/19/2024] Open
Abstract
OBJECTIVE We aimed to screen and construct a predictive model for pregnancy loss in polycystic ovary syndrome (PCOS) patients through machine learning methods. METHODS We obtained the endometrial samples from 33 PCOS patients and 7 healthy controls at the Reproductive Center of the Second Hospital of Lanzhou University from September 2019 to September 2020. Liquid chromatography tandem mass spectrometry (LCMS/MS) was conducted to identify the differentially expressed proteins (DEPs) of the two groups. Gene Ontology (GO) as well as Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analysis were performed to analyze the related pathways and functions of the DEPs. Then, we used machine learning methods to screen the feature proteins. Multivariate Cox regression analysis was also conducted to establish the prognostic models. The performance of the prognostic model was then evaluated by the receiver operating characteristic (ROC) curve, calibration curve, and decision curve analysis (DCA). In addition, the Bootstrap method was conducted to verify the generalization ability of the model. Finally, linear correlation analysis was performed to figure out the correlation between the feature proteins and clinical data. RESULTS Four hundred and fifty DEPs in PCOS and controls were screened out, and we obtained some pathways and functions. A prognostic model for the pregnancy loss of PCOS was established, which has good discrimination and generalization ability based on two feature proteins (TIA1, COL5A1). Strong correlation between clinical data and proteins were identified to predict the reproductive outcome in PCOS. CONCLUSION The model based on the TIA1 and COL5A1 protein could effectively predict the occurrence of pregnancy loss in PCOS patients and provide a good theoretical foundation for subsequent research.
Collapse
Affiliation(s)
- Yuanyuan Wu
- Traditional Chinese and Western Medicine, Gansu University of Chinese Medicine, Lanzhou, China
| | - Cai Liu
- Department of Reproductive Medicine, Lanzhou University Second Hospital, Lanzhou, China
| | - Jinge Huang
- Traditional Chinese and Western Medicine, Gansu University of Chinese Medicine, Lanzhou, China
| | - Fang Wang
- Department of Reproductive Medicine, Lanzhou University Second Hospital, Lanzhou, China
| |
Collapse
|
4
|
Tao X, Jiang M, Liu Y, Hu Q, Zhu B, Hu J, Guo W, Wu X, Xiong Y, Shi X, Zhang X, Han X, Li W, Tong R, Long E. Predicting three-month fasting blood glucose and glycated hemoglobin changes in patients with type 2 diabetes mellitus based on multiple machine learning algorithms. Sci Rep 2023; 13:16437. [PMID: 37777593 PMCID: PMC10543442 DOI: 10.1038/s41598-023-43240-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2023] [Accepted: 09/21/2023] [Indexed: 10/02/2023] Open
Abstract
Fasting blood glucose (FBG) and glycosylated hemoglobin (HbA1c) are key indicators reflecting blood glucose control in type 2 diabetes mellitus (T2DM) patients. The purpose of this study is to establish a predictive model for blood glucose changes in T2DM patients after 3 months of treatment, achieving personalized treatment.A retrospective study was conducted on type 2 diabetes mellitus real-world medical data from 4 cities in Sichuan Province, China from January 2015 to December 2020. After data preprocessing, data inputting, data sampling, and feature screening, 16 kinds of machine learning methods were used to construct prediction models, and 5 prediction models with the best prediction performance were screened respectively. A total of 100,000 cases were included to establish the FBG model, and 2,169 cases were established to establish the HbA1c model. The best prediction model both of FBG and HbA1c finally obtained are realized by ensemble learning and modified random forest inputting, the AUC values are 0.819 and 0.970, respectively. The most important indicators of the FBG and HbA1c prediction model were FBG and HbA1c. Medication compliance, follow-up outcome, dietary habits, BMI, and waist circumference also had a greater impact on FBG levels. The prediction accuracy of the models of the two blood glucose control indicators is high and has certain clinical applicability.HbA1c and FBG are mutually important predictors, and there is a close relationship between them.
Collapse
Affiliation(s)
- Xue Tao
- Personalized Drug Therapy Key Laboratory of Sichuan Province, Department of Pharmacy, Sichuan Provincial People's Hospital, School of Medicine, University of Electronic Science and Technology of China, Chengdu, 610072, Sichuan, China
| | - Min Jiang
- West China School of Public Health and West China Fourth Hospital, Sichuan University, Chengdu, 610044, Sichuan, China
| | - Yumeng Liu
- Department of Pharmacy, Daping Hospital, Army Medical University, Chongqing, 400042, China
| | - Qi Hu
- Hospital of Chengdu University of Traditional Chinese Medicine, Chengdu, 610072, Sichuan, China
| | - Baoqiang Zhu
- School of Pharmacy, Southwest Medical University, Luzhou, 646000, Sichuan, China
| | - Jiaqiang Hu
- Personalized Drug Therapy Key Laboratory of Sichuan Province, Department of Pharmacy, Sichuan Provincial People's Hospital, School of Medicine, University of Electronic Science and Technology of China, Chengdu, 610072, Sichuan, China
| | - Wenmei Guo
- Personalized Drug Therapy Key Laboratory of Sichuan Province, Department of Pharmacy, Sichuan Provincial People's Hospital, School of Medicine, University of Electronic Science and Technology of China, Chengdu, 610072, Sichuan, China
| | - Xingwei Wu
- Personalized Drug Therapy Key Laboratory of Sichuan Province, Department of Pharmacy, Sichuan Provincial People's Hospital, School of Medicine, University of Electronic Science and Technology of China, Chengdu, 610072, Sichuan, China
| | - Yu Xiong
- Institute of Materia Medica, Chinese Academy of Medical Sciences/Peking Union Medical College, Beijing, 100050, China
| | - Xia Shi
- Personalized Drug Therapy Key Laboratory of Sichuan Province, Department of Pharmacy, Sichuan Provincial People's Hospital, School of Medicine, University of Electronic Science and Technology of China, Chengdu, 610072, Sichuan, China
| | - Xueli Zhang
- Sichuan Provincial Health Information Center, Chengdu, 610015, Sichuan, China
| | - Xu Han
- Sichuan Provincial Health Information Center, Chengdu, 610015, Sichuan, China
| | - Wenyuan Li
- Personalized Drug Therapy Key Laboratory of Sichuan Province, Department of Pharmacy, Sichuan Provincial People's Hospital, School of Medicine, University of Electronic Science and Technology of China, Chengdu, 610072, Sichuan, China
| | - Rongsheng Tong
- Personalized Drug Therapy Key Laboratory of Sichuan Province, Department of Pharmacy, Sichuan Provincial People's Hospital, School of Medicine, University of Electronic Science and Technology of China, Chengdu, 610072, Sichuan, China
| | - Enwu Long
- Personalized Drug Therapy Key Laboratory of Sichuan Province, Department of Pharmacy, Sichuan Provincial People's Hospital, School of Medicine, University of Electronic Science and Technology of China, Chengdu, 610072, Sichuan, China.
| |
Collapse
|
5
|
Tyagi P, Sharma A, Semwal R, Tiwary US, Varadwaj PK. XGBoost odor prediction model: finding the structure-odor relationship of odorant molecules using the extreme gradient boosting algorithm. J Biomol Struct Dyn 2023:1-12. [PMID: 37723894 DOI: 10.1080/07391102.2023.2258415] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2023] [Accepted: 09/07/2023] [Indexed: 09/20/2023]
Abstract
Determining the structure-odor relationship has always been a very challenging task. The main challenge in investigating the correlation between the molecular structure and its associated odor is the ambiguous and obscure nature of verbally defined odor descriptors, particularly when the odorant molecules are from different sources. With the recent developments in machine learning (ML) technology, ML and data analytic techniques are significantly being used for quantitative structure-activity relationship (QSAR) in the chemistry domain toward knowledge discovery where the traditional Edisonian methods have not been useful. The smell perception of odorant molecules is one of the aforementioned tasks, as olfaction is one of the least understood senses as compared to other senses. In this study, the XGBoost odor prediction model was generated to classify smells of odorant molecules from their SMILES strings. We first collected the dataset of 1278 odorant molecules with seven basic odor descriptors, and then 1875 physicochemical properties of odorant molecules were calculated. To obtain relevant physicochemical features, a feature reduction algorithm called PCA was also employed. The ML model developed in this study was able to predict all seven basic smells with high precision (>99%) and high sensitivity (>99%) when tested on an independent test dataset. The results of the proposed study were also compared with three recently conducted studies. The results indicate that the XGBoost-PCA model performed better than the other models for predicting common odor descriptors. The methodology and ML model developed in this study may be helpful in understanding the structure-odor relationship.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Pankaj Tyagi
- Department of Information Technology, Indian Institute of Information Technology Allahabad, Allahabad, India
| | - Anju Sharma
- Department of Pharmacoinformatics, National Institute of Pharmaceutical Education and Research (NIPER), Mohali, India
| | - Rahul Semwal
- Department of Computer Sciences & Engineering, Indian Institute of Information Technology Nagpur, Nagpur, India
| | - Uma Shanker Tiwary
- Department of Information Technology, Indian Institute of Information Technology Allahabad, Allahabad, India
| | - Pritish Kumar Varadwaj
- Department of Bioinformatics and Applied Sciences, Indian Institute of Information Technology Allahabad, Allahabad, India
| |
Collapse
|
6
|
Wu L, Xiao F, Luo X, Yun K, Wen D, Lin J, Yang S, Li T, Xiang P, Shi Y. Predicting the retention time of Synthetic Cannabinoids using a combinatorial QSAR approach. Heliyon 2023; 9:e16671. [PMID: 37484220 PMCID: PMC10360586 DOI: 10.1016/j.heliyon.2023.e16671] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2022] [Revised: 05/23/2023] [Accepted: 05/24/2023] [Indexed: 07/25/2023] Open
Abstract
Background Abuse of Synthetic Cannabinoids (SCs) has become a serious threat to public health. Due to the various structural and chemical group modified by criminals, their detection is a major challenge in forensic toxicological identification. Therefore, rapid and efficient identification of SCs is important for forensic toxicology and drug bans. The prediction of an analyte's retention time in liquid chromatography is an important index for the qualitative analysis of compounds and can provide informatics solutions for the interpretation of chromatographic data. Methods In this study, experimental data from high-resolution mass spectrometry (HRMS) are used to construct a regression model for predicting the retention time of SCs using machine learning methods. The prediction ability of the model is improved by adopting a strategy that combines different descriptors in different independent machine-learning methods. Results The best model was obtained with a method that combined Substructure Fingerprint Count and Finger printer features and the support vector regression (SVR) method, as it exhibited an R2 value of 0.81 for the validation set and 0.83 for the test set. In addition, 4 new SCs were predicted by the optimized model, with a prediction error within 3%. Conclusions Our study provides a model that can predict the retention time of compounds and it can be used as a filter to reduce false-positive candidates when used in combination with LC-HRMS, especially in the absence of reference standards. This can improve the confidence of identification in non-targeted analysis and the reliability of identifying unknown substances.
Collapse
Affiliation(s)
- Lina Wu
- Academy of Forensic Science, Shanghai Key Laboratory of Forensic Medicine, Shanghai 200063, PR China
- Shanxi Medical University, Jinzhong 030600, PR China
| | - Fu Xiao
- School of Chinese Materia Medica, Nanjing University of Chinese Medicine, Nanjing 210023, PR China
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Science, 555 Zuchongzhi Road, Shanghai 201203, PR China
| | - Xiaomin Luo
- School of Chinese Materia Medica, Nanjing University of Chinese Medicine, Nanjing 210023, PR China
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Science, 555 Zuchongzhi Road, Shanghai 201203, PR China
| | - Keming Yun
- Shanxi Medical University, Jinzhong 030600, PR China
| | - Di Wen
- Hebei Medical University, Shijiazhuang 050017, PR China
| | - Jiaman Lin
- Academy of Forensic Science, Shanghai Key Laboratory of Forensic Medicine, Shanghai 200063, PR China
- Shanxi Medical University, Jinzhong 030600, PR China
| | - Shuo Yang
- Academy of Forensic Science, Shanghai Key Laboratory of Forensic Medicine, Shanghai 200063, PR China
| | - Tianle Li
- Shanxi Medical University, Jinzhong 030600, PR China
| | - Ping Xiang
- Academy of Forensic Science, Shanghai Key Laboratory of Forensic Medicine, Shanghai 200063, PR China
| | - Yan Shi
- Academy of Forensic Science, Shanghai Key Laboratory of Forensic Medicine, Shanghai 200063, PR China
| |
Collapse
|
7
|
Li H, He J, Li M, Li K, Pu X, Guo Y. Immune landscape-based machine-learning-assisted subclassification, prognosis, and immunotherapy prediction for glioblastoma. Front Immunol 2022; 13:1027631. [PMID: 36532035 PMCID: PMC9751405 DOI: 10.3389/fimmu.2022.1027631] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2022] [Accepted: 11/15/2022] [Indexed: 12/04/2022] Open
Abstract
Introduction As a malignant brain tumor, glioblastoma (GBM) is characterized by intratumor heterogeneity, a worse prognosis, and highly invasive, lethal, and refractory natures. Immunotherapy has been becoming a promising strategy to treat diverse cancers. It has been known that there are highly heterogeneous immunosuppressive microenvironments among different GBM molecular subtypes that mainly include classical (CL), mesenchymal (MES), and proneural (PN), respectively. Therefore, an in-depth understanding of immune landscapes among them is essential for identifying novel immune markers of GBM. Methods and results In the present study, based on collecting the largest number of 109 immune signatures, we aim to achieve a precise diagnosis, prognosis, and immunotherapy prediction for GBM by performing a comprehensive immunogenomic analysis. Firstly, machine-learning (ML) methods were proposed to evaluate the diagnostic values of these immune signatures, and the optimal classifier was constructed for accurate recognition of three GBM subtypes with robust and promising performance. The prognostic values of these signatures were then confirmed, and a risk score was established to divide all GBM patients into high-, medium-, and low-risk groups with a high predictive accuracy for overall survival (OS). Therefore, complete differential analysis across GBM subtypes was performed in terms of the immune characteristics along with clinicopathological and molecular features, which indicates that MES shows much higher immune heterogeneity compared to CL and PN but has significantly better immunotherapy responses, although MES patients may have an immunosuppressive microenvironment and be more proinflammatory and invasive. Finally, the MES subtype is proved to be more sensitive to 17-AAG, docetaxel, and erlotinib using drug sensitivity analysis and three compounds of AS-703026, PD-0325901, and MEK1-2-inhibitor might be potential therapeutic agents. Conclusion Overall, the findings of this research could help enhance our understanding of the tumor immune microenvironment and provide new insights for improving the prognosis and immunotherapy of GBM patients.
Collapse
|
8
|
DBP-iDWT: Improving DNA-Binding Proteins Prediction Using Multi-Perspective Evolutionary Profile and Discrete Wavelet Transform. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2022; 2022:2987407. [PMID: 36211019 PMCID: PMC9534628 DOI: 10.1155/2022/2987407] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/29/2022] [Revised: 08/19/2022] [Accepted: 09/09/2022] [Indexed: 11/17/2022]
Abstract
DNA-binding proteins (DBPs) have crucial biotic activities including DNA replication, recombination, and transcription. DBPs are highly concerned with chronic diseases and are used in the manufacturing of antibiotics and steroids. A series of predictors were established to identify DBPs. However, researchers are still working to further enhance the identification of DBPs. This research designed a novel predictor to identify DBPs more accurately. The features from the sequences are transformed by F-PSSM (Filtered position-specific scoring matrix), PSSM-DPC (Position specific scoring matrix-dipeptide composition), and R-PSSM (Reduced position-specific scoring matrix). To eliminate the noisy attributes, we extended DWT (discrete wavelet transform) to F-PSSM, PSSM-DPC, and R-PSSM and introduced three novel descriptors, namely, F-PSSM-DWT, PSSM-DPC-DWT, and R-PSSM-DWT. Onward, the training of the four models were performed using LiXGB (Light eXtreme gradient boosting), XGB (eXtreme gradient boosting, ERT (extremely randomized trees), and Adaboost. LiXGB with R-PSSM-DWT has attained 6.55% higher accuracy on training and 5.93% on testing dataset than the best existing predictors. The results reveal the excellent performance of our novel predictor over the past studies. DBP-iDWT would be fruitful for establishing more operative therapeutic strategies for fatal disease treatment.
Collapse
|
9
|
Comparative Analysis on Alignment-Based and Pretrained Feature Representations for the Identification of DNA-Binding Proteins. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2022; 2022:5847242. [PMID: 35799660 PMCID: PMC9256349 DOI: 10.1155/2022/5847242] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/10/2022] [Accepted: 06/07/2022] [Indexed: 11/17/2022]
Abstract
The interaction between DNA and protein is vital for the development of a living body. Previous numerous studies on in silico identification of DNA-binding proteins (DBPs) usually include features extracted from the alignment-based (pseudo) position-specific scoring matrix (PSSM), leading to limited application due to its time-consuming generation. Few researchers have paid attention to the application of pretrained language models at the scale of evolution to the identification of DBPs. To this end, we present comprehensive insights into a comparison study on alignment-based PSSM and pretrained evolutionary scale modeling (ESM) representations in the field of DBP classification. The comparison is conducted by extracting information from PSSM and ESM representations using four unified averaging operations and by performing various feature selection (FS) methods. Experimental results demonstrate that the pretrained ESM representation outperforms the PSSM-derived features in a fair comparison perspective. The pretrained feature presentation deserves wide application to the area of in silico DBP identification as well as other function annotation issues. Finally, it is also confirmed that an ensemble scheme by aggregating various trained FS models can significantly improve the classification performance of DBPs.
Collapse
|
10
|
Zhang C, Mou M, Zhou Y, Zhang W, Lian X, Shi S, Lu M, Sun H, Li F, Wang Y, Zeng Z, Li Z, Zhang B, Qiu Y, Zhu F, Gao J. Biological activities of drug inactive ingredients. Brief Bioinform 2022; 23:6582006. [PMID: 35524477 DOI: 10.1093/bib/bbac160] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2022] [Revised: 04/01/2022] [Accepted: 04/09/2022] [Indexed: 02/06/2023] Open
Abstract
In a drug formulation (DFM), the major components by mass are not Active Pharmaceutical Ingredient (API) but rather Drug Inactive Ingredients (DIGs). DIGs can reach much higher concentrations than that achieved by API, which raises great concerns about their clinical toxicities. Therefore, the biological activities of DIG on physiologically relevant target are widely demanded by both clinical investigation and pharmaceutical industry. However, such activity data are not available in any existing pharmaceutical knowledge base, and their potentials in predicting the DIG-target interaction have not been evaluated yet. In this study, the comprehensive assessment and analysis on the biological activities of DIGs were therefore conducted. First, the largest number of DIGs and DFMs were systematically curated and confirmed based on all drugs approved by US Food and Drug Administration. Second, comprehensive activities for both DIGs and DFMs were provided for the first time to pharmaceutical community. Third, the biological targets of each DIG and formulation were fully referenced to available databases that described their pharmaceutical/biological characteristics. Finally, a variety of popular artificial intelligence techniques were used to assess the predictive potential of DIGs' activity data, which was the first evaluation on the possibility to predict DIG's activity. As the activities of DIGs are critical for current pharmaceutical studies, this work is expected to have significant implications for the future practice of drug discovery and precision medicine.
Collapse
Affiliation(s)
- Chenyang Zhang
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
| | - Minjie Mou
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
| | - Ying Zhou
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China.,State Key Laboratory for Diagnosis and Treatment of Infectious Disease, Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, Zhejiang Provincial Key Laboratory for Drug Clinical Research and Evaluation, The First Affiliated Hospital, Zhejiang University, 79 QingChun Road, Hangzhou, Zhejiang 310000, China
| | - Wei Zhang
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
| | - Xichen Lian
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
| | - Shuiyang Shi
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
| | - Mingkun Lu
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
| | - Huaicheng Sun
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
| | - Fengcheng Li
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
| | - Yunxia Wang
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
| | - Zhenyu Zeng
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou 330110, China
| | - Zhaorong Li
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou 330110, China
| | - Bing Zhang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou 330110, China
| | - Yunqing Qiu
- State Key Laboratory for Diagnosis and Treatment of Infectious Disease, Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, Zhejiang Provincial Key Laboratory for Drug Clinical Research and Evaluation, The First Affiliated Hospital, Zhejiang University, 79 QingChun Road, Hangzhou, Zhejiang 310000, China
| | - Feng Zhu
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China.,Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou 330110, China
| | - Jianqing Gao
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China.,Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou, Zhejiang, China
| |
Collapse
|
11
|
Sokhansanj BA, Rosen GL. Mapping Data to Deep Understanding: Making the Most of the Deluge of SARS-CoV-2 Genome Sequences. mSystems 2022; 7:e0003522. [PMID: 35311562 PMCID: PMC9040592 DOI: 10.1128/msystems.00035-22] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/27/2022] [Indexed: 12/22/2022] Open
Abstract
Next-generation sequencing has been essential to the global response to the COVID-19 pandemic. As of January 2022, nearly 7 million severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) sequences are available to researchers in public databases. Sequence databases are an abundant resource from which to extract biologically relevant and clinically actionable information. As the pandemic has gone on, SARS-CoV-2 has rapidly evolved, involving complex genomic changes that challenge current approaches to classifying SARS-CoV-2 variants. Deep sequence learning could be a potentially powerful way to build complex sequence-to-phenotype models. Unfortunately, while they can be predictive, deep learning typically produces "black box" models that cannot directly provide biological and clinical insight. Researchers should therefore consider implementing emerging methods for visualizing and interpreting deep sequence models. Finally, researchers should address important data limitations, including (i) global sequencing disparities, (ii) insufficient sequence metadata, and (iii) screening artifacts due to poor sequence quality control.
Collapse
Affiliation(s)
- Bahrad A. Sokhansanj
- Drexel University, Ecological and Evolutionary Signal-Processing and Informatics Laboratory, Department of Electrical & Computer Engineering, College of Engineering, Philadelphia, Pennsylvania, USA
| | - Gail L. Rosen
- Drexel University, Ecological and Evolutionary Signal-Processing and Informatics Laboratory, Department of Electrical & Computer Engineering, College of Engineering, Philadelphia, Pennsylvania, USA
| |
Collapse
|