Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

Total Articles

52
(from Reference Citation Analysis)

Article PDFs (6)

Cited by > 0 (38)

Searched Name

Qian-Nan Hu

Ranked By

Results Analysis

Year Published Analysis
Article Type Analysis
Publication Title Analysis
Category Analysis

Results Analysis

Indexed Articles

Year Published

Show more Refine

Article Statistics

Refine

Publication Titles

Show more Refine

Grant Agencies

Show more Refine

Category

Show more Refine

Number	Citation Analysis
1	High-throughput prediction of enzyme promiscuity based on substrate-product pairs. Brief Bioinform 2024;25:bbae089. [PMID: 38487850 PMCID: PMC10940840 DOI: 10.1093/bib/bbae089] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2023] [Revised: 01/20/2024] [Accepted: 02/03/2024] [Indexed: 03/18/2024] Open Abstract The screening of enzymes for catalyzing specific substrate-product pairs is often constrained in the realms of metabolic engineering and synthetic biology. Existing tools based on substrate and reaction similarity predominantly rely on prior knowledge, demonstrating limited extrapolative capabilities and an inability to incorporate custom candidate-enzyme libraries. Addressing these limitations, we have developed the Substrate-product Pair-based Enzyme Promiscuity Prediction (SPEPP) model. This innovative approach utilizes transfer learning and transformer architecture to predict enzyme promiscuity, thereby elucidating the intricate interplay between enzymes and substrate-product pairs. SPEPP exhibited robust predictive ability, eliminating the need for prior knowledge of reactions and allowing users to define their own candidate-enzyme libraries. It can be seamlessly integrated into various applications, including metabolic engineering, de novo pathway design, and hazardous material degradation. To better assist metabolic engineers in designing and refining biochemical pathways, particularly those without programming skills, we also designed EnzyPick, an easy-to-use web server for enzyme screening based on SPEPP. EnzyPick is accessible at http://www.biosynther.com/enzypick/. Collapse Key Words deep learning enzyme promiscuity enzyme screening substrate-product pair web server Collapse MESH Headings Collapse Grants 2018YFA0900704 National Key Research and Development Program of China 153D31KYSB20170121 International Partnership Program of the Chinese Academy of Sciences of China Collapse
2	Epidemiological investigation and proteomic profiling of typical TCM syndrome in HIV/AIDS immunological nonresponders. Anat Rec (Hoboken) 2023;306:3106-3119. [PMID: 35775967 DOI: 10.1002/ar.25018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2022] [Revised: 05/11/2022] [Accepted: 06/02/2022] [Indexed: 11/09/2022] Abstract HIV/AIDS pandemic remains the world's most severe public health challenge, especially for HIV/AIDS immunological nonresponders (HIV/AIDS-INRs), who tend to have higher mortality. Due to the advantages in promoting patients' immune reconstitution, Traditional Chinese medicine (TCM) has become one of the mainstays of complementary treatments for HIV/AIDS-INRs. Given that effective TCM treatments largely depend on precise syndrome differentiation, there is an increasing interest in exploring biological evidence for the classification of TCM syndromes in HIV/AIDS-INRs. In our study, to identify the typical HIV/AIDS-INRs syndrome, an epidemiological survey was first conducted in the Liangshan prefecture (China), a high HIV/AIDS prevalence region. The key TCM syndrome, Yang deficiency of spleen and kidney (YDSK), was evaluated by using a tandem mass tag combined with liquid chromatography-tandem mass spectrometry (TMT-LC-MS/MS). A total of 62 differentially expressed proteins (DEPs) of YDSK syndrome compared with healthy people were screened out. Comparative bioinformatics analyses showed that DEPs in YDSK syndrome were mainly associated with response to wounding and acute inflammatory response in the biological process. The pathway annotation is mainly enriched in complement and coagulation cascades. Finally, the YDSK syndrome-specific DEPs such as HP and S100A9 were verified by ELISA, and confirmed as potential biomarkers for YDSK syndrome. Our study may lay the biological and scientific basis for the specificity of TCM syndromes in HIV/AIDs-INRs, and may provide more opportunities for the deep understanding of TCM syndromes and the developing more effective and stable TCM treatment for HIV/AIDS-INRs. Collapse Key Words HIV/AIDS epidemiological investigation immunological nonresponders proteomics syndrome traditional Chinese medicine 中医免疫无应答流行病学调查蛋白组学证候 Collapse MESH Headings Humans Acquired Immunodeficiency Syndrome/diagnosis Acquired Immunodeficiency Syndrome/epidemiology Medicine, Chinese Traditional/methods Chromatography, Liquid Proteomics Tandem Mass Spectrometry Collapse Grants Collapse
3	MCF2Chem: A manually curated knowledge base of biosynthetic compound production. BIOTECHNOLOGY FOR BIOFUELS AND BIOPRODUCTS 2023;16:167. [PMID: 37925500 PMCID: PMC10625697 DOI: 10.1186/s13068-023-02419-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/02/2023] [Accepted: 10/23/2023] [Indexed: 11/06/2023] Abstract BACKGROUND Microbes have been used as cell factories to synthesize various chemical compounds. Recent advances in synthetic biological technologies have accelerated the increase in the number and capacity of microbial cell factories; the variety and number of synthetic compounds produced via these cell factories have also grown substantially. However, no database is available that provides detailed information on the microbial cell factories and the synthesized compounds. RESULTS In this study, we established MCF2Chem, a manually curated knowledge base on the production of biosynthetic compounds using microbial cell factories. It contains 8888 items of production records related to 1231 compounds that were synthesizable by 590 microbial cell factories, including the production data of compounds (titer, yield, productivity, and content), strain culture information (culture medium, carbon source/precursor/substrate), fermentation information (mode, vessel, scale, and condition), and other information (e.g., strain modification method). The database contains statistical analyses data of compounds and microbial species. The data statistics of MCF2Chem showed that bacteria accounted for 60% of the species and that "fatty acids", "terpenoids", and "shikimates and phenylpropanoids" accounted for the top three chemical products. Escherichia coli, Saccharomyces cerevisiae, Yarrowia lipolytica, and Corynebacterium glutamicum synthesized 78% of these chemical compounds. Furthermore, we constructed a system to recommend microbial cell factories suitable for synthesizing target compounds and vice versa by combining MCF2Chem data, additional strain- and compound-related data, the phylogenetic relationships between strains, and compound similarities. CONCLUSIONS MCF2Chem provides a user-friendly interface for querying, browsing, and visualizing detailed statistical information on microbial cell factories and their synthesizable compounds. It is publicly available at https://mcf.lifesynther.com . This database may serve as a useful resource for synthetic biologists. Collapse Key Words Biochemical product Microbial cell factory Production database Recommendation system Synthetic biology Collapse MESH Headings Collapse Grants 2019YFA0904300 National Key Research and Development Program of China 2019YFA0904300 National Key Research and Development Program of China 153D31KYSB20170121 International Partnership Program of the Chinese Academy of Sciences of China Collapse
4	RDBridge: a knowledge graph of rare diseases based on large-scale text mining. Bioinformatics 2023;39:btad440. [PMID: 37458501 PMCID: PMC10368801 DOI: 10.1093/bioinformatics/btad440] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2023] [Revised: 06/25/2023] [Accepted: 07/14/2023] [Indexed: 07/27/2023] Open Abstract MOTIVATION Despite low prevalence, rare diseases affect 300 million people worldwide. Research on pathogenesis and drug development lags due to limited commercial potential, insufficient epidemiological data, and a dearth of publications. The unique characteristics of rare diseases, including limited annotated data, intricate processes for extracting pertinent entity relationships, and difficulties in standardizing data, represent challenges for text mining. RESULTS We developed a rare disease data acquisition framework using text mining and knowledge graphs and constructed the most comprehensive rare disease knowledge graph to date, Rare Disease Bridge (RDBridge). RDBridge offers search functions for genes, potential drugs, pathways, literature, and medical imaging data that will support mechanistic research, drug development, diagnosis, and treatment for rare diseases. AVAILABILITY AND IMPLEMENTATION RDBridge is freely available at http://rdb.lifesynther.com/. Collapse Key Words Collapse MESH Headings Collapse Grants 2018YFA0900700 National Key Research and Development Program of China Collapse
5	CCIBP: a comprehensive cosmetic ingredients bioinformatics platform. Bioinformatics 2023;39:btad416. [PMID: 37399096 PMCID: PMC10345691 DOI: 10.1093/bioinformatics/btad416] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2023] [Revised: 05/30/2023] [Accepted: 07/03/2023] [Indexed: 07/05/2023] Open Abstract SUMMARY Cosmetics form an important part of our daily lives, and it is therefore important to understand the basic physicochemical properties, metabolic pathways, and toxicological and safe concentrations of these cosmetics molecules. Therefore, comprehensive cosmetic ingredients bioinformatics platform (CCIBP) was developed here, which is a unique comprehensive cosmetic database providing information on regulations, physicochemical properties, and human metabolic pathways for cosmetic molecules from major regions of the world, whilst also correlating plant information in natural products. CCIBP supports formulation analysis, efficacy component analysis, and also combines knowledge of synthetic biology to facilitate access to natural molecules and biosynthetic production. CCIBP, empowered with chemoinformatics, bioinformatics, and synthetic biology data and tools, presents a very helpful platform for cosmetic research and development of ingredients. AVAILABILITY AND IMPLEMENTATION CCIBP is available at: http://design.rxnfinder.org/cosing/. Collapse Key Words Collapse MESH Headings Humans Metabolic Networks and Pathways Databases, Factual Cosmetics Computational Biology Biological Products Collapse Grants National Key Research and Development Program of China Chinese Academy of Sciences of China Collapse
6	Factors impacting the behavior of phytoremediation in pesticide-contaminated environment: A meta-analysis. THE SCIENCE OF THE TOTAL ENVIRONMENT 2023:164418. [PMID: 37257596 DOI: 10.1016/j.scitotenv.2023.164418] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/08/2023] [Revised: 05/17/2023] [Accepted: 05/21/2023] [Indexed: 06/02/2023] Abstract Phytoremediation provides substantial advantages, including eco-friendliness, cost-effectiveness, efficiency, and visual appeal. However, the current knowledge of the factors influencing phytoremediation in pesticide-contaminated environments remains limited. It is critical to understand phytoremediation and the factors affecting the variation in removal efficiency. In this study, we compiled 72 previous research articles to quantify plant-induced improvements in removal efficiency and identify factors that influence variations in phytoremediation behavior through meta-analysis. We observed a significant increase in the removal efficiency of phytoremediation compared to the control group which did not involve phytoremediation. Pesticides significantly affect removal efficiency in terms of their modes of action, substance group, and properties. Plants demonstrated higher efficiency in remediating environments contaminated with pesticides possessing lower molecular masses and log K_ow values. Plant species emerged as a crucial determinant of variations in removal efficiency. Annual plants exhibited a 1.45-fold higher removal efficiency than perennial plants. The removal efficiencies of different plant types decreased in the following order: agri-food crops > aquatic macrophytes > turfgrasses > medicinal plants > forage crops > woody trees. The Gramineae family, which was the most prevalent, demonstrated a robust and consistent phytoremediation ability. This study offers a more comprehensive triangular relationship between removal efficiency, pesticides, and plants, expanding the traditional linear model. Our findings offer valuable insights into the behavior of phytoremediation in pesticide-contaminated environments and the factors determining its success, ultimately guiding further research toward developing strategies for higher removal efficiency in phytoremediation. Collapse Key Words Annual plants Phytoremediation Removal efficiency Residual pesticides meta-analysis Collapse MESH Headings Collapse Grants Collapse
7	Data-Driven Prediction of Molecular Biotransformations in Food Fermentation. JOURNAL OF AGRICULTURAL AND FOOD CHEMISTRY 2023. [PMID: 37218994 DOI: 10.1021/acs.jafc.3c01172] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 05/24/2023] Abstract Fermentation products, together with food components, determine the sense, nutrition, and safety of fermented foods. Traditional methods of fermentation product identification are time-consuming and cumbersome, which cannot meet the increasing need for the identification of the extensive bioactive metabolites produced during food fermentation. Hence, we propose a data-driven integrated platform (FFExplorer, http://www.rxnfinder.org/ffexplorer/) based on machine learning and data on 2,192,862 microbial sequence-encoded enzymes for computational prediction of fermentation products. Using FFExplorer, we explained the mechanism behind the disappearance of spicy taste during pepper fermentation and evaluated the detoxification effects of microbial fermentation for common food contaminants. FFExplorer will provide a valuable reference for inferring bioactive "dark matter" in fermented foods and exploring the application potential of microorganisms. Collapse Key Words active ingredients food microbiology machine learning metabolite synthetic biology Collapse MESH Headings Collapse Grants Collapse
8	SynBioTools: a one-stop facility for searching and selecting synthetic biology tools. BMC Bioinformatics 2023;24:152. [PMID: 37069545 PMCID: PMC10111727 DOI: 10.1186/s12859-023-05281-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2023] [Accepted: 04/11/2023] [Indexed: 04/19/2023] Open Abstract BACKGROUND The rapid development of synthetic biology relies heavily on the use of databases and computational tools, which are also developing rapidly. While many tool registries have been created to facilitate tool retrieval, sharing, and reuse, no relatively comprehensive tool registry or catalog addresses all aspects of synthetic biology. RESULTS We constructed SynBioTools, a comprehensive collection of synthetic biology databases, computational tools, and experimental methods, as a one-stop facility for searching and selecting synthetic biology tools. SynBioTools includes databases, computational tools, and methods extracted from reviews via SCIentific Table Extraction, a scientific table-extraction tool that we built. Approximately 57% of the resources that we located and included in SynBioTools are not mentioned in bio.tools, the dominant tool registry. To improve users' understanding of the tools and to enable them to make better choices, the tools are grouped into nine modules (each with subdivisions) based on their potential biosynthetic applications. Detailed comparisons of similar tools in every classification are included. The URLs, descriptions, source references, and the number of citations of the tools are also integrated into the system. CONCLUSIONS SynBioTools is freely available at https://synbiotools.lifesynther.com/ . It provides end-users and developers with a useful resource of categorized synthetic biology databases, tools, and methods to facilitate tool retrieval and selection. Collapse Key Words Computational tool Database Synthetic biology Table extraction Tool registry Tool retrieval Collapse MESH Headings Collapse Grants 2021YFC2103001, 2019YFA0904300 National Key Research and Development Program of China 2021YFC2103001, 2019YFA0904300 National Key Research and Development Program of China 153D31KYSB20170121 International Partnership Program of the Chinese Academy of Sciences of China Collapse
9	Proteomic investigation and biomarker identification of lung and spleen deficiency syndrome in HIV/AIDS immunological nonresponders. ANNALS OF TRANSLATIONAL MEDICINE 2023. [DOI: 10.21037/atm-23-280] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/21/2023] Abstract Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
10	Proteomic investigation and biomarker identification of lung and spleen deficiency syndrome in HIV/AIDS immunological nonresponders. J Thorac Dis 2023;15:1460-1472. [PMID: 37065569 PMCID: PMC10089843 DOI: 10.21037/jtd-23-322] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2023] [Accepted: 02/27/2023] [Indexed: 03/31/2023] Abstract Background Human immunodeficiency virus (HIV) and acquired immune deficiency syndrome (AIDS) immunological nonresponders (HIV/AIDS-INRs) whose CD4⁺ cell counts do not rebound after highly active antiretroviral therapy (HAART) treatment usually experience severely impaired immune function and high mortality. Traditional Chinese medicine (TCM) has many advantages in the field of AIDS, especially its promotion of patients' immune reconstitution. Accurate differentiation of TCM syndromes is a prerequisite for guiding an effective TCM prescription. However, the objective and biological evidence for identification of the TCM syndromes in HIV/AIDS-INRs remains lacking. Lung and spleen deficiency (LSD) syndrome, a typical HIV/AIDS-INR syndrome, was examined on in this study. Methods We first performed a proteomic study of LSD syndrome in INRs (INRs-LSD) using tandem mass tag combined with liquid chromatography-tandem mass spectrometry (TMT-LC-MS/MS) and screened them against the healthy and undocumented identifiable groups. The TCM syndrome-specific proteins were subsequently validated based on bioinformatics analysis and enzyme-linked immunosorbent assay (ELISA). Results A total of 22 differentially expressed proteins (DEPs) were screened in INRs-LSD compared to the healthy group. Based on bioinformatic analysis, these DEPs were found to be mainly associated with the immunoglobin A (IgA)-generated intestinal immune network. In addition, we examined the TCM syndrome-specific proteins alpha-2-macroglobulin (A2M) and human selectin L (SELL) with ELISA and found that they were both upregulated, which was consistent with the proteomic screening results. Conclusions A2M and SELL were finally identified as potential biomarkers for INRs-LSD, providing a scientific and biological basis for identifying typical TCM syndromes in HIV/AIDS-INRs and an opportunity to build a more effective TCM treatment system for HIV/AIDS-INRs. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
11	BioBulkFoundary: a customized webserver for exploring biosynthetic potentials of bulk chemicals. Bioinformatics 2022;38:5137-5138. [PMID: 36130260 DOI: 10.1093/bioinformatics/btac640] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2022] [Revised: 08/28/2022] [Accepted: 09/20/2022] [Indexed: 12/24/2022] Open Abstract SUMMARY Advances in metabolic engineering have boosted the production of bulk chemicals, resulting in tons of production volumes of some bulk chemicals with very low prices. A decrease in the production cost and overproduction of bulk chemicals makes it necessary and desirable to explore the potential to synthesize higher-value products from them. It is also useful and important for society to explore the use of design methods involving synthetic biology to increase the economic value of these bulk chemicals. Therefore, we developed 'BioBulkFoundary', which provides an elaborate analysis of the biosynthetic potential of bulk chemicals based on the state-of-art exploration of pathways to synthesize value-added chemicals, along with associated comprehensive technology and economic database into a user-friendly framework. AVAILABILITY AND IMPLEMENTATION Freely available on the web at http://design.rxnfinder.org/biobulkfoundary/. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
12	Species identity and combinations differ in their overall benefits to Astragalus adsurgens plants inoculated with single or multiple endophytic fungi under drought conditions. FRONTIERS IN PLANT SCIENCE 2022;13:933738. [PMID: 36160950 PMCID: PMC9490189 DOI: 10.3389/fpls.2022.933738] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/01/2022] [Accepted: 08/08/2022] [Indexed: 06/16/2023] Abstract Although desert plants often establish multiple simultaneous symbiotic associations with various endophytic fungi in their roots, most studies focus on single fungus inoculation. Therefore, combined inoculation of multiple fungi should be applied to simulate natural habitats with the presence of a local microbiome. Here, a pot experiment was conducted to test the synergistic effects between three extremely arid habitat-adapted root endophytes (Alternaria chlamydospora, Sarocladium kiliense, and Monosporascus sp.). For that, we compared the effects of single fungus vs. combined fungi inoculation, on plant morphology and rhizospheric soil microhabitat of desert plant Astragalus adsurgens grown under drought and non-sterile soil conditions. The results indicated that fungal inoculation mainly influenced root biomass of A. adsurgens, but did not affect the shoot biomass. Both single fungus and combined inoculation decreased plant height (7-17%), but increased stem branching numbers (13-34%). However, fungal inoculation influenced the root length and surface area depending on their species and combinations, with the greatest benefits occurring on S. kiliense inoculation alone and its co-inoculation with Monosporascus sp. (109% and 61%; 54% and 42%). Although A. chlamydospora and co-inoculations with S. kiliense and Monosporascus sp. also appeared to promote root growth, these inoculations resulted in obvious soil acidification. Despite no observed root growth promotion, Monosporascus sp. associated with its combined inoculations maximally facilitated soil organic carbon accumulation. However, noticeably, combined inoculation of the three species had no significant effects on root length, surface area, and biomass, but promoted rhizospheric fungal diversity and abundance most, with Sordariomycetes being the dominant fungal group. This indicates the response of plant growth to fungal inoculation may be different from that of the rhizospheric fungal community. Structural equation modeling also demonstrated that fungal inoculation significantly influenced the interactions among the growth of A. adsurgens, soil factors, and rhizospheric fungal groups. Our findings suggest that, based on species-specific and combinatorial effects, endophytic fungi enhanced the plant root growth, altered soil nutrients, and facilitated rhizospheric fungal community, possibly contributing to desert plant performance and ecological adaptability. These results will provide the basis for evaluating the potential application of fungal inoculants for developing sustainable management for desert ecosystems. Collapse Key Words combination inoculation drought stress promoting effects root endophytes soil fungal community Collapse MESH Headings Collapse Grants National Natural Science Foundation of China Collapse
13	Elimination of Fusarium mycotoxin deoxynivalenol (DON) via microbial and enzymatic strategies: Current status and future perspectives. Trends Food Sci Technol 2022. [DOI: 10.1016/j.tifs.2022.04.002] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022] Abstract Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
14	SynBioStrainFinder: A microbial strain database of manually curated CRISPR/Cas genetic manipulation system information for biomanufacturing. Microb Cell Fact 2022;21:87. [PMID: 35568950 PMCID: PMC9107733 DOI: 10.1186/s12934-022-01813-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2022] [Accepted: 05/02/2022] [Indexed: 11/29/2022] Open Abstract BACKGROUND Microbial strain information databases provide valuable data for microbial basic research and applications. However, they rarely contain information on the genetic operating system of microbial strains. RESULTS We established a comprehensive microbial strain database, SynBioStrainFinder, by integrating CRISPR/Cas gene-editing system information with cultivation methods, genome sequence data, and compound-related information. It is presented through three modules, Strain2Gms/PredStrain2Gms, Strain2BasicInfo, and Strain2Compd, which combine to form a rapid strain information query system conveniently curated, integrated, and accessible on a single platform. To date, 1426 CRISPR/Cas gene-editing records of 157 microbial strains have been manually extracted from the literature in the Strain2Gms module. For strains without established CRISPR/Cas systems, the PredStrain2Gms module recommends the system of the most closely related strain as a reference to facilitate the construction of a new CRISPR/Cas gene-editing system. The database contains 139,499 records of strain cultivation and genome sequences, and 773,298 records of strain-related compounds. To facilitate simple and intuitive data application, all microbial strains are also labeled with stars based on the order and availability of strain information. SynBioStrainFinder provides a user-friendly interface for querying, browsing, and visualizing detailed information on microbial strains, and it is publicly available at http://design.rxnfinder.org/biosynstrain/ . CONCLUSION SynBioStrainFinder is the first microbial strain database with manually curated information on the strain CRISPR/Cas system as well as other microbial strain information. It also provides reference information for the construction of new CRISPR/Cas systems. SynBioStrainFinder will serve as a useful resource to extend microbial strain research and application for biomanufacturing. Collapse Key Words CRISPR/Cas system Database Gene-editing method Genome sequence Microorganism Strain cultivation Strain-related compound Collapse MESH Headings CRISPR-Cas Systems Gene Editing Collapse Grants 2019YFA0904300 National Key Research and Development Program of China 31700081 National Natural Science Foundation of China 31570092 National Natural Science Foundation of China QYZDB-SSW-SMC012 CAS STS program 153D31KYSB20170121 International Partnership Program of Chinese Academy of Sciences of China Collapse
15	Development of 3D-QSAR models for predicting the activities of chemicals to stimulate muscle growth via β₂-adrenoceptor. Toxicol In Vitro 2021;77:105251. [PMID: 34601065 DOI: 10.1016/j.tiv.2021.105251] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2021] [Revised: 09/03/2021] [Accepted: 09/27/2021] [Indexed: 10/20/2022] Abstract β₂-adrenoceptor (β₂AR) agonists can stimulate skeletal muscle growth. Their illegal use in food-producing animals, human athletes and bodybuilders causes adverse health effects. In the present study, we developed 3D-QSAR models for predicting the activity of chemicals which can stimulate skeletal muscle growth through β₂AR. The activity of 25 β₂AR agonists was measured by β₂AR-cAMP response element (CRE) -luciferase (Luc) reporter assay. The 3D-QSAR models were built using comparative molecular field analysis (CoMFA) and comparative molecular similarity indices analysis (CoMSIA). The CoMFA and CoMSIA models displayed high external predictability (R² 0.996 and 0.992, respectively) and good statistical robustness, and revealed that electrostatic effects were the most prominent forces influencing the activity of β₂AR agonists. The CoMFA and CoMSIA contour plots provided clues regarding the main chemical features responsible for the activity variations and also resulted in predictions which correlate very well with the observed activity. In vitro study with differentiated myotubes showed that the potency orders of β₂AR agonists in activating the β₂AR-CRE-Luc reporter and in upregulating CREB target genes related to muscle growth were consistent. These 3D-QSAR models provide tools for predicting the activity of chemicals which might be illegally used in livestock or humans to stimulate skeletal muscle growth. Collapse Key Words 3D-QSAR models CRE-luciferase reporter CoMFA CoMSIA Myotube gene expression β(2)AR agonists Collapse MESH Headings Collapse Grants Collapse
16	Cell2Chem: mining explored and unexplored biosynthetic chemical spaces. Bioinformatics 2021;36:5269-5270. [PMID: 32697815 DOI: 10.1093/bioinformatics/btaa660] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2020] [Revised: 06/14/2020] [Accepted: 07/16/2020] [Indexed: 11/12/2022] Open Abstract SUMMARY Living cell strains have important applications in synthesizing their native compounds and potential for use in studies exploring the universal chemical space. Here, we present a web server named as Cell2Chem which accelerates the search for explored compounds in organisms, facilitating investigations of biosynthesis in unexplored chemical spaces. Cell2Chem uses co-occurrence networks and natural language processing to provide a systematic method for linking living organisms to biosynthesized compounds and the processes that produce these compounds. The Cell2Chem platform comprises 40 370 species and 125 212 compounds. Using reaction pathway and enzyme function in silico prediction methods, Cell2Chem reveals possible biosynthetic pathways of compounds and catalytic functions of proteins to expand unexplored biosynthetic chemical spaces. Cell2Chem can help improve biosynthesis research and enhance the efficiency of synthetic biology. AVAILABILITY AND IMPLEMENTATION Cell2Chem is available at: http://www.rxnfinder.org/cell2chem/. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
17	The Matrix Expression, Topological Index and Atomic Attribute of Molecular Topological Structure. ACTA ACUST UNITED AC 2021. [DOI: 10.6339/jds.2003.01(4).172] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022] Abstract Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
18	SARS2020: an integrated platform for identification of novel coronavirus by a consensus sequence-function model. Bioinformatics 2021;37:1182-1183. [PMID: 32871007 PMCID: PMC7558763 DOI: 10.1093/bioinformatics/btaa767] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2020] [Revised: 08/21/2020] [Accepted: 08/25/2020] [Indexed: 12/02/2022] Open Abstract Motivation The 2019 novel coronavirus outbreak has significantly affected global health and society. Thus, predicting biological function from pathogen sequence is crucial and urgently needed. However, little work has been performed to identify viruses by the enzymes that they encode, and which are key to pathogen propagation. Results We built a comprehensive scientific resource, SARS2020, that integrates coronavirus-related research, genomic sequences, and results of anti-viral drug trials. In addition, we built a consensus sequence-catalytic function model from which we identified the novel coronavirus as encoding the same proteinase as the Severe Acute Respiratory Syndrome virus. This data-driven sequence-based strategy will enable rapid identification of agents responsible for future epidemics. Availability SARS2020 is available at http://design.rxnfinder.org/sars2020/. Supplementary information Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
19	ChemHub: a knowledgebase of functional chemicals for synthetic biology studies. Bioinformatics 2021;37:4275-4276. [PMID: 33970229 DOI: 10.1093/bioinformatics/btab360] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2020] [Revised: 03/29/2021] [Accepted: 05/07/2021] [Indexed: 11/14/2022] Open Abstract SUMMARY The field of synthetic biology lacks a comprehensive knowledgebase for selecting synthetic target molecules according to their functions, economic applications, and known biosynthetic pathways. We implemented ChemHub, a knowledgebase containing >90,000 chemicals and their functions, along with related biosynthesis information for these chemicals that was manually extracted from >600,000 published studies by more than 100 people over the past 10 years. AVAILABILITY AND IMPLEMENTATION Multiple algorithms were implemented to enable biosynthetic pathway design and precursor discovery, which can support investigation of the biosynthetic potential of these functional chemicals. ChemHub is freely available at: http://www.rxnfinder.org/chemhub/. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
20	Transcriptor: a comprehensive platform for annotation of the enzymatic functions of transcripts. Bioinformatics 2021;37:434-435. [PMID: 32717064 DOI: 10.1093/bioinformatics/btaa685] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2020] [Revised: 05/29/2020] [Accepted: 07/22/2020] [Indexed: 11/14/2022] Open Abstract MOTIVATION Rapid advances in sequencing technology have resulted huge increases in the accessibility of sequencing data. Moreover, researchers are focusing more on organisms that lack a reference genome. However, few easy-to-use web servers focusing on annotations of enzymatic functions are available. Accordingly, in this study, we describe Transcriptor, a novel platform for annotating transcripts encoding enzymes. RESULTS The transcripts were evaluated using more than 300 000 in-house enzymatic reactions through bridges of Enzyme Commission numbers. Transcriptor also enabled ontology term identification and along with associated enzymes, visualization and prediction of domains and annotation of regulatory structure, such as long noncoding RNAs, which could facilitate the discovery of new functions in model or nonmodel species. Transcriptor may have applications in elucidation of the roles of organs transcriptomes and secondary metabolite biosynthesis in organisms lacking a reference genome. AVAILABILITY AND IMPLEMENTATION Transcriptor is available at http://design.rxnfinder.org/transcriptor/. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
21	A data-driven integrative platform for computational prediction of toxin biotransformation with a case study. JOURNAL OF HAZARDOUS MATERIALS 2021;408:124810. [PMID: 33360695 DOI: 10.1016/j.jhazmat.2020.124810] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/10/2020] [Revised: 11/24/2020] [Accepted: 12/06/2020] [Indexed: 06/12/2023] Abstract Recently, biogenic toxins have received increasing attention owing to their high contamination levels in feed and food as well as in the environment. However, there is a lack of an integrative platform for seamless linking of data-driven computational methods with 'wet' experimental validations. To this end, we constructed a novel platform that integrates the technical aspects of toxin biotransformation methods. First, a biogenic toxin database termed ToxinDB (http://www.rxnfinder.org/toxindb/), containing multifaceted data on more than 4836 toxins, was built. Next, more than 8000 biotransformation reaction rules were extracted from over 300,000 biochemical reactions extracted from ~580,000 literature reports curated by more than 100 people over the past decade. Based on these reaction rules, a toxin biotransformation prediction model was constructed. Finally, the global chemical space of biogenic toxins was constructed, comprising ~550,000 toxins and putative toxin metabolites, of which 94.7% of the metabolites have not been previously reported. Additionally, we performed a case study to investigate citrinin metabolism in Trichoderma, and a novel metabolite was identified with the assistance of the biotransformation prediction tool of ToxinDB. This unique integrative platform will assist exploration of the 'dark matter' of a toxin's metabolome and promote the discovery of detoxification enzymes. Collapse Key Words Bioinformatics Cheminformatics Metabolites Synthesis biology Systems biology Collapse MESH Headings Biotransformation Computational Biology Databases, Factual Humans Metabolome Collapse Grants Collapse
22	FADB-China: A molecular-level food adulteration database in China based on molecular fingerprints and similarity algorithms prediction expansion. Food Chem 2020;327:127010. [DOI: 10.1016/j.foodchem.2020.127010] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2019] [Revised: 04/18/2020] [Accepted: 05/06/2020] [Indexed: 12/19/2022] Abstract Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
23	Data-driven rational biosynthesis design: from molecules to cell factories. Brief Bioinform 2020;21:1238-1248. [PMID: 31243440 DOI: 10.1093/bib/bbz065] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2019] [Revised: 04/28/2019] [Accepted: 05/08/2019] [Indexed: 11/12/2022] Open Abstract A proliferation of chemical, reaction and enzyme databases, new computational methods and software tools for data-driven rational biosynthesis design have emerged in recent years. With the coming of the era of big data, particularly in the bio-medical field, data-driven rational biosynthesis design could potentially be useful to construct target-oriented chassis organisms. Engineering the complicated metabolic systems of chassis organisms to biosynthesize target molecules from inexpensive biomass is the main goal of cell factory design. The process of data-driven cell factory design could be divided into several parts: (1) target molecule selection; (2) metabolic reaction and pathway design; (3) prediction of novel enzymes based on protein domain and structure transformation of biosynthetic reactions; (4) construction of large-scale DNA for metabolic pathways; and (5) DNA assembly methods and visualization tools. The construction of a one-stop cell factory system could achieve automated design from the molecule level to the chassis level. In this article, we outline data-driven rational biosynthesis design steps and provide an overview of related tools in individual steps. Collapse Key Words cell factory design data-driven biosynthesis design enzyme function prediction gene construction and assembly target selection Collapse MESH Headings Collapse Grants Collapse
24	novoPathFinder: a webserver of designing novel-pathway with integrating GEM-model. Nucleic Acids Res 2020;48:W477-W487. [PMID: 32313937 PMCID: PMC7319456 DOI: 10.1093/nar/gkaa230] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2020] [Revised: 03/16/2020] [Accepted: 03/28/2020] [Indexed: 12/14/2022] Open Abstract To increase the number of value-added chemicals that can be produced by metabolic engineering and synthetic biology, constructing metabolic space with novel reactions/pathways is crucial. However, with the large number of reactions that existed in the metabolic space and complicated metabolisms within hosts, identifying novel pathways linking two molecules or heterologous pathways when engineering a host to produce a target molecule is an arduous task. Hence, we built a user-friendly web server, novoPathFinder, which has several features: (i) enumerate novel pathways between two specified molecules without considering hosts; (ii) construct heterologous pathways with known or putative reactions for producing target molecule within Escherichia coli or yeast without giving precursor; (iii) estimate novel pathways with considering several categories, including enzyme promiscuity, Synthetic Complex Score (SCScore) and LD50 of intermediates, overall stoichiometric conversions, pathway length, theoretical yields and thermodynamic feasibility. According to the results, novoPathFinder is more capable to recover experimentally validated pathways when comparing other rule-based web server tools. Besides, more efficient pathways with novel reactions could also be retrieved for further experimental exploration. novoPathFinder is available at http://design.rxnfinder.org/novopathfinder/. Collapse Key Words Collapse MESH Headings Algorithms Benzaldehydes/metabolism Biosynthetic Pathways Cannabidiol/metabolism Escherichia coli/metabolism Internet Metabolic Engineering Saccharomyces cerevisiae/metabolism Software Collapse Grants National Key Research and Development Program of China National Natural Science Foundation of China Scientific Research Conditions and Technical Support System Program CAS STS program International Partnership Program of Chinese Academy of Sciences of China Natural Science Foundation of Tianjin Collapse
25	BCSExplorer: a customized biosynthetic chemical space explorer with multifunctional objective function analysis. Bioinformatics 2020;36:1642-1643. [PMID: 31593245 DOI: 10.1093/bioinformatics/btz755] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2019] [Revised: 09/11/2019] [Accepted: 10/01/2019] [Indexed: 11/14/2022] Open Abstract SUMMARY The biosynthetic ability of living organisms has important applications in producing bulk chemicals, biofuels and natural products. Based on the most comprehensive biosynthesis knowledgebase, a computational system, BCSExplorer, is proposed to discover the unexplored chemical space using nature's biosynthetic potential. BCSExplorer first integrates the most comprehensive biosynthetic reaction database with 280 000 biochemical reactions and 60 000 chemicals biosynthesized globally over the past 130 years. Second, in this study, a biosynthesis tree is computed for a starting chemical molecule based on a comprehensive biotransformation rule library covering almost all biosynthetic possibilities, in which redundant rules are removed using a new algorithm. Moreover, biosynthesis feasibility, drug-likeness and toxicity analysis of a new generation of compounds will be pursued in further studies to meet various needs. BCSExplorer represents a novel method to explore biosynthetically available chemical space. AVAILABILITY AND IMPLEMENTATION BCSExplorer is available at: http://www.rxnfinder.org/bcsexplorer/. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
26	Bio2Rxn: sequence-based enzymatic reaction predictions by a consensus strategy. Bioinformatics 2020;36:3600-3601. [DOI: 10.1093/bioinformatics/btaa135] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2019] [Revised: 02/01/2020] [Accepted: 02/25/2020] [Indexed: 01/11/2023] Open Abstract AbstractSummaryThe development of sequencing technologies has generated large amounts of protein sequence data. The automated prediction of the enzymatic reactions of uncharacterized proteins is a major challenge in the field of bioinformatics. Here, we present Bio2Rxn as a web-based tool to provide putative enzymatic reaction predictions for uncharacterized protein sequences. Bio2Rxn adopts a consensus strategy by incorporating six types of enzyme prediction tools. It allows for the efficient integration of these computational resources to maximize the accuracy and comprehensiveness of enzymatic reaction predictions, which facilitates the characterization of the functional roles of target proteins in metabolism. Bio2Rxn further links the enzyme function prediction with more than 300 000 enzymatic reactions, which were manually curated by more than 100 people over the past 9 years from more than 580 000 publications.Availability and implementationBio2Rxn is available at: http://design.rxnfinder.org/bio2rxn/.Contactqnhu@sibs.ac.cnSupplementary informationSupplementary data are available at Bioinformatics online. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
27	FRCD: A comprehensive food risk component database with molecular scaffold, chemical diversity, toxicity, and biodegradability analysis. Food Chem 2020;318:126470. [PMID: 32120139 DOI: 10.1016/j.foodchem.2020.126470] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2019] [Revised: 02/21/2020] [Accepted: 02/22/2020] [Indexed: 12/26/2022] Abstract The presence of natural toxins, pesticide residues, and illegal additives in food products has been associated with a range of potential health hazards. However, no systematic database exists that comprehensively includes and integrates all research information on these compounds, and valuable information remains scattered across numerous databases and extensive literature reports. Thus, using natural language processing technology, we curated 12,018 food risk components from 152,737 literature reports, 12 authoritative databases, and numerous related regulatory documents. Data on molecular structures, physicochemical properties, chemical taxonomy, absorption, distribution, metabolism, excretion, toxicity properties, and physiological targets within the human body were integrated to afford the comprehensive food risk component database (FRCD, http://www.rxnfinder.org/frcd/). We also analyzed the molecular scaffold and chemical diversity, in addition to evaluating the toxicity and biodegradability of the food risk components. The FRCD could be considered a highly promising tool for future food safety studies. Collapse Key Words Bioinformatics Chemical diversity Cheminformatics Food toxin Scaffold analysis Collapse MESH Headings Collapse Grants Collapse
28	RxnBLAST: molecular scaffold and reactive chemical environment feature extractor for biochemical reactions. Bioinformatics 2020;36:2946-2947. [DOI: 10.1093/bioinformatics/btaa036] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2019] [Revised: 12/11/2019] [Accepted: 01/14/2020] [Indexed: 12/28/2022] Open Abstract Abstract Motivation Molecular scaffolds are useful in medicinal chemistry to describe, discuss and visualize series of chemical compounds, biochemical transformations and associated biological properties. Results Here, we present RxnBLAST as a web-based tool for analyzing scaffold transformations and reactive chemical environment features in bioreactions. RxnBLAST extracts chemical features from bioreactions including atom–atom mapping, reaction centers, rules and functional groups to help understand chemical compositions and reaction patterns. Core-to-Core is proposed, which can be utilized in scaffold networks and for constructing a reaction space, as well as providing guidance for subsequent biosynthesis efforts. Availability and implementation RxnBLAST is available at: http://design.rxnfinder.org/rxnblast/. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
29	CF-Targeter: A Rational Biological Cell Factory Targeting Platform for Biosynthetic Target Chemicals. ACS Synth Biol 2019;8:2280-2286. [PMID: 31518497 DOI: 10.1021/acssynbio.9b00070] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Abstract Biosynthesis is a promising method for chemical synthesis. However, due to varieties between different microorganism hosts, yield and heterologous pathways needed for production of target chemical may also vary from different strains. One of the main challenges in metabolic engineering is to select an appropriate chassis host for specified target chemical production. However, with thousands of microorganisms existing in nature and extremely complicated metabolism within them, it is still time-consuming and error-prone work to achieve such a goal only through experimental methods, even with some existing computational methods. Hence, more efficient methods should be proposed to assist in selecting appropriate chassis hosts. In this article, based on symbolic reaction repositories and a pathway search algorithm which performed 1 400 000 searches for per target compound, we established a biological reasoning system for appropriate chassis host selection by coupling with various GEM-models. By using a supercomputer to calculate the biosynthetic pathways for more than 1 month, nearly 50 000 000 biosynthetic pathways are computed for production of 6026 compounds within 70 microorganisms. With retrieved organisms for specified target production, several heterologous biosynthetic pathways can be shown in length order, and then the maximum theoretical yields and thermodynamic feasibility can be calculated in real time under customized growth conditions and physiological states. From the computation results, the system not only identifies experimentally validated pathways but also outputs more efficient solutions with less heterologous steps or higher maximum possible theoretical yield by engineering other organism hosts. CF-targeter is available at http://www.rxnfinder.org/cf_targeter/. Collapse Key Words chassis hosts selection genome-scale metabolic model heterologous pathway design theoretical yield Collapse MESH Headings Collapse Grants Collapse
30	PrecursorFinder: a customized biosynthetic precursor explorer. Bioinformatics 2018;35:1603-1604. [DOI: 10.1093/bioinformatics/bty838] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2018] [Revised: 09/20/2018] [Accepted: 09/09/2018] [Indexed: 01/26/2023] Open Abstract Abstract Summary Synthetic biology has a great potential to produce high value pharmaceuticals, commodities or bulk chemicals. However, many biosynthetic target molecules have no defined or predicted biosynthetic pathways. Biosynthetic precursors are crucial to create biosynthetic pathways. Thus computer-assisted tools for precursor identification are urgently needed to develop novel metabolic pathways. To this end, we present PrecursorFinder, a computational tool that explores biosynthetic precursors for the query target molecules using chemical structure, similarity as well as MCS (maximum common substructure). This platform comprises more than 60 000 compounds biosynthesized for being promising precursors, which are extracted from >500 000 scientific literatures and manually curated by more than 100 people over the past 8 years. The PrecursorFinder could speed up the process of biosynthesis research and make synthetic biology or metabolic engineering more efficient. Availability and implementation PrecursorFinder is available at: http://www.rxnfinder.org/precursorfinder/. Supplementary information Supplementary data are available at Bioinformatics online. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
31	EcoSynther: A Customized Platform To Explore the Biosynthetic Potential in E. coli. ACS Chem Biol 2017;12:2823-2829. [PMID: 28952720 DOI: 10.1021/acschembio.7b00605] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Abstract Developing computational tools for a chassis-centered biosynthetic pathway design is very important for a productive heterologous biosynthesis system by considering enormous foreign biosynthetic reactions. For many cases, a pathway to produce a target molecule consists of both native and heterologous reactions when utilizing a microbial organism as the host organism. Due to tens of thousands of biosynthetic reactions existing in nature, it is not trivial to identify which could be served as heterologous ones to produce the target molecule in a specific organism. In the present work, we integrate more than 10,000 E. coli non-native reactions and utilize a probability-based algorithm to search pathways. Moreover, we built a user-friendly Web server named EcoSynther. It is able to explore the precursors and heterologous reactions needed to produce a target molecule in Escherichia coli K12 MG1655 and then applies flux balance analysis to calculate theoretical yields of each candidate pathway. Compared with other chassis-centered biosynthetic pathway design tools, EcoSynther has two unique features: (1) allow for automatic search without knowing a precursor in E. coli and (2) evaluate the candidate pathways under constraints from E. coli physiological states and growth conditions. EcoSynther is available at http://www.rxnfinder.org/ecosynther/ . Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
32	PhID: An Open-Access Integrated Pharmacology Interactions Database for Drugs, Targets, Diseases, Genes, Side-Effects, and Pathways. J Chem Inf Model 2017;57:2395-2400. [PMID: 28906116 DOI: 10.1021/acs.jcim.7b00175] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022] Abstract The current network pharmacology study encountered a bottleneck with a lot of public data scattered in different databases. There is a lack of an open-access and consolidated platform that integrates this information for systemic research. To address this issue, we have developed PhID, an integrated pharmacology database which integrates >400 000 pharmacology elements (drug, target, disease, gene, side-effect, and pathway) and >200 000 element interactions in branches of public databases. PhID has three major applications: (1) assisting scientists searching through the overwhelming amount of pharmacology element interaction data by names, public IDs, molecule structures, or molecular substructures; (2) helping visualizing pharmacology elements and their interactions with a web-based network graph; and (3) providing prediction of drug-target interactions through two modules: PreDPI-ki and FIM, by which users can predict drug-target interactions of PhID entities or some drug-target pairs of their own interest. To get a systems-level understanding of drug action and disease complexity, PhID as a network pharmacology tool was established from the perspective of data layer, visualization layer, and prediction model layer to present information untapped by current databases. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
33	SynBioEcoli: a comprehensive metabolism network of engineered E. coli in three dimensional visualization. QUANTITATIVE BIOLOGY 2017. [DOI: 10.1007/s40484-017-0098-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022] Abstract Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
34	Multi-fields model for predicting target–ligand interaction. Neurocomputing 2016. [DOI: 10.1016/j.neucom.2016.03.079] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022] Abstract Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
35	BioSynther: a customized biosynthetic potential explorer. Bioinformatics 2015;32:472-3. [DOI: 10.1093/bioinformatics/btv599] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2015] [Accepted: 10/12/2015] [Indexed: 11/13/2022] Open Abstract Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
36	Predicting target-ligand interactions using protein ligand-binding site and ligand substructures. BMC SYSTEMS BIOLOGY 2015;9 Suppl 1:S2. [PMID: 25707321 PMCID: PMC4331677 DOI: 10.1186/1752-0509-9-s1-s2] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/16/2023] Abstract Background Cell proliferation, differentiation, Gene expression, metabolism, immunization and signal transduction require the participation of ligands and targets. It is a great challenge to identify rules governing molecular recognition between chemical topological substructures of ligands and the binding sites of the targets. Methods We suppose that the ligand-target interactions are determined by ligand substructures as well as the physical-chemical properties of the binding sites. Therefore, we propose a fragment interaction model (FIM) to describe the interactions between ligands and targets, with the purpose of facilitating the chemical interpretation of ligand-target binding. First we extract target-ligand complexes from sc-PDB database, based on which, we get the target binding sites and the ligands. Then we represent each binding site as a fragment vector based on a target fragment dictionary that is composed of 199 clusters (denoted as fragements in this work) obtained by clustering 4200 trimers according to their physical-chemical properties. And then, we represent each ligand as a substructure vector based on a dictionary containing 747 substructures. Finally, we build the FIM by generating the interaction matrix M (representing the fragment interaction network), and the FIM can later be used for predicting unknown ligand-target interactions as well as providing the binding details of the interactions. Results The five-fold cross validation results show that the proposed model can get higher AUC score (92%) than three prevalence algorithms CS-PD (80%), BLM-NII (85%) and RF (85%), demonstrating the remarkable predictive ability of FIM. We also show that the ligand binding sites (local information) overweight the sequence similarities (global information) in ligand-target binding, and introducing too much global information would be harmful to the predictive ability. Moreover, The derived fragment interaction network can provide the chemical insights on the interactions. Conclusions The target and ligand bindings are local events, and the local information dominate the binding ability. Though integrating of the global information can promote the predictive ability, the role is very limited. The fragment interaction network is helpful for understanding the mechanism of the ligand-target interaction. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
37	Transferability of retrotransposon primers derived from Persimmon (Diospyros kaki Thunb.) across other plant species. GENETICS AND MOLECULAR RESEARCH 2013;12:1781-95. [PMID: 23913371 DOI: 10.4238/2013.june.6.2] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022] Abstract Retrotransposon-based molecular markers are powerful molecular tools. However, these markers are not readily available due to the difficulty in obtaining species-specific retrotransposon primers. Although recent techniques enabling the rapid isolation of retrotransposon sequences have facilitated primer development, this process nonetheless remains time-consuming and costly. Therefore, research into the transferability of retrotransposon primers developed from one plant species onto others would be of great value. The present study investigated the transferability of retrotransposon primers derived from 'Luotian-tianshi' persimmon (Diospyros kaki Thunb.) across other fruit crops, as well as within the genus using inter-retrotransposon amplified polymorphism molecular marker. Fourteen of the 26 retrotransposon primers tested (53.85%) produced robust and reproducible amplification products across all fruit crops tested, indicating their applicability across plant species. Four of the 13 fruit crops showed the best transferability performances: persimmon, grape, citrus, and peach. Furthermore, similarity coefficients and UPGMA clustering indicated that these primers could further offer a potential tool for germplasm differentiation, parentage identification, genetic diversity assessment, classification, and phylogenetic studies across a variety of plant species. Transferability was further confirmed by examining published primers derived from Rosaceae, Gramineae, and Solanaceae. This study is one of the few currently available studies concerning the transferability of retrotransposon primers across plant species in general, and is the first successful study of the transferability of retrotransposon primers derived from persimmon. The primers presented here will help reduce costs for future retrotransposon primer development and therefore contribute to the popularization of retrotransposon molecular markers. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
38	Synthesis, biological evaluation and molecular modeling of substituted 2-aminobenzimidazoles as novel inhibitors of acetylcholinesterase and butyrylcholinesterase. Bioorg Med Chem 2013;21:4218-24. [PMID: 23719283 DOI: 10.1016/j.bmc.2013.05.001] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2013] [Revised: 05/01/2013] [Accepted: 05/02/2013] [Indexed: 10/26/2022] Abstract A series of novel 2-aminobenzimidazole derivatives were synthesized under microwave irradiation. Their biological activities were evaluated on acetylcholinesterase (AChE) and butyrylcholinesterase (BuChE). A number of the 2-aminobenzimidazole derivatives showed good inhibitory activities to AChE and BuChE. Among them, compounds 9, 12 and 13 were found to be >25-fold more selective for BuChE than AChE. No evidence of cytotoxicity was observed by MTT assay in PC12 cells or HepG2 cells exposed to 100μM of the compounds. Molecular modeling studies indicate that the benzimidazole moiety of compounds 9, 12 and 13 forms a face-to-face π-π stacking interaction in a 'sandwich' form with the indole ring of Trp82 (4.09Å) in the active gorge, and compounds 12 and 13 form a hydrogen bond with His438 at the catalytic site of BuChE. In addition, compounds 12 and 13 fit well into the hydrophobic pocket formed by Ala328, Trp430 and Tyr332 of BuChE. Our data suggest the 2-aminobenzimidazole drugs as promising new selective inhibitors for AChE and BuChE, potentially useful to treat neurodegenerative diseases. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
39	Genome-scale screening of drug-target associations relevant to Ki using a chemogenomics approach. PLoS One 2013;8:e57680. [PMID: 23577055 PMCID: PMC3618265 DOI: 10.1371/journal.pone.0057680] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2012] [Accepted: 01/27/2013] [Indexed: 11/18/2022] Open Abstract The identification of interactions between drugs and target proteins plays a key role in genomic drug discovery. In the present study, the quantitative binding affinities of drug-target pairs are differentiated as a measurement to define whether a drug interacts with a protein or not, and then a chemogenomics framework using an unbiased set of general integrated features and random forest (RF) is employed to construct a predictive model which can accurately classify drug-target pairs. The predictability of the model is further investigated and validated by several independent validation sets. The built model is used to predict drug-target associations, some of which were confirmed by comparing experimental data from public biological resources. A drug-target interaction network with high confidence drug-target pairs was also reconstructed. This network provides further insight for the action of drugs and targets. Finally, a web-based server called PreDPI-Ki was developed to predict drug-target interactions for drug discovery. In addition to providing a high-confidence list of drug-target associations for subsequent experimental investigation guidance, these results also contribute to the understanding of drug-target interactions. We can also see that quantitative information of drug-target associations could greatly promote the development of more accurate models. The PreDPI-Ki server is freely available via: http://sdd.whu.edu.cn/dpiki. Collapse Key Words Collapse MESH Headings Drug Evaluation, Preclinical/methods Genomics/methods Humans Pharmaceutical Preparations/metabolism Probability Protein Binding Proteins/metabolism ROC Curve Collapse Grants Collapse
40	ChemoPy: freely available python package for computational biology and chemoinformatics. ACTA ACUST UNITED AC 2013;29:1092-4. [PMID: 23493324 DOI: 10.1093/bioinformatics/btt105] [Citation(s) in RCA: 123] [Impact Index Per Article: 11.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023] Abstract MOTIVATION Molecular representation for small molecules has been routinely used in QSAR/SAR, virtual screening, database search, ranking, drug ADME/T prediction and other drug discovery processes. To facilitate extensive studies of drug molecules, we developed a freely available, open-source python package called chemoinformatics in python (ChemoPy) for calculating the commonly used structural and physicochemical features. It computes 16 drug feature groups composed of 19 descriptors that include 1135 descriptor values. In addition, it provides seven types of molecular fingerprint systems for drug molecules, including topological fingerprints, electro-topological state (E-state) fingerprints, MACCS keys, FP4 keys, atom pairs fingerprints, topological torsion fingerprints and Morgan/circular fingerprints. By applying a semi-empirical quantum chemistry program MOPAC, ChemoPy can also compute a large number of 3D molecular descriptors conveniently. AVAILABILITY The python package, ChemoPy, is freely available via http://code.google.com/p/pychem/downloads/list, and it runs on Linux and MS-Windows. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
41	Assignment of EC numbers to enzymatic reactions with reaction difference fingerprints. PLoS One 2012;7:e52901. [PMID: 23285222 PMCID: PMC3532301 DOI: 10.1371/journal.pone.0052901] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2012] [Accepted: 11/23/2012] [Indexed: 11/18/2022] Open Abstract The EC numbers represent enzymes and enzyme genes (genomic information), but they are also utilized as identifiers of enzymatic reactions (chemical information). In the present work (ECAssigner), our newly proposed reaction difference fingerprints (RDF) are applied to assign EC numbers to enzymatic reactions. The fingerprints of reactant molecules minus the fingerprints of product molecules will generate reaction difference fingerprints, which are then used to calculate reaction Euclidean distance, a reaction similarity measurement, of two reactions. The EC number of the most similar training reaction will be assigned to an input reaction. For 5120 balanced enzymatic reactions, the RDF with a fingerprint length at 3 obtained at the sub-subclass, subclass, and main class level with cross-validation accuracies of 83.1%, 86.7%, and 92.6% respectively. Compared with three published methods, ECAssigner is the first fully automatic server for EC number assignment. The EC assignment system (ECAssigner) is freely available via: http://cadd.whu.edu.cn/ecassigner/. Collapse Key Words Collapse MESH Headings Advisory Committees Catalysis DNA Fingerprinting/methods Enzyme Activation Enzymes/genetics Enzymes/metabolism Genotyping Techniques/methods Humans Models, Biological Molecular Sequence Data Reproducibility of Results Software Substrate Specificity Validation Studies as Topic Collapse Grants Collapse
42	Using core hydrophobicity to identify phosphorylation sites of human G protein-coupled receptors. Biochimie 2012;94:1697-704. [PMID: 22503742 DOI: 10.1016/j.biochi.2012.03.022] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2011] [Accepted: 03/28/2012] [Indexed: 01/23/2023] Abstract As the most frequent drug target, G protein-coupled receptors (GPCRs) are a large family of seven trans-membrane receptors that sense molecules outside the cell and activate inside signal transduction pathways. The activity and lifetime of activated receptors are regulated by receptor phosphorylation. Therefore, investigating the exact positions of phosphorylation sites in GPCRs sequence could provide useful clues for drug design and other biotechnology applications. Experimental identification of phosphorylation sites is expensive and laborious. Hence, there is significant interest in the development of computational methods for reliable prediction of phosphorylation sites from amino acid sequences. In this article, we presented a simple and effective method to recognize phosphorylation sites of human GPCRs by combining amino acid hydrophobicity and support vector machine. The prediction accuracy, sensitivity, specificity, Matthews correlation coefficient and area under the curve values for phosphoserine, phosphothreonine, and phosphotyrosine were 0.964, 0.790, 0.999, 0.866, 0.941; 0.954, 0.800, 0.985, 0.828, 0.958; and 0.976, 0.820, 0.993, 0.861, 0.959, respectively. The establishment of such a fast and accurate prediction method will speed up the pace of identifying proper GPCRs sites to facilitate drug discovery. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
43	A novel kernel Fisher discriminant analysis: Constructing informative kernel by decision tree ensemble for metabolomics data analysis. Anal Chim Acta 2011;706:97-104. [DOI: 10.1016/j.aca.2011.08.025] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2011] [Revised: 06/12/2011] [Accepted: 08/16/2011] [Indexed: 10/17/2022] Abstract Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
44	In silico classification of human maximum recommended daily dose based on modified random forest and substructure fingerprint. Anal Chim Acta 2011;692:50-6. [DOI: 10.1016/j.aca.2011.02.010] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2010] [Revised: 12/07/2010] [Accepted: 02/03/2011] [Indexed: 10/18/2022] Abstract Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
45	A new strategy of exploring metabolomics data using Monte Carlo tree. Analyst 2011;136:947-54. [DOI: 10.1039/c0an00383b] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Abstract Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
46	COMDECOM: predicting the lifetime of screening compounds in DMSO solution. ACTA ACUST UNITED AC 2009;14:557-65. [PMID: 19483143 DOI: 10.1177/1087057109336953] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Abstract The technological evolution of the 1990s in both combinatorial chemistry and high-throughput screening created the demand for rapid access to the compound deck to support the screening process. The common strategy within the pharmaceutical industry is to store the screening library in DMSO solution. Several studies have shown that a percentage of these compounds decompose in solution, varying from a few percent of the total to a substantial part of the library. In the COMDECOM (COMpound DECOMposition) project, the compound stability of screening compounds in DMSO solution is monitored in an accelerated thermal, hydrolytic, and oxidative decomposition program. A large database with stability data is collected, and from this database, a predictive model is being developed. The aim of this program is to build an algorithm that can flag compounds that are likely to decompose-information that is considered to be of utmost importance (e.g., in the compound acquisition process and when evaluation screening results of library compounds, as well as in the determination of optimal storage conditions). Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
47	LC-DAD-APCI-MS-based screening and analysis of the absorption and metabolite components in plasma from a rabbit administered an oral solution of danggui. Anal Bioanal Chem 2005;383:247-54. [PMID: 16132135 DOI: 10.1007/s00216-005-0008-7] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2005] [Revised: 07/01/2005] [Accepted: 07/03/2005] [Indexed: 11/24/2022] Abstract A valid chromatographic fingerprint method using liquid chromatography-diode array detection-atmospheric pressure chemical ionization mass spectrometry in negative mode (LC-DAD-APCI-MS) is proposed for studying the absorption and metabolites of a traditional Chinese medicine (TCM) Angelica sinensis (danggui) in rabbit plasma, after the rabbit is administered with danggui oral solution (DOS). More than thirty-two common components were detected in both DOS and rabbit plasma, which shows that the components in the DOS were absorbed into the body of the rabbit. Of these, senkyunolide I, senkyunolide H, Z-6,7-epoxyligustilide, 3-butylidene-7-hydroxyphthalide, Z-ligustilide, Z-butylidenephthalide, Diels-Alder dimers of ligustilide, linolenic acid, linoleic acid and falcarindiol were tentatively identified from their MS, UV spectra and retention behavior by comparing the results with the published literature. At least ten components were found in rabbit plasma but not in DOS, indicating that these components must be metabolites of some of the components in the original extract. The results prove that the proposed method can be used to rapidly analyze multiple constituents in TCMs, and to screen for bioactive compounds by comparing and contrasting the chromatographic fingerprints of DOS and plasma samples. Collapse Key Words Collapse MESH Headings 4-Butyrolactone/analogs & derivatives 4-Butyrolactone/blood Absorption Administration, Oral Aldehydes/blood Angelica sinensis/chemistry Animals Benzofurans/blood Diynes Drugs, Chinese Herbal/administration & dosage Drugs, Chinese Herbal/chemistry Fatty Alcohols/blood Linoleic Acid/blood Mass Spectrometry/methods Phthalic Anhydrides/blood Plant Extracts/administration & dosage Plant Extracts/chemistry Rabbits alpha-Linolenic Acid/blood Collapse Grants Collapse
48	Impersonality of the connectivity index and recomposition of topological indices according to different properties. Molecules 2004;9:1089-99. [PMID: 18007506 DOI: 10.3390/91201089] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2004] [Revised: 12/07/2004] [Accepted: 12/08/2004] [Indexed: 11/16/2022] Open Abstract The connectivity index chi can be regarded as the sum of bond contributions. In this article, boiling point (bp)-oriented contributions for each kind of bond are obtained by decomposing the connectivity indices into ten connectivity character bases and then doing a linear regression between bps and the bases. From the comparison of bp-oriented contributions with the contributions assigned by chi, it can be found that they are very similar in percentage, i.e. the relative importance of each particular kind of bond is nearly the same in the two forms of combinations (one is obtained from the regression with boiling point, and the other is decided by the constructor of the chi index). This coincidence shows an impersonality of chi on bond weighting and may provide us another interpretation of the efficiency of the connectivity index on many quantitative structure-activity/property relationship (QSAR or QSPR) results. However, we also found that chi's weighting formula may not be appropriate for some other properties. In fact, there is no universal weighting formula appropriate for all properties/activities. Recomposition of some topological indices by adjusting the weights upon character bases according to different properties/activities is suggested. This idea of recomposition is applied to the first Zagreb group index M(1) and a large improvement has been achieved. Collapse Key Words Collapse MESH Headings Models, Chemical Models, Molecular Molecular Structure Quantitative Structure-Activity Relationship Transition Temperature Collapse Grants Collapse
49	Structural Interpretation of the Topological Index. 2. The Molecular Connectivity Index, the Kappa Index, and the Atom-type E-State Index. ACTA ACUST UNITED AC 2004;44:1193-201. [PMID: 15272826 DOI: 10.1021/ci049973z] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022] Abstract The structural interpretation is extended to the topological indices describing cyclic structures. Three representatives of the topological index, such as the molecular connectivity index, the Kappa index, and the atom-type E-State index, are interpreted by mining out, through projection pursuit combining with a number theory method generating uniformly distributed directions on unit sphere, the structural features hidden in the spaces spanned by the three series of indices individually. Some interesting results, which can hardly be found by individual index, are obtained from the multidimensional spaces by several topological indices. The results support quantitatively the former studies on the topological indices, and some new insights are obtained during the analysis. The combinations of several molecular connectivity indices describe mainly three general categories of molecular structure information, which include degree of branching, size, and degree of cyclicity. The cyclicity can also be coded by the combination of chi cluster and path/cluster indices. The Kappa shape indices encode, in combination, significant information on size, the degree of cyclicity, and the degree of centralization/separation in branching. The size, branch number, and cyclicity information has also been mined out to interpret atom-type E-State indices. The structural feature such as the number of quaternary atoms is searched out to be an important factor. The results indicate that the collinearity might be a serious problem in the applications of the topological indices. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
50	Structural Interpretation of a Topological Index. 1. External Factor Variable Connectivity Index (EFVCI). ACTA ACUST UNITED AC 2004;44:437-46. [PMID: 15032523 DOI: 10.1021/ci034225f] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022] Abstract The external factor variable connectivity index (EFVCI) is interpreted by mining out the structural features hidden in the space spanned by the EFVCI indices through projection pursuit combining with number-theory net (NT-net) on the unit sphere U(Us). Projection pursuit is concerned with "interesting" projections of high-dimensional data sets to machine-pick "interesting" low-dimensional projections of a high-dimensional point cloud by numerically maximizing a certain objective function or projection index. At first, the optimal EFVCI index reaches to -0.80 in the correlation with a retention index of 207 hydrocarbons produced by insects. The EFVCI indices, with regression results of R = 0.99998, s = 3.49, RMSECV = 3.90, and F = 7.9560e+005, obtain high regression quality. The model is proven valid by leave-one-out cross validation. Second, the EFVCI index is interpreted by the structure information, that is, size, branch number, graph center, and branching position of topological structures, which is searched out on the unit sphere U(Us) by projection pursuit. Finally, the interpretation information is used to discover some chemical knowledge concerning the variation of the retention index with the change in chemical structures. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse