1
|
Chambers BA, Basili D, Word L, Baker N, Middleton A, Judson RS, Shah I. Searching for LINCS to Stress: Using Text Mining to Automate Reference Chemical Curation. Chem Res Toxicol 2024; 37:878-893. [PMID: 38736322 DOI: 10.1021/acs.chemrestox.3c00335] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/14/2024]
Abstract
Adaptive stress response pathways (SRPs) restore cellular homeostasis following perturbation but may activate terminal outcomes like apoptosis, autophagy, or cellular senescence if disruption exceeds critical thresholds. Because SRPs hold the key to vital cellular tipping points, they are targeted for therapeutic interventions and assessed as biomarkers of toxicity. Hence, we are developing a public database of chemicals that perturb SRPs to enable new data-driven tools to improve public health. Here, we report on the automated text-mining pipeline we used to build and curate the first version of this database. We started with 100 reference SRP chemicals gathered from published biomarker studies to bootstrap the database. Second, we used information retrieval to find co-occurrences of reference chemicals with SRP terms in PubMed abstracts and determined pairwise mutual information thresholds to filter biologically relevant relationships. Third, we applied these thresholds to find 1206 putative SRP perturbagens within thousands of substances in the Library of Integrated Network-Based Cellular Signatures (LINCS). To assign SRP activity to LINCS chemicals, domain experts had to manually review at least three publications for each of 1206 chemicals out of 181,805 total abstracts. To accomplish this efficiently, we implemented a machine learning approach to predict SRP classifications from texts to prioritize abstracts. In 5-fold cross-validation testing with a corpus derived from the 100 reference chemicals, artificial neural networks performed the best (F1-macro = 0.678) and prioritized 2479/181,805 abstracts for expert review, which resulted in 457 chemicals annotated with SRP activities. An independent analysis of enriched mechanisms of action and chemical use class supported the text-mined chemical associations (p < 0.05): heat shock inducers were linked with HSP90 and DNA damage inducers to topoisomerase inhibition. This database will enable novel applications of LINCS data to evaluate SRP activities and to further develop tools for biomedical information extraction from the literature.
Collapse
Affiliation(s)
- Bryant A Chambers
- Center for Computational Toxicology and Exposure, Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, North Carolina 27711, United States
| | - Danilo Basili
- Unilever, Safety and Environmental Assurance Centre (SEAC), Colworth Science Park, Sharnbrook, Bedfordshire MK44 1LQ, U.K
| | - Laura Word
- Center for Computational Toxicology and Exposure, Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, North Carolina 27711, United States
| | - Nancy Baker
- Leidos, Research Triangle Park, North Carolina 27711, United States
| | - Alistair Middleton
- Unilever, Safety and Environmental Assurance Centre (SEAC), Colworth Science Park, Sharnbrook, Bedfordshire MK44 1LQ, U.K
| | - Richard S Judson
- Center for Computational Toxicology and Exposure, Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, North Carolina 27711, United States
| | - Imran Shah
- Center for Computational Toxicology and Exposure, Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, North Carolina 27711, United States
| |
Collapse
|
2
|
El-Masri H, Paul Friedman K, Isaacs K, Wetmore BA. Advances in computational methods along the exposure to toxicological response paradigm. Toxicol Appl Pharmacol 2022; 450:116141. [PMID: 35777528 PMCID: PMC9619339 DOI: 10.1016/j.taap.2022.116141] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2022] [Revised: 05/27/2022] [Accepted: 06/23/2022] [Indexed: 10/17/2022]
Abstract
Human health risk assessment is a function of chemical toxicity, bioavailability to reach target biological tissues, and potential environmental exposure. These factors are complicated by many physiological, biochemical, physical and lifestyle factors. Furthermore, chemical health risk assessment is challenging in view of the large, and continually increasing, number of chemicals found in the environment. These challenges highlight the need to prioritize resources for the efficient and timely assessment of those environmental chemicals that pose greatest health risks. Computational methods, either predictive or investigative, are designed to assist in this prioritization in view of the lack of cost prohibitive in vivo experimental data. Computational methods provide specific and focused toxicity information using in vitro high throughput screening (HTS) assays. Information from the HTS assays can be converted to in vivo estimates of chemical levels in blood or target tissue, which in turn are converted to in vivo dose estimates that can be compared to exposure levels of the screened chemicals. This manuscript provides a review for the landscape of computational methods developed and used at the U.S. Environmental Protection Agency (EPA) highlighting their potentials and challenges.
Collapse
Affiliation(s)
- Hisham El-Masri
- Center for Computational Toxicology and Exposure, Office of Research and Development, U. S. Environmental Protection Agency, Research Triangle Park, NC, USA.
| | - Katie Paul Friedman
- Center for Computational Toxicology and Exposure, Office of Research and Development, U. S. Environmental Protection Agency, Research Triangle Park, NC, USA
| | - Kristin Isaacs
- Center for Computational Toxicology and Exposure, Office of Research and Development, U. S. Environmental Protection Agency, Research Triangle Park, NC, USA
| | - Barbara A Wetmore
- Center for Computational Toxicology and Exposure, Office of Research and Development, U. S. Environmental Protection Agency, Research Triangle Park, NC, USA
| |
Collapse
|