1
|
Biziukova N, Tarasova O, Ivanov S, Poroikov V. Automated Extraction of Information From Texts of Scientific Publications: Insights Into HIV Treatment Strategies. Front Genet 2021; 11:618862. [PMID: 33414815 PMCID: PMC7783389 DOI: 10.3389/fgene.2020.618862] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2020] [Accepted: 11/26/2020] [Indexed: 12/16/2022] Open
Abstract
Text analysis can help to identify named entities (NEs) of small molecules, proteins, and genes. Such data are very important for the analysis of molecular mechanisms of disease progression and development of new strategies for the treatment of various diseases and pathological conditions. The texts of publications represent a primary source of information, which is especially important to collect the data of the highest quality due to the immediate obtaining information, in comparison with databases. In our study, we aimed at the development and testing of an approach to the named entity recognition in the abstracts of publications. More specifically, we have developed and tested an algorithm based on the conditional random fields, which provides recognition of NEs of (i) genes and proteins and (ii) chemicals. Careful selection of abstracts strictly related to the subject of interest leads to the possibility of extracting the NEs strongly associated with the subject. To test the applicability of our approach, we have applied it for the extraction of (i) potential HIV inhibitors and (ii) a set of proteins and genes potentially responsible for viremic control in HIV-positive patients. The computational experiments performed provide the estimations of evaluating the accuracy of recognition of chemical NEs and proteins (genes). The precision of the chemical NEs recognition is over 0.91; recall is 0.86, and the F1-score (harmonic mean of precision and recall) is 0.89; the precision of recognition of proteins and genes names is over 0.86; recall is 0.83; while F1-score is above 0.85. Evaluation of the algorithm on two case studies related to HIV treatment confirms our suggestion about the possibility of extracting the NEs strongly relevant to (i) HIV inhibitors and (ii) a group of patients i.e., the group of HIV-positive individuals with an ability to maintain an undetectable HIV-1 viral load overtime in the absence of antiretroviral therapy. Analysis of the results obtained provides insights into the function of proteins that can be responsible for viremic control. Our study demonstrated the applicability of the developed approach for the extraction of useful data on HIV treatment.
Collapse
Affiliation(s)
- Nadezhda Biziukova
- Laboratory of Structure-Function Based Drug Design, Department of Bioinformatics, Institute of Biomedical Chemistry, Moscow, Russia
| | - Olga Tarasova
- Laboratory of Structure-Function Based Drug Design, Department of Bioinformatics, Institute of Biomedical Chemistry, Moscow, Russia
| | - Sergey Ivanov
- Laboratory of Structure-Function Based Drug Design, Department of Bioinformatics, Institute of Biomedical Chemistry, Moscow, Russia.,Department of Bioinformatics, Faculty of Biomedicine, Pirogov Russian National Research Medical University, Moscow, Russia
| | - Vladimir Poroikov
- Laboratory of Structure-Function Based Drug Design, Department of Bioinformatics, Institute of Biomedical Chemistry, Moscow, Russia
| |
Collapse
|
2
|
Mirarab A, Mohebbi A, Javid N, Moradi A, Vakili MA, Tabarraei A. Human cytomegalovirus pUL97 drug-resistance mutations in congenitally neonates and HIV-infected, no-drug-treated patients. Future Virol 2017. [DOI: 10.2217/fvl-2016-0089] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
Aim: Human cytomegalovirus (HCMV) treatment is hard to achieve because of viral protein target sequence variations. Objectives: We aimed to find HCMV pUL97 kinase variations in HIV- and congenitally infected patients. Methods: Twenty HCMV-positive DNA samples from nonganciclovir treated congenitally infected neonates and HIV positive patients were used for PCR restriction fragment length polymorphism. Variations were assessed computationally for pUL97 functionality. Results: P521L, D605E and N597Y substitutions were prevalent significantly in congenital infection. Furthermore, we found those mutations have neutral or low impact on pUL97 functionality. In addition, we found a new K599Q substitution in an HIV-infected individual. Conclusion: More prevalent substitutions related to low-grade ganciclovir resistance were found in congenitally infected neonates in comparison with HIV-infected patients.
Collapse
Affiliation(s)
- Azam Mirarab
- Student Research Committee, School of Medicine, Golestan University of Medical Science, Gorgan, Iran
| | - Alireza Mohebbi
- Student Research Committee, School of Medicine, Golestan University of Medical Science, Gorgan, Iran
| | - Naeme Javid
- Infectious Diseases Research Centre, Golestan University of Medical Science, Gorgan, Iran
| | - Abdolvahab Moradi
- Infectious Diseases Research Centre, Golestan University of Medical Science, Gorgan, Iran
| | - Mohammad A Vakili
- Infectious Diseases Research Centre, Golestan University of Medical Science, Gorgan, Iran
| | - Alijan Tabarraei
- Infectious Diseases Research Centre, Golestan University of Medical Science, Gorgan, Iran
| |
Collapse
|
3
|
Slavov SN, Otaguiri KK, de Figueiredo GG, Yamamoto AY, Mussi-Pinhata MM, Kashima S, Covas DT. Development and optimization of a sensitive TaqMan® real-time PCR with synthetic homologous extrinsic control for quantitation of Human cytomegalovirus viral load. J Med Virol 2016; 88:1604-12. [PMID: 26890091 DOI: 10.1002/jmv.24499] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/13/2016] [Indexed: 02/03/2023]
Abstract
Human cytomegalovirus (Human herpesvirus 5, HCMV) causes frequent asymptomatic infections in the general population. However, in immunosuppressed patients or congenitally infected infants, HCMV is related to high morbidity and mortality. In such cases, a rapid viral detection is crucial for monitoring the clinical outcome and the antiviral treatment. In this study, we optimized a sensitive biplex TaqMan® real-time PCR for the simultaneous detection and differentiation of a partial HCMV UL97 sequence and homologous extrinsic control (HEC) in the same tube. HEC was represented by a plasmid containing a modified HCMV sequence retaining the original primer binding sites, while the probe sequence was substituted by a phylogenetically divergent one (chloroplast CF0 subunit plant gene). It was estimated that the optimal HEC concentration, which did not influence the HCMV amplification is 1,000 copies/reaction. The optimized TaqMan® PCR demonstrated high analytical sensitivity (6.97 copies/reaction, CI = 95%) and specificity (100%). Moreover, the reaction showed adequate precision (repeatability, CV = 0.03; reproducibility, CV = 0.0027) and robustness (no carry-over or cross-contamination). The diagnostic sensitivity (100%) and specificity (97.8%) were adequate for the clinical application of the molecular platform. The optimized TaqMan® real-time PCR is suitable for HCMV detection and quantitation in predisposed patients and monitoring of the applied antiviral therapy. J. Med. Virol. 88:1604-1612, 2016. © 2016 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Svetoslav Nanev Slavov
- Faculty of Medicine of Ribeirão Preto, Blood Center of Ribeirão Preto, University of São Paulo, Brazil.,Faculty of Medicine of Ribeirão Preto, Department of Clinical Medicine, University of São Paulo, Brazil
| | - Katia Kaori Otaguiri
- Faculty of Medicine of Ribeirão Preto, Blood Center of Ribeirão Preto, University of São Paulo, Brazil.,Faculty of Pharmaceutical Sciences, Department of Clinical, Toxicological and Bromatological Analyses, University of São Paulo, Brazil
| | | | - Aparecida Yulie Yamamoto
- Faculty of Medicine of Ribeirão Preto, Department of Pediatrics, University of São Paulo, Brazil
| | | | - Simone Kashima
- Faculty of Medicine of Ribeirão Preto, Blood Center of Ribeirão Preto, University of São Paulo, Brazil.,Faculty of Pharmaceutical Sciences, Department of Clinical, Toxicological and Bromatological Analyses, University of São Paulo, Brazil
| | - Dimas Tadeu Covas
- Faculty of Medicine of Ribeirão Preto, Blood Center of Ribeirão Preto, University of São Paulo, Brazil.,Faculty of Medicine of Ribeirão Preto, Department of Clinical Medicine, University of São Paulo, Brazil
| |
Collapse
|