1
|
Deng J, Zhang L, Zhang H, Wang X, Huang X. Chromosome-level genome assembly of the cottony cushion scale Icerya purchasi. Sci Data 2024; 11:639. [PMID: 38886361 PMCID: PMC11183206 DOI: 10.1038/s41597-024-03502-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2024] [Accepted: 06/10/2024] [Indexed: 06/20/2024] Open
Abstract
The cottony cushion scale, Icerya purchasi, a polyphagous pest, poses a significant threat to the global citrus industry. The hermaphroditic self-fertilization observed in I. purchasi is an exceptionally rare reproductive mode among insects. In this study, we successfully assembled a chromosome-level genome sequence for I. purchasi using PacBio long-reads and the Hi-C technique, resulting in a total size of 1,103.38 Mb and a contig N50 of 12.81 Mb. The genome comprises 14,046 predicted protein-coding genes, with 462,722,633 bp occurrence of repetitive sequences. BUSCO analysis revealed a completeness score of 93.20%. The genome sequence of I. purchasi serves as a crucial resource for comprehending the reproductive modes in insects, with particular emphasis on hermaphroditic self-fertilization.
Collapse
Affiliation(s)
- Jun Deng
- State Key Laboratory of Ecological Pest Control for Fujian and Taiwan Crops, College of Plant Protection, Fujian Agriculture and Forestry University, Fuzhou, 350002, China
| | - Lin Zhang
- State Key Laboratory of Ecological Pest Control for Fujian and Taiwan Crops, College of Plant Protection, Fujian Agriculture and Forestry University, Fuzhou, 350002, China
| | - Hui Zhang
- State Key Laboratory of Ecological Pest Control for Fujian and Taiwan Crops, College of Plant Protection, Fujian Agriculture and Forestry University, Fuzhou, 350002, China
| | - Xubo Wang
- Key Laboratory of Forest Disaster Warning and Control in Yunnan Province, College of Biodiversity Conservation, Southwest Forestry University, Kunming, 650224, China
- Yunnan Academy of Biodiversity, Southwest Forestry University, Kunming, 650224, China
| | - Xiaolei Huang
- State Key Laboratory of Ecological Pest Control for Fujian and Taiwan Crops, College of Plant Protection, Fujian Agriculture and Forestry University, Fuzhou, 350002, China.
| |
Collapse
|
2
|
Tang XF, Huang YH, Sun YF, Zhang PF, Huo LZ, Li HS, Pang H. The transcriptome of Icerya aegyptiaca (Hemiptera: Monophlebidae) and comparison with neococcoids reveal genetic clues of evolution in the scale insects. BMC Genomics 2023; 24:231. [PMID: 37138224 PMCID: PMC10158165 DOI: 10.1186/s12864-023-09327-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2022] [Accepted: 04/21/2023] [Indexed: 05/05/2023] Open
Abstract
BACKGROUND Scale insects are worldwide sap-sucking parasites, which can be distinguished into neococcoids and non-neococcoids. Neococcoids are monophyletic with a peculiar reproductive system, paternal genome elimination (PGE). Different with neococcoids, Iceryini, a tribe in non-neococcoids including several damaging pests, has abdominal spiracles, compound eyes in males, relatively abundant wax, unique hermaphrodite system, and specific symbionts. However, the current studies on the gene resources and genomic mechanism of scale insects are mainly limited in the neococcoids, and lacked of comparison in an evolution frame. RESULT We sequenced and de novo assembled a transcriptome of Icerya aegyptiaca (Douglas), a worldwide pest of Iceryini, and used it as representative of non-neococcoids to compare with the genomes or transcriptomes of other six species from different families of neococcoids. We found that the genes under positive selection or negative selection intensification (simplified as "selected genes" below) in I. aegyptiaca included those related to neurogenesis and development, especially eye development. Some genes related to fatty acid biosynthesis were unique in its transcriptome with relatively high expression and not detected in neococcoids. These results may indicate a potential link to the unique structures and abundant wax of I. aegyptiaca compared with neococcoids. Meanwhile, genes related to DNA repair, mitosis, spindle, cytokinesis and oogenesis, were included in the selected genes in I. aegyptiaca, which is possibly associated with cell division and germ cell formation of the hermaphrodite system. Chromatin-related process were enriched from selected genes in neococcoids, along with some mitosis-related genes also detected, which may be related to their unique PGE system. Moreover, in neococcoid species, male-biased genes tend to undergo negative selection relaxation under the PGE system. We also found that the candidate horizontally transferred genes (HTGs) in the scale insects mainly derived from bacteria and fungi. bioD and bioB, the two biotin-synthesizing HTGs were exclusively found in the scale insects and neococcoids, respectively, which possibly show potential demand changes in the symbiotic relationships. CONCLUSION Our study reports the first I. aegyptiaca transcriptome and provides preliminary insights for the genetic change of structures, reproductive systems and symbiont relationships at an evolutionary aspect. This will provide a basis for further research and control of scale insects.
Collapse
Affiliation(s)
- Xue-Fei Tang
- State Key Laboratory of Biocontrol, School of Ecology, Sun Yat-Sen University, Shenzhen, China
| | - Yu-Hao Huang
- State Key Laboratory of Biocontrol, School of Ecology, Sun Yat-Sen University, Shenzhen, China
| | - Yi-Fei Sun
- State Key Laboratory of Biocontrol, School of Ecology, Sun Yat-Sen University, Shenzhen, China
| | - Pei-Fang Zhang
- State Key Laboratory of Biocontrol, School of Ecology, Sun Yat-Sen University, Shenzhen, China
| | - Li-Zhi Huo
- Guangzhou Institute of Forestry and Landscape Architecture, Guangzhou, China
| | - Hao-Sen Li
- State Key Laboratory of Biocontrol, School of Ecology, Sun Yat-Sen University, Shenzhen, China
| | - Hong Pang
- State Key Laboratory of Biocontrol, School of Ecology, Sun Yat-Sen University, Shenzhen, China.
| |
Collapse
|
3
|
Hoang VT, Jeon HJ, You ES, Yoon Y, Jung S, Lee OJ. Graph Representation Learning and Its Applications: A Survey. SENSORS (BASEL, SWITZERLAND) 2023; 23:4168. [PMID: 37112507 PMCID: PMC10144941 DOI: 10.3390/s23084168] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/08/2023] [Revised: 04/16/2023] [Accepted: 04/17/2023] [Indexed: 06/19/2023]
Abstract
Graphs are data structures that effectively represent relational data in the real world. Graph representation learning is a significant task since it could facilitate various downstream tasks, such as node classification, link prediction, etc. Graph representation learning aims to map graph entities to low-dimensional vectors while preserving graph structure and entity relationships. Over the decades, many models have been proposed for graph representation learning. This paper aims to show a comprehensive picture of graph representation learning models, including traditional and state-of-the-art models on various graphs in different geometric spaces. First, we begin with five types of graph embedding models: graph kernels, matrix factorization models, shallow models, deep-learning models, and non-Euclidean models. In addition, we also discuss graph transformer models and Gaussian embedding models. Second, we present practical applications of graph embedding models, from constructing graphs for specific domains to applying models to solve tasks. Finally, we discuss challenges for existing models and future research directions in detail. As a result, this paper provides a structured overview of the diversity of graph embedding models.
Collapse
Affiliation(s)
- Van Thuy Hoang
- Department of Artificial Intelligence, The Catholic University of Korea, 43, Jibong-ro, Bucheon-si 14662, Gyeonggi-do, Republic of Korea; (V.T.H.); (E.-S.Y.)
| | - Hyeon-Ju Jeon
- Data Assimilation Group, Korea Institute of Atmospheric Prediction Systems (KIAPS), 35, Boramae-ro 5-gil, Dongjak-gu, Seoul 07071, Republic of Korea;
| | - Eun-Soon You
- Department of Artificial Intelligence, The Catholic University of Korea, 43, Jibong-ro, Bucheon-si 14662, Gyeonggi-do, Republic of Korea; (V.T.H.); (E.-S.Y.)
| | - Yoewon Yoon
- Department of Social Welfare, Dongguk University, 30, Pildong-ro 1-gil, Jung-gu, Seoul 04620, Republic of Korea;
| | - Sungyeop Jung
- Semiconductor Devices and Circuits Laboratory, Advanced Institute of Convergence Technology (AICT), Seoul National University, 145, Gwanggyo-ro, Yeongtong-gu, Suwon-si 16229, Gyeonggi-do, Republic of Korea;
| | - O-Joun Lee
- Department of Artificial Intelligence, The Catholic University of Korea, 43, Jibong-ro, Bucheon-si 14662, Gyeonggi-do, Republic of Korea; (V.T.H.); (E.-S.Y.)
| |
Collapse
|
4
|
Van Meenen J, Leysen H, Chen H, Baccarne R, Walter D, Martin B, Maudsley S. Making Biomedical Sciences publications more accessible for machines. MEDICINE, HEALTH CARE, AND PHILOSOPHY 2022; 25:179-190. [PMID: 35039972 DOI: 10.1007/s11019-022-10069-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 01/08/2022] [Indexed: 06/14/2023]
Abstract
With the rapidly expanding catalogue of scientific publications, especially within the Biomedical Sciences field, it is becoming increasingly difficult for researchers to search for, read or even interpret emerging scientific findings. PubMed, just one of the current biomedical data repositories, comprises over 33 million citations for biomedical research, and over 2500 publications are added each day. To further strengthen the impact biomedical research, we suggest that there should be more synergy between publications and machines. By bringing machines into the realm of research and publication, we can greatly augment the assessment, investigation and cataloging of the biomedical literary corpus. The effective application of machine-based manuscript assessment and interpretation is now crucial, and potentially stands as the most effective way for researchers to comprehend and process the tsunami of biomedical data and literature. Many biomedical manuscripts are currently published online in poorly searchable document types, with figures and data presented in formats that are partially inaccessible to machine-based approaches. The structure and format of biomedical manuscripts should be adapted to facilitate machine-assisted interrogation of this important literary corpus. In this context, it is important to embrace the concept that biomedical scientists should also write manuscripts that can be read by machines. It is likely that an enhanced human-machine synergy in reading biomedical publications will greatly enhance biomedical data retrieval and reveal novel insights into complex datasets.
Collapse
Affiliation(s)
- Joris Van Meenen
- Receptor Biology Lab, Department of Biomedical Sciences, University of Antwerp, Wilrijk, 2610, Antwerp, Belgium
- Antwerp Research Group for Ocular Science, Department of Translational Neurosciences, University of Antwerp, Wilrijk, 2610, Antwerp, Belgium
| | - Hanne Leysen
- Receptor Biology Lab, Department of Biomedical Sciences, University of Antwerp, Wilrijk, 2610, Antwerp, Belgium
| | - Hongyu Chen
- Weill Cornell Medical College, New York, NY, USA
| | - Rudi Baccarne
- Anet Library Automation, University of Antwerp, Wilrijk, 2610, Antwerp, Belgium
| | - Deborah Walter
- Receptor Biology Lab, Department of Biomedical Sciences, University of Antwerp, Wilrijk, 2610, Antwerp, Belgium
| | - Bronwen Martin
- Faculty of Pharmaceutical, Veterinary and Biomedical Sciences, University of Antwerp, Wilrijk, 2610, Antwerp, Belgium
| | - Stuart Maudsley
- Receptor Biology Lab, Department of Biomedical Sciences, University of Antwerp, Wilrijk, 2610, Antwerp, Belgium.
| |
Collapse
|
5
|
Peng Y, Tang Y, Lee S, Zhu Y, Summers RM, Lu Z. COVID-19-CT-CXR: A Freely Accessible and Weakly Labeled Chest X-Ray and CT Image Collection on COVID-19 From Biomedical Literature. IEEE TRANSACTIONS ON BIG DATA 2021; 7:3-12. [PMID: 33997112 PMCID: PMC8117951 DOI: 10.1109/tbdata.2020.3035935] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/29/2020] [Revised: 10/09/2020] [Accepted: 10/19/2020] [Indexed: 05/06/2023]
Abstract
The latest threat to global health is the COVID-19 outbreak. Although there exist large datasets of chest X-rays (CXR) and computed tomography (CT) scans, few COVID-19 image collections are currently available due to patient privacy. At the same time, there is a rapid growth of COVID-19-relevant articles in the biomedical literature, including those that report findings on radiographs. Here, we present COVID-19-CT-CXR, a public database of COVID-19 CXR and CT images, which are automatically extracted from COVID-19-relevant articles from the PubMed Central Open Access (PMC-OA) Subset. We extracted figures, associated captions, and relevant figure descriptions in the article and separated compound figures into subfigures. Because a large portion of figures in COVID-19 articles are not CXR or CT, we designed a deep-learning model to distinguish them from other figure types and to classify them accordingly. The final database includes 1,327 CT and 263 CXR images (as of May 9, 2020) with their relevant text. To demonstrate the utility of COVID-19-CT-CXR, we conducted four case studies. (1) We show that COVID-19-CT-CXR, when used as additional training data, is able to contribute to improved deep-learning (DL) performance for the classification of COVID-19 and non-COVID-19 CT. (2) We collected CT images of influenza, another common infectious respiratory illness that may present similarly to COVID-19, and fine-tuned a baseline deep neural network to distinguish a diagnosis of COVID-19, influenza, or normal or other types of diseases on CT. (3) We fine-tuned an unsupervised one-class classifier from non-COVID-19 CXR and performed anomaly detection to detect COVID-19 CXR. (4) From text-mined captions and figure descriptions, we compared 15 clinical symptoms and 20 clinical findings of COVID-19 versus those of influenza to demonstrate the disease differences in the scientific publications. Our database is unique, as the figures are retrieved along with relevant text with fine-grained descriptions, and it can be extended easily in the future. We believe that our work is complementary to existing resources and hope that it will contribute to medical image analysis of the COVID-19 pandemic. The dataset, code, and DL models are publicly available at https://github.com/ncbi-nlp/COVID-19-CT-CXR.
Collapse
Affiliation(s)
- Yifan Peng
- NCBI/NLM/NIH and Department of Population Health SciencesWeill Cornell MedicineNew YorkNY10065USA
| | - Yuxing Tang
- Imaging Biomarkers and Computer-Aided Diagnosis LaboratoryRadiology and Imaging Sciences DepartmentNational Institutes of Health (NIH) Clinical CenterBethesdaMD20892USA
| | - Sungwon Lee
- Imaging Biomarkers and Computer-Aided Diagnosis LaboratoryRadiology and Imaging Sciences DepartmentNational Institutes of Health (NIH) Clinical CenterBethesdaMD20892USA
| | - Yingying Zhu
- Imaging Biomarkers and Computer-Aided Diagnosis LaboratoryRadiology and Imaging Sciences DepartmentNational Institutes of Health (NIH) Clinical CenterBethesdaMD20892USA
- Department of Computer Science and EngineeringUniversity of Texas at ArlingtonArlingtonTX76019USA
| | - Ronald M. Summers
- Imaging Biomarkers and Computer-Aided Diagnosis LaboratoryRadiology and Imaging Sciences DepartmentNational Institutes of Health (NIH) Clinical CenterBethesdaMD20892USA
| | - Zhiyong Lu
- National Center for Biotechnology Information (NCBI)National Library of Medicine (NLM)National Institutes of Health (NIH)BethesdaMD20894USA
| |
Collapse
|
6
|
Li P, Jiang X, Shatkay H. Figure and caption extraction from biomedical documents. Bioinformatics 2020; 35:4381-4388. [PMID: 30949681 PMCID: PMC6821181 DOI: 10.1093/bioinformatics/btz228] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2018] [Revised: 03/22/2019] [Accepted: 04/02/2019] [Indexed: 12/16/2022] Open
Abstract
Motivation Figures and captions convey essential information in biomedical documents. As such, there is a growing interest in mining published biomedical figures and in utilizing their respective captions as a source of knowledge. Notably, an essential step underlying such mining is the extraction of figures and captions from publications. While several PDF parsing tools that extract information from such documents are publicly available, they attempt to identify images by analyzing the PDF encoding and structure and the complex graphical objects embedded within. As such, they often incorrectly identify figures and captions in scientific publications, whose structure is often non-trivial. The extraction of figures, captions and figure-caption pairs from biomedical publications is thus neither well-studied nor yet well-addressed. Results We introduce a new and effective system for figure and caption extraction, PDFigCapX. Unlike existing methods, we first separate between text and graphical contents, and then utilize layout information to effectively detect and extract figures and captions. We generate files containing the figures and their associated captions and provide those as output to the end-user. We test our system both over a public dataset of computer science documents previously used by others, and over two newly collected sets of publications focusing on the biomedical domain. Our experiments and results comparing PDFigCapX to other state-of-the-art systems show a significant improvement in performance, and demonstrate the effectiveness and robustness of our approach. Availability and implementation Our system is publicly available for use at: https://www.eecis.udel.edu/~compbio/PDFigCapX. The two new datasets are available at: https://www.eecis.udel.edu/~compbio/PDFigCapX/Downloads
Collapse
Affiliation(s)
- Pengyuan Li
- Department of Computer and Information Sciences, University of Delaware, Newark, DE, USA
| | - Xiangying Jiang
- Department of Computer and Information Sciences, University of Delaware, Newark, DE, USA
| | - Hagit Shatkay
- Department of Computer and Information Sciences, University of Delaware, Newark, DE, USA
| |
Collapse
|
7
|
Zhang T, Leng J, Liu Y. Deep learning for drug–drug interaction extraction from the literature: a review. Brief Bioinform 2019; 21:1609-1627. [DOI: 10.1093/bib/bbz087] [Citation(s) in RCA: 26] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2019] [Revised: 06/20/2019] [Accepted: 06/21/2019] [Indexed: 01/07/2023] Open
Abstract
Abstract
Drug–drug interactions (DDIs) are crucial for drug research and pharmacovigilance. These interactions may cause adverse drug effects that threaten public health and patient safety. Therefore, the DDIs extraction from biomedical literature has been widely studied and emphasized in modern biomedical research. The previous rules-based and machine learning approaches rely on tedious feature engineering, which is labourious, time-consuming and unsatisfactory. With the development of deep learning technologies, this problem is alleviated by learning feature representations automatically. Here, we review the recent deep learning methods that have been applied to the extraction of DDIs from biomedical literature. We describe each method briefly and compare its performance in the DDI corpus systematically. Next, we summarize the advantages and disadvantages of these deep learning models for this task. Furthermore, we discuss some challenges and future perspectives of DDI extraction via deep learning methods. This review aims to serve as a useful guide for interested researchers to further advance bioinformatics algorithms for DDIs extraction from the literature.
Collapse
Affiliation(s)
- Tianlin Zhang
- School of Computer Science and Technology, University of Chinese Academy of Sciences, China
| | - Jiaxu Leng
- School of Computer Science and Technology, University of Chinese Academy of Sciences, China
| | - Ying Liu
- University of Chinese Academy of Sciences, Key Lab of Big Data Mining and Knowledge Management
| |
Collapse
|
8
|
Deng J, Lu C, Huang X. The first mitochondrial genome of scale insects (Hemiptera: Coccoidea). MITOCHONDRIAL DNA PART B-RESOURCES 2019; 4:2094-2095. [PMID: 33365423 PMCID: PMC7687528 DOI: 10.1080/23802359.2019.1622464] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Here, we report the first mitochondrial genome of scale insects sequenced from Ceroplastes japonicus (Hemiptera: Coccidae). The genome has a circular genome of 14,979 bp in length, with a high A + T content of 85.15%. Twelve protein-coding genes (excluding atp8), 13 tRNA, and 2 rRNA genes were detected and annotated using the MITOS web server. The absence of atp8 and some tRNAs might indicate possible novel structures or loss of genes.
Collapse
Affiliation(s)
- Jun Deng
- State Key Laboratory of Ecological Pest Control for Fujian and Taiwan Crops, College of Plant Protection, Fujian Agriculture and Forestry University, Fuzhou, China
| | - Congcong Lu
- State Key Laboratory of Ecological Pest Control for Fujian and Taiwan Crops, College of Plant Protection, Fujian Agriculture and Forestry University, Fuzhou, China
| | - Xiaolei Huang
- State Key Laboratory of Ecological Pest Control for Fujian and Taiwan Crops, College of Plant Protection, Fujian Agriculture and Forestry University, Fuzhou, China
| |
Collapse
|
9
|
Li P, Jiang X, Kambhamettu C, Shatkay H. Compound image segmentation of published biomedical figures. Bioinformatics 2018; 34:1192-1199. [PMID: 29040394 DOI: 10.1093/bioinformatics/btx611] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2017] [Accepted: 09/22/2017] [Indexed: 12/28/2022] Open
Abstract
Motivation Images convey essential information in biomedical publications. As such, there is a growing interest within the bio-curation and the bio-databases communities, to store images within publications as evidence for biomedical processes and for experimental results. However, many of the images in biomedical publications are compound images consisting of multiple panels, where each individual panel potentially conveys a different type of information. Segmenting such images into constituent panels is an essential first step toward utilizing images. Results In this article, we develop a new compound image segmentation system, FigSplit, which is based on Connected Component Analysis. To overcome shortcomings typically manifested by existing methods, we develop a quality assessment step for evaluating and modifying segmentations. Two methods are proposed to re-segment the images if the initial segmentation is inaccurate. Experimental results show the effectiveness of our method compared with other methods. Availability and implementation The system is publicly available for use at: https://www.eecis.udel.edu/~compbio/FigSplit. The code is available upon request. Contact shatkay@udel.edu. Supplementary information Supplementary data are available online at Bioinformatics.
Collapse
Affiliation(s)
- Pengyuan Li
- Department of Computer and Information Sciences, University of Delaware, Newark, DE 19716, USA
| | - Xiangying Jiang
- Department of Computer and Information Sciences, University of Delaware, Newark, DE 19716, USA
| | - Chandra Kambhamettu
- Department of Computer and Information Sciences, University of Delaware, Newark, DE 19716, USA
| | - Hagit Shatkay
- Department of Computer and Information Sciences, University of Delaware, Newark, DE 19716, USA
| |
Collapse
|
10
|
Vilar S, Friedman C, Hripcsak G. Detection of drug-drug interactions through data mining studies using clinical sources, scientific literature and social media. Brief Bioinform 2018; 19:863-877. [PMID: 28334070 PMCID: PMC6454455 DOI: 10.1093/bib/bbx010] [Citation(s) in RCA: 82] [Impact Index Per Article: 13.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2016] [Revised: 12/28/2016] [Indexed: 11/13/2022] Open
Abstract
Drug-drug interactions (DDIs) constitute an important concern in drug development and postmarketing pharmacovigilance. They are considered the cause of many adverse drug effects exposing patients to higher risks and increasing public health system costs. Methods to follow-up and discover possible DDIs causing harm to the population are a primary aim of drug safety researchers. Here, we review different methodologies and recent advances using data mining to detect DDIs with impact on patients. We focus on data mining of different pharmacovigilance sources, such as the US Food and Drug Administration Adverse Event Reporting System and electronic health records from medical institutions, as well as on the diverse data mining studies that use narrative text available in the scientific biomedical literature and social media. We pay attention to the strengths but also further explain challenges related to these methods. Data mining has important applications in the analysis of DDIs showing the impact of the interactions as a cause of adverse effects, extracting interactions to create knowledge data sets and gold standards and in the discovery of novel and dangerous DDIs.
Collapse
Affiliation(s)
- Santiago Vilar
- Department of Biomedical Informatics, Columbia University, New York, USA
- Department of Organic Chemistry, University of Santiago de Compostela, Spain
| | - Carol Friedman
- Department of Biomedical Informatics, Columbia University, New York, USA
| | - George Hripcsak
- Department of Biomedical Informatics, Columbia University, New York, USA
| |
Collapse
|
11
|
He Y, Yu X, Gan Y, Zhu T, Xiong S, Peng J, Hu L, Xu G, Yuan X. Bar charts detection and analysis in biomedical literature of PubMed Central. AMIA ... ANNUAL SYMPOSIUM PROCEEDINGS. AMIA SYMPOSIUM 2018; 2017:859-865. [PMID: 29854152 PMCID: PMC5977659] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Bar charts are crucial to summarize and present multi-faceted data sets in biomedical publications. Quantitative information carried by bar charts is of great interest to scientists and practitioners, which make it valuable to parse bar charts. This fact together with the abundance of bar chart images and their shared common patterns gives us a good candidates for automated image mining and parsing. We demonstrate a workflow to analyze bar charts and give a few feasible solutions to apply it. We are able to detect bar segments and panels with a promising performance in terms of both accuracy and recall, and we also perform extensive experiments to identify the entities of bar charts in the images of biomedical literature collected from PubMed Central. While we cannot provide a complete instance of the application using our method, we present evidence that this kind of image mining is feasible.
Collapse
Affiliation(s)
- Ying He
- School of Computer Science and Technology, Wuhan University of Technology, Wuhan, China
| | - Xiaohan Yu
- School of Computer Science and Technology, Wuhan University of Technology, Wuhan, China
| | - Yangjing Gan
- School of Computer Science and Technology, Wuhan University of Technology, Wuhan, China
| | - Tujin Zhu
- School of Computer Science and Technology, Wuhan University of Technology, Wuhan, China
- Hubei Co-Innovation Center of Basic Education Information Technology Services, College of Computer, Hubei University of Education, Wuhan, China
| | - Shengwu Xiong
- School of Computer Science and Technology, Wuhan University of Technology, Wuhan, China
| | - Jing Peng
- School of Computer Science and Technology, Wuhan University of Technology, Wuhan, China
| | - Lun Hu
- School of Computer Science and Technology, Wuhan University of Technology, Wuhan, China
| | - Guang Xu
- Hubei Co-Innovation Center of Basic Education Information Technology Services, College of Computer, Hubei University of Education, Wuhan, China
| | - Xiaohui Yuan
- School of Computer Science and Technology, Wuhan University of Technology, Wuhan, China
| |
Collapse
|