1
|
Jamaludeen N, Beyer C, Billing U, Vogel K, Brunner-Weinzierl M, Spiliopoulou M. Potential of Point-of-Care and At-Home Assessment of Immune Status via Rapid Cytokine Detection and Questionnaire-Based Anamnesis. SENSORS (BASEL, SWITZERLAND) 2021; 21:4960. [PMID: 34372196 PMCID: PMC8348245 DOI: 10.3390/s21154960] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/30/2021] [Revised: 06/30/2021] [Accepted: 07/07/2021] [Indexed: 12/29/2022]
Abstract
Monitoring the immune system's status has emerged as an urgent demand in critical health conditions. The circulating cytokine levels in the blood reflect a thorough insight into the immune system status. Indeed, measuring one cytokine may deliver more information equivalent to detecting multiple diseases at a time. However, if the reported cytokine levels are interpreted with considering lifestyle and any comorbid health conditions for the individual, this will promote a more precise assessment of the immune status. Therefore, this study addresses the most recent advanced assays that deliver rapid, accurate measuring of the cytokine levels in human blood, focusing on add-on potentials for point-of-care (PoC) or personal at-home usage, and investigates existing health questionnaires as supportive assessment tools that collect all necessary information for the concrete analysis of the measured cytokine levels. We introduced a ten-dimensional featuring of cytokine measurement assays. We found 15 rapid cytokine assays with assay time less than 1 h; some could operate on unprocessed blood samples, while others are mature commercial products available in the market. In addition, we retrieved several health questionnaires that addressed various health conditions such as chronic diseases and psychological issues. Then, we present a machine learning-based solution to determine what makes the immune system fit. To this end, we discuss how to employ topic modeling for deriving the definition of immune fitness automatically from literature. Finally, we propose a prototype model to assess the fitness of the immune system through leveraging the derived definition of the immune fitness, the cytokine measurements delivered by a rapid PoC immunoassay, and the complementary information collected by the health questionnaire about other health factors. In conclusion, we discovered various advanced rapid cytokine detection technologies that are promising candidates for point-of-care or at-home usage; if paired with a health status questionnaire, the assessment of the immune system status becomes solid and we demonstrated potentials for promoting the assessment tool with data mining techniques.
Collapse
Affiliation(s)
- Noor Jamaludeen
- Knowledge Management & Discovery Lab, Otto-von-Guericke University, 39106 Magdeburg, Germany; (C.B.); (M.S.)
| | - Christian Beyer
- Knowledge Management & Discovery Lab, Otto-von-Guericke University, 39106 Magdeburg, Germany; (C.B.); (M.S.)
| | - Ulrike Billing
- Department of Experimental Pediatrics, University Hospital, Otto-von-Guericke University, 39120 Magdeburg, Germany; (U.B.); (K.V.); (M.B.-W.)
| | - Katrin Vogel
- Department of Experimental Pediatrics, University Hospital, Otto-von-Guericke University, 39120 Magdeburg, Germany; (U.B.); (K.V.); (M.B.-W.)
| | - Monika Brunner-Weinzierl
- Department of Experimental Pediatrics, University Hospital, Otto-von-Guericke University, 39120 Magdeburg, Germany; (U.B.); (K.V.); (M.B.-W.)
| | - Myra Spiliopoulou
- Knowledge Management & Discovery Lab, Otto-von-Guericke University, 39106 Magdeburg, Germany; (C.B.); (M.S.)
| |
Collapse
|
2
|
Kavvadias S, Drosatos G, Kaldoudi E. Supporting topic modeling and trends analysis in biomedical literature. J Biomed Inform 2020; 110:103574. [DOI: 10.1016/j.jbi.2020.103574] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2020] [Revised: 08/24/2020] [Accepted: 09/12/2020] [Indexed: 11/25/2022]
|
3
|
Jiang Y, Wu C, Zhang Y, Zhang S, Yu S, Lei P, Lu Q, Xi Y, Wang H, Song Z. GTX.Digest.VCF: an online NGS data interpretation system based on intelligent gene ranking and large-scale text mining. BMC Med Genomics 2019; 12:193. [PMID: 31856831 PMCID: PMC6923899 DOI: 10.1186/s12920-019-0637-x] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2019] [Accepted: 11/26/2019] [Indexed: 02/07/2023] Open
Abstract
Background An important task in the interpretation of sequencing data is to highlight pathogenic genes (or detrimental variants) in the field of Mendelian diseases. It is still challenging despite the recent rapid development of genomics and bioinformatics. A typical interpretation workflow includes annotation, filtration, manual inspection and literature review. Those steps are time-consuming and error-prone in the absence of systematic support. Therefore, we developed GTX.Digest.VCF, an online DNA sequencing interpretation system, which prioritizes genes and variants for novel disease-gene relation discovery and integrates text mining results to provide literature evidence for the discovery. Its phenotype-driven ranking and biological data mining approach significantly speed up the whole interpretation process. Results The GTX.Digest.VCF system is freely available as a web portal at http://vcf.gtxlab.com for academic research. Evaluation on the DDD project dataset demonstrates an accuracy of 77% (235 out of 305 cases) for top-50 genes and an accuracy of 41.6% (127 out of 305 cases) for top-5 genes. Conclusions GTX.Digest.VCF provides an intelligent web portal for genomics data interpretation via the integration of bioinformatics tools, distributed parallel computing, biomedical text mining. It can facilitate the application of genomic analytics in clinical research and practices.
Collapse
Affiliation(s)
| | - Chengkun Wu
- State Key Laboratory of High-Performance Computing, College of Computer, National University of Defense Technology, Changsha, 410073, China
| | - Yanghui Zhang
- NHC key laboratory of birth defects research, prevention and treatment (Hunan Provincial Maternal and Child Health Care Hospital), NO.53 Xiangchun Road, Changsha, 410008, Hunan, China
| | - Shaowei Zhang
- Genetalks Biotech. Co., Ltd., Changsha, 410000, China
| | - Shuojun Yu
- Genetalks Biotech. Co., Ltd., Changsha, 410000, China
| | - Peng Lei
- Genetalks Biotech. Co., Ltd., Changsha, 410000, China
| | - Qin Lu
- Genetalks Biotech. Co., Ltd., Changsha, 410000, China
| | - Yanwei Xi
- Cytogenetics and Human Molecular Genetics Laboratories, Royal University Hospital, Saskatoon, SK, Canada
| | - Hua Wang
- NHC key laboratory of birth defects research, prevention and treatment (Hunan Provincial Maternal and Child Health Care Hospital), NO.53 Xiangchun Road, Changsha, 410008, Hunan, China. .,Hunan Provincial Maternal and Child Health Care Hospital, Changsha, 410073, China.
| | - Zhuo Song
- Genetalks Biotech. Co., Ltd., Changsha, 410000, China.
| |
Collapse
|
4
|
Zhang G, Wang W, Huang W, Xie X, Liang Z, Cao H. Cross-disease analysis identified novel common genes for both lung adenocarcinoma and lung squamous cell carcinoma. Oncol Lett 2019; 18:3463-3470. [PMID: 31516564 PMCID: PMC6732964 DOI: 10.3892/ol.2019.10678] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2018] [Accepted: 05/25/2019] [Indexed: 12/25/2022] Open
Abstract
Lung squamous cell carcinoma (LSCC) exhibits a number of similarities with lung adenocarcinoma (LA) in terms of copy number alterations. However, compared with LA, the range of genetic alterations in LSCC is less understood. In the present study, a large-scale literature-based search of LA-associated genes and LSCC-associated genes was performed to identify the genetic basis in common with these two diseases. For each of the LA-associated genes, a mega-analysis was performed to test its expression variations in LSCC using 11 RNA expression datasets, with significant genes identified using statistical analysis. Subsequently, a functional pathway analysis was performed to identify a possible association between any of the significant genes identified from the mega-analysis and LSCC, followed by a co-expression analysis. A multiple linear regression (MLR) model was employed to investigate the possible influence of sample size, country of origin and study date on gene expression in patients with LSCC. Disease-gene association data analysis identified 1,178 genes involved in LA, 334 in LSCC, with a significant overlap of 187 genes (P<1.02×−161). Mega-analysis revealed that three LA-associated genes, such as solute carrier family 2 member 1 (SLC2A1), endothelial PAS domain protein 1 (EPAS1) and cyclin-dependent kinase 4 (CDK4), were significantly associated with LSCC (P<1.60×10−8), with multiple potential pathways identified by functional pathway analysis, which were further validated by co-expression analysis. The present MLR analysis suggested that the country of origin was a significant factor for the levels of expression of all three genes in patients with LSCC (P<4.0×10−3). Collectively, the present results suggested that genes associated with LA should be further investigated for their association with LSCC. In addition, SLC2A1, EPAS1 and CDK4 may be novel risk genes associated with LA and LSCC.
Collapse
Affiliation(s)
- Guanghui Zhang
- Department of Cardiothoracic Surgery, Ningbo Fourth Hospital, Ningbo, Zhejiang 315037, P.R. China
| | - Weijie Wang
- Department of Cardiothoracic Surgery, Ningbo Fourth Hospital, Ningbo, Zhejiang 315037, P.R. China
| | - Weiyang Huang
- Department of Cardiothoracic Surgery, Ningbo Fourth Hospital, Ningbo, Zhejiang 315037, P.R. China
| | - Xiaoli Xie
- Department of Cardiothoracic Surgery, Ningbo Fourth Hospital, Ningbo, Zhejiang 315037, P.R. China
| | - Zhigang Liang
- Department of Thoracic Surgery, Ningbo First Hospital, Ningbo, Zhejiang 315000, P.R. China
| | - Hongbao Cao
- Statistical Genomics and Data Analysis Core, National Institutes of Health, Bethesda, MD 20852, USA
| |
Collapse
|
5
|
Shen F, Zhao Y, Wang L, Mojarad MR, Wang Y, Liu S, Liu H. Rare disease knowledge enrichment through a data-driven approach. BMC Med Inform Decis Mak 2019; 19:32. [PMID: 30764825 PMCID: PMC6376651 DOI: 10.1186/s12911-019-0752-9] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2018] [Accepted: 02/01/2019] [Indexed: 01/03/2023] Open
Abstract
BACKGROUND Existing resources to assist the diagnosis of rare diseases are usually curated from the literature that can be limited for clinical use. It often takes substantial effort before the suspicion of a rare disease is even raised to utilize those resources. The primary goal of this study was to apply a data-driven approach to enrich existing rare disease resources by mining phenotype-disease associations from electronic medical record (EMR). METHODS We first applied association rule mining algorithms on EMR to extract significant phenotype-disease associations and enriched existing rare disease resources (Human Phenotype Ontology and Orphanet (HPO-Orphanet)). We generated phenotype-disease bipartite graphs for HPO-Orphanet, EMR, and enriched knowledge base HPO-Orphanet + and conducted a case study on Hodgkin lymphoma to compare performance on differential diagnosis among these three graphs. RESULTS We used disease-disease similarity generated by the eRAM, an existing rare disease encyclopedia, as a gold standard to compare the three graphs with sensitivity and specificity as (0.17, 0.36, 0.46) and (0.52, 0.47, 0.51) for three graphs respectively. We also compared the top 15 diseases generated by the HPO-Orphanet + graph with eRAM and another clinical diagnostic tool, the Phenomizer. CONCLUSIONS Per our evaluation results, our approach was able to enrich existing rare disease knowledge resources with phenotype-disease associations from EMR and thus support rare disease differential diagnosis.
Collapse
Affiliation(s)
- Feichen Shen
- Department of Health Sciences Research, Mayo Clinic, 205 3rd Ave SW, Rochester, MN, 55905, USA.
| | - Yiqing Zhao
- Department of Health Sciences Research, Mayo Clinic, 205 3rd Ave SW, Rochester, MN, 55905, USA
| | - Liwei Wang
- Department of Health Sciences Research, Mayo Clinic, 205 3rd Ave SW, Rochester, MN, 55905, USA
| | - Majid Rastegar Mojarad
- Department of Health Sciences Research, Mayo Clinic, 205 3rd Ave SW, Rochester, MN, 55905, USA
| | - Yanshan Wang
- Department of Health Sciences Research, Mayo Clinic, 205 3rd Ave SW, Rochester, MN, 55905, USA
| | - Sijia Liu
- Department of Health Sciences Research, Mayo Clinic, 205 3rd Ave SW, Rochester, MN, 55905, USA
| | - Hongfang Liu
- Department of Health Sciences Research, Mayo Clinic, 205 3rd Ave SW, Rochester, MN, 55905, USA.
| |
Collapse
|
6
|
Wang Y, Sohn S, Liu S, Shen F, Wang L, Atkinson EJ, Amin S, Liu H. A clinical text classification paradigm using weak supervision and deep representation. BMC Med Inform Decis Mak 2019; 19:1. [PMID: 30616584 PMCID: PMC6322223 DOI: 10.1186/s12911-018-0723-6] [Citation(s) in RCA: 131] [Impact Index Per Article: 26.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2018] [Accepted: 12/10/2018] [Indexed: 01/02/2023] Open
Abstract
BACKGROUND Automatic clinical text classification is a natural language processing (NLP) technology that unlocks information embedded in clinical narratives. Machine learning approaches have been shown to be effective for clinical text classification tasks. However, a successful machine learning model usually requires extensive human efforts to create labeled training data and conduct feature engineering. In this study, we propose a clinical text classification paradigm using weak supervision and deep representation to reduce these human efforts. METHODS We develop a rule-based NLP algorithm to automatically generate labels for the training data, and then use the pre-trained word embeddings as deep representation features for training machine learning models. Since machine learning is trained on labels generated by the automatic NLP algorithm, this training process is called weak supervision. We evaluat the paradigm effectiveness on two institutional case studies at Mayo Clinic: smoking status classification and proximal femur (hip) fracture classification, and one case study using a public dataset: the i2b2 2006 smoking status classification shared task. We test four widely used machine learning models, namely, Support Vector Machine (SVM), Random Forest (RF), Multilayer Perceptron Neural Networks (MLPNN), and Convolutional Neural Networks (CNN), using this paradigm. Precision, recall, and F1 score are used as metrics to evaluate performance. RESULTS CNN achieves the best performance in both institutional tasks (F1 score: 0.92 for Mayo Clinic smoking status classification and 0.97 for fracture classification). We show that word embeddings significantly outperform tf-idf and topic modeling features in the paradigm, and that CNN captures additional patterns from the weak supervision compared to the rule-based NLP algorithms. We also observe two drawbacks of the proposed paradigm that CNN is more sensitive to the size of training data, and that the proposed paradigm might not be effective for complex multiclass classification tasks. CONCLUSION The proposed clinical text classification paradigm could reduce human efforts of labeled training data creation and feature engineering for applying machine learning to clinical text classification by leveraging weak supervision and deep representation. The experimental experiments have validated the effectiveness of paradigm by two institutional and one shared clinical text classification tasks.
Collapse
Affiliation(s)
- Yanshan Wang
- Division of Biomedical Statistics and Informatics, Department of Health Sciences Research, Mayo Clinic, 200 1st ST SW, Rochester, MN 55905 USA
| | - Sunghwan Sohn
- Division of Biomedical Statistics and Informatics, Department of Health Sciences Research, Mayo Clinic, 200 1st ST SW, Rochester, MN 55905 USA
| | - Sijia Liu
- Division of Biomedical Statistics and Informatics, Department of Health Sciences Research, Mayo Clinic, 200 1st ST SW, Rochester, MN 55905 USA
| | - Feichen Shen
- Division of Biomedical Statistics and Informatics, Department of Health Sciences Research, Mayo Clinic, 200 1st ST SW, Rochester, MN 55905 USA
| | - Liwei Wang
- Division of Biomedical Statistics and Informatics, Department of Health Sciences Research, Mayo Clinic, 200 1st ST SW, Rochester, MN 55905 USA
| | - Elizabeth J. Atkinson
- Division of Biomedical Statistics and Informatics, Department of Health Sciences Research, Mayo Clinic, 200 1st ST SW, Rochester, MN 55905 USA
| | - Shreyasee Amin
- Division of Rheumatology, Department of Medicine, Mayo Clinic, 200 1st ST SW, Rochester, MN 55905 USA
- Division of Epidemiology, Department of Health Sciences Research, Mayo Clinic, 200 1st ST SW, Rochester, MN 55905 USA
| | - Hongfang Liu
- Division of Biomedical Statistics and Informatics, Department of Health Sciences Research, Mayo Clinic, 200 1st ST SW, Rochester, MN 55905 USA
| |
Collapse
|