1
|
Zhang Z, Wang ZX, Chen YX, Wu HX, Yin L, Zhao Q, Luo HY, Zeng ZL, Qiu MZ, Xu RH. Integrated analysis of single-cell and bulk RNA sequencing data reveals a pan-cancer stemness signature predicting immunotherapy response. Genome Med 2022; 14:45. [PMID: 35488273 PMCID: PMC9052621 DOI: 10.1186/s13073-022-01050-w] [Citation(s) in RCA: 149] [Impact Index Per Article: 49.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2021] [Accepted: 04/19/2022] [Indexed: 12/30/2022] Open
Abstract
BACKGROUND Although immune checkpoint inhibitor (ICI) is regarded as a breakthrough in cancer therapy, only a limited fraction of patients benefit from it. Cancer stemness can be the potential culprit in ICI resistance, but direct clinical evidence is lacking. METHODS Publicly available scRNA-Seq datasets derived from ICI-treated patients were collected and analyzed to elucidate the association between cancer stemness and ICI response. A novel stemness signature (Stem.Sig) was developed and validated using large-scale pan-cancer data, including 34 scRNA-Seq datasets, The Cancer Genome Atlas (TCGA) pan-cancer cohort, and 10 ICI transcriptomic cohorts. The therapeutic value of Stem.Sig genes was further explored using 17 CRISPR datasets that screened potential immunotherapy targets. RESULTS Cancer stemness, as evaluated by CytoTRACE, was found to be significantly associated with ICI resistance in melanoma and basal cell carcinoma (both P < 0.001). Significantly negative association was found between Stem.Sig and anti-tumor immunity, while positive correlations were detected between Stem.Sig and intra-tumoral heterogenicity (ITH) / total mutational burden (TMB). Based on this signature, machine learning model predicted ICI response with an AUC of 0.71 in both validation and testing set. Remarkably, compared with previous well-established signatures, Stem.Sig achieved better predictive performance across multiple cancers. Moreover, we generated a gene list ranked by the average effect of each gene to enhance tumor immune response after genetic knockout across different CRISPR datasets. Then we matched Stem.Sig to this gene list and found Stem.Sig significantly enriched 3% top-ranked genes from the list (P = 0.03), including EMC3, BECN1, VPS35, PCBP2, VPS29, PSMF1, GCLC, KXD1, SPRR1B, PTMA, YBX1, CYP27B1, NACA, PPP1CA, TCEB2, PIGC, NR0B2, PEX13, SERF2, and ZBTB43, which were potential therapeutic targets. CONCLUSIONS We revealed a robust link between cancer stemness and immunotherapy resistance and developed a promising signature, Stem.Sig, which showed increased performance in comparison to other signatures regarding ICI response prediction. This signature could serve as a competitive tool for patient selection of immunotherapy. Meanwhile, our study potentially paves the way for overcoming immune resistance by targeting stemness-associated genes.
Collapse
|
research-article |
3 |
149 |
2
|
Tsimberidou AM. Targeted therapy in cancer. Cancer Chemother Pharmacol 2015; 76:1113-32. [PMID: 26391154 DOI: 10.1007/s00280-015-2861-1] [Citation(s) in RCA: 143] [Impact Index Per Article: 14.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2015] [Accepted: 08/30/2015] [Indexed: 01/02/2023]
Abstract
PURPOSE To describe the emergence of targeted therapies that have led to significant breakthroughs in cancer therapy and completed or ongoing clinical trials of novel agents for the treatment of patients with advanced cancer. METHODS The literature was systematically reviewed, based on clinical experience and the use of technologies that improved our understanding of carcinogenesis. RESULTS Genomics and model systems have enabled the validation of novel therapeutic strategies. Tumor molecular profiling has enabled the reclassification of cancer and elucidated some mechanisms of disease progression or resistance to treatment, the heterogeneity between primary and metastatic tumors, and the dynamic changes of tumor molecular profiling over time. Despite the notable technologic advances, there is a gap between the plethora of preclinical data and the lack of effective therapies, which is attributed to suboptimal drug development for "driver" alterations of human cancer, the high cost of clinical trials and available drugs, and limited access of patients to clinical trials. Bioinformatic analyses of complex data to characterize tumor biology, function, and the dynamic tumor changes in time and space may improve cancer diagnosis. The application of discoveries in cancer biology in clinic holds the promise to improve the clinical outcomes in a large scale of patients with cancer. Increased harmonization between discoveries, policies, and practices will expedite the development of anticancer drugs and will accelerate the implementation of precision medicine. CONCLUSIONS Combinations of targeted, immunomodulating, antiangiogenic, or chemotherapeutic agents are in clinical development. Innovative adaptive study design is used to expedite effective drug development.
Collapse
|
Review |
10 |
143 |
3
|
Qian S, Golubnitschaja O, Zhan X. Chronic inflammation: key player and biomarker-set to predict and prevent cancer development and progression based on individualized patient profiles. EPMA J 2019; 10:365-381. [PMID: 31832112 PMCID: PMC6882964 DOI: 10.1007/s13167-019-00194-x] [Citation(s) in RCA: 136] [Impact Index Per Article: 22.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2019] [Accepted: 11/06/2019] [Indexed: 12/24/2022]
Abstract
A strong relationship exists between tumor and inflammation, which is the hot point in cancer research. Inflammation can promote the occurrence and development of cancer by promoting blood vessel growth, cancer cell proliferation, and tumor invasiveness, negatively regulating immune response, and changing the efficacy of certain anti-tumor drugs. It has been demonstrated that there are a large number of inflammatory factors and inflammatory cells in the tumor microenvironment, and tumor-promoting immunity and anti-tumor immunity exist simultaneously in the tumor microenvironment. The typical relationship between chronic inflammation and tumor has been presented by the relationships between Helicobacter pylori, chronic gastritis, and gastric cancer; between smoking, development of chronic pneumonia, and lung cancer; and between hepatitis virus (mainly hepatitis virus B and C), development of chronic hepatitis, and liver cancer. The prevention of chronic inflammation is a factor that can prevent cancer, so it effectively inhibits or blocks the occurrence, development, and progression of the chronic inflammation process playing important roles in the prevention of cancer. Monitoring of the causes and inflammatory factors in chronic inflammation processes is a useful way to predict cancer and assess the efficiency of cancer prevention. Chronic inflammation-based biomarkers are useful tools to predict and prevent cancer.
Collapse
|
Review |
6 |
136 |
4
|
Hillestad EMR, van der Meeren A, Nagaraja BH, Bjørsvik BR, Haleem N, Benitez-Paez A, Sanz Y, Hausken T, Lied GA, Lundervold A, Berentsen B. Gut bless you: The microbiota-gut-brain axis in irritable bowel syndrome. World J Gastroenterol 2022; 28:412-431. [PMID: 35125827 PMCID: PMC8790555 DOI: 10.3748/wjg.v28.i4.412] [Citation(s) in RCA: 57] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/17/2021] [Revised: 06/24/2021] [Accepted: 01/13/2022] [Indexed: 12/16/2022] Open
Abstract
Irritable bowel syndrome (IBS) is a common clinical label for medically unexplained gastrointestinal symptoms, recently described as a disturbance of the microbiota-gut-brain axis. Despite decades of research, the pathophysiology of this highly heterogeneous disorder remains elusive. However, a dramatic change in the understanding of the underlying pathophysiological mechanisms surfaced when the importance of gut microbiota protruded the scientific picture. Are we getting any closer to understanding IBS' etiology, or are we drowning in unspecific, conflicting data because we possess limited tools to unravel the cluster of secrets our gut microbiota is concealing? In this comprehensive review we are discussing some of the major important features of IBS and their interaction with gut microbiota, clinical microbiota-altering treatment such as the low FODMAP diet and fecal microbiota transplantation, neuroimaging and methods in microbiota analyses, and current and future challenges with big data analysis in IBS.
Collapse
|
Review |
3 |
57 |
5
|
Huang F, Ding H, Liu Z, Wu P, Zhu M, Li A, Zhu T. How fear and collectivism influence public's preventive intention towards COVID-19 infection: a study based on big data from the social media. BMC Public Health 2020; 20:1707. [PMID: 33198699 PMCID: PMC7667474 DOI: 10.1186/s12889-020-09674-6] [Citation(s) in RCA: 57] [Impact Index Per Article: 11.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2020] [Accepted: 10/12/2020] [Indexed: 12/21/2022] Open
Abstract
BACKGROUND Despite worldwide calls for precautionary measures to combat COVID-19, the public's preventive intention still varies significantly among different regions. Exploring the influencing factors of the public's preventive intention is very important to curtail the spread of COVID-19. Previous studies have found that fear can effectively improve the public's preventive intention, but they ignore the impact of differences in cultural values. The present study examines the combined effect of fear and collectivism on the public's preventive intention towards COVID-19 through the analysis of social media big data. METHODS The Sina microblog posts of 108,914 active users from Chinese mainland 31 provinces were downloaded. The data was retrieved from January 11 to February 21, 2020. Afterwards, we conducted a province-level analysis of the contents of downloaded posts. Three lexicons were applied to automatically recognise the scores of fear, collectivism, and preventive intention of 31 provinces. After that, a multiple regression model was established to examine the combined effect of fear and collectivism on the public's preventive intention towards COVID-19. The simple slope test and the Johnson-Neyman technique were used to test the interaction of fear and collectivism on preventive intention. RESULTS The study reveals that: (a) both fear and collectivism can positively predict people's preventive intention and (b) there is an interaction of fear and collectivism on people's preventive intention, where fear and collectivism reduce each other's positive influence on people's preventive intention. CONCLUSION The promotion of fear on people's preventive intention may be limited and conditional, and values of collectivism can well compensate for the promotion of fear on preventive intention. These results provide scientific inspiration on how to enhance the public's preventive intention towards COVID-19 effectively.
Collapse
|
research-article |
5 |
57 |
6
|
Hasselwander M, Tamagusko T, Bigotte JF, Ferreira A, Mejia A, Ferranti EJS. Building back better: The COVID-19 pandemic and transport policy implications for a developing megacity. SUSTAINABLE CITIES AND SOCIETY 2021; 69:102864. [PMID: 36568855 PMCID: PMC9760281 DOI: 10.1016/j.scs.2021.102864] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/20/2020] [Revised: 03/14/2021] [Accepted: 03/16/2021] [Indexed: 05/03/2023]
Abstract
The COVID-19 pandemic has affected human mobility via lockdowns, social distancing rules, home quarantines, and the full or partial suspension of transportation. Evidence-based policy recommendations are urgently needed to ensure that transport systems have resilience to future pandemic outbreaks, particularly within Global South megacities where demand for public transport is high and reduced access can exacerbate socio-economic inequalities. This study focuses on Metro Manila - a characteristic megacity that experienced one of the most stringent lockdowns worldwide. It analyzes aggregated cell phone and GPS data from Google and Apple that provide a comprehensive representation of mobility behavior before and during the lockdown. While significant decreases are observed for all transport modes, public transport experienced the largest drop (-74.5 %, on average). The study demonstrates that: (i) those most reliant on public transport were disproportionately affected by lockdowns; (ii) public transport was unable to fulfil its role as public service; and, (iii) this drove a paradigm shift towards active mobility. Moving forwards, in the short-term policymakers must promote active mobility and prioritize public transport to reduce unequal access to transport. Longer-term, policymakers must leverage the increased active transport to encourage modal shift via infrastructure investment, and better utilize big data to support decision-making.
Collapse
|
research-article |
4 |
27 |
7
|
Mazur DM, Detenchuk EA, Sosnova AA, Artaev VB, Lebedev AT. GC-HRMS with Complementary Ionization Techniques for Target and Non-target Screening for Chemical Exposure: Expanding the Insights of the Air Pollution Markers in Moscow Snow. THE SCIENCE OF THE TOTAL ENVIRONMENT 2021; 761:144506. [PMID: 33360203 DOI: 10.1016/j.scitotenv.2020.144506] [Citation(s) in RCA: 26] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/04/2020] [Revised: 12/08/2020] [Accepted: 12/11/2020] [Indexed: 06/12/2023]
Abstract
Environmental exposure assessment is an important step in establishing a list of local priority pollutants and finding the sources of the threats for proposing appropriate protection measures. Exposome targeted and non-targeted analysis as well as suspect screening may be applied to reveal these pollutants. The non-targeted screening is a challenging task and requires the application of the most powerful analytical tools available, assuring wide analytical coverage, sensitivity, identification reliability, and quantitation. Moscow, Russia, is the largest and most rapidly growing European city. That rapid growth is causing changes in the environment which require periodic clarification of the real environmental situation regarding the presence of the classic pollutants and possible new contaminants. Gas chromatography - high resolution time-of-flight mass spectrometry (GC-HR-TOFMS) with electron ionization (EI), positive chemical ionization (PCI), and electron capture negative ionization (ECNI) ion sources were used for the analysis of Moscow snow samples collected in the early spring of 2018 in nine different locations. Collection of snow samples represents an efficient approach for the estimation of long-term air pollution, due to accumulation and preservation of environmental contaminants by snow during winter period. The high separation power of GC, complementary ionization methods, high mass accuracy, and wide mass range of TOFMS allowed for the identification of several hundred organic compounds belonging to the various classes of pollutants, exposure to which could represent a danger to the health of the population. Although quantitative analysis was not a primary aim of the study, targeted analysis revealed that some priority pollutants exceeded the established safe levels. Thus, dibutylphthalate concentration was over 10-fold higher than its safe level (0.001 mg/L), while benz[a]pyrene concentration exceeded Russian maximal permissible concentration value of 5 ng/L in three samples. The large amount of information generated during the combination of targeted and non-targeted analysis and screening samples for suspects makes it feasible to apply the big data analysis to observe the trends and tendencies in the pollution exposome across the city.
Collapse
|
Review |
4 |
26 |
8
|
Chen Z, Bird VY, Ruchi R, Segal MS, Bian J, Khan SR, Elie MC, Prosperi M. Development of a personalized diagnostic model for kidney stone disease tailored to acute care by integrating large clinical, demographics and laboratory data: the diagnostic acute care algorithm - kidney stones (DACA-KS). BMC Med Inform Decis Mak 2018; 18:72. [PMID: 30119627 PMCID: PMC6098647 DOI: 10.1186/s12911-018-0652-4] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2017] [Accepted: 08/06/2018] [Indexed: 11/10/2022] Open
Abstract
Background Kidney stone (KS) disease has high, increasing prevalence in the United States and poses a massive economic burden. Diagnostics algorithms of KS only use a few variables with a limited sensitivity and specificity. In this study, we tested a big data approach to infer and validate a ‘multi-domain’ personalized diagnostic acute care algorithm for KS (DACA-KS), merging demographic, vital signs, clinical, and laboratory information. Methods We utilized a large, single-center database of patients admitted to acute care units in a large tertiary care hospital. Patients diagnosed with KS were compared to groups of patients with acute abdominal/flank/groin pain, genitourinary diseases, and other conditions. We analyzed multiple information domains (several thousands of variables) using a collection of statistical and machine learning models with feature selectors. We compared sensitivity, specificity and area under the receiver operating characteristic (AUROC) of our approach with the STONE score, using cross-validation. Results Thirty eight thousand five hundred and ninety-seven distinct adult patients were admitted to critical care between 2001 and 2012, of which 217 were diagnosed with KS, and 7446 with acute pain (non-KS). The multi-domain approach using logistic regression yielded an AUROC of 0.86 and a sensitivity/specificity of 0.81/0.82 in cross-validation. Increase in performance was obtained by fitting a super-learner, at the price of lower interpretability. We discussed in detail comorbidity and lab marker variables independently associated with KS (e.g. blood chloride, candidiasis, sleep disorders). Conclusions Although external validation is warranted, DACA-KS could be integrated into electronic health systems; the algorithm has the potential used as an effective tool to help nurses and healthcare personnel during triage or clinicians making a diagnosis, streamlining patients’ management in acute care.
Collapse
|
Research Support, Non-U.S. Gov't |
7 |
19 |
9
|
Mei Y, Yang JP, Qian CN. For robust big data analyses: a collection of 150 important pro-metastatic genes. CHINESE JOURNAL OF CANCER 2017; 36:16. [PMID: 28109319 PMCID: PMC5251273 DOI: 10.1186/s40880-016-0178-z] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/19/2016] [Accepted: 11/03/2016] [Indexed: 02/08/2023]
Abstract
Metastasis is the greatest contributor to cancer-related death. In the era of precision medicine, it is essential to predict and to prevent the spread of cancer cells to significantly improve patient survival. Thanks to the application of a variety of high-throughput technologies, accumulating big data enables researchers and clinicians to identify aggressive tumors as well as patients with a high risk of cancer metastasis. However, there have been few large-scale gene collection studies to enable metastasis-related analyses. In the last several years, emerging efforts have identified pro-metastatic genes in a variety of cancers, providing us the ability to generate a pro-metastatic gene cluster for big data analyses. We carefully selected 285 genes with in vivo evidence of promoting metastasis reported in the literature. These genes have been investigated in different tumor types. We used two datasets downloaded from The Cancer Genome Atlas database, specifically, datasets of clear cell renal cell carcinoma and hepatocellular carcinoma, for validation tests, and excluded any genes for which elevated expression level correlated with longer overall survival in any of the datasets. Ultimately, 150 pro-metastatic genes remained in our analyses. We believe this collection of pro-metastatic genes will be helpful for big data analyses, and eventually will accelerate anti-metastasis research and clinical intervention.
Collapse
|
Review |
8 |
18 |
10
|
Zhou M, Wang R, Cheng S, Xu Y, Luo S, Zhang Y, Kong L. Bibliometrics and visualization analysis regarding research on the development of microplastics. ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH INTERNATIONAL 2021; 28:8953-8967. [PMID: 33447976 DOI: 10.1007/s11356-021-12366-2] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/21/2020] [Accepted: 01/03/2021] [Indexed: 06/12/2023]
Abstract
Microplastics have caused considerable harm to the environment and threatened human health due to their strong adsorption and hard biodegradation. Therefore, the research of microplastic received increasing attention recently, producing numbers of related achievements. To comprehensively grasp the quantitative information of published papers on "microplastics," we analyzed the research progress and hotspots of "microplastics" through visualization software "VOSviewer." The results show that the number of literature on microplastics published from 2009 to 2019 increased exponentially (R2 = 0.9873). The top 10 cited references are mainly in "zooplankton ingesting microplastics," "microplastics in artificially cultivated bivalve," "microplastics in surface waters such as lakes," etc. The cutting-edge microplastics research is adsorption, biodegradation, ingestion and accumulation model, and toxicity analysis. In addition, the results predict that the combination of constructed wetland, biotechnology, and photocatalysis to remove microplastics will become new hotspots. The study provides researchers in microplastics with an overview of existing research and directional guidance for future research.
Collapse
|
Review |
4 |
18 |
11
|
Maturo MG, Soligo M, Gibson G, Manni L, Nardini C. The greater inflammatory pathway-high clinical potential by innovative predictive, preventive, and personalized medical approach. EPMA J 2020; 11:1-16. [PMID: 32140182 PMCID: PMC7028895 DOI: 10.1007/s13167-019-00195-w] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2019] [Accepted: 11/13/2019] [Indexed: 12/13/2022]
Abstract
BACKGROUND AND LIMITATIONS Impaired wound healing (WH) and chronic inflammation are hallmarks of non-communicable diseases (NCDs). However, despite WH being a recognized player in NCDs, mainstream therapies focus on (un)targeted damping of the inflammatory response, leaving WH largely unaddressed, owing to three main factors. The first is the complexity of the pathway that links inflammation and wound healing; the second is the dual nature, local and systemic, of WH; and the third is the limited acknowledgement of genetic and contingent causes that disrupt physiologic progression of WH. PROPOSED APPROACH Here, in the frame of Predictive, Preventive, and Personalized Medicine (PPPM), we integrate and revisit current literature to offer a novel systemic view on the cues that can impact on the fate (acute or chronic inflammation) of WH, beyond the compartmentalization of medical disciplines and with the support of advanced computational biology. CONCLUSIONS This shall open to a broader understanding of the causes for WH going awry, offering new operational criteria for patients' stratification (prediction and personalization). While this may also offer improved options for targeted prevention, we will envisage new therapeutic strategies to reboot and/or boost WH, to enable its progression across its physiological phases, the first of which is a transient acute inflammatory response versus the chronic low-grade inflammation characteristic of NCDs.
Collapse
|
Review |
5 |
16 |
12
|
Annual prevalence and economic burden of genital warts in Korea: Health Insurance Review and Assessment (HIRA) service data from 2007 to 2015. Epidemiol Infect 2017; 146:177-186. [PMID: 29235433 DOI: 10.1017/s0950268817002813] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023] Open
Abstract
This study evaluated the annual prevalence of anogenital warts (AGW) caused by human papillomavirus (HPV) and analysed the trend in annual per cent changes (APC) by using national claims data from the Health Insurance Review and Assessment of Korea, 2007-2015. We also estimated the socio-economic burden and co-morbidities of AGW. All analyses were performed based on data for primary A63.0, the specific diagnosis code for AGW. The socio-economic cost of AGW was calculated based on the direct medical cost, direct non-medical cost and indirect cost. The overall AGW prevalence and socio-economic burden has increased during the last 9 years. However, the prevalence of AGW differed significantly by sex. The female prevalence increased until 2012, and decreased thereafter (APC + 3·6%). It would fall after the introduction of routine HPV vaccination, principally for females, in Korea. The male prevalence increased continuously over time (APC + 11·6%), especially in those aged 20-49 years. Referring to the increasing AGW prevalence and its disease burden, active HPV infection control surveillance and prevention in males are worth consideration.
Collapse
|
Research Support, Non-U.S. Gov't |
8 |
16 |
13
|
Elkin LS, Topal K, Bebek G. Network based model of social media big data predicts contagious disease diffusion. INFORMATION DISCOVERY AND DELIVERY 2017; 45:110-120. [PMID: 31179401 PMCID: PMC6554721 DOI: 10.1108/idd-05-2017-0046] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
PURPOSE– Predicting future outbreaks and understanding how they are spreading from location to location can improve patient care provided. Recently, mining social media big data provided the ability to track patterns and trends across the world. This study aims to analyze social media micro-blogs and geographical locations to understand how disease outbreaks spread over geographies and to enhance forecasting of future disease outbreaks. DESIGN/METHODOLOGY/APPROACH – In this paper, the authors use Twitter data as the social media data source, influenza-like illnesses (ILI) as disease epidemic and states in the USA as geographical locations. They present a novel network-based model to make predictions about the spread of diseases a week in advance utilizing social media big data. FINDINGS– The authors showed that flu-related tweets align well with ILI data from the Centers for Disease Control and Prevention (CDC) (p < 0.049). The authors compared this model to earlier approaches that utilized airline traffic, and showed that ILI activity estimates of their model were more accurate. They also found that their disease diffusion model yielded accurate predictions for upcoming ILI activity (p < 0.04), and they predicted the diffusion of flu across states based on geographical surroundings at 76 per cent accuracy. The equations and procedures can be translated to apply to any social media data, other contagious diseases and geographies to mine large data sets. ORIGINALITY/VALUE– First, while extensive work has been presented utilizing time-series analysis on single geographies, or post-analysis of highly contagious diseases, no previous work has provided a generalized solution to identify how contagious diseases diffuse across geographies, such as states in the USA. Secondly, due to nature of the social media data, various statistical models have been extensively used to address these problems.
Collapse
|
research-article |
8 |
13 |
14
|
Jamshidi A, Faghih-Roohi S, Hajizadeh S, Núñez A, Babuska R, Dollevoet R, Li Z, De Schutter B. A Big Data Analysis Approach for Rail Failure Risk Assessment. RISK ANALYSIS : AN OFFICIAL PUBLICATION OF THE SOCIETY FOR RISK ANALYSIS 2017; 37:1495-1507. [PMID: 28561899 DOI: 10.1111/risa.12836] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
Railway infrastructure monitoring is a vital task to ensure rail transportation safety. A rail failure could result in not only a considerable impact on train delays and maintenance costs, but also on safety of passengers. In this article, the aim is to assess the risk of a rail failure by analyzing a type of rail surface defect called squats that are detected automatically among the huge number of records from video cameras. We propose an image processing approach for automatic detection of squats, especially severe types that are prone to rail breaks. We measure the visual length of the squats and use them to model the failure risk. For the assessment of the rail failure risk, we estimate the probability of rail failure based on the growth of squats. Moreover, we perform severity and crack growth analyses to consider the impact of rail traffic loads on defects in three different growth scenarios. The failure risk estimations are provided for several samples of squats with different crack growth lengths on a busy rail track of the Dutch railway network. The results illustrate the practicality and efficiency of the proposed approach.
Collapse
|
|
8 |
11 |
15
|
Mesiti M, Re M, Valentini G. Think globally and solve locally: secondary memory-based network learning for automated multi-species function prediction. Gigascience 2014; 3:5. [PMID: 24843788 PMCID: PMC4006453 DOI: 10.1186/2047-217x-3-5] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2013] [Accepted: 04/01/2014] [Indexed: 01/08/2023] Open
Abstract
Background Network-based learning algorithms for automated function prediction (AFP) are negatively affected by the limited coverage of experimental data and limited a priori known functional annotations. As a consequence their application to model organisms is often restricted to well characterized biological processes and pathways, and their effectiveness with poorly annotated species is relatively limited. A possible solution to this problem might consist in the construction of big networks including multiple species, but this in turn poses challenging computational problems, due to the scalability limitations of existing algorithms and the main memory requirements induced by the construction of big networks. Distributed computation or the usage of big computers could in principle respond to these issues, but raises further algorithmic problems and require resources not satisfiable with simple off-the-shelf computers. Results We propose a novel framework for scalable network-based learning of multi-species protein functions based on both a local implementation of existing algorithms and the adoption of innovative technologies: we solve “locally” the AFP problem, by designing “vertex-centric” implementations of network-based algorithms, but we do not give up thinking “globally” by exploiting the overall topology of the network. This is made possible by the adoption of secondary memory-based technologies that allow the efficient use of the large memory available on disks, thus overcoming the main memory limitations of modern off-the-shelf computers. This approach has been applied to the analysis of a large multi-species network including more than 300 species of bacteria and to a network with more than 200,000 proteins belonging to 13 Eukaryotic species. To our knowledge this is the first work where secondary-memory based network analysis has been applied to multi-species function prediction using biological networks with hundreds of thousands of proteins. Conclusions The combination of these algorithmic and technological approaches makes feasible the analysis of large multi-species networks using ordinary computers with limited speed and primary memory, and in perspective could enable the analysis of huge networks (e.g. the whole proteomes available in SwissProt), using well-equipped stand-alone machines.
Collapse
|
Journal Article |
11 |
10 |
16
|
Feng Z, Bhat RR, Yuan X, Freeman D, Baslanti T, Bihorac A, Li X. Intelligent Perioperative System: Towards Real-time Big Data Analytics in Surgery Risk Assessment. DASC-PICOM-DATACOM-CYBERSCITECH 2017 : 2017 IEEE 15TH INTERNATIONAL CONFERENCE ON DEPENDABLE, AUTONOMIC AND SECURE COMPUTING ; 2017 IEEE 15TH INTERNATIONAL CONFERENCE ON PERVASIVE INTELLIGENCE AND COMPUTING ; 2017 IEEE 3RD INTERNATIONAL... 2017; 2017:1254-1259. [PMID: 30272054 DOI: 10.1109/dasc-picom-datacom-cyberscitec.2017.201] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
Surgery risk assessment is an effective tool for physicians to manage the treatment of patients, but most current research projects fall short in providing a comprehensive platform to evaluate the patients' surgery risk in terms of different complications. The recent evolution of big data analysis techniques makes it possible to develop a real-time platform to dynamically analyze the surgery risk from large-scale patients information. In this paper, we propose the Intelligent Perioperative System (IPS), a real-time system that assesses the risk of postoperative complications (PC) and dynamically interacts with physicians to improve the predictive results. In order to process large volume patients data in real-time, we design the system by integrating several big data computing and storage frameworks with the high through-output streaming data processing components. We also implement a system prototype along with the visualization results to show the feasibility of system design.
Collapse
|
Journal Article |
8 |
9 |
17
|
Ware AP, Kabekkodu SP, Chawla A, Paul B, Satyamoorthy K. Diagnostic and prognostic potential clustered miRNAs in bladder cancer. 3 Biotech 2022; 12:173. [PMID: 35845108 PMCID: PMC9279521 DOI: 10.1007/s13205-022-03225-z] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2022] [Accepted: 06/18/2022] [Indexed: 12/13/2022] Open
Abstract
UNLABELLED At specific genomic loci, miRNAs are in clusters and their association with copy number variations (CNVs) may exhibit abnormal expression in several cancers. Hence, the current study aims to understand the expression of miRNA clusters residing within CNVs and the regulation of their target genes in bladder cancer. To achieve this, we used extensive bioinformatics resources and performed an integrated analysis of recurrent CNVs, clustered miRNA expression, gene expression, and drug-gene interaction datasets. The study identified nine upregulated miRNA clusters that are residing on CNV gain regions and three miRNA clusters (hsa-mir-200c/mir-141, hsa-mir-216a/mir-217, and hsa-mir-15b/mir-16-2) are correlated with patient survival. These clustered miRNAs targeted 89 genes that were downregulated in bladder cancer. Moreover, network and gene enrichment analysis displayed 10 hub genes (CCND2, ETS1, FGF2, FN1, JAK2, JUN, KDR, NOTCH1, PTEN, and ZEB1) which have significant potential for diagnosis and prognosis of bladder cancer patients. Interestingly, hsa-mir-200c/mir-141 and hsa-mir-15b/mir-16-2 cluster candidates showed significant differences in their expression in stage-specific manner during cancer progression. Downregulation of NOTCH1 by hsa-mir-200c/mir-141 may also sensitize tumors to methotrexate thus suggesting potential chemotherapeutic options for bladder cancer subjects. To overcome some computational challenges and reduce the complexity in multistep big data analysis, we developed an automated pipeline called CmiRClustFinder v1.0 (https://github.com/msls-bioinfo/CmiRClustFinder_v1.0), which can perform integrated data analysis of 35 TCGA cancer types. SUPPLEMENTARY INFORMATION The online version contains supplementary material available at 10.1007/s13205-022-03225-z.
Collapse
|
research-article |
3 |
8 |
18
|
Park DI. Genomics, transcriptomics, proteomics and big data analysis in the discovery of new diagnostic markers and targets for therapy development. PROGRESS IN MOLECULAR BIOLOGY AND TRANSLATIONAL SCIENCE 2020; 173:61-90. [PMID: 32711818 DOI: 10.1016/bs.pmbts.2020.04.017] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
Highly complex endophenotypes and underlying molecular mechanisms have prevented effective diagnosis and treatment of autism spectrum disorder. Despite extensive studies to identify relevant biosignatures, no biomarker and therapeutic targets are available in the current clinical practice. While our current knowledge is still largely incomplete, -omics technology and machine learning-based big data analysis have provided novel insights on the etiology of autism spectrum disorders, elucidating systemic impairments that can be translated into biomarker and therapy target candidates. However, more integrated and sophisticated approaches are vital to realize molecular stratification and individualized treatment strategy. Ultimately, systemic approaches based on -omics and big data analysis will significantly contribute to more effective biomarker and therapy development for autism spectrum disorder.
Collapse
|
Review |
5 |
7 |
19
|
Gojobori T, Ikeo K, Katayama Y, Kawabata T, Kinjo AR, Kinoshita K, Kwon Y, Migita O, Mizutani H, Muraoka M, Nagata K, Omori S, Sugawara H, Yamada D, Yura K. VaProS: a database-integration approach for protein/genome information retrieval. JOURNAL OF STRUCTURAL AND FUNCTIONAL GENOMICS 2016; 17:69-81. [PMID: 28012137 PMCID: PMC5274651 DOI: 10.1007/s10969-016-9211-3] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/03/2016] [Accepted: 12/05/2016] [Indexed: 01/01/2023]
Abstract
Life science research now heavily relies on all sorts of databases for genome sequences, transcription, protein three-dimensional (3D) structures, protein-protein interactions, phenotypes and so forth. The knowledge accumulated by all the omics research is so vast that a computer-aided search of data is now a prerequisite for starting a new study. In addition, a combinatory search throughout these databases has a chance to extract new ideas and new hypotheses that can be examined by wet-lab experiments. By virtually integrating the related databases on the Internet, we have built a new web application that facilitates life science researchers for retrieving experts' knowledge stored in the databases and for building a new hypothesis of the research target. This web application, named VaProS, puts stress on the interconnection between the functional information of genome sequences and protein 3D structures, such as structural effect of the gene mutation. In this manuscript, we present the notion of VaProS, the databases and tools that can be accessed without any knowledge of database locations and data formats, and the power of search exemplified in quest of the molecular mechanisms of lysosomal storage disease. VaProS can be freely accessed at http://p4d-info.nig.ac.jp/vapros/ .
Collapse
|
research-article |
9 |
7 |
20
|
Using data-driven sublanguage pattern mining to induce knowledge models: application in medical image reports knowledge representation. BMC Med Inform Decis Mak 2018; 18:61. [PMID: 29980203 PMCID: PMC6035419 DOI: 10.1186/s12911-018-0645-3] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2017] [Accepted: 06/27/2018] [Indexed: 12/04/2022] Open
Abstract
Background The use of knowledge models facilitates information retrieval, knowledge base development, and therefore supports new knowledge discovery that ultimately enables decision support applications. Most existing works have employed machine learning techniques to construct a knowledge base. However, they often suffer from low precision in extracting entity and relationships. In this paper, we described a data-driven sublanguage pattern mining method that can be used to create a knowledge model. We combined natural language processing (NLP) and semantic network analysis in our model generation pipeline. Methods As a use case of our pipeline, we utilized data from an open source imaging case repository, Radiopaedia.org, to generate a knowledge model that represents the contents of medical imaging reports. We extracted entities and relationships using the Stanford part-of-speech parser and the “Subject:Relationship:Object” syntactic data schema. The identified noun phrases were tagged with the Unified Medical Language System (UMLS) semantic types. An evaluation was done on a dataset comprised of 83 image notes from four data sources. Results A semantic type network was built based on the co-occurrence of 135 UMLS semantic types in 23,410 medical image reports. By regrouping the semantic types and generalizing the semantic network, we created a knowledge model that contains 14 semantic categories. Our knowledge model was able to cover 98% of the content in the evaluation corpus and revealed 97% of the relationships. Machine annotation achieved a precision of 87%, recall of 79%, and F-score of 82%. Conclusion The results indicated that our pipeline was able to produce a comprehensive content-based knowledge model that could represent context from various sources in the same domain. Electronic supplementary material The online version of this article (10.1186/s12911-018-0645-3) contains supplementary material, which is available to authorized users.
Collapse
|
Research Support, Non-U.S. Gov't |
7 |
7 |
21
|
Mahmoudi T, Naghdi T, Morales-Narváez E, Golmohammadi H. Toward smart diagnosis of pandemic infectious diseases using wastewater-based epidemiology. Trends Analyt Chem 2022; 153:116635. [PMID: 35440833 PMCID: PMC9010328 DOI: 10.1016/j.trac.2022.116635] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2021] [Revised: 03/21/2022] [Accepted: 04/07/2022] [Indexed: 12/12/2022]
Abstract
COVID-19 outbreak revealed fundamental weaknesses of current diagnostic systems, particularly in prediction and subsequently prevention of pandemic infectious diseases (PIDs). Among PIDs detection methods, wastewater-based epidemiology (WBE) has been demonstrated to be a favorable mean for estimation of community-wide health. Besides, by going beyond purely sensing usages of WBE, it can be efficiently exploited in Healthcare 4.0/5.0 for surveillance, monitoring, control, and above all prediction and prevention, thereby, resulting in smart sensing and management of potential outbreaks/epidemics/pandemics. Herein, an overview of WBE sensors for PIDs is presented. The philosophy behind the smart diagnosis of PIDs using WBE with the help of digital technologies is then discussed, as well as their characteristics to be met. Analytical techniques that are pushing the frontiers of smart sensing and have a high potential to be used in the smart diagnosis of PIDs via WBE are surveyed. In this context, we underscore key challenges ahead and provide recommendations for implementing and moving faster toward smart diagnostics.
Collapse
|
Review |
3 |
7 |
22
|
Tian C, Feng C, Chen L, Wang Q. Impact of water source mixture and population changes on the Al residue in megalopolitan drinking water. WATER RESEARCH 2020; 186:116335. [PMID: 32882454 DOI: 10.1016/j.watres.2020.116335] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/01/2020] [Revised: 08/19/2020] [Accepted: 08/22/2020] [Indexed: 06/11/2023]
Abstract
This study establishes a new understanding of the contributions of Al residue in a megalopolitan drinking water supply system with mixed water sources. The different influences and contributions of foreign water source, resident migration and season changing to Al residue in drinking water were investigated. Especially, the role of Southern water transferred over 1200 km via the South-to-North Water Diversion Project in the Al residue of drinking water supply system of a northern megalopolitan were revealed for the first time. Comparisons of big data on Al residue in the water supply system with sole and mixed water sources showed that the introduction of Southern water enhanced the Al residue in drinking water by over 35%. The world's largest annual residents' migration during Chinese Lunar New Year and the changes of season affect the water pipework hydrodynamics, which were embodied as the periodic changes of particulate aluminium and the relations with resident's temporal-spatial distribution in the megalopolitan. Because of the differences in water quality, Southern water promotes the release of historically deposited Al and facilitates the cleaning of old pipes.
Collapse
|
|
5 |
6 |
23
|
Fine S, Chaudhri A, Englebright J, Dan Roberts W. Nursing process, derived from the clinical care classification system components, as an earlier indicator of nursing care during a pandemic. Int J Med Inform 2023; 173:104954. [PMID: 36842361 DOI: 10.1016/j.ijmedinf.2022.104954] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2022] [Revised: 11/06/2022] [Accepted: 12/02/2022] [Indexed: 12/13/2022]
Abstract
BACKGROUND During COVID pandemic response, an early signal was desired beyond typical financial classifications or order sets. The foundational work of Virginia K Saba informed the essential, symbiotic relationship of nursing practice and resource utilization by means of the Clinical Care Classification System [CCC]. Scholars have confirmed the use of the CCC as the structure for data modeling, focusing on the concept of nursing cost [1]. Therefore, the purpose of this retrospective, descriptive study was to determine if analysis of CCC Care Component codes could provide a high granularity signal of early shifts in patient demographics and in nursing care interventions and to, then, determine if nursing care intervention shifts indicated changes in resource utilization. METHODS For a large multi-facility healthcare system in the USA, patients cared for in an acute care setting/hospital-based care unit were the population of interest. Through prior and ongoing efforts of ensuring Evidenced-Based Clinical Documentation [EBCD], a data model was utilized to determine changes in the patient's nursing diagnoses, nursing interventions, during care episodes, for patients with acute symptoms or diagnosed/confirmed COVID. RESULTS The structure of CCC revealed 22 billion individual instances of the CCC Care Component/Concept codes for the data sets for 2017 and during COVID, a considerably large data set suitable for pre- and post- event analyses. The component codes were included in a string data set for concept/diagnosis/intervention. DISCUSSION By our analysis, these CCC Information Model elements determined a clear ability to detect increasing demands of nursing and resources, prior to other data models, including supply chain data, provider documented diagnostic codes, or laboratory test codes. Therefore, we conclude CCC System structure and Nursing Intervention codes allow for earlier detection of pandemic care nursing resource demands, despite the perceived challenges of "timeliness of documentation" attributed to more constrained timelines of data models of nursing care.
Collapse
|
|
2 |
4 |
24
|
Wu Y, He Z, Lin H, Zheng Y, Zhang J, Xu D. A Fast Projection-Based Algorithm for Clustering Big Data. Interdiscip Sci 2018; 11:360-366. [PMID: 29882026 DOI: 10.1007/s12539-018-0294-3] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2017] [Revised: 03/18/2018] [Accepted: 03/22/2018] [Indexed: 01/01/2023]
Abstract
With the fast development of various techniques, more and more data have been accumulated with the unique properties of large size (tall) and high dimension (wide). The era of big data is coming. How to understand and discover new knowledge from these data has attracted more and more scholars' attention and has become the most important task in data mining. As one of the most important techniques in data mining, clustering analysis, a kind of unsupervised learning, could group a set data into objectives(clusters) that are meaningful, useful, or both. Thus, the technique has played very important role in knowledge discovery in big data. However, when facing the large-sized and high-dimensional data, most of the current clustering methods exhibited poor computational efficiency and high requirement of computational source, which will prevent us from clarifying the intrinsic properties and discovering the new knowledge behind the data. Based on this consideration, we developed a powerful clustering method, called MUFOLD-CL. The principle of the method is to project the data points to the centroid, and then to measure the similarity between any two points by calculating their projections on the centroid. The proposed method could achieve linear time complexity with respect to the sample size. Comparison with K-Means method on very large data showed that our method could produce better accuracy and require less computational time, demonstrating that the MUFOLD-CL can serve as a valuable tool, at least may play a complementary role to other existing methods, for big data clustering. Further comparisons with state-of-the-art clustering methods on smaller datasets showed that our method was fastest and achieved comparable accuracy. For the convenience of most scholars, a free soft package was constructed.
Collapse
|
|
7 |
4 |
25
|
Pecoraro V, Pirotti T, Trenti T. Evidence of SARS-CoV-2 reinfection: analysis of 35,000 subjects and overview of systematic reviews. Clin Exp Med 2023; 23:1213-1224. [PMID: 36289100 PMCID: PMC9607758 DOI: 10.1007/s10238-022-00922-0] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2022] [Accepted: 10/11/2022] [Indexed: 11/29/2022]
Abstract
BACKGROUND Reinfection by SARS-CoV-2 is a rare but possible event. We evaluated the prevalence of reinfections in the Province of Modena and performed an overview of systematic reviews to summarize the current knowledge. METHODS We applied big data analysis and retrospectively analysed the results of oro- or naso-pharyngeal swab results tested for molecular research of viral RNA of SARS-CoV-2 between 1 January 2021 and 30 June 2021 at a single center. We selected individuals with samples sequence of positive, negative and then positive results. Between first and second positive result we considered a time interval of 90 days to be sure of a reinfection. We also performed a search for and evaluation of systematic reviews reporting SARS-CoV-2 reinfection rates. Main information was collected and the methodological quality of each review was assessed, according to A Measurement Tool to Assess systematic Reviews (AMSTAR). RESULTS Initial positive results were revealed in more than 35,000 (20%) subjects; most (28%) were aged 30-49 years old. Reinfection was reported in 1,258 (3.5%); most (33%) were aged 30-49 years old. Reinfection rates according to vaccinated or non-vaccinated subjects were 0.6% vs 1.1% (p < 0.0001). Nine systematic reviews were identified and confirmed that SARS-CoV-2 reinfection rate is a rare event. AMSTAR revealed very low-moderate levels of quality among selected systematic reviews. CONCLUSIONS There is a real, albeit rare risk of SARS-CoV-2 reinfection. Big data analysis enabled accurate estimates of the reinfection rates. Nevertheless, a standardized approach to identify and report reinfection cases should be developed.
Collapse
|
research-article |
2 |
3 |