1
|
Germino J, Szymanski A, Eicher-Miller HA, Metoyer R, Chawla NV. Corrigendum: A community focused approach toward making healthy and affordable daily diet recommendations. Front Big Data 2024; 7:1396638. [PMID: 38638341 PMCID: PMC11024675 DOI: 10.3389/fdata.2024.1396638] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2024] [Accepted: 03/27/2024] [Indexed: 04/20/2024] Open
Abstract
[This corrects the article DOI: 10.3389/fdata.2023.1086212.].
Collapse
Affiliation(s)
- Joe Germino
- Department of Computer Science and Engineering, Lucy Family Institute, University of Notre Dame, Notre Dame, IN, United States
| | - Annalisa Szymanski
- Department of Computer Science and Engineering, Lucy Family Institute, University of Notre Dame, Notre Dame, IN, United States
| | | | - Ronald Metoyer
- Department of Computer Science and Engineering, Lucy Family Institute, University of Notre Dame, Notre Dame, IN, United States
| | - Nitesh V. Chawla
- Department of Computer Science and Engineering, Lucy Family Institute, University of Notre Dame, Notre Dame, IN, United States
| |
Collapse
|
2
|
Wang D, Zhao T, Yu W, Chawla NV, Jiang M. Deep Multimodal Complementarity Learning. IEEE Trans Neural Netw Learn Syst 2023; 34:10213-10224. [PMID: 35436202 DOI: 10.1109/tnnls.2022.3165180] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Complementarity plays a significant role in the synergistic effect created by different components of a complex data object. Complementarity learning on multimodal data has fundamental challenges of representation learning because the complementarity exists along with multiple modalities and one or multiple items of each modality. Also, an appropriate metric is needed for measuring the complementarity in the representation space. Existing methods that rely on similarity-based metrics cannot adequately capture the complementarity. In this work, we propose a novel deep architecture for systematically learning the complementarity of components from multimodal multi-item data. The proposed model consists of three major modules: 1) unimodal aggregation for extracting the intramodal complementarity; 2) cross-modal fusion for extracting the intermodal complementarity at the modality level; and 3) interactive aggregation for extracting the intermodal complementarity at the item level. To quantify complementarity, we utilize the TUBE distance metric to measure the difference between the composited data object and its label in the representation space. Experiments on three real datasets show that our model outperforms the state-of-the-art by +6.8% of mean reciprocal rank (MRR) on object classification and +3.0% of MRR on hold-out item prediction. Qualitative analyses reveal that complementarity is significantly different from similarity.
Collapse
|
3
|
Germino J, Szymanski A, Eicher-Miller HA, Metoyer R, Chawla NV. A community focused approach toward making healthy and affordable daily diet recommendations. Front Big Data 2023; 6:1086212. [PMID: 38025946 PMCID: PMC10661405 DOI: 10.3389/fdata.2023.1086212] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2022] [Accepted: 07/26/2023] [Indexed: 12/01/2023] Open
Abstract
Introduction Maintaining an affordable and nutritious diet can be challenging, especially for those living under the conditions of poverty. To fulfill a healthy diet, consumers must make difficult decisions within a complicated food landscape. Decisions must factor information on health and budget constraints, the food supply and pricing options at local grocery stores, and nutrition and portion guidelines provided by government services. Information to support food choice decisions is often inconsistent and challenging to find, making it difficult for consumers to make informed, optimal decisions. This is especially true for low-income and Supplemental Nutrition Assistance Program (SNAP) households which have additional time and cost constraints that impact their food purchases and ultimately leave them more susceptible to malnutrition and obesity. The goal of this paper is to demonstrate how the integration of data from local grocery stores and federal government databases can be used to assist specific communities in meeting their unique health and budget challenges. Methods We discuss many of the challenges of integrating multiple data sources, such as inconsistent data availability and misleading nutrition labels. We conduct a case study using linear programming to identify a healthy meal plan that stays within a limited SNAP budget and also adheres to the Dietary Guidelines for Americans. Finally, we explore the main drivers of cost of local food products with emphasis on the nutrients determined by the USDA as areas of focus: added sugars, saturated fat, and sodium. Results and discussion Our case study results suggest that such an optimization model can be used to facilitate food purchasing decisions within a given community. By focusing on the community level, our results will inform future work navigating the complex networks of food information to build global recommendation systems.
Collapse
Affiliation(s)
- Joe Germino
- Department of Computer Science and Engineering, Lucy Family Institute, University of Notre Dame, Notre Dame, IN, United States
| | - Annalisa Szymanski
- Department of Computer Science and Engineering, Lucy Family Institute, University of Notre Dame, Notre Dame, IN, United States
| | | | - Ronald Metoyer
- Department of Computer Science and Engineering, Lucy Family Institute, University of Notre Dame, Notre Dame, IN, United States
| | - Nitesh V. Chawla
- Department of Computer Science and Engineering, Lucy Family Institute, University of Notre Dame, Notre Dame, IN, United States
| |
Collapse
|
4
|
Dablain D, Krawczyk B, Chawla NV. DeepSMOTE: Fusing Deep Learning and SMOTE for Imbalanced Data. IEEE Trans Neural Netw Learn Syst 2023; 34:6390-6404. [PMID: 35085094 DOI: 10.1109/tnnls.2021.3136503] [Citation(s) in RCA: 14] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Despite over two decades of progress, imbalanced data is still considered a significant challenge for contemporary machine learning models. Modern advances in deep learning have further magnified the importance of the imbalanced data problem, especially when learning from images. Therefore, there is a need for an oversampling method that is specifically tailored to deep learning models, can work on raw images while preserving their properties, and is capable of generating high-quality, artificial images that can enhance minority classes and balance the training set. We propose Deep synthetic minority oversampling technique (SMOTE), a novel oversampling algorithm for deep learning models that leverages the properties of the successful SMOTE algorithm. It is simple, yet effective in its design. It consists of three major components: 1) an encoder/decoder framework; 2) SMOTE-based oversampling; and 3) a dedicated loss function that is enhanced with a penalty term. An important advantage of DeepSMOTE over generative adversarial network (GAN)-based oversampling is that DeepSMOTE does not require a discriminator, and it generates high-quality artificial images that are both information-rich and suitable for visual inspection. DeepSMOTE code is publicly available at https://github.com/dd1github/DeepSMOTE.
Collapse
|
5
|
Saebi M, Nan B, Herr JE, Wahlers J, Guo Z, Zurański AM, Kogej T, Norrby PO, Doyle AG, Chawla NV, Wiest O. On the use of real-world datasets for reaction yield prediction. Chem Sci 2023; 14:4997-5005. [PMID: 37206399 PMCID: PMC10189898 DOI: 10.1039/d2sc06041h] [Citation(s) in RCA: 13] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2022] [Accepted: 03/09/2023] [Indexed: 09/30/2023] Open
Abstract
The lack of publicly available, large, and unbiased datasets is a key bottleneck for the application of machine learning (ML) methods in synthetic chemistry. Data from electronic laboratory notebooks (ELNs) could provide less biased, large datasets, but no such datasets have been made publicly available. The first real-world dataset from the ELNs of a large pharmaceutical company is disclosed and its relationship to high-throughput experimentation (HTE) datasets is described. For chemical yield predictions, a key task in chemical synthesis, an attributed graph neural network (AGNN) performs as well as or better than the best previous models on two HTE datasets for the Suzuki-Miyaura and Buchwald-Hartwig reactions. However, training the AGNN on an ELN dataset does not lead to a predictive model. The implications of using ELN data for training ML-based models are discussed in the context of yield predictions.
Collapse
Affiliation(s)
- Mandana Saebi
- Department of Computer Science and Engineering and Lucy Family Institute for Data and Society, University of Notre Dame Notre Dame IN 46556 USA
| | - Bozhao Nan
- Department of Chemistry and Biochemistry, University of Notre Dame Notre Dame IN 46556 USA
| | - John E Herr
- Department of Chemistry and Biochemistry, University of Notre Dame Notre Dame IN 46556 USA
| | - Jessica Wahlers
- Department of Chemistry and Biochemistry, University of Notre Dame Notre Dame IN 46556 USA
| | - Zhichun Guo
- Department of Computer Science and Engineering and Lucy Family Institute for Data and Society, University of Notre Dame Notre Dame IN 46556 USA
| | - Andrzej M Zurański
- Department of Chemistry, Princeton University Princeton New Jersey 08544 USA
| | - Thierry Kogej
- Molecular AI, Discovery Sciences, R&D, AstraZeneca Pepparedsleden 1, SE-431 83 Mölndal Gothenburg Sweden
| | - Per-Ola Norrby
- Data Science and Modelling, Pharmaceutical Sciences, R&D, AstraZeneca Pepparedsleden 1, SE-431 83 Mölndal Gothenburg Sweden
| | - Abigail G Doyle
- Department of Chemistry, Princeton University Princeton New Jersey 08544 USA
- Department of Chemistry and Biochemistry, University of California Los Angeles California 90095 USA
| | - Nitesh V Chawla
- Department of Computer Science and Engineering and Lucy Family Institute for Data and Society, University of Notre Dame Notre Dame IN 46556 USA
| | - Olaf Wiest
- Department of Chemistry and Biochemistry, University of Notre Dame Notre Dame IN 46556 USA
| |
Collapse
|
6
|
Chaudhry BM, Dasgupta D, Chawla NV. Successful Aging for Community-Dwelling Older Adults: An Experimental Study with a Tablet App. Int J Environ Res Public Health 2022; 19:13148. [PMID: 36293730 PMCID: PMC9603432 DOI: 10.3390/ijerph192013148] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/09/2022] [Revised: 09/17/2022] [Accepted: 09/24/2022] [Indexed: 05/08/2023]
Abstract
Mobile health (mHealth) technologies offer an opportunity to enable the care and support of community-dwelling older adults, however, research examining the use of mHealth in delivering quality of life (QoL) improvements in the older population is limited. We developed a tablet application (eSeniorCare) based on the Successful Aging framework and investigated its feasibility among older adults with low socioeconomic status. Twenty five participants (females = 14, mean age = 65 years) used the app to set and track medication intake reminders and health goals, and to play selected casual mobile games for 24 weeks. The Older person QoL and Short Health (SF12v2) surveys were administered before and after the study. The Wilcoxon rank tests were used to determine differences from baseline, and thematic analysis was used to analyze post-study interview data. The improvements in health-related QoL (HRQoL) scores were statistically significant (V=41.5, p=0.005856) across all participants. The frequent eSeniorCare users experienced statistically significant improvements in their physical health (V=13, p=0.04546) and HRQoL (V=7.5, p=0.0050307) scores. Participants reported that the eSeniorCare app motivated timely medication intake and health goals achievement, whereas tablet games promoted mental stimulation. Participants were willing to use mobile apps to self-manage their medications (70%) and adopt healthy activities (72%), while 92% wanted to recommend eSeniorCare to a friend. This study shows the feasibility and possible impact of an mHealth tool on the health-related QoL in older adults with a low socioeconomic status. mHealth support tools and future research to determine their effects are warranted for this population.
Collapse
Affiliation(s)
- Beenish Moalla Chaudhry
- School of Computing and Informatics, University of Louisiana at Lafayette, 104 E. University Circle, Lafayette, LA 70501, USA
| | - Dipanwita Dasgupta
- Department of Computer Science and Engineering, University of Notre Dame, Indiana, IN 46656, USA
| | - Nitesh V. Chawla
- Department of Computer Science and Engineering, University of Notre Dame, Indiana, IN 46656, USA
| |
Collapse
|
7
|
Krieg SJ, Schnur JJ, Miranda ML, Pfrender ME, Chawla NV. Symptomatic, Presymptomatic, and Asymptomatic Transmission of SARS-CoV-2 in a University Student Population, August-November 2020. Public Health Rep 2022; 137:1023-1030. [PMID: 35848117 PMCID: PMC9358125 DOI: 10.1177/00333549221110300] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023] Open
Abstract
OBJECTIVES The impact and risk of SARS-CoV-2 transmission from asymptomatic and presymptomatic hosts remains an open question. This study measured the secondary attack rates (SARs) and relative risk (RR) of SARS-CoV-2 transmission from asymptomatic and presymptomatic index cases as compared with symptomatic index cases. METHODS We used COVID-19 test results, daily health check reports, and contact tracing data to measure SARs and corresponding RRs among close contacts of index cases in a cohort of 12 960 young adults at the University of Notre Dame in Indiana for 103 days, from August 10 to November 20, 2020. Further analysis included Fisher exact tests to determine the association between symptoms and COVID-19 infection and z tests to determine statistical differences between SARs. RESULTS Asymptomatic rates of transmission of SARS-CoV-2 were higher (SAR = 0.19; 95% CI, 0.14-0.24) than was estimated in prior studies, producing an RR of 0.75 (95% CI, 0.54-1.07) when compared with symptomatic transmission. In addition, the transmission rate associated with presymptomatic cases (SAR = 0.25; 95% CI, 0.21-0.30) was approximately the same as that for symptomatic cases (SAR = 0.25; 95% CI, 0.19-0.31). Furthermore, different symptoms were associated with different transmission rates. CONCLUSIONS Asymptomatic and presymptomatic hosts of SARS-CoV-2 are a risk for community spread of COVID-19, especially with new variants emerging. Moreover, typical symptom checks may easily miss people who are asymptomatic or presymptomatic but still infectious. Our study results may be used as a guide to analyze the spread of SARS-CoV-2 variants and help inform appropriate public health measures as they relate to asymptomatic and presymptomatic cases.
Collapse
Affiliation(s)
- Steven J. Krieg
- Lucy Family Institute for Data and
Society, University of Notre Dame, Notre Dame, IN, USA
| | - Jennifer J. Schnur
- Lucy Family Institute for Data and
Society, University of Notre Dame, Notre Dame, IN, USA
| | - Marie Lynn Miranda
- Lucy Family Institute for Data and
Society, University of Notre Dame, Notre Dame, IN, USA
| | - Michael E. Pfrender
- Lucy Family Institute for Data and
Society, University of Notre Dame, Notre Dame, IN, USA
| | - Nitesh V. Chawla
- Lucy Family Institute for Data and
Society, University of Notre Dame, Notre Dame, IN, USA
| |
Collapse
|
8
|
Avilés-Robles M, Schnur JJ, Dorantes-Acosta E, Márquez-González H, Ocampo-Ramírez LA, Chawla NV. Predictors of Septic Shock or Bacteremia in Children Experiencing Febrile Neutropenia Post-Chemotherapy. J Pediatric Infect Dis Soc 2022; 11:498-503. [PMID: 35924573 PMCID: PMC9720364 DOI: 10.1093/jpids/piac080] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/23/2022] [Accepted: 07/20/2022] [Indexed: 11/15/2022]
Abstract
BACKGROUND Febrile neutropenia (FN) is an early indicator of infection in oncology patients post-chemotherapy. We aimed to determine clinical predictors of septic shock and/or bacteremia in pediatric cancer patients experiencing FN and to create a model that classifies patients as low-risk for these outcomes. METHODS This is a retrospective analysis with clinical data of a cohort of pediatric oncology patients admitted during July 2015 to September 2017 with FN. One FN episode per patient was randomly selected. Statistical analyses include distribution analysis, hypothesis testing, and multivariate logistic regression to determine clinical feature association with outcomes. RESULTS A total of 865 episodes of FN occurred in 429 subjects. In the 404 sampled episodes that were analyzed, 20.8% experienced outcomes of septic shock and/or bacteremia. Gram-negative bacteria count for 70% of bacteremias. Features with statistically significant influence in predicting these outcomes were hematological malignancy (P < .001), cancer relapse (P = .011), platelet count (P = .004), and age (P = .023). The multivariate logistic regression model achieves AUROC = 0.66 (95% CI 0.56-0.76). The optimal classification threshold achieves sensitivity = 0.96, specificity = 0.33, PPV = 0.40, and NPV = 0.95. CONCLUSIONS This model, based on simple clinical variables, can be used to identify patients at low-risk of septic shock and/or bacteremia. The model's NPV of 95% satisfies the priority to avoid discharging patients at high-risk for adverse infection outcomes. The model will require further validation on a prospective population.
Collapse
Affiliation(s)
| | | | - Elisa Dorantes-Acosta
- Department of Oncology and Leukemia Cell Research Biobank, Hospital Infantil de México Federico Gómez, Mexico City, Mexico
| | - Horacio Márquez-González
- Department of Clinical Research, Hospital Infantil de México Federico Gómez, Mexico City, Mexico
| | - Luis A Ocampo-Ramírez
- Department of Infectious Diseases, Hospital Infantil de México Federico Gómez, Mexico City, Mexico
| | - Nitesh V Chawla
- Corresponding Author: Nitesh V. Chawla, Ph.D., Lucy Family Institute for Data and Society, 384E Nieuwland Science Hall, Notre Dame, IN 46556 USA. E-mail:
| |
Collapse
|
9
|
Wu X, Huang C, Granda PR, Chawla NV. Representation Learning on Variable Length and Incomplete Wearable-Sensory Time Series. ACM T INTEL SYST TEC 2022. [DOI: 10.1145/3531228] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]
Abstract
The prevalence of wearable sensors (
e
.
g
., smart wristband) is creating unprecedented opportunities to not only inform health and wellness states of individuals, but also assess and infer personal attributes, including demographic and personality attributes. However, the data captured from wearables, such as heart rate or number of steps, present two key challenges: 1) the time series is often of variable-length and incomplete due to different data collection periods (
e
.
g
., wearing behavior varies by person); and 2) there is inter-individual variability to external factors like stress and environment. This paper addresses these challenges and brings us closer to the potential of personalized insights about an individual, taking the leap from quantified self to qualified self. Specifically,
HeartSpace
proposed in this paper learns embedding of the time series data with variable-length and missing values via the integration of a time series encoding module and a pattern aggregation network. Additionally,
HeartSpace
implements a Siamese-triplet network to optimize representations by jointly capturing intra- and inter-series correlations during the embedding learning process. The empirical evaluation over two different real-world data presents significant performance gains over state-of-the-art baselines in a variety of applications, including user identification, personality prediction, demographics inference, job performance prediction and sleep duration estimation.
Collapse
|
10
|
|
11
|
Krieg SJ, Avendano C, Grantham-Brown E, Lilienfeld Asbun A, Schnur JJ, Miranda ML, Chawla NV. Data-driven testing program improves detection of COVID-19 cases and reduces community transmission. NPJ Digit Med 2022; 5:17. [PMID: 35149754 PMCID: PMC8837751 DOI: 10.1038/s41746-022-00562-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2021] [Accepted: 01/07/2022] [Indexed: 11/16/2022] Open
Abstract
COVID-19 remains a global threat in the face of emerging SARS-CoV-2 variants and gaps in vaccine administration and availability. In this study, we analyze a data-driven COVID-19 testing program implemented at a mid-sized university, which utilized two simple, diverse, and easily interpretable machine learning models to predict which students were at elevated risk and should be tested. The program produced a positivity rate of 0.53% (95% CI 0.34–0.77%) from 20,862 tests, with 1.49% (95% CI 1.15–1.89%) of students testing positive within five days of the initial test—a significant increase from the general surveillance baseline, which produced a positivity rate of 0.37% (95% CI 0.28–0.47%) with 0.67% (95% CI 0.55–0.81%) testing positive within five days. Close contacts who were predicted by the data-driven models were tested much more quickly on average (0.94 days from reported exposure; 95% CI 0.78–1.11) than those who were manually contact traced (1.92 days; 95% CI 1.81–2.02). We further discuss how other universities, business, and organizations could adopt similar strategies to help quickly identify positive cases and reduce community transmission.
Collapse
Affiliation(s)
- Steven J Krieg
- Lucy Family Institute for Data and Society, University of Notre Dame, Notre Dame, IN, 46556, USA
| | - Carolina Avendano
- Children's Environmental Health Initiative, University of Notre Dame, Notre Dame, IN, 46556, USA
| | - Evan Grantham-Brown
- Lucy Family Institute for Data and Society, University of Notre Dame, Notre Dame, IN, 46556, USA
| | - Aaron Lilienfeld Asbun
- Children's Environmental Health Initiative, University of Notre Dame, Notre Dame, IN, 46556, USA
| | - Jennifer J Schnur
- Lucy Family Institute for Data and Society, University of Notre Dame, Notre Dame, IN, 46556, USA
| | - Marie Lynn Miranda
- Lucy Family Institute for Data and Society, University of Notre Dame, Notre Dame, IN, 46556, USA.,Children's Environmental Health Initiative, University of Notre Dame, Notre Dame, IN, 46556, USA
| | - Nitesh V Chawla
- Lucy Family Institute for Data and Society, University of Notre Dame, Notre Dame, IN, 46556, USA.
| |
Collapse
|
12
|
Abstract
Recipe recommendation systems play an important role in helping people find recipes that are of their interest and fit their eating habits. Unlike what has been developed for recommending recipes using content-based or collaborative filtering approaches, the relational information among users, recipes, and food items is less explored. In this paper, we leverage the relational information into recipe recommendation and propose a graph learning approach to solve it. In particular, we propose HGAT, a novel hierarchical graph attention network for recipe recommendation. The proposed model can capture user history behavior, recipe content, and relational information through several neural network modules, including type-specific transformation, node-level attention, and relation-level attention. We further introduce a ranking-based objective function to optimize the model. Thorough experiments demonstrate that HGAT outperforms numerous baseline methods.
Collapse
Affiliation(s)
- Yijun Tian
- Department of Computer Science and Engineering and Lucy Family Institute for Data and Society, University of Notre Dame, Notre Dame, IN, United States
| | - Chuxu Zhang
- Department of Computer Science, Brandeis University, Waltham, MA, United States
| | - Ronald Metoyer
- Department of Computer Science and Engineering and Lucy Family Institute for Data and Society, University of Notre Dame, Notre Dame, IN, United States
| | - Nitesh V. Chawla
- Department of Computer Science and Engineering and Lucy Family Institute for Data and Society, University of Notre Dame, Notre Dame, IN, United States
- *Correspondence: Nitesh V. Chawla
| |
Collapse
|
13
|
Bielak P, Tagowski K, Falkiewicz M, Kajdanowicz T, Chawla NV. FILDNE: A Framework for Incremental Learning of Dynamic Networks Embeddings. Knowl Based Syst 2022. [DOI: 10.1016/j.knosys.2021.107453] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
|
14
|
Syed M, Wang D, Jiang M, Conway O, Juneja V, Subramanian S, Chawla NV. Unified Representation of Twitter and Online News Using Graph and Entities. Front Big Data 2021; 4:699070. [PMID: 34514380 PMCID: PMC8432963 DOI: 10.3389/fdata.2021.699070] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2021] [Accepted: 07/27/2021] [Indexed: 11/13/2022] Open
Abstract
To improve consumer engagement and satisfaction, online news services employ strategies for personalizing and recommending articles to their users based on their interests. In addition to news agencies’ own digital platforms, they also leverage social media to reach out to a broad user base. These engagement efforts are often disconnected with each other, but present a compelling opportunity to incorporate engagement data from social media to inform their digital news platform and vice-versa, leading to a more personalized experience for users. While this idea seems intuitive, there are several challenges due to the disparate nature of the two sources. In this paper, we propose a model to build a generalized graph of news articles and tweets that can be used for different downstream tasks such as identifying sentiment, trending topics, and misinformation, as well as sharing relevant articles on social media in a timely fashion. We evaluate our framework on a downstream task of identifying related pairs of news articles and tweets with promising results. The content unification problem addressed by our model is not unique to the domain of news, and thus can be applicable to other problems linking different content platforms.
Collapse
Affiliation(s)
- Munira Syed
- University of Notre Dame, Notre Dame, IN, United States
| | - Daheng Wang
- University of Notre Dame, Notre Dame, IN, United States
| | - Meng Jiang
- University of Notre Dame, Notre Dame, IN, United States
| | | | | | | | | |
Collapse
|
15
|
Zhang C, Yao H, Yu L, Huang C, Song D, Chen H, Jiang M, Chawla NV. Inductive Contextual Relation Learning for Personalization. ACM T INFORM SYST 2021. [DOI: 10.1145/3450353] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Abstract
Web personalization, e.g., recommendation or relevance search, tailoring a service/product to accommodate specific online users, is becoming increasingly important. Inductive personalization aims to infer the relations between existing entities and unseen new ones, e.g., searching relevant authors for new papers or recommending new items to users. This problem, however, is challenging since most of recent studies focus on transductive problem for existing entities. In addition, despite some inductive learning approaches have been introduced recently, their performance is sub-optimal due to relatively simple and inflexible architectures for aggregating entity’s content. To this end, we propose the inductive contextual personalization (ICP) framework through contextual relation learning. Specifically, we first formulate the pairwise relations between entities with a ranking optimization scheme that employs neural aggregator to fuse entity’s heterogeneous contents. Next, we introduce a node embedding term to capture entity’s contextual relations, as a smoothness constraint over the prior ranking objective. Finally, the gradient descent procedure with adaptive negative sampling is employed to learn the model parameters. The learned model is capable of inferring the relations between existing entities and inductive ones. Thorough experiments demonstrate that ICP outperforms numerous baseline methods for two different applications, i.e., relevant author search and new item recommendation.
Collapse
Affiliation(s)
| | | | - Lu Yu
- Ant Financial Services Group, Hangzhou, China
| | | | | | - Haifeng Chen
- NEC Laboratories America Inc, Princeton, NJ, USA
| | - Meng Jiang
- University of Notre Dame, Notre Dame, IN, USA
| | | |
Collapse
|
16
|
Wang D, Zeng Q, Chawla NV, Jiang M. Modeling Complementarity in Behavior Data with Multi-Type Itemset Embedding. ACM T INTEL SYST TEC 2021. [DOI: 10.1145/3458724] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Abstract
People are looking for complementary contexts, such as team members of
complementary skills
for project team building and/or reading materials of
complementary knowledge
for effective student learning, to make their behaviors more likely to be successful. Complementarity has been revealed by behavioral sciences as one of the most important factors in decision making. Existing computational models that learn low-dimensional context representations from behavior data have poor scalability and recent network embedding methods only focus on preserving the similarity between the contexts. In this work, we formulate a behavior entry as a set of context items and propose a novel representation learning method,
Multi-type Itemset Embedding
, to learn the context representations preserving the itemset structures. We propose a
measurement of complementarity
between context items in the embedding space. Experiments demonstrate both effectiveness and efficiency of the proposed method over the state-of-the-art methods on behavior prediction and context recommendation. We discover that the complementary contexts and similar contexts are significantly different in human behaviors.
Collapse
Affiliation(s)
- Daheng Wang
- University of Notre Dame, Notre Dame, IN 46556, USA
| | - Qingkai Zeng
- University of Notre Dame, Notre Dame, IN 46556, USA
| | - Nitesh V. Chawla
- University of Notre Dame, Notre Dame, IN 46556, USA and Department ofComputational Intelligence, Wrocław University of Science and Technology, Wrocław, Poland
| | - Meng Jiang
- University of Notre Dame, Notre Dame, IN 46556, USA
| |
Collapse
|
17
|
Faust L, Feldman K, Lin S, Mattingly S, D'Mello S, Chawla NV. Examining Response to Negative Life Events Through Fitness Tracker Data. Front Digit Health 2021; 3:659088. [PMID: 34713131 PMCID: PMC8521839 DOI: 10.3389/fdgth.2021.659088] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2021] [Accepted: 03/24/2021] [Indexed: 11/29/2022] Open
Abstract
Negative life events, such as the death of a loved one, are an unavoidable part of life. These events can be overwhelmingly stressful and may lead to the development of mental health disorders. To mitigate these adverse developments, prior literature has utilized measures of psychological responses to negative life events to better understand their effects on mental health. However, psychological changes represent only one aspect of an individual's potential response. We posit measuring additional dimensions of health, such as physical health, may also be beneficial, as physical health itself may be affected by negative life events and measuring its response could provide context to changes in mental health. Therefore, the primary aim of this work was to quantify how an individual's physical health changes in response to negative life events by testing for deviations in their physiological and behavioral state (PB-state). After capturing post-event, PB-state responses, our second aim sought to contextualize changes within known factors of psychological response to negative life events, namely coping strategies. To do so, we utilized a cohort of professionals across the United States monitored for 1 year and who experienced a negative life event while under observation. Garmin Vivosmart-3 devices provided a multidimensional representation of one's PB-state by collecting measures of resting heart rate, physical activity, and sleep. To test for deviations in PB-state following negative life events, One-Class Support Vector Machines were trained on a window of time prior to the event, which established a PB-state baseline. The model then evaluated participant's PB-state on the day of the life event and each day that followed, assigning each day a level of deviance relative to the participant's baseline. Resulting response curves were then examined in association with the use of various coping strategies using Bayesian gamma-hurdle regression models. The results from our objectives suggest that physical determinants of health also deviate in response to negative life events and that these deviations can be mitigated through different coping strategies. Taken together, these observations stress the need to examine physical determinants of health alongside psychological determinants when investigating the effects of negative life events.
Collapse
Affiliation(s)
- Louis Faust
- Department of Computer Science & Engineering, University of Notre Dame, Notre Dame, IN, United States
| | - Keith Feldman
- Children's Mercy Kansas City, Kansas City, MO, United States
- Department of Pediatrics, University of Missouri-Kansas City School of Medicine, Kansas City, MO, United States
| | - Suwen Lin
- Department of Computer Science & Engineering, University of Notre Dame, Notre Dame, IN, United States
| | - Stephen Mattingly
- Department of Computer Science & Engineering, University of Notre Dame, Notre Dame, IN, United States
| | - Sidney D'Mello
- Institute of Cognitive Science, University of Colorado, Boulder, CO, United States
| | - Nitesh V. Chawla
- Department of Computer Science & Engineering, University of Notre Dame, Notre Dame, IN, United States
| |
Collapse
|
18
|
Feldman K, Rohan AJ, Chawla NV. Discrete Heart Rate Values or Continuous Streams? Representation, Variability, and Meaningful Use of Vital Sign Data. Comput Inform Nurs 2021; 39:793-803. [PMID: 34747895 DOI: 10.1097/cin.0000000000000728] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023]
Abstract
Documentation and review of patient heart rate are a fundamental process across a myriad of clinical settings. While historically recorded manually, bedside monitors now provide for the automated collection of such data. Despite the availability of continuous streaming data, patients' charts continue to reflect only a subset of this information as snapshots recorded throughout a hospitalization. Over the past decade, prominent works have explored the implications of such practices and established fundamental differences in the alignment of discrete charted vitals and steaming data captured by monitoring systems. Limited work has examined the temporal properties of these differences, how they manifest, and their relation to clinical applications. The work presented in this article addresses this disparity, providing evidence that differences between charting techniques extend to measures of variability. Our results demonstrate how variability manifests with respect to temporal elements of charting timing and how it can facilitate personalized care by contextualizing deviations in magnitude. This work also highlights the utility of variability metrics with relation to clinical measures including associations to severity scores and a case study utilizing complex variability metrics derived from the complete set of monitor data.
Collapse
Affiliation(s)
- Keith Feldman
- Author Affiliations: Department of Computer Science and Engineering and iCeNSA, University of Notre Dame, IN (Drs Feldman and Chawla); SUNY Downstate Health Sciences University, College of Nursing, Brooklyn, NY (Dr Rohan)
| | | | | |
Collapse
|
19
|
Robles-Granda P, Lin S, Wu X, Martinez GJ, Mattingly SM, Moskal E, Striegel A, Chawla NV, D'Mello S, Gregg J, Nies K, Mark G, Grover T, Campbell AT, Mirjafari S, Saha K, De Choudhury M, Dey AK. Jointly Predicting Job Performance, Personality, Cognitive Ability, Affect, and Well-Being. IEEE COMPUT INTELL M 2021. [DOI: 10.1109/mci.2021.3061877] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
|
20
|
Jiang T, Zeng Q, Zhao T, Qin B, Liu T, Chawla NV, Jiang M. Biomedical Knowledge Graphs Construction From Conditional Statements. IEEE/ACM Trans Comput Biol Bioinform 2021; 18:823-835. [PMID: 32167907 DOI: 10.1109/tcbb.2020.2979959] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Conditions play an essential role in biomedical statements. However, existing biomedical knowledge graphs (BioKGs) only focus on factual knowledge, organized as a flat relational network of biomedical concepts. These BioKGs ignore the conditions of the facts being valid, which loses essential contexts for knowledge exploration and inference. We consider both facts and their conditions in biomedical statements and proposed a three-layered information-lossless representation of BioKG. The first layer has biomedical concept nodes, attribute nodes. The second layer represents both biomedical fact and condition tuples by nodes of the relation phrases, connecting to the subject and object in the first layer. The third layer has nodes of statements connecting to a set of fact tuples and/or condition tuples in the second layer. We transform the BioKG construction problem into a sequence labeling problem based on a novel designed tag schema. We design a Multi-Input Multi-Output sequence labeling model (MIMO) that learns from multiple input signals and generates proper number of multiple output sequences for tuple extraction. Experiments on a newly constructed dataset show that MIMO outperforms the existing methods. Further case study demonstrates that the BioKGs constructed provide a good understanding of the biomedical statements.
Collapse
|
21
|
Saebi M, Xu J, Curasi SR, Grey EK, Chawla NV, Lodge DM. Network analysis of ballast-mediated species transfer reveals important introduction and dispersal patterns in the Arctic. Sci Rep 2020; 10:19558. [PMID: 33177658 PMCID: PMC7658980 DOI: 10.1038/s41598-020-76602-4] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2020] [Accepted: 10/22/2020] [Indexed: 11/09/2022] Open
Abstract
Rapid climate change has wide-ranging implications for the Arctic region, including sea ice loss, increased geopolitical attention, and expanding economic activity resulting in a dramatic increase in shipping activity. As a result, the risk of harmful non-native marine species being introduced into this critical region will increase unless policy and management steps are implemented in response. Using data about shipping, ecoregions, and environmental conditions, we leverage network analysis and data mining techniques to assess, visualize, and project ballast water-mediated species introductions into the Arctic and dispersal of non-native species within the Arctic. We first identify high-risk connections between the Arctic and non-Arctic ports that could be sources of non-native species over 15 years (1997-2012) and observe the emergence of shipping hubs in the Arctic where the cumulative risk of non-native species introduction is increasing. We then consider how environmental conditions can constrain this Arctic introduction network for species with different physiological limits, thus providing a tool that will allow decision-makers to evaluate the relative risk of different shipping routes. Next, we focus on within-Arctic ballast-mediated species dispersal where we use higher-order network analysis to identify critical shipping routes that may facilitate species dispersal within the Arctic. The risk assessment and projection framework we propose could inform risk-based assessment and management of ship-borne invasive species in the Arctic.
Collapse
Affiliation(s)
- Mandana Saebi
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN, 46556, USA
- Center for Network and Data Science (CNDS), Notre Dame, IN, 46556, USA
| | - Jian Xu
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN, 46556, USA
- Citadel LLC, Chicago, IL, 60603, USA
| | - Salvatore R Curasi
- Department of Biological Sciences, University of Notre Dame, Notre Dame, IN, 46556, USA
| | - Erin K Grey
- Division of Science, Mathematics and Technology, Governors State University, University Park, IL, 60484, USA
| | - Nitesh V Chawla
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN, 46556, USA
- Center for Network and Data Science (CNDS), Notre Dame, IN, 46556, USA
| | - David M Lodge
- Cornell Atkinson Center for Sustainability, and Department of Ecology and Evolutionary Biology, Cornell University, Ithaca, NY, 14850, USA.
| |
Collapse
|
22
|
Abstract
Representation learning on networks offers a powerful alternative to the oft painstaking process of manual feature engineering, and, as a result, has enjoyed considerable success in recent years. However, all the existing representation learning methods are based on the first-order network, that is, the network that only captures the pairwise interactions between the nodes. As a result, these methods may fail to incorporate non-Markovian higher order dependencies in the network. Thus, the embeddings that are generated may not accurately represent the underlying phenomena in a network, resulting in inferior performance in different inductive or transductive learning tasks. To address this challenge, this study presents higher order network embedding (HONEM), a higher order network (HON) embedding method that captures the non-Markovian higher order dependencies in a network. HONEM is specifically designed for the HON structure and outperforms other state-of-the-art methods in node classification, network reconstruction, link prediction, and visualization for networks that contain non-Markovian higher order dependencies.
Collapse
Affiliation(s)
- Mandana Saebi
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, Indiana, USA
| | - Giovanni Luca Ciampaglia
- Department of Computer Science and Engineering, University of South Florida, Tampa, Florida, USA
| | | | - Nitesh V Chawla
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, Indiana, USA
| |
Collapse
|
23
|
Abstract
Knowledge graphs (KGs) serve as useful resources for various natural language processing applications. Previous KG completion approaches require a large number of training instances (i.e., head-tail entity pairs) for every relation. The real case is that for most of the relations, very few entity pairs are available. Existing work of one-shot learning limits method generalizability for few-shot scenarios and does not fully use the supervisory information; however, few-shot KG completion has not been well studied yet. In this work, we propose a novel few-shot relation learning model (FSRL) that aims at discovering facts of new relations with few-shot references. FSRL can effectively capture knowledge from heterogeneous graph structure, aggregate representations of few-shot references, and match similar entity pairs of reference set for every relation. Extensive experiments on two public datasets demonstrate that FSRL outperforms the state-of-the-art.
Collapse
|
24
|
Faust L, Feldman K, Mattingly SM, Hachen D, V. Chawla N. Deviations from normal bedtimes are associated with short-term increases in resting heart rate. NPJ Digit Med 2020; 3:39. [PMID: 32219180 PMCID: PMC7090013 DOI: 10.1038/s41746-020-0250-6] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2019] [Accepted: 02/28/2020] [Indexed: 12/22/2022] Open
Abstract
Despite proper sleep hygiene being critical to our health, guidelines for improving sleep habits often focus on only a single component, namely, sleep duration. Recent works, however, have brought to light the importance of another aspect of sleep: bedtime regularity, given its ties to cognitive and metabolic health outcomes. To further our understanding of this often-neglected component of sleep, the objective of this work was to investigate the association between bedtime regularity and resting heart rate (RHR): an important biomarker for cardiovascular health. Utilizing Fitbit Charge HRs to measure bedtimes, sleep and RHR, 255,736 nights of data were collected from a cohort of 557 college students. We observed that going to bed even 30 minutes later than one's normal bedtime was associated with a significantly higher RHR throughout sleep (Coeff +0.18; 95% CI: +0.11, +0.26 bpm), persisting into the following day and converging with one's normal RHR in the early evening. Bedtimes of at least 1 hour earlier were also associated with significantly higher RHRs throughout sleep; however, they converged with one's normal rate by the end of the sleep session, not extending into the following day. These observations stress the importance of maintaining proper sleep habits, beyond sleep duration, as high variability in bedtimes may be detrimental to one's cardiovascular health.
Collapse
Affiliation(s)
- Louis Faust
- Department of Computer Science & Engineering, University of Notre Dame, Notre Dame, IN USA
- Center for Network and Data Science (CNDS), University of Notre Dame, Notre Dame, IN USA
| | - Keith Feldman
- Department of Computer Science & Engineering, University of Notre Dame, Notre Dame, IN USA
- Center for Network and Data Science (CNDS), University of Notre Dame, Notre Dame, IN USA
| | - Stephen M. Mattingly
- Department of Computer Science & Engineering, University of Notre Dame, Notre Dame, IN USA
- Center for Network and Data Science (CNDS), University of Notre Dame, Notre Dame, IN USA
| | - David Hachen
- Center for Network and Data Science (CNDS), University of Notre Dame, Notre Dame, IN USA
- Department of Sociology, University of Notre Dame, Notre Dame, IN USA
| | - Nitesh V. Chawla
- Department of Computer Science & Engineering, University of Notre Dame, Notre Dame, IN USA
- Center for Network and Data Science (CNDS), University of Notre Dame, Notre Dame, IN USA
| |
Collapse
|
25
|
Feldman K, Solymos GMB, de Albuquerque MP, Chawla NV. Unraveling Complexity about Childhood Obesity and Nutritional Interventions: Modeling Interactions Among Psychological Factors. Sci Rep 2019; 9:18807. [PMID: 31827160 PMCID: PMC6906362 DOI: 10.1038/s41598-019-55260-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2019] [Accepted: 11/26/2019] [Indexed: 12/15/2022] Open
Abstract
As the global prevalence of childhood obesity continues to rise, researchers and clinicians have sought to develop more effective and personalized intervention techniques. In doing so, obesity interventions have expanded beyond the traditional context of nutrition to address several facets of a child’s life, including their psychological state. While the consideration of psychological features has significantly advanced the view of obesity as a holistic condition, attempts to associate such features with outcomes of treatment have been inconclusive. We posit that such uncertainty may arise from the univariate manner in which features are evaluated, focusing on a particular aspect such as loneliness or insecurity, but failing to account for the impact of co-occurring psychological characteristics. Moreover, co-occurrence of psychological characteristics (both child and parent/guardian) have not been studied from the perspective of their relationship with nutritional intervention outcomes. To that end, this work looks to broaden the prevailing view: laying the foundation for the existence of complex interactions among psychological features. In collaboration with a non-profit nutritional clinic in Brazil, this paper demonstrates and models these interactions and their associations with the outcomes of a nutritional intervention.
Collapse
Affiliation(s)
- Keith Feldman
- Department of Computer Science and Engineering, Interdisciplinary Center for Network Science and Applications (iCeNSA), University of Notre Dame, Notre Dame, IN, 46556, USA.,Health Services and Outcomes Research, Children's Mercy Kansas City, Kansas City, MO, USA.,Department of Pediatrics, University of Missouri-Kansas City School of Medicine, Kansas City, MO, USA
| | - Gisela M B Solymos
- Department of Computer Science and Engineering, Interdisciplinary Center for Network Science and Applications (iCeNSA), University of Notre Dame, Notre Dame, IN, 46556, USA.,Kellogg Institute for International Studies, University of Notre Dame, Notre Dame, IN, USA
| | - Maria Paula de Albuquerque
- Department of Physiology, Section Physiology of Nutrition, Federal University of São Paulo (UNIFESP), São Paulo, Brazil.,CREN, São Paulo, Brazil
| | - Nitesh V Chawla
- Department of Computer Science and Engineering, Interdisciplinary Center for Network Science and Applications (iCeNSA), University of Notre Dame, Notre Dame, IN, 46556, USA.
| |
Collapse
|
26
|
Abstract
BACKGROUND Known colloquially as the "weekend effect," the association between weekend admissions and increased mortality within hospital settings has become a highly contested topic over the last two decades. Drawing interest from practitioners and researchers alike, a sundry of works have emerged arguing for and against the presence of the effect across various patient cohorts. However, it has become evident that simply studying population characteristics is insufficient for understanding how the effect manifests. Rather, to truly understand the effect, investigations into its underlying factors must be considered. As such, the work presented in this manuscript serves to address this consideration by moving beyond identification of patient cohorts to examining the role of ICU performance. METHODS Employing a comprehensive, publicly available database of electronic medical records (EMR), we began by utilizing multiple logistic regression to identify and isolate a specific cohort in which the weekend effect was present. Next, we leveraged the highly detailed nature of the EMR to evaluate ICU performance using well-established ICU quality scorecards to assess differences in clinical factors among patients admitted to an ICU on the weekend versus weekday. RESULTS Our results demonstrate the weekend effect to be most prevalent among emergency surgery patients (OR 1.53; 95% CI 1.19, 1.96), specifically those diagnosed with circulatory diseases (P<.001). Differences between weekday and weekend admissions for this cohort included a variety of clinical factors such as ventilatory support and night-time discharges. CONCLUSIONS This work reinforces the importance of accounting for differences in clinical factors as well as patient cohorts in studies investigating the weekend effect.
Collapse
Affiliation(s)
- Louis Faust
- Department of Computer Science & Engineering, University of Notre Dame, Notre Dame, USA
| | - Keith Feldman
- Department of Computer Science & Engineering, University of Notre Dame, Notre Dame, USA
| | - Nitesh V Chawla
- Department of Computer Science & Engineering, University of Notre Dame, Notre Dame, USA.
| |
Collapse
|
27
|
Lin S, Faust L, Robles-Granda P, Kajdanowicz T, Chawla NV. Social network structure is predictive of health and wellness. PLoS One 2019; 14:e0217264. [PMID: 31170181 PMCID: PMC6553705 DOI: 10.1371/journal.pone.0217264] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2019] [Accepted: 05/06/2019] [Indexed: 11/19/2022] Open
Abstract
Social networks influence health-related behavior, such as obesity and smoking. While researchers have studied social networks as a driver for diffusion of influences and behavior, it is less understood how the structure or topology of the network, in itself, impacts an individual’s health behavior and wellness state. In this paper, we investigate whether the structure or topology of a social network offers additional insight and predictability on an individual’s health and wellness. We develop a method called the Network-Driven health predictor (NetCARE) that leverages features representative of social network structure. Using a large longitudinal data set of students enrolled in the NetHealth study at the University of Notre Dame, we show that the NetCARE method improves the overall prediction performance over the baseline models—that use demographics and physical attributes—by 38%, 65%, 55%, and 54% for the wellness states—stress, happiness, positive attitude, and self-assessed health—considered in this paper.
Collapse
Affiliation(s)
- Suwen Lin
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN, United States of America
- Interdisciplinary Center for Network Science and Applications, University of Notre Dame, Notre Dame, IN, United States of America
| | - Louis Faust
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN, United States of America
- Interdisciplinary Center for Network Science and Applications, University of Notre Dame, Notre Dame, IN, United States of America
| | - Pablo Robles-Granda
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN, United States of America
- Interdisciplinary Center for Network Science and Applications, University of Notre Dame, Notre Dame, IN, United States of America
| | - Tomasz Kajdanowicz
- Department of Computational Intelligence, Wroclaw University of Science and Technology, Wrocław, Poland
| | - Nitesh V. Chawla
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN, United States of America
- Interdisciplinary Center for Network Science and Applications, University of Notre Dame, Notre Dame, IN, United States of America
- Department of Computational Intelligence, Wroclaw University of Science and Technology, Wrocław, Poland
- * E-mail:
| |
Collapse
|
28
|
Faust L, Wang C, Hachen D, Lizardo O, Chawla NV. Physical Activity Trend eXtraction: A Framework for Extracting Moderate-Vigorous Physical Activity Trends From Wearable Fitness Tracker Data. JMIR Mhealth Uhealth 2019; 7:e11075. [PMID: 30860488 PMCID: PMC6434402 DOI: 10.2196/11075] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2018] [Revised: 11/26/2018] [Accepted: 12/10/2018] [Indexed: 12/11/2022] Open
Abstract
Background Moderate-vigorous physical activity (MVPA) offers extensive health benefits but is neglected by many. As a result, a wide body of research investigating physical activity behavior change has been conducted. As many of these studies transition from paper-based methods of MVPA data collection to fitness trackers, a series of challenges arise in extracting insights from these new data. Objective The objective of this research was to develop a framework for preprocessing and extracting MVPA trends from wearable fitness tracker data to support MVPA behavior change studies. Methods Using heart rate data collected from fitness trackers, we propose Physical Activity Trend eXtraction (PATX), a framework that imputes missing data, recalculates personalized target heart zones, and extracts MVPA trends. We tested our framework on a dataset of 123 college study participants observed across 2 academic years (18 months) using Fitbit Charge HRs. To demonstrate the value of our frameworks’ output in supporting MVPA behavior change studies, we applied it to 2 case studies. Results Among the 123 participants analyzed, PATX labeled 41 participants as experiencing a significant increase in MVPA and 44 participants who experienced a significant decrease in MVPA, with significance defined as P<.05. Our first case study was consistent with previous works investigating the associations between MVPA and mental health. Whereas the second, exploring how individuals perceive their own levels of MVPA relative to their friends, led to a novel observation that individuals were less likely to notice changes in their own MVPA when close ties in their social network mimicked their changes. Conclusions By providing meaningful and flexible outputs, PATX alleviates data concerns common with fitness trackers to support MVPA behavior change studies as they shift to more objective assessments of MVPA.
Collapse
Affiliation(s)
- Louis Faust
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN, United States.,Interdisciplinary Center for Network Science and Applications, University of Notre Dame, Notre Dame, IN, United States
| | - Cheng Wang
- Interdisciplinary Center for Network Science and Applications, University of Notre Dame, Notre Dame, IN, United States.,Department of Sociology, University of Notre Dame, Notre Dame, IN, United States
| | - David Hachen
- Interdisciplinary Center for Network Science and Applications, University of Notre Dame, Notre Dame, IN, United States.,Department of Sociology, University of Notre Dame, Notre Dame, IN, United States
| | - Omar Lizardo
- Department of Sociology, University of California, Los Angeles, CA, United States
| | - Nitesh V Chawla
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN, United States.,Interdisciplinary Center for Network Science and Applications, University of Notre Dame, Notre Dame, IN, United States
| |
Collapse
|
29
|
Tao J, Wang C, Chawla NV, Shi L, Kim SH. Semantic Flow Graph: A Framework for Discovering Object Relationships in Flow Fields. IEEE Trans Vis Comput Graph 2018; 24:3200-3213. [PMID: 29990237 DOI: 10.1109/tvcg.2017.2773071] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Visual exploration of flow fields is important for studying dynamic systems. We introduce semantic flow graph (SFG), a novel graph representation and interaction framework that enables users to explore the relationships among key objects (i.e., field lines, features, and spatiotemporal regions) of both steady and unsteady flow fields. The objects and their relationships are organized as a heterogeneous graph. We assign each object a set of attributes, based on which a semantic abstraction of the heterogeneous graph is generated. This semantic abstraction is SFG. We design a suite of operations to explore the underlying flow fields based on this graph representation and abstraction mechanism. Users can flexibly reconfigure SFG to examine the relationships among groups of objects at different abstraction levels. Three linked views are developed to display SFG, its node split criteria and history, and the objects in the spatial volume. For simplicity, we introduce SFG construction and exploration for steady flow fields with critical points being the only features. Then we demonstrate that SFG can be naturally extended to deal with unsteady flow fields and multiple types of features. We experiment with multiple data sets and conduct an expert evaluation to demonstrate the effectiveness of our approach.
Collapse
|
30
|
Thomas PB, Robertson DH, Chawla NV. Predicting onset of complications from diabetes: a graph based approach. Appl Netw Sci 2018; 3:48. [PMID: 30581983 PMCID: PMC6245137 DOI: 10.1007/s41109-018-0106-z] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/03/2018] [Accepted: 10/17/2018] [Indexed: 06/09/2023]
Abstract
Diabetes is a significant health concern with more than 30 million Americans living with diabetes. Onset of diabetes increases the risk for various complications, including kidney disease, myocardial infractions, heart failure, stroke, retinopathy, and liver disease. In this paper, we study and predict the onset of these complications using a network-based approach by identifying fast and slow progressors. That is, given a patient's diagnosis of diabetes, we predict the likelihood of developing one or more of the possible complications, and which patients will develop complications quickly. This combination of "if a complication will be developed" with "how fast it will be developed" can aid the physician in developing better diabetes management program for a given patient.
Collapse
Affiliation(s)
- Pamela Bilo Thomas
- iCeNSA, Department of Computer Science and Engineering, University of Notre Dame, 384E Nieuwland Science Hall, Notre Dame, 46656 Indiana USA
- Indiana Biosciences Research Institute, 1345 W. 16th Street Suite 300, Indianapolis, 46202 IN USA
| | - Daniel H. Robertson
- Indiana Biosciences Research Institute, 1345 W. 16th Street Suite 300, Indianapolis, 46202 IN USA
| | - Nitesh V. Chawla
- iCeNSA, Department of Computer Science and Engineering, University of Notre Dame, 384E Nieuwland Science Hall, Notre Dame, 46656 Indiana USA
- Indiana Biosciences Research Institute, 1345 W. 16th Street Suite 300, Indianapolis, 46202 IN USA
| |
Collapse
|
31
|
Tao J, Imre M, Wang C, Chawla NV, Guo H, Sever G, Kim SH. Exploring Time-Varying Multivariate Volume Data Using Matrix of Isosurface Similarity Maps. IEEE Trans Vis Comput Graph 2018; 25:1236-1245. [PMID: 30130208 DOI: 10.1109/tvcg.2018.2864808] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
We present a novel visual representation and interface named the matrix of isosurface similarity maps (MISM) for effective exploration of large time-varying multivariate volumetric data sets. MISM synthesizes three types of similarity maps (i.e., self, temporal, and variable similarity maps) to capture the essential relationships among isosurfaces of different variables and time steps. Additionally, it serves as the main visual mapping and navigation tool for examining the vast number of isosurfaces and exploring the underlying time-varying multivariate data set. We present temporal clustering, variable grouping, and interactive filtering to reduce the huge exploration space of MISM. In conjunction with the isovalue and isosurface views, MISM allows users to identify important isosurfaces or isosurface pairs and compare them over space, time, and value range. More importantly, we introduce path recommendation that suggests, animates, and compares traversal paths for effectively exploring MISM under varied criteria and at different levels-of-detail. A silhouette-based method is applied to render multiple surfaces of interest in a visually succinct manner. We demonstrate the effectiveness of our approach with case studies of several time-varying multivariate data sets and an ensemble data set, and evaluate our work with two domain experts.
Collapse
|
32
|
Feldman K, Johnson RA, Chawla NV. The State of Data in Healthcare: Path Towards Standardization. J Healthc Inform Res 2018; 2:248-271. [DOI: 10.1007/s41666-018-0019-8] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2017] [Revised: 03/21/2018] [Accepted: 03/29/2018] [Indexed: 12/23/2022]
|
33
|
Fernandez A, Garcia S, Herrera F, Chawla NV. SMOTE for Learning from Imbalanced Data: Progress and Challenges, Marking the 15-year Anniversary. J ARTIF INTELL RES 2018. [DOI: 10.1613/jair.1.11192] [Citation(s) in RCA: 490] [Impact Index Per Article: 81.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022] Open
Abstract
The Synthetic Minority Oversampling Technique (SMOTE) preprocessing algorithm is considered "de facto" standard in the framework of learning from imbalanced data. This is due to its simplicity in the design of the procedure, as well as its robustness when applied to different type of problems. Since its publication in 2002, SMOTE has proven successful in a variety of applications from several different domains. SMOTE has also inspired several approaches to counter the issue of class imbalance, and has also significantly contributed to new supervised learning paradigms, including multilabel classification, incremental learning, semi-supervised learning, multi-instance learning, among others. It is standard benchmark for learning from imbalanced data. It is also featured in a number of different software packages - from open source to commercial. In this paper, marking the fifteen year anniversary of SMOTE, we reflect on the SMOTE journey, discuss the current state of affairs with SMOTE, its applications, and also identify the next set of challenges to extend SMOTE for Big Data problems.
Collapse
|
34
|
Feldman K, Kotoulas S, Chawla NV. TIQS: Targeted Iterative Question Selection for Health Interventions. J Healthc Inform Res 2018; 2:205-227. [PMID: 35415407 DOI: 10.1007/s41666-018-0015-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2017] [Revised: 02/08/2018] [Accepted: 02/21/2018] [Indexed: 11/28/2022]
Abstract
While healthcare has traditionally existed within the confines of formal clinical environments, the emergence of population health initiatives has given rise to a new and diverse set of community interventions. As the number of interventions continues to grow, the ability to quickly and accurately identify those most relevant to an individual's specific need has become essential in the care process. However, due to the diverse nature of the interventions, the determination need often requires non-clinical social and behavioral information that must be collected from the individuals themselves. Although survey tools have demonstrated success in the collection of this data, time restrictions and diminishing respondent interest have presented barriers to obtaining up-to-date information on a regular basis. In response, researchers have turned to analytical approaches to optimize surveys and quantify the importance of each question. To date, the majority of these works have approached the task from a univariate standpoint, identifying the next most important question to ask. However, such an approach fails to address the interconnected nature of the health conditions inherently captured by the broader set of survey questions. Utilizing data mining and machine learning methodology, this work demonstrates the value of capturing these relations. We present a novel framework that identifies a variable-length subset of survey questions most relevant in determining the need for a particular health intervention for a given individual. We evaluate the framework using a large national longitudinal dataset centered on aging, demonstrating the ability to identify the questions with the highest impact across a variety of interventions.
Collapse
Affiliation(s)
- Keith Feldman
- Department of Computer Science and Engineering, and iCeNSA, University of Notre Dame, Notre Dame, IN 46656 USA
| | - Spyros Kotoulas
- IBM Research Ireland, IBM Technology Campus, Dublin, Ireland
| | - Nitesh V Chawla
- Department of Computer Science and Engineering, and iCeNSA, University of Notre Dame, Notre Dame, IN 46656 USA.,Wrocław University of Science and Technology, Wrocław, Poland
| |
Collapse
|
35
|
Abstract
Nonstandard insurers suffer from a peculiar variant of fraud wherein an overwhelming majority of claims have the semblance of fraud. We show that state-of-the-art fraud detection performs poorly when deployed at underwriting. Our proposed framework "FraudBuster" represents a new paradigm in predicting segments of fraud at underwriting in an interpretable and regulation compliant manner. We show that the most actionable and generalizable profile of fraud is represented by market segments with high confidence of fraud and high loss ratio. We show how these segments can be reported in terms of their constituent policy traits, expected loss ratios, support, and confidence of fraud. Overall, our predictive models successfully identify fraud with an area under the precision-recall curve of 0.63 and an f-1 score of 0.769.
Collapse
Affiliation(s)
- Saurabh Nagrecha
- Department of Computer Science and Engineering, iCeNSA, University of Notre Dame , Notre Dame, Indiana
| | - Reid A Johnson
- Department of Computer Science and Engineering, iCeNSA, University of Notre Dame , Notre Dame, Indiana
| | - Nitesh V Chawla
- Department of Computer Science and Engineering, iCeNSA, University of Notre Dame , Notre Dame, Indiana
| |
Collapse
|
36
|
Nigam A, Dambanemuya HK, Joshi M, Chawla NV. Harvesting Social Signals to Inform Peace Processes Implementation and Monitoring. Big Data 2017; 5:337-355. [PMID: 29235916 PMCID: PMC5734239 DOI: 10.1089/big.2017.0055] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
Peace processes are complex, protracted, and contentious involving significant bargaining and compromising among various societal and political stakeholders. In civil war terminations, it is pertinent to measure the pulse of the nation to ensure that the peace process is responsive to citizens' concerns. Social media yields tremendous power as a tool for dialogue, debate, organization, and mobilization, thereby adding more complexity to the peace process. Using Colombia's final peace agreement and national referendum as a case study, we investigate the influence of two important indicators: intergroup polarization and public sentiment toward the peace process. We present a detailed linguistic analysis to detect intergroup polarization and a predictive model that leverages Tweet structure, content, and user-based features to predict public sentiment toward the Colombian peace process. We demonstrate that had proaccord stakeholders leveraged public opinion from social media, the outcome of the Colombian referendum could have been different.
Collapse
Affiliation(s)
- Aastha Nigam
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, Indiana
- Interdisciplinary Center for Network Science and Applications (iCeNSA), University of Notre Dame, Notre Dame, Indiana
| | - Henry K. Dambanemuya
- Interdisciplinary Center for Network Science and Applications (iCeNSA), University of Notre Dame, Notre Dame, Indiana
- Kroc Institute for International Peace Studies, University of Notre Dame, Notre Dame, Indiana
| | - Madhav Joshi
- Kroc Institute for International Peace Studies, University of Notre Dame, Notre Dame, Indiana
| | - Nitesh V. Chawla
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, Indiana
- Interdisciplinary Center for Network Science and Applications (iCeNSA), University of Notre Dame, Notre Dame, Indiana
- Kroc Institute for International Peace Studies, University of Notre Dame, Notre Dame, Indiana
| |
Collapse
|
37
|
Abstract
Users with demographic profiles in social networks offer the potential to understand the social principles that underpin our highly connected world, from individuals, to groups, to societies. In this article, we harness the power of network and data sciences to model the interplay between user demographics and social behavior and further study to what extent users’ demographic profiles can be inferred from their mobile communication patterns. By modeling over 7 million users and 1 billion mobile communication records, we find that during the active dating period (i.e., 18--35 years old), users are active in broadening social connections with males and females alike, while after reaching 35 years of age people tend to keep small, closed, and same-gender social circles. Further, we formalize the demographic prediction problem of inferring users’ gender and age simultaneously. We propose a factor graph-based
WhoAmI
method to address the problem by leveraging not only the correlations between network features and users’ gender/age, but also the interrelations between gender and age. In addition, we identify a new problem—coupled network demographic prediction across multiple mobile operators—and present a coupled variant of the
WhoAmI
method to address its unique challenges. Our extensive experiments demonstrate the effectiveness, scalability, and applicability of the
WhoAmI
methods. Finally, our study finds a greater than 80% potential predictability for inferring users’ gender from phone call behavior and 73% for users’ age from text messaging interactions.
Collapse
Affiliation(s)
| | | | - Jie Tang
- Tsinghua University, Beijing, P. R. China
| | - Yang Yang
- Zhejiang University, Hangzhou, P. R. China
| | | |
Collapse
|
38
|
Wang S, Song J, Yang Y, Zhang Y, Chawla NV, Ma J, Wang H. Interaction between obesity and the Hypoxia Inducible Factor 3 Alpha Subunit rs3826795 polymorphism in relation with plasma alanine aminotransferase. BMC Med Genet 2017; 18:80. [PMID: 28754107 PMCID: PMC5534125 DOI: 10.1186/s12881-017-0437-0] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/14/2016] [Accepted: 07/13/2017] [Indexed: 12/14/2022]
Abstract
BACKGROUND Hypoxia Inducible Factor 3 Alpha Subunit (HIF3A) DNA has been demonstrated to be associated with obesity in the methylation level, and it also has a Body Mass Index (BMI)-independent association with plasma alanine aminotransferase (ALT). However, the relation among obesity, plasma ALT, HIF3A polymorphism and methylation remains unclear. This study aims to identify the association between HIF3A polymorphism and plasma ALT, and further to determine whether the effect of HIF3A polymorphism on ALT could be modified by obesity or mediated by DNA methylation. METHODS The HIF3A rs3826795 polymorphism was genotyped in a case-control study including 2030 Chinese children aged 7-18 years (705 obese cases and 1325 non-obese controls). Furthermore, the HIF3A DNA methylation of the peripheral blood was measured in 110 severely obese children and 110 age- and gender- matched normal-weight controls. RESULTS There was no overall association between the HIF3A rs3826795 polymorphism and ALT. A significant interaction between obesity and rs3826795 in relation with ALT was found (P inter = 0.042), with rs3826795 G-allele number elevating ALT significantly only in obese children (β' = 0.075, P = 0.037), but not in non-obese children (β' = -0.009, P = 0.741). Additionally, a mediation effect of HIF3A methylation was found in the association between the HIF3A rs3826795 polymorphism and ALT among obese children (β' = 0.242, P = 0.014). CONCLUSION This is the first study to report the interaction between obesity and HIF3A gene in relation with ALT, and also to reveal a mediation effect among the HIF3A polymorphism, methylation and ALT. This study provides new evidence to the function of HIF3A gene, which would be helpful for future risk assessment and personalized treatment of liver diseases.
Collapse
Affiliation(s)
- Shuo Wang
- Institute of Child and Adolescent Health of Peking University, School of Public Health, Peking University Health Science Center, Beijing, 100191, China.,Interdisciplinary Center for Network Science and Applications (iCeNSA), University of Notre Dame, Notre Dame, IN, 46556, USA
| | - Jieyun Song
- Institute of Child and Adolescent Health of Peking University, School of Public Health, Peking University Health Science Center, Beijing, 100191, China
| | - Yide Yang
- Institute of Child and Adolescent Health of Peking University, School of Public Health, Peking University Health Science Center, Beijing, 100191, China
| | - Yining Zhang
- Institute of Child and Adolescent Health of Peking University, School of Public Health, Peking University Health Science Center, Beijing, 100191, China
| | - Nitesh V Chawla
- Interdisciplinary Center for Network Science and Applications (iCeNSA), University of Notre Dame, Notre Dame, IN, 46556, USA.,Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN, 46556, USA
| | - Jun Ma
- Institute of Child and Adolescent Health of Peking University, School of Public Health, Peking University Health Science Center, Beijing, 100191, China.
| | - Haijun Wang
- Division of Maternal and Child Health, School of Public Health, Peking University Health Science Center, Beijing, 100191, China.
| |
Collapse
|
39
|
Mursalin M, Zhang Y, Chen Y, Chawla NV. Automated epileptic seizure detection using improved correlation-based feature selection with random forest classifier. Neurocomputing 2017. [DOI: 10.1016/j.neucom.2017.02.053] [Citation(s) in RCA: 151] [Impact Index Per Article: 21.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
40
|
Wang S, Song J, Yang Y, Chawla NV, Ma J, Wang H. Rs12970134 near MC4R is associated with appetite and beverage intake in overweight and obese children: A family-based association study in Chinese population. PLoS One 2017; 12:e0177983. [PMID: 28520814 PMCID: PMC5433775 DOI: 10.1371/journal.pone.0177983] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2016] [Accepted: 05/05/2017] [Indexed: 12/15/2022] Open
Abstract
Background Recent studies indicated that eating behaviors are under genetic influence, and the melanocortin 4 receptor (MC4R) gene polymorphisms can affect the total energy intake and the consumption of fat, protein and carbohydrates. Our study aims at investigating the association of the MC4R polymorphism with appetite and food intake among Chinese children. Methods A family-based association study was conducted among 151 Chinese trios whose offsprings were overweight/obese children aged 9–15 years. The rs12970134 near MC4R was genotyped, and the Children Eating Behavior Questionnaire (CEBQ) and a self-designed questionnaire measuring food intake were performed. The FBAT and PBAT software packages were used. Results The family-based association analysis showed that there was a significant association between rs12970134 and obesity (Z = 2.449, P = 0.014). After adjusting for age, gender and standardized BMI, rs12970134 was significantly associated with food responsiveness (FR) among children (β'b = 0.077, Pb = 0.028), and with satiety responsiveness (SR) in trios (P = -0.026). The polymorphism was associated with beverage intake (β'b = 0.331, Pb = 0.00016 in children; P = 0.043 in trios), but not significantly associated with vegetable, fruit or meat intake (P>0.050). We further found a significant mediation effect among the rs12970134, FR and beverage intake (b = 0.177, P = 0.047). Conclusions Our study is the first to report that rs12970134 near MC4R was associated with appetite and beverage intake, and food responsiveness could mediate the effect of rs12970134 on beverage intake in overweight and obese Chinese children population. Further studies are needed to uncover the genetic basis for eating behaviors, which could lead to develop and implement effective interventional strategies early in life.
Collapse
Affiliation(s)
- Shuo Wang
- Institute of Child and Adolescent Health of Peking University, School of Public Health, Peking University Health Science Center, Beijing, China
- Interdisciplinary Center for Network Science and Applications (iCeNSA), University of Notre Dame, Notre Dame, IN, United States of America
| | - Jieyun Song
- Institute of Child and Adolescent Health of Peking University, School of Public Health, Peking University Health Science Center, Beijing, China
| | - Yide Yang
- Institute of Child and Adolescent Health of Peking University, School of Public Health, Peking University Health Science Center, Beijing, China
| | - Nitesh V. Chawla
- Interdisciplinary Center for Network Science and Applications (iCeNSA), University of Notre Dame, Notre Dame, IN, United States of America
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN, United States of America
| | - Jun Ma
- Institute of Child and Adolescent Health of Peking University, School of Public Health, Peking University Health Science Center, Beijing, China
- * E-mail: (JM); (HW)
| | - Haijun Wang
- Institute of Child and Adolescent Health of Peking University, School of Public Health, Peking University Health Science Center, Beijing, China
- * E-mail: (JM); (HW)
| |
Collapse
|
41
|
Dasgupta D, Johnson RA, Chaudhry B, Reeves KG, Willaert P, Chawla NV. Design and Evaluation of a Medication Adherence Application with Communication for Seniors in Independent Living Communities. AMIA Annu Symp Proc 2017; 2016:480-489. [PMID: 28269843 PMCID: PMC5333254] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
Medication non-adherence is a pressing concern among seniors, leading to a lower quality of life and higher healthcare costs. While mobile applications provide a viable medium for medication management, their utility can be limited without tackling the specific needs of seniors and facilitating the active involvement of care providers. To address these limitations, we are developing a tablet-based application designed specifically for seniors to track their medications and a web portal for their care providers to track medication adherence. In collaboration with a local Aging in Place program, we conducted a three-month study with sixteen participants from an independent living facility. Our study found that the application helped participants to effectively track their medications and improved their sense of wellbeing. Our findings highlight the importance of catering to the needs of seniors and of involving care providers in this process, with specific recommendations for the development of future medication management applications.
Collapse
|
42
|
Abstract
To ensure the correctness of network analysis methods, the network (as the input) has to be a sufficiently accurate representation of the underlying data. However, when representing sequential data from complex systems, such as global shipping traffic or Web clickstream traffic as networks, conventional network representations that implicitly assume the Markov property (first-order dependency) can quickly become limiting. This assumption holds that, when movements are simulated on the network, the next movement depends only on the current node, discounting the fact that the movement may depend on several previous steps. However, we show that data derived from many complex systems can show up to fifth-order dependencies. In these cases, the oversimplifying assumption of the first-order network representation can lead to inaccurate network analysis results. To address this problem, we propose the higher-order network (HON) representation that can discover and embed variable orders of dependencies in a network representation. Through a comprehensive empirical evaluation and analysis, we establish several desirable characteristics of HON, including accuracy, scalability, and direct compatibility with the existing suite of network analysis methods. We illustrate how HON can be applied to a broad variety of tasks, such as random walking, clustering, and ranking, and we demonstrate that, by using it as input, HON yields more accurate results without any modification to these tasks.
Collapse
Affiliation(s)
- Jian Xu
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN 46556, USA
- Interdisciplinary Center for Network Science and Applications, University of Notre Dame, Notre Dame, IN 46556, USA.Department of Electrical Engineering, University of Notre Dame, Notre Dame, IN 46556, USA.Environmental Change Initiative, University of Notre Dame, Notre Dame, IN 46556, USA
| | - Thanuka L. Wickramarathne
- Interdisciplinary Center for Network Science and Applications, University of Notre Dame, Notre Dame, IN 46556, USA.Department of Electrical Engineering, University of Notre Dame, Notre Dame, IN 46556, USA.Environmental Change Initiative, University of Notre Dame, Notre Dame, IN 46556, USA
| | - Nitesh V. Chawla
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN 46556, USA
- Interdisciplinary Center for Network Science and Applications, University of Notre Dame, Notre Dame, IN 46556, USA.Department of Electrical Engineering, University of Notre Dame, Notre Dame, IN 46556, USA.Environmental Change Initiative, University of Notre Dame, Notre Dame, IN 46556, USA
| |
Collapse
|
43
|
Abstract
On April 2nd, 2014, the Department of Health and Human Services (HHS) announced a historic policy in its effort to increase the transparency in the American healthcare system. The Center for Medicare and Medicaid Service (CMS) would publicly release a dataset containing information about the types of Medicare services, requested charges, and payments issued by providers across the country. In its release, HHS stated that the data would shed light on "Medicare fraud, waste, and abuse." While this is most certainly true, we believe that it can provide so much more. Beyond the purely financial aspects of procedure charges and payments, the procedures themselves may provide us with additional information, not only about the Medicare population, but also about the physicians themselves. The procedures a physician performs are for the most part not novel, but rather recommended, observed, and studied. However, whether a physician decides on advocating a procedure is somewhat discretionary. Some patients require a clear course of action, while others may benefit from a variety of options. This article poses the following question: How does a physician's past experience in medical school shape his or her practicing decisions? This article aims to open the analysis into how data, such as the CMS Medicare release, can help further our understanding of knowledge transfer and how experiences during education can shape a physician's decision's over the course of his or her career. This work begins with an evaluation into similarities between medical school charges, procedures, and payments. It then details how schools' procedure choices may link them in other, more interesting ways. Finally, the article includes a geographic analysis of how medical school procedure payments and charges are distributed nationally, highlighting potential deviations.
Collapse
Affiliation(s)
- Keith Feldman
- Department of Computer Science & Engineering, iCeNSA, University of Notre Dame, Notre Dame, Indiana
| | - Nitesh V. Chawla
- Department of Computer Science & Engineering, iCeNSA, University of Notre Dame, Notre Dame, Indiana
| |
Collapse
|
44
|
Dong Y, Tang J, Chawla NV, Lou T, Yang Y, Wang B. Inferring social status and rich club effects in enterprise communication networks. PLoS One 2015; 10:e0119446. [PMID: 25822343 PMCID: PMC4379184 DOI: 10.1371/journal.pone.0119446] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2014] [Accepted: 01/19/2015] [Indexed: 11/30/2022] Open
Abstract
Social status, defined as the relative rank or position that an individual holds in a social hierarchy, is known to be among the most important motivating forces in social behaviors. In this paper, we consider the notion of status from the perspective of a position or title held by a person in an enterprise. We study the intersection of social status and social networks in an enterprise. We study whether enterprise communication logs can help reveal how social interactions and individual status manifest themselves in social networks. To that end, we use two enterprise datasets with three communication channels — voice call, short message, and email — to demonstrate the social-behavioral differences among individuals with different status. We have several interesting findings and based on these findings we also develop a model to predict social status. On the individual level, high-status individuals are more likely to be spanned as structural holes by linking to people in parts of the enterprise networks that are otherwise not well connected to one another. On the community level, the principle of homophily, social balance and clique theory generally indicate a “rich club” maintained by high-status individuals, in the sense that this community is much more connected, balanced and dense. Our model can predict social status of individuals with 93% accuracy.
Collapse
Affiliation(s)
- Yuxiao Dong
- Interdisciplinary Center for Network Science and Applications, Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN, United States of America
| | - Jie Tang
- Department of Computer Science and Technology, Tsinghua University, Beijing, P. R. China
| | - Nitesh V. Chawla
- Interdisciplinary Center for Network Science and Applications, Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN, United States of America
- * E-mail:
| | - Tiancheng Lou
- Google Inc, Mountain View, CA, United States of America
| | - Yang Yang
- Interdisciplinary Center for Network Science and Applications, Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN, United States of America
| | - Bai Wang
- Department of Computer Science and Technology, Beijing University of Posts and Telecommunications, Beijing, P. R. China
| |
Collapse
|
45
|
Dong Y, Pinelli F, Gkoufas Y, Nabi Z, Calabrese F, Chawla NV. Inferring Unusual Crowd Events from Mobile Phone Call Detail Records. Machine Learning and Knowledge Discovery in Databases 2015. [DOI: 10.1007/978-3-319-23525-7_29] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
|
46
|
Abstract
Centrality of a node measures its relative importance within a network. There are a number of applications of centrality, including inferring the influence or success of an individual in a social network, and the resulting social network dynamics. While we can compute the centrality of any node in a given network snapshot, a number of applications are also interested in knowing the potential importance of an individual in the future. However, current centrality is not necessarily an effective predictor of future centrality. While there are different measures of centrality, we focus on degree centrality in this paper. We develop a method that reconciles preferential attachment and triadic closure to capture a node's prominence profile. We show that the proposed node prominence profile method is an effective predictor of degree centrality. Notably, our analysis reveals that individuals in the early stage of evolution display a distinctive and robust signature in degree centrality trend, adequately predicted by their prominence profile. We evaluate our work across four real-world social networks. Our findings have important implications for the applications that require prediction of a node's future degree centrality, as well as the study of social network dynamics.
Collapse
Affiliation(s)
- Yang Yang
- Interdisciplinary Center for Network Science and Applications (iCeNSA), Department of Computer Science and Engineering, University of Notre Dame
| | - Yuxiao Dong
- Interdisciplinary Center for Network Science and Applications (iCeNSA), Department of Computer Science and Engineering, University of Notre Dame
| | - Nitesh V Chawla
- Interdisciplinary Center for Network Science and Applications (iCeNSA), Department of Computer Science and Engineering, University of Notre Dame
| |
Collapse
|
47
|
Zhou ZH, Chawla NV, Jin Y, Williams GJ. Big Data Opportunities and Challenges: Discussions from Data Analytics Perspectives [Discussion Forum]. IEEE COMPUT INTELL M 2014. [DOI: 10.1109/mci.2014.2350953] [Citation(s) in RCA: 166] [Impact Index Per Article: 16.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|
48
|
Abstract
We describe the vertex collocation profile (VCP) concept. VCPs provide rich information about the surrounding local structure of embedded vertex pairs. VCP analysis offers a new tool for researchers and domain experts to understand the underlying growth mechanisms in their networks and to analyze link formation mechanisms in the appropriate sociological, biological, physical, or other context. The same resolution that gives the VCP method its analytical power also enables it to perform well when used to accomplish link prediction. We first develop the theory, mathematics, and algorithms underlying VCPs. We provide timing results to demonstrate that the algorithms scale well even for large networks. Then we demonstrate VCP methods performing link prediction competitively with unsupervised and supervised methods across different network families. Unlike many analytical tools, VCPs inherently generalize to multirelational data, which provides them with unique power in complex modeling tasks. To demonstrate this, we apply the VCP method to longitudinal networks by encoding temporally resolved information into different relations. In this way, the transitions between VCP elements represent temporal evolutionary patterns in the longitudinal network data. Results show that VCPs can use this additional data, typically challenging to employ, to improve predictive model accuracies. We conclude with our perspectives on the VCP method and its future in network science, particularly link prediction.
Collapse
Affiliation(s)
- Ryan N Lichtenwalter
- Interdisciplinary Center for Network Science and Applications (iCeNSA), The University of Notre Dame, 384 Nieuwland Hall, 46556 Notre Dame, USA ; Department of Computer Science, The University of Notre Dame, 384 Fitzpatrick Hall, 46556 Notre Dame, USA
| | - Nitesh V Chawla
- Interdisciplinary Center for Network Science and Applications (iCeNSA), The University of Notre Dame, 384 Nieuwland Hall, 46556 Notre Dame, USA ; Department of Computer Science, The University of Notre Dame, 384 Fitzpatrick Hall, 46556 Notre Dame, USA
| |
Collapse
|
49
|
Rider AK, Siwo G, Emrich SJ, Ferdig MT, Chawla NV. A supervised learning approach to the ensemble clustering of genes. INT J DATA MIN BIOIN 2014; 9:199-219. [DOI: 10.1504/ijdmb.2014.059062] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
|
50
|
Abstract
One of the concerns patients have when confronted with a medical condition is which physician to trust. Any recommendation system that seeks to answer this question must ensure that any sensitive medical information collected by the system is properly secured. In this article, we codify these privacy concerns in a privacy-friendly framework and present two architectures that realize it: the Secure Processing Architecture (SPA) and the Anonymous Contributions Architecture (ACA). In SPA, patients submit their ratings in a protected form without revealing any information about their data and the computation of recommendations proceeds over the protected data using secure multiparty computation techniques. In ACA, patients submit their ratings in the clear, but no link between a submission and patient data can be made. We discuss various aspects of both architectures, including techniques for ensuring reliability of computed recommendations and system performance, and provide their comparison.
Collapse
|