1
|
Kang HYJ, Batbaatar E, Choi DW, Choi KS, Ko M, Ryu KS. Synthetic Tabular Data Based on Generative Adversarial Networks in Health Care: Generation and Validation Using the Divide-and-Conquer Strategy. JMIR Med Inform 2023; 11:e47859. [PMID: 37999942 DOI: 10.2196/47859] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2023] [Revised: 08/02/2023] [Accepted: 10/28/2023] [Indexed: 11/25/2023] Open
Abstract
BACKGROUND Synthetic data generation (SDG) based on generative adversarial networks (GANs) is used in health care, but research on preserving data with logical relationships with synthetic tabular data (STD) remains challenging. Filtering methods for SDG can lead to the loss of important information. OBJECTIVE This study proposed a divide-and-conquer (DC) method to generate STD based on the GAN algorithm, while preserving data with logical relationships. METHODS The proposed method was evaluated on data from the Korea Association for Lung Cancer Registry (KALC-R) and 2 benchmark data sets (breast cancer and diabetes). The DC-based SDG strategy comprises 3 steps: (1) We used 2 different partitioning methods (the class-specific criterion distinguished between survival and death groups, while the Cramer V criterion identified the highest correlation between columns in the original data); (2) the entire data set was divided into a number of subsets, which were then used as input for the conditional tabular generative adversarial network and the copula generative adversarial network to generate synthetic data; and (3) the generated synthetic data were consolidated into a single entity. For validation, we compared DC-based SDG and conditional sampling (CS)-based SDG through the performances of machine learning models. In addition, we generated imbalanced and balanced synthetic data for each of the 3 data sets and compared their performance using 4 classifiers: decision tree (DT), random forest (RF), Extreme Gradient Boosting (XGBoost), and light gradient-boosting machine (LGBM) models. RESULTS The synthetic data of the 3 diseases (non-small cell lung cancer [NSCLC], breast cancer, and diabetes) generated by our proposed model outperformed the 4 classifiers (DT, RF, XGBoost, and LGBM). The CS- versus DC-based model performances were compared using the mean area under the curve (SD) values: 74.87 (SD 0.77) versus 63.87 (SD 2.02) for NSCLC, 73.31 (SD 1.11) versus 67.96 (SD 2.15) for breast cancer, and 61.57 (SD 0.09) versus 60.08 (SD 0.17) for diabetes (DT); 85.61 (SD 0.29) versus 79.01 (SD 1.20) for NSCLC, 78.05 (SD 1.59) versus 73.48 (SD 4.73) for breast cancer, and 59.98 (SD 0.24) versus 58.55 (SD 0.17) for diabetes (RF); 85.20 (SD 0.82) versus 76.42 (SD 0.93) for NSCLC, 77.86 (SD 2.27) versus 68.32 (SD 2.37) for breast cancer, and 60.18 (SD 0.20) versus 58.98 (SD 0.29) for diabetes (XGBoost); and 85.14 (SD 0.77) versus 77.62 (SD 1.85) for NSCLC, 78.16 (SD 1.52) versus 70.02 (SD 2.17) for breast cancer, and 61.75 (SD 0.13) versus 61.12 (SD 0.23) for diabetes (LGBM). In addition, we found that balanced synthetic data performed better. CONCLUSIONS This study is the first attempt to generate and validate STD based on a DC approach and shows improved performance using STD. The necessity for balanced SDG was also demonstrated.
Collapse
Affiliation(s)
- Ha Ye Jin Kang
- Department of Applied Artificial Intelligence, Hanyang University, Ansan, Republic of Korea
- Department of Cancer AI & Digital Health, Graduate School of Cancer Science and Policy, National Cancer Center, Gyeonggi-do, Republic of Korea
| | - Erdenebileg Batbaatar
- National Cancer Data Center, National Cancer Control Institute, National Cancer Center, Gyeonggi-do, Republic of Korea
| | - Dong-Woo Choi
- National Cancer Data Center, National Cancer Control Institute, National Cancer Center, Gyeonggi-do, Republic of Korea
| | - Kui Son Choi
- National Cancer Data Center, National Cancer Control Institute, National Cancer Center, Gyeonggi-do, Republic of Korea
- Department of Cancer Control and Policy, Graduate School of Cancer Science and Policy, National Cancer Center, Gyeonggi-do, Republic of Korea
| | - Minsam Ko
- Department of Applied Artificial Intelligence, Hanyang University, Ansan, Republic of Korea
- Department of Human-Computer Interaction, Hanyang University, Ansan, Republic of Korea
| | - Kwang Sun Ryu
- Department of Cancer AI & Digital Health, Graduate School of Cancer Science and Policy, National Cancer Center, Gyeonggi-do, Republic of Korea
- National Cancer Data Center, National Cancer Control Institute, National Cancer Center, Gyeonggi-do, Republic of Korea
| |
Collapse
|
2
|
Ganbat M, Batbaatar E, Bazarragchaa G, Ider T, Gantumur E, Dashkhorol L, Altantsatsralt K, Nemekh M, Dashdondog E, Namsrai OE. Effect of Psychological Factors on Credit Risk: A Case Study of the Microlending Service in Mongolia. Behav Sci (Basel) 2021; 11:bs11040047. [PMID: 33916498 PMCID: PMC8067141 DOI: 10.3390/bs11040047] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2021] [Revised: 03/17/2021] [Accepted: 03/26/2021] [Indexed: 11/16/2022] Open
Abstract
This paper determined the predefining factors of loan repayment behavior based on psychological and behavioral economics theories. The purpose of this research is to identify whether an individual’s credit risk can be predicted based on psychometric tests measuring areas of psychological factors such as effective economic decision-making, self-control, conscientiousness, selflessness and a giving attitude, neuroticism, and attitude toward money. In addition, we compared the psychological indicators to the financial indicators, and different age and gender groups, to assess whether the former can predict loan default prospects. This research covered the psychometric test results, financial information, and loan default information of 1118 borrowers from loan-issuing applications on mobile phones. We validated the questionnaire using confirmatory factor analysis (CFA) and achieved an overall Cronbach’s alpha reliability coefficient greater than 0.90 (α = 0.937). We applied the empirical data to construct prediction models using logistic regression. Logistic regression was employed to estimate the parameters of a logistic model. The outcome indicates that positive results from the psychometric testing of effective financial decision-making, self-control, conscientiousness, selflessness and a giving attitude, and attitude toward money enable individuals’ debt access possibilities. On the other hand, one of the variables—neuroticism—was determined to be insignificant. Finally, the model only used psychological variables proven to have significant default predictability, and psychological variables and psychometric credit scoring offer the best prediction capacities.
Collapse
Affiliation(s)
- Mandukhai Ganbat
- Department of Research and Development, Optimal N Max LLC, Bogd Javzandamba Street, Khan-Uul District, LS Plaza 801, Ulaanbaatar 17011, Mongolia; (M.G.); (G.B.); (T.I.); (E.G.); (L.D.); (K.A.); (M.N.)
| | - Erdenebileg Batbaatar
- Department of Information and Computer Sciences, School of Engineering and Applied Sciences, National University of Mongolia, Ikh surguuliin gudamj 3, Sukhbaatar District, P.O. Box-46A/600, Ulaanbaatar 14201, Mongolia;
| | - Ganzul Bazarragchaa
- Department of Research and Development, Optimal N Max LLC, Bogd Javzandamba Street, Khan-Uul District, LS Plaza 801, Ulaanbaatar 17011, Mongolia; (M.G.); (G.B.); (T.I.); (E.G.); (L.D.); (K.A.); (M.N.)
| | - Togtuunaa Ider
- Department of Research and Development, Optimal N Max LLC, Bogd Javzandamba Street, Khan-Uul District, LS Plaza 801, Ulaanbaatar 17011, Mongolia; (M.G.); (G.B.); (T.I.); (E.G.); (L.D.); (K.A.); (M.N.)
| | - Enkhjargalan Gantumur
- Department of Research and Development, Optimal N Max LLC, Bogd Javzandamba Street, Khan-Uul District, LS Plaza 801, Ulaanbaatar 17011, Mongolia; (M.G.); (G.B.); (T.I.); (E.G.); (L.D.); (K.A.); (M.N.)
| | - Lkhamsuren Dashkhorol
- Department of Research and Development, Optimal N Max LLC, Bogd Javzandamba Street, Khan-Uul District, LS Plaza 801, Ulaanbaatar 17011, Mongolia; (M.G.); (G.B.); (T.I.); (E.G.); (L.D.); (K.A.); (M.N.)
| | - Khosgarig Altantsatsralt
- Department of Research and Development, Optimal N Max LLC, Bogd Javzandamba Street, Khan-Uul District, LS Plaza 801, Ulaanbaatar 17011, Mongolia; (M.G.); (G.B.); (T.I.); (E.G.); (L.D.); (K.A.); (M.N.)
| | - Mandakhbayar Nemekh
- Department of Research and Development, Optimal N Max LLC, Bogd Javzandamba Street, Khan-Uul District, LS Plaza 801, Ulaanbaatar 17011, Mongolia; (M.G.); (G.B.); (T.I.); (E.G.); (L.D.); (K.A.); (M.N.)
| | - Erdenebaatar Dashdondog
- Department of Physics, School of Arts and Sciences, National University of Mongolia, Ikh surguuliin gudamj 1, Sukhbaatar District, P.O. Box-46A/600, Ulaanbaatar 14201, Mongolia
- Correspondence: (E.D.); (O.-E.N.)
| | - Oyun-Erdene Namsrai
- Department of Information and Computer Sciences, School of Engineering and Applied Sciences, National University of Mongolia, Ikh surguuliin gudamj 3, Sukhbaatar District, P.O. Box-46A/600, Ulaanbaatar 14201, Mongolia;
- Correspondence: (E.D.); (O.-E.N.)
| |
Collapse
|
3
|
Dorjdagva J, Batbaatar E, Dorjsuren B, Kauhanen J. Socioeconomic Inequalities in Mental Health in Mongolia. Eur J Public Health 2020. [DOI: 10.1093/eurpub/ckaa166.1062] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Abstract
Background
Promotion of mental health and well-being is recently recognized as a health priority at the global level. In Mongolia, mental health issues have been on the rise. However, less is known on socioeconomic inequality in mental health in the country. The aim of this study is to examine socioeconomic inequality in mental health in the adult population in Mongolia.
Methods
This study analyzed the data of 30,567 adults from the Household Socio-Economic Survey, collected in 2012 by the National Statistical Office of Mongolia. Self-reported mental health was used as a health outcome variable. Socioeconomic status was measured by household income. We employed the Wagstaff's concentration index to assess the degree of socioeconomic inequality in mental health.
Results
The results show that the prevalence of self-reported mental health was 1.17% among the respondents. The adults living in urban areas suffer significantly more with mental illness compared to the adults living in rural settlements. The Wagstaff's concentration index for mental health was significantly negative (-0.243), indicating that mental health problems were concentrated among the lower-income groups. The decomposition results show that education, economic activity status and marital status were the main contributors to socioeconomic inequalities in mental health after removing age-sex related contributions.
Conclusions
Socioeconomic inequality in mental health exists in the adult population in Mongolia, which was mainly explained by the education level, employment and marital status. Prospective policies are needed to reduce socioeconomic inequality in mental health in the country.
Key messages
Socioeconomic inequality in mental health exists in Mongolia. It calls for further policy actions.
Collapse
Affiliation(s)
- J Dorjdagva
- Institute of Public Health and Clinical Nutrition, University of Eastern Finland, Kuopio, Finland
| | - E Batbaatar
- Department of Social Sciences, University of Eastern Finland, Kuopio, Finland
| | - B Dorjsuren
- Department of Health Systems Governance and Financing, WHO, Geneva, Switzerland
| | - J Kauhanen
- Institute of Public Health and Clinical Nutrition, University of Eastern Finland, Kuopio, Finland
| |
Collapse
|