1
|
Wieditz J, Miller C, Scholand J, Nemeth M. A Brief Introduction on Latent Variable Based Ordinal Regression Models With an Application to Survey Data. Stat Med 2024; 43:5618-5634. [PMID: 39466627 PMCID: PMC11588990 DOI: 10.1002/sim.10208] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2023] [Revised: 01/31/2024] [Accepted: 08/12/2024] [Indexed: 10/30/2024]
Abstract
The analysis of survey data is a frequently arising issue in clinical trials, particularly when capturing quantities which are difficult to measure. Typical examples are questionnaires about patient's well-being, pain, or consent to an intervention. In these, data is captured on a discrete scale containing only a limited number of possible answers, from which the respondent has to pick the answer which fits best his/her personal opinion. This data is generally located on an ordinal scale as answers can usually be arranged in an ascending order, for example, "bad", "neutral", "good" for well-being. Since responses are usually stored numerically for data processing purposes, analysis of survey data using ordinary linear regression models are commonly applied. However, assumptions of these models are often not met as linear regression requires a constant variability of the response variable and can yield predictions out of the range of response categories. By using linear models, one only gains insights about the mean response which may affect representativeness. In contrast, ordinal regression models can provide probability estimates for all response categories and yield information about the full response scale beyond the mean. In this work, we provide a concise overview of the fundamentals of latent variable based ordinal models, applications to a real data set, and outline the use of state-of-the-art-software for this purpose. Moreover, we discuss strengths, limitations and typical pitfalls. This is a companion work to a current vignette-based structured interview study in pediatric anesthesia.
Collapse
Affiliation(s)
- Johannes Wieditz
- Department of Medical StatisticsUniversity Medical Center GöttingenGöttingenGermany
- Department of AnaesthesiologyUniversity Medical Center GöttingenGöttingenGermany
| | - Clemens Miller
- Department of AnaesthesiologyUniversity Medical Center GöttingenGöttingenGermany
- Department of AnesthesiologyChildren's Orthopedic HospitalAschau im ChiemgauGermany
| | - Jan Scholand
- Department of AnaesthesiologyUniversity Medical Center GöttingenGöttingenGermany
| | - Marcus Nemeth
- Department of AnaesthesiologyUniversity Medical Center GöttingenGöttingenGermany
| |
Collapse
|
2
|
Levy JJ, Chan N, Marotti JD, Kerr DA, Gutmann EJ, Glass RE, Dodge CP, Suriawinata AA, Christensen B, Liu X, Vaickus LJ. Large-scale validation study of an improved semiautonomous urine cytology assessment tool: AutoParis-X. Cancer Cytopathol 2023; 131:637-654. [PMID: 37377320 PMCID: PMC11251731 DOI: 10.1002/cncy.22732] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2023] [Revised: 05/11/2023] [Accepted: 05/12/2023] [Indexed: 06/29/2023]
Abstract
BACKGROUND Adopting a computational approach for the assessment of urine cytology specimens has the potential to improve the efficiency, accuracy, and reliability of bladder cancer screening, which has heretofore relied on semisubjective manual assessment methods. As rigorous, quantitative criteria and guidelines have been introduced for improving screening practices (e.g., The Paris System for Reporting Urinary Cytology), algorithms to emulate semiautonomous diagnostic decision-making have lagged behind, in part because of the complex and nuanced nature of urine cytology reporting. METHODS In this study, the authors report on the development and large-scale validation of a deep-learning tool, AutoParis-X, which can facilitate rapid, semiautonomous examination of urine cytology specimens. RESULTS The results of this large-scale, retrospective validation study indicate that AutoParis-X can accurately determine urothelial cell atypia and aggregate a wide variety of cell-related and cluster-related information across a slide to yield an atypia burden score, which correlates closely with overall specimen atypia and is predictive of Paris system diagnostic categories. Importantly, this approach accounts for challenges associated with the assessment of overlapping cell cluster borders, which improve the ability to predict specimen atypia and accurately estimate the nuclear-to-cytoplasm ratio for cells in these clusters. CONCLUSIONS The authors developed a publicly available, open-source, interactive web application that features a simple, easy-to-use display for examining urine cytology whole-slide images and determining the level of atypia in specific cells, flagging the most abnormal cells for pathologist review. The accuracy of AutoParis-X (and other semiautomated digital pathology systems) indicates that these technologies are approaching clinical readiness and necessitates full evaluation of these algorithms in head-to-head clinical trials.
Collapse
Affiliation(s)
- Joshua J. Levy
- Emerging Diagnostic and Investigative Technologies, Department of Pathology and Laboratory Medicine, Dartmouth Hitchcock Medical Center, Lebanon, NH, 03766
- Department of Dermatology, Dartmouth Hitchcock Medical Center, Lebanon, NH, 03766
- Department of Epidemiology, Dartmouth College Geisel School of Medicine, Hanover, NH, 03756
- Program in Quantitative Biomedical Sciences, Dartmouth College Geisel School of Medicine, Hanover, NH, 03756
| | - Natt Chan
- Program in Quantitative Biomedical Sciences, Dartmouth College Geisel School of Medicine, Hanover, NH, 03756
| | - Jonathan D. Marotti
- Emerging Diagnostic and Investigative Technologies, Department of Pathology and Laboratory Medicine, Dartmouth Hitchcock Medical Center, Lebanon, NH, 03766
- Dartmouth College Geisel School of Medicine, Hanover, NH, 03756
| | - Darcy A. Kerr
- Emerging Diagnostic and Investigative Technologies, Department of Pathology and Laboratory Medicine, Dartmouth Hitchcock Medical Center, Lebanon, NH, 03766
- Dartmouth College Geisel School of Medicine, Hanover, NH, 03756
| | - Edward J. Gutmann
- Emerging Diagnostic and Investigative Technologies, Department of Pathology and Laboratory Medicine, Dartmouth Hitchcock Medical Center, Lebanon, NH, 03766
- Dartmouth College Geisel School of Medicine, Hanover, NH, 03756
| | | | | | - Arief A. Suriawinata
- Emerging Diagnostic and Investigative Technologies, Department of Pathology and Laboratory Medicine, Dartmouth Hitchcock Medical Center, Lebanon, NH, 03766
- Dartmouth College Geisel School of Medicine, Hanover, NH, 03756
| | - Brock Christensen
- Department of Epidemiology, Dartmouth College Geisel School of Medicine, Hanover, NH, 03756
- Department of Molecular and Systems Biology, Dartmouth College Geisel School of Medicine, Hanover, NH, 03756
- Department of Community and Family Medicine, Dartmouth College Geisel School of Medicine, Hanover, NH, 03756
| | - Xiaoying Liu
- Emerging Diagnostic and Investigative Technologies, Department of Pathology and Laboratory Medicine, Dartmouth Hitchcock Medical Center, Lebanon, NH, 03766
- Dartmouth College Geisel School of Medicine, Hanover, NH, 03756
| | - Louis J. Vaickus
- Emerging Diagnostic and Investigative Technologies, Department of Pathology and Laboratory Medicine, Dartmouth Hitchcock Medical Center, Lebanon, NH, 03766
- Dartmouth College Geisel School of Medicine, Hanover, NH, 03756
| |
Collapse
|
3
|
Yao Y. Bayesian network model structure based on binary evolutionary algorithm. PeerJ Comput Sci 2023; 9:e1466. [PMID: 37547397 PMCID: PMC10403175 DOI: 10.7717/peerj-cs.1466] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2023] [Accepted: 06/08/2023] [Indexed: 08/08/2023]
Abstract
With the continuous development of new technologies, the scale of training data is also expanding. Machine learning algorithms are gradually beginning to be studied and applied in places where the scale of data is relatively large. Because the current structure of learning algorithms only focus on the identification of dependencies and ignores the direction of dependencies, it causes multiple labeled samples not to identify categories. Multiple labels need to be classified using techniques such as machine learning and then applied to solve the problem. In the environment of more training data, it is very meaningful to explore the structure extension to identify the dependencies between attributes and take into account the direction of dependencies. In this article, Bayesian network structure learning, analysis of the shortcomings of traditional algorithms, and binary evolutionary algorithm are applied to the randomized algorithm to generate the initial population. In the optimization process of the algorithm, it uses a Bayesian network to do a local search and uses a depth-first algorithm to break the loop. Finally, it finds a higher score for the network structure. In the simulation experiment, the classic data sets, ALARM and INSURANCE, are introduced to verify the effectiveness of the algorithm. Compared with NOTEARS and the Expectation-Maximization (EM) algorithm, the weight evaluation index of this article was 4.5% and 7.3% better than other schemes. The clustering effect was improved by 13.5% and 15.2%. The smallest error and the highest accuracy are also better than other schemes. The discussion of Bayesian reasoning in this article has very important theoretical and practical significance. This article further improves the Bayesian network structure and optimizes the performance of the classifier, which plays a very important role in promoting the expansion of the network structure and provides innovative thinking.
Collapse
Affiliation(s)
- Yongna Yao
- School of Information and Electronic Engineering, Shangqiu Institute of Technology, Shangqiu, China
| |
Collapse
|
6
|
Error-Correcting Output Codes in the Framework of Deep Ordinal Classification. Neural Process Lett 2022. [DOI: 10.1007/s11063-022-10824-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Abstract
AbstractAutomatic classification tasks on structured data have been revolutionized by Convolutional Neural Networks (CNNs), but the focus has been on binary and nominal classification tasks. Only recently, ordinal classification (where class labels present a natural ordering) has been tackled through the framework of CNNs. Also, ordinal classification datasets commonly present a high imbalance in the number of samples of each class, making it an even harder problem. Focus should be shifted from classic classification metrics towards per-class metrics (like AUC or Sensitivity) and rank agreement metrics (like Cohen’s Kappa or Spearman’s rank correlation coefficient). We present a new CNN architecture based on the Ordinal Binary Decomposition (OBD) technique using Error-Correcting Output Codes (ECOC). We aim to show experimentally, using four different CNN architectures and two ordinal classification datasets, that the OBD+ECOC methodology significantly improves the mean results on the relevant ordinal and class-balancing metrics. The proposed method is able to outperform a nominal approach as well as already existing ordinal approaches, achieving a mean performance of $${{\,\mathrm{\textit{RMSE}}\,}}= 1.0797$$
RMSE
=
1.0797
for the Retinopathy dataset and $${{\,\mathrm{\textit{RMSE}}\,}}= 1.1237$$
RMSE
=
1.1237
for the Adience dataset averaged over 4 different architectures.
Collapse
|
7
|
A novel deep ordinal classification approach for aesthetic quality control classification. Neural Comput Appl 2022. [DOI: 10.1007/s00521-022-07050-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Abstract
AbstractNowadays, decision support systems (DSSs) are widely used in several application domains, from industrial to healthcare and medicine fields. Concerning the industrial scenario, we propose a DSS oriented to the aesthetic quality control (AQC) task, which has quickly established itself as one of the most crucial challenges of Industry 4.0. Taking into account the increasing amount of data in this domain, the application of machine learning (ML) and deep learning (DL) techniques offers great opportunities to automatize the overall AQC process. State-of-the-art is mainly oriented to approach this problem with a nominal DL classification method which does not exploit the ordinal structure of the AQC task, thus not penalizing the error among distant AQC classes (which is a relevant aspect for the real use case). The paper introduces a DL ordinal methodology for the AQC classification. Differently from other deep ordinal methods, we combined the standard categorical cross-entropy with the cumulative link model and we imposed the ordinal constraint via the thresholds and slope parameters. Experimental results were performed for solving an AQC task on a novel image dataset originated from a specific company’s demand (i.e., aesthetic assessment of wooden stocks). We demonstrated how the proposed methodology is able to reduce misclassification errors (up to 0.937 quadratic weight kappa loss) among distant classes while overcoming other state-of-the-art deep ordinal models and reducing the bias factor related to the item geometry. The proposed DL approach was integrated as the main core of a DSS supported by Internet of Things (IoT) architecture that can support the human operator by reducing up to 90% the time needed for the qualitative analysis carried out manually in this specific domain.
Collapse
|
8
|
Romeo L, Frontoni E. A Unified Hierarchical XGBoost model for classifying priorities for COVID-19 vaccination campaign. PATTERN RECOGNITION 2022; 121:108197. [PMID: 34312570 PMCID: PMC8295058 DOI: 10.1016/j.patcog.2021.108197] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/31/2021] [Revised: 06/21/2021] [Accepted: 07/20/2021] [Indexed: 05/03/2023]
Abstract
The current ML approaches do not fully focus to answer a still unresolved and topical challenge, namely the prediction of priorities of COVID-19 vaccine administration. Thus, our task includes some additional methodological challenges mainly related to avoiding unwanted bias while handling categorical and ordinal data with a highly imbalanced nature. Hence, the main contribution of this study is to propose a machine learning algorithm, namely Hierarchical Priority Classification eXtreme Gradient Boosting for priority classification for COVID-19 vaccine administration using the Italian Federation of General Practitioners dataset that contains Electronic Health Record data of 17k patients. We measured the effectiveness of the proposed methodology for classifying all the priority classes while demonstrating a significant improvement with respect to the state of the art. The proposed ML approach, which is integrated into a clinical decision support system, is currently supporting General Pracitioners in assigning COVID-19 vaccine administration priorities to their assistants.
Collapse
Affiliation(s)
- Luca Romeo
- Department of Information Engineering (DII), Università Politecnica delle Marche, Ancona, Italy
- Computational Statistics and Machine Learning, Istituto Italiano di Tecnologia, Genova, Italy
| | - Emanuele Frontoni
- Department of Information Engineering (DII), Università Politecnica delle Marche, Ancona, Italy
| |
Collapse
|