1
|
Tadege M, Tegegne AS, Dessie ZG. Post-surgery survival and associated factors for cardiac patients in Ethiopia: applications of machine learning, semi-parametric and parametric modelling. BMC Med Inform Decis Mak 2024; 24:91. [PMID: 38553701 PMCID: PMC10979627 DOI: 10.1186/s12911-024-02480-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2023] [Accepted: 03/11/2024] [Indexed: 04/02/2024] Open
Abstract
INTRODUCTION Living in poverty, especially in low-income countries, are more affected by cardiovascular disease. Unlike the developed countries, it remains a significant cause of preventable heart disease in the Sub-Saharan region, including Ethiopia. According to the Ethiopian Ministry of Health statement, around 40,000 cardiac patients have been waiting for surgery in Ethiopia since September 2020. There is insufficient information about long-term cardiac patients' post-survival after cardiac surgery in Ethiopia. Therefore, the main objective of the current study was to determine the long-term post-cardiac surgery patients' survival status in Ethiopia. METHODS All patients attended from 2012 to 2023 throughout the country were included in the current study. The total number of participants was 1520 heart disease patients. The data collection procedure was conducted from February 2022- January 2023. Machine learning algorithms were applied. Gompertz regression was used also for the multivariable analysis report. RESULTS From possible machine learning models, random survival forest were preferred. It emphasizes, the most important variable for clinical prediction was SPO2, Age, time to surgery waiting time, and creatinine value and it accounts, 42.55%, 25.17%,11.82%, and 12.19% respectively. From the Gompertz regression, lower saturated oxygen, higher age, lower ejection fraction, short period of cardiac center stays after surgery, prolonged waiting time to surgery, and creating value were statistically significant predictors of death outcome for post-cardiac surgery patients' survival in Ethiopia. CONCLUSION Some of the risk factors for the death of post-cardiac surgery patients are identified in the current investigation. Particular attention should be given to patients with prolonged waiting times and aged patients. Since there were only two fully active cardiac centers in Ethiopia it is far from an adequate number of centers for more than 120 million population, therefore, the study highly recommended to increase the number of cardiac centers that serve as cardiac surgery in Ethiopia.
Collapse
Affiliation(s)
- Melaku Tadege
- College of Science, Bahir Dar University, Bahir Dar, Ethiopia.
- Department of Statistics, Injibara University, Injibara, Amhara, Ethiopia.
- Regional Data Management Center for Health (RDMC), Amhara Public Health Institute (APHI), Bahir Dar, Ethiopia.
| | | | - Zelalem G Dessie
- College of Science, Bahir Dar University, Bahir Dar, Ethiopia
- School of Mathematics, Statistics and Computer Science, University of KwaZulu- Natal, Durban, South Africa
| |
Collapse
|
2
|
Wang PW, Su YH, Chou PH, Huang MY, Chen TW. Survival-related genes are diversified across cancers but generally enriched in cancer hallmark pathways. BMC Genomics 2022; 22:918. [PMID: 35508961 PMCID: PMC9066720 DOI: 10.1186/s12864-022-08581-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2022] [Accepted: 04/22/2022] [Indexed: 11/28/2022] Open
Abstract
Background Pan-cancer studies have disclosed many commonalities and differences in mutations, copy number variations, and gene expression alterations among cancers. Some of these features are significantly associated with clinical outcomes, and many prognosis-predictive biomarkers or biosignatures have been proposed for specific cancer types. Here, we systematically explored the biological functions and the distribution of survival-related genes (SRGs) across cancers. Results We carried out two different statistical survival models on the mRNA expression profiles in 33 cancer types from TCGA. We identified SRGs in each cancer type based on the Cox proportional hazards model and the log-rank test. We found a large difference in the number of SRGs among different cancer types, and most of the identified SRGs were specific to a particular cancer type. While these SRGs were unique to each cancer type, they were found mostly enriched in cancer hallmark pathways, e.g., cell proliferation, cell differentiation, DNA metabolism, and RNA metabolism. We also analyzed the association between cancer driver genes and SRGs and did not find significant over-representation amongst most cancers. Conclusions In summary, our work identified all the SRGs for 33 cancer types from TCGA. In addition, the pan-cancer analysis revealed the similarities and the differences in the biological functions of SRGs across cancers. Given the potential of SRGs in clinical utility, our results can serve as a resource for basic research and biotech applications. Supplementary Information The online version contains supplementary material available at 10.1186/s12864-022-08581-x.
Collapse
Affiliation(s)
- Po-Wen Wang
- Institute of Bioinformatics and Systems Biology, National Yang Ming Chiao Tung University, Hsinchu, 30068, Taiwan.,Center for Intelligent Drug Systems and Smart Bio-devices (IDS2B), National Yang Ming Chiao Tung University, Hsinchu, 30068, Taiwan
| | - Yi-Hsun Su
- Institute of Bioinformatics and Systems Biology, National Yang Ming Chiao Tung University, Hsinchu, 30068, Taiwan.,Industrial Development PhD Program of the College of Biological Science and Technology, National Yang Ming Chiao Tung University, Hsinchu, 30068, Taiwan
| | - Po-Hao Chou
- Institute of Bioinformatics and Systems Biology, National Yang Ming Chiao Tung University, Hsinchu, 30068, Taiwan.,Center for Intelligent Drug Systems and Smart Bio-devices (IDS2B), National Yang Ming Chiao Tung University, Hsinchu, 30068, Taiwan
| | - Ming-Yueh Huang
- Institute of Statistical Science, Academia Sinica, Taipei, 11529, Taiwan
| | - Ting-Wen Chen
- Institute of Bioinformatics and Systems Biology, National Yang Ming Chiao Tung University, Hsinchu, 30068, Taiwan. .,Center for Intelligent Drug Systems and Smart Bio-devices (IDS2B), National Yang Ming Chiao Tung University, Hsinchu, 30068, Taiwan. .,Department of Biological Science and Technology, National Yang Ming Chiao Tung University, Hsinchu, 30068, Taiwan.
| |
Collapse
|
3
|
Su PF, Lin CCK, Hung JY, Lee JS. The Proper Use and Reporting of Survival Analysis and Cox Regression. World Neurosurg 2022; 161:303-309. [PMID: 35505548 DOI: 10.1016/j.wneu.2021.06.132] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2021] [Accepted: 06/27/2021] [Indexed: 10/18/2022]
Abstract
BACKGROUND Survival analyses are heavily used to analyze data in which the time to event is of interest. The purpose of this paper is to introduce some fundamental concepts for survival analyses in medical studies. METHODS We comprehensively review current survival methodologies, such as the nonparametric Kaplan-Meier method used to estimate survival probability, the log-rank test, one of the most popular tests for comparing survival curves, and the Cox proportional hazard model, which is used for building the relationship between survival time and specific risk factors. More advanced methods, such as time-dependent receiver operating characteristic, restricted mean survival time, and time-dependent covariates are also introduced. RESULTS This tutorial is aimed toward covering the basics of survival analysis. We used a neurosurgical case series of surgically treated brain metastases from non-small cell lung cancer patients as an example. The survival time was defined from the date of craniotomy to the date of patient death. CONCLUSIONS This work is an attempt to encourage more investigators/medical practitioners to use survival analyses appropriately in medical research. We highlight some statistical issues, make recommendations, and provide more advanced survival modeling in this aspect.
Collapse
Affiliation(s)
- Pei-Fang Su
- Department of Statistics, National Cheng Kung University, Tainan, Taiwan.
| | - Chou-Ching K Lin
- Department of Neurology, National Cheng Kung University Hospital, College of Medicine, National Cheng Kung University, Tainan, Taiwan
| | - Jo-Ying Hung
- Department of Statistics, National Cheng Kung University, Tainan, Taiwan
| | - Jung-Shun Lee
- Division of Neurosurgery, Department of Surgery, National Cheng Kung University Hospital, College of Medicine, National Cheng Kung University, Tainan, Taiwan
| |
Collapse
|
4
|
Nakagawa Y, Yoshimoto T, Nakagawa S, Sugitani Y, Yamamoto H, Asakawa T. Impact of Tumor Assessment Frequency on Statistical Power in Randomized Cancer Clinical Trials Evaluating Progression-Free Survival. Ther Innov Regul Sci 2021; 55:1258-1264. [PMID: 34319577 DOI: 10.1007/s43441-021-00328-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2021] [Accepted: 07/22/2021] [Indexed: 11/30/2022]
Abstract
BACKGROUND Progression-free survival (PFS) is frequently used as a primary endpoint in late-phase clinical trials for anti-metastatic cancer agents. Previous studies have indicated that the frequency of tumor assessment affects the statistical power for PFS because progression dates are inaccurate; however, this finding may be difficult to generalize because of its unrealistic assumptions. Therefore, we re-examined this issue under realistic assumptions and various scenarios that approximate actual clinical trials. METHODS Randomized clinical trials comparing two interventions against a solid tumor were simulated under conditions where progressive disease (PD)-dominant PFS or a non-negligible number of deaths (death-competitive PFS) contributed to PFS events, which are conditions that resemble clinical trials of first-line therapy and later-line therapy, respectively. We assessed the impact of tumor assessment frequency on the statistical power. RESULTS Under the PD-dominant PFS condition, even in extreme scenarios, statistical power loss was only approximately 3%. Under the death-competitive PFS condition, tumor assessment frequency affected the statistical power of PFS if the effect of the treatment on overall survival was lower than that on time to progression. In this case, loss of statistical power was often more than 10% in some realistic scenarios. CONCLUSION In trials investigating first-line treatments (PD-dominant PFS), tumor assessment frequency has a negligible impact on statistical power, whereas in trials investigating late-line therapies (death-competitive PFS), the potential impact of tumor assessment frequency on statistical power should be carefully evaluated at the design stage.
Collapse
Affiliation(s)
- Yuki Nakagawa
- Chugai Pharmaceutical Co., Ltd., 2-1-1 Nihonbashi-Muromachi, Chuo-ku, Tokyo, 103-8324, Japan.
| | - Takuya Yoshimoto
- Chugai Pharmaceutical Co., Ltd., 2-1-1 Nihonbashi-Muromachi, Chuo-ku, Tokyo, 103-8324, Japan
| | - Shintaro Nakagawa
- Chugai Pharmaceutical Co., Ltd., 2-1-1 Nihonbashi-Muromachi, Chuo-ku, Tokyo, 103-8324, Japan
| | - Yasuo Sugitani
- Chugai Pharmaceutical Co., Ltd., 2-1-1 Nihonbashi-Muromachi, Chuo-ku, Tokyo, 103-8324, Japan
| | - Hideharu Yamamoto
- Chugai Pharmaceutical Co., Ltd., 2-1-1 Nihonbashi-Muromachi, Chuo-ku, Tokyo, 103-8324, Japan
| | - Takashi Asakawa
- Chugai Pharmaceutical Co., Ltd., 2-1-1 Nihonbashi-Muromachi, Chuo-ku, Tokyo, 103-8324, Japan
| |
Collapse
|
5
|
Le-Rademacher J, Wang X. Time-To-Event Data: An Overview and Analysis Considerations. J Thorac Oncol 2021; 16:1067-1074. [PMID: 33887465 DOI: 10.1016/j.jtho.2021.04.004] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2021] [Revised: 04/13/2021] [Accepted: 04/14/2021] [Indexed: 12/19/2022]
Abstract
In oncology, overall survival and progression-free survival are common time-to-event end points used to measure treatment efficacy. Analyses of this type of data rely on a complex statistical framework and the analysis results are only valid when the data meet certain assumptions. This article provides an overview of time-to-event data, the basic mechanics of common analysis methods, and issues often encountered when analyzing such data. Our goal is to provide clinicians and other lung cancer researchers with the knowledge to choose the appropriate time-to-event analysis methods and to interpret the outcomes of such analyses appropriately. We strongly encourage investigators to seek out statisticians with expertise in survival analysis when embarking on studies that include time-to-event data to ensure that their data are collected and analyzed using the appropriate methods.
Collapse
Affiliation(s)
| | - Xiaofei Wang
- Department of Biostatistics and Bioinformatics, Duke University, Durham, North Carolina
| |
Collapse
|
6
|
Abstract
BACKGROUND Restricted mean survival time methods compare the areas under the Kaplan-Meier curves up to a time τ for the control and experimental treatments. Extraordinary claims have been made about the benefits (in terms of dramatically smaller required sample sizes) when using restricted mean survival time methods as compared to proportional hazards methods for analyzing noninferiority trials, even when the true survival distributions satisfy proportional hazardss. METHODS Through some limited simulations and asymptotic power calculations, the authors compare the operating characteristics of restricted mean survival time and proportional hazards methods for analyzing both noninferiority and superiority trials under proportional hazardss to understand what relative power benefits there are when using restricted mean survival time methods for noninferiority testing. RESULTS In the setting of low-event rates, very large targeted noninferiority margins, and limited follow-up past τ, restricted mean survival time methods have more power than proportional hazards methods. For superiority testing, proportional hazards methods have more power. This is not a small-sample phenomenon but requires a low-event rate and a large noninferiority margin. CONCLUSION Although there are special settings where restricted mean survival time methods have a power advantage over proportional hazards methods for testing noninferiority, the larger issue in these settings is defining appropriate noninferiority margins. We find the restricted mean survival time methods lacking in these regards.
Collapse
Affiliation(s)
- Boris Freidlin
- Biometric Research Program, National Cancer Institute, Bethesda, MD, USA
| | - Chen Hu
- Division of Biostatistics and Bioinformatics, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Edward L Korn
- Biometric Research Program, National Cancer Institute, Bethesda, MD, USA
| |
Collapse
|
7
|
Abstract
Survival analysis is a collection of statistical procedures employed on time-to-event data. The outcome variable of interest is time until an event occurs. Conventionally, it dealt with death as the event, but it can handle any event occurring in an individual like disease, relapse from remission, and recovery. Survival data describe the length of time from a time of origin to an endpoint of interest. By time, we mean years, months, weeks, or days from the beginning of being enrolled in the study. The major limitation of time-to-event data is the possibility of an event not occurring in all the subjects during a specific study period. In addition, some of the study subjects may leave the study prematurely. Such situations lead to what is called censored observations as complete information is not available for these subjects. Life table and Kaplan-Meier techniques are employed to obtain the descriptive measures of survival times. The main objectives of survival analysis include analysis of patterns of time-to-event data, evaluating reasons why data may be censored, comparing the survival curves, and assessing the relationship of explanatory variables to survival time. Survival analysis also offers different regression models that accommodate any number of covariates (categorical or continuous) and produces adjusted hazard ratios for individual factor.
Collapse
|
8
|
Oller R, Gómez Melis G. A nonparametric test for the association between longitudinal covariates and censored survival data. Biostatistics 2020; 21:727-742. [PMID: 30796830 DOI: 10.1093/biostatistics/kxz002] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2018] [Revised: 12/28/2018] [Accepted: 01/03/2019] [Indexed: 11/12/2022] Open
Abstract
Many biomedical studies focus on the association between a longitudinal measurement and a time-to-event outcome while quantifying this association by means of a longitudinal-survival joint model. In this article we propose using the $LLR$ test - a longitudinal extension of the log-rank test statistic given by Peto and Peto (1972) - to provide evidence of a plausible association between a time-to-event outcome (right- or interval-censored) and a time-dependent covariate. As joint model methods are complex and hard to interpret, it is wise to conduct a preliminary test such as $LLR$ for checking the association between both processes. The $LLR$ statistic can be expressed in the form of a weighted difference of hazards, yielding a broad class of weighted log-rank test statistics known as $LWLR$, which allow a specific emphasis along the time axis of the effects of the time-dependent covariate on the survival. The asymptotic distribution of $LLR$ is derived by means of a permutation approach under the assumption that the censoring mechanism is independent of the survival time and the longitudinal covariate. A simulation study is conducted to evaluate the performance of the test statistics $LLR$ and $LWLR$, showing that the empirical size is close to the nominal significance level and that the power of the test depends on the association between the covariates and the survival time. A data set together with a toy example are used to illustrate the $LLR$ test. The data set explores the study Epidemiology of Diabetes Interventions and Complications (Sparling and others, 2006) which includes interval-censored data. A software implementation of our method is available on github (https://github.com/RamonOller/LWLRtest).
Collapse
Affiliation(s)
- Ramon Oller
- Departament d'Economia i Empresa, Universitat de Vic-Universitat Central de Catalunya, Sagrada Família 7, 08500 Vic, Spain
| | - Guadalupe Gómez Melis
- Departament d'Estadística i Investigació Operativa, Universitat Politècnica de Catalunya, Jordi Girona 1-3, 08034 Barcelona, Spain
| |
Collapse
|
9
|
Abstract
BACKGROUND The data from immuno-oncology (IO) therapy trials often show delayed effects, cure rate, crossing hazards, or some mixture of these phenomena. Thus, the proportional hazards (PH) assumption is often violated such that the commonly used log-rank test can be very underpowered. In these trials, the conventional hazard ratio for describing the treatment effect may not be a good estimand due to the lack of an easily understandable interpretation. To overcome this challenge, restricted mean survival time (RMST) has been strongly recommended for survival analysis in clinical literature due to its independence of the PH assumption as well as a more clinically meaningful interpretation. The RMST also aligns well with the estimand associated with the analysis from the recommendation in ICH E-9 (R1), and the test/estimation coherency. Currently, the Kaplan Meier (KM) curve is commonly applied to RMST related analyses. Due to some drawbacks of the KM approach such as the limitation in extrapolating to time points beyond the follow-up time, and the large variance at time points with small numbers of events, the RMST may be hindered. METHODS The dynamic RMST curve using a mixture model is proposed in this paper to fully enhance the RMST method for survival analysis in clinical trials. It is constructed that the RMST difference or ratio is computed over a range of values to the restriction time τ which traces out an evolving treatment effect profile over time. RESULTS This new dynamic RMST curve overcomes the drawbacks from the KM approach. The good performance of this proposal is illustrated through three real examples. CONCLUSIONS The RMST provides a clinically meaningful and easily interpretable measure for survival clinical trials. The proposed dynamic RMST approach provides a useful tool for assessing treatment effect over different time frames for survival clinical trials. This dynamic RMST curve also allows ones for checking whether the follow-up time for a study is long enough to demonstrate a treatment difference. The prediction feature of the dynamic RMST analysis may be used for determining an appropriate time point for an interim analysis, and the data monitoring committee (DMC) can use this evaluation tool for study recommendation.
Collapse
Affiliation(s)
| | - G Frank Liu
- Merck & Co., Inc, North Wales, PA, 19454, USA
| | - Wen-Chi Wu
- Merck & Co., Inc, North Wales, PA, 19454, USA
| |
Collapse
|
10
|
Ter Wee MM, Lissenberg-Witte BI. Biostatistics in Cardiovascular Research with Emphasis on Sex-Related Aspects. Adv Exp Med Biol 2018; 1065:71-92. [PMID: 30051378 DOI: 10.1007/978-3-319-77932-4_5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register]
Abstract
Research on sex differences related to cardiovascular dysfunction has become a topic of interest in the last decade. Although scientific research has been carried out since ancient times, we still may struggle with performing scientific research in the best way to achieve the highest quality data and solid conclusions. In this chapter, every step of scientific research is explained: from formulating the research question and hypotheses to analyzing the collected data to interpreting and reporting the results. Several fundamental biostatistical techniques-such as the independent samples t-test, the chi-square test, the log-rank test, and different regression models-are presented. In addition, methods that can deal with variables influencing the association of interest are discussed. All examples are focused on investigating sex differences in cardiac outcomes, but this chapter is written in such a way that it easily translates to other fields of medical research on every disease or health state.
Collapse
|
11
|
Abstract
Length of time is a variable often encountered during data analysis. Survival analysis provides simple, intuitive results concerning time-to-event for events of interest, which are not confined to death. This review introduces methods of analyzing time-to-event. The Kaplan-Meier survival analysis, log-rank test, and Cox proportional hazards regression modeling method are described with examples of hypothetical data.
Collapse
Affiliation(s)
- Junyong In
- Department of Anesthesiology and Pain Medicine, Dongguk University Ilsan Hospital, Goyang, Korea
| | - Dong Kyu Lee
- Guro Hospital, Korea University School of Medicine, Seoul, Korea,Corresponding author: Dong Kyu Lee, M.D., Ph.D. Department of Anesthesiology and Pain Medicine, Guro Hospital, Korea University School of Medicine, 148, Gurodong-ro, Guro-gu, Seoul 08308, Korea Tel: 82-2-2626-3237, Fax: 82-2-2626-1438
| |
Collapse
|
12
|
Abstract
Background Survival analysis methods have been widely applied in different areas of health and medicine, spanning over varying events of interest and target diseases. They can be utilized to provide relationships between the survival time of individuals and factors of interest, rendering them useful in searching for biomarkers in diseases such as cancer. However, some disease progression can be very unpredictable because the conventional approaches have failed to consider multiple-marker interactions. An exponential increase in the number of candidate markers requires large correction factor in the multiple-testing correction and hide the significance. Methods We address the issue of testing marker combinations that affect survival by adapting the recently developed Limitless Arity Multiple-testing Procedure (LAMP), a p-value correction technique for statistical tests for combination of markers. LAMP cannot handle survival data statistics, and hence we extended LAMP for the log-rank test, making it more appropriate for clinical data, with newly introduced theoretical lower bound of the p-value. Results We applied the proposed method to gene combination detection for cancer and obtained gene interactions with statistically significant log-rank p-values. Gene combinations with orders of up to 32 genes were detected by our algorithm, and effects of some genes in these combinations are also supported by existing literature. Conclusion The novel approach for detecting prognostic markers presented here can identify statistically significant markers with no limitations on the order of interaction. Furthermore, it can be applied to different types of genomic data, provided that binarization is possible.
Collapse
Affiliation(s)
- Raissa T Relator
- Artificial Intelligence Research Center, National Institute of Advanced Industrial Science and Technology, 2-4-7 Aomi, Koto-ku, Tokyo, 135-0064, Japan
| | - Aika Terada
- Artificial Intelligence Research Center, National Institute of Advanced Industrial Science and Technology, 2-4-7 Aomi, Koto-ku, Tokyo, 135-0064, Japan.,PRESTO, Japan Science and Technology Agency, 4-1-8 Honcho, Kawaguchi, Saitama, 332-0012, Japan.,Graduate School of Frontier Sciences, The University of Tokyo, 5-1-5, Kashiwanoha, Kashiwa, Chiba, 277-8561, Japan
| | - Jun Sese
- Artificial Intelligence Research Center, National Institute of Advanced Industrial Science and Technology, 2-4-7 Aomi, Koto-ku, Tokyo, 135-0064, Japan. .,AIST-Tokyo Tech Real World Big-Data Computation Open Innovation Laboratory (RWBC-OIL), 2-12-1 Okayama, Meguro-ku, Tokyo, 152-8550, Japan.
| |
Collapse
|
13
|
Abstract
We investigate the survival distribution of the patients who have survived over a certain time period. This is called a conditional survival distribution. In this paper, we show that one-sample estimation, two-sample comparison and regression analysis of conditional survival distributions can be conducted using the regular methods for unconditional survival distributions that are provided by the standard statistical software, such as SAS and SPSS. We conduct extensive simulations to evaluate the finite sample property of these conditional survival analysis methods. We illustrate these methods with real clinical data.
Collapse
Affiliation(s)
- Sin-Ho Jung
- a Department of Biostatistics and Bioinformatics , Duke University , Durham , NC , USA
| | - Ho Yun Lee
- b Department of Radiology, Samsung Medical Center , Sungkyunkwan University School of Medicine , Seoul , Korea
| | - Shein-Chung Chow
- a Department of Biostatistics and Bioinformatics , Duke University , Durham , NC , USA
| |
Collapse
|
14
|
Abstract
In clinical trials survival endpoints are usually compared using the log-rank test. Sequential methods for the log-rank test and the Cox proportional hazards model are largely reported in the statistical literature. When the proportional hazards assumption is violated the hazard ratio is ill-defined and the power of the log-rank test depends on the distribution of the censoring times. The average hazard ratio was proposed as an alternative effect measure, which has a meaningful interpretation in the case of non-proportional hazards, and is equal to the hazard ratio, if the hazards are indeed proportional. In the present work we prove that the average hazard ratio based sequential test statistics are asymptotically multivariate normal with the independent increments property. This allows for the calculation of group-sequential boundaries using standard methods and existing software. The finite sample characteristics of the new method are examined in a simulation study in a proportional and a non-proportional hazards setting.
Collapse
Affiliation(s)
- Matthias Brückner
- Competence Center for Clinical Trials, University of Bremen, Linzer Str. 4, 28359, Bremen, Germany.
| | - Werner Brannath
- Competence Center for Clinical Trials, University of Bremen, Linzer Str. 4, 28359, Bremen, Germany
| |
Collapse
|
15
|
Abstract
Background Survival analysis is an important element of reasoning from data. Applied in a number of fields, it has become particularly useful in medicine to estimate the survival rate of patients on the basis of their condition, examination results, and undergoing treatment. The recent developments in the next generation sequencing open new opportunities in survival study as they allow vast amount of genome-, transcriptome-, and proteome-related features to be investigated. These include single nucleotide and structural variants, expressions of genes and microRNAs, DNA methylation, and many others. Results We present LR-Rules, a new algorithm for rule induction from survival data. It works according to the separate-and-conquer heuristics with a use of log-rank test for establishing rule body. Extensive experiments show LR-Rules to generate models of superior accuracy and comprehensibility. The detailed analysis of rules rendered by the presented algorithm on four medical datasets concerning leukemia as well as breast, lung, and thyroid cancers, reveals the ability to discover true relations between attributes and patients’ survival rate. Two of the case studies incorporate features obtained with a use of high throughput technologies showing the usability of the algorithm in the analysis of bioinformatics data. Conclusions LR-Rules is a viable alternative to existing approaches to survival analysis, particularly when the interpretability of a resulting model is crucial. Presented algorithm may be especially useful when applied on the genomic and proteomic data as it may contribute to the better understanding of the background of diseases and support their treatments. Electronic supplementary material The online version of this article (doi:10.1186/s12859-017-1693-x) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Łukasz Wróbel
- Institute of Informatics, Silesian Univ. of Technology, Akademicka 16, Gliwice, 44-100, Poland.
| | - Adam Gudyś
- Institute of Informatics, Silesian Univ. of Technology, Akademicka 16, Gliwice, 44-100, Poland
| | - Marek Sikora
- Institute of Innovative Technologies, EMAG, Leopolda 31, Katowice, 40-189, Poland
| |
Collapse
|
16
|
Abstract
Background Relative survival analysis is a subfield of survival analysis where competing risks data are observed, but the causes of death are unknown. A first step in the analysis of such data is usually the estimation of a net survival curve, possibly followed by regression modelling. Recently, a log-rank type test for comparison of net survival curves has been introduced and the goal of this paper is to explore its properties and put this methodological advance into the context of the field. Methods We build on the association between the log-rank test and the univariate or stratified Cox model and show the analogy in the relative survival setting. We study the properties of the methods using both the theoretical arguments as well as simulations. We provide an R function to enable practical usage of the log-rank type test. Results Both the log-rank type test and its model alternatives perform satisfactory under the null, even if the correlation between their p-values is rather low, implying that both approaches cannot be used simultaneously. The stratified version has a higher power in case of non-homogeneous hazards, but also carries a different interpretation. Conclusions The log-rank type test and its stratified version can be interpreted in the same way as the results of an analogous semi-parametric additive regression model despite the fact that no direct theoretical link can be established between the test statistics. Electronic supplementary material The online version of this article (doi:10.1186/s12874-017-0351-3) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Klemen Pavlič
- University of Ljubljana, Faculty of Medicine, Institute for Biostatistics and Medical Informatics, Vrazov trg 2, Ljubljana, 1000, Slovenia
| | - Maja Pohar Perme
- University of Ljubljana, Faculty of Medicine, Institute for Biostatistics and Medical Informatics, Vrazov trg 2, Ljubljana, 1000, Slovenia.
| |
Collapse
|
17
|
Beisel C, Benner A, Kunz C, Kopp-Schneider A. Heterogeneous treatment effects in stratified clinical trials with time-to-event endpoints. Biom J 2017; 59:511-530. [PMID: 28263395 DOI: 10.1002/bimj.201600047] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2016] [Revised: 10/18/2016] [Accepted: 12/20/2016] [Indexed: 11/08/2022]
Abstract
When analyzing clinical trials with a stratified population, homogeneity of treatment effects is a common assumption in survival analysis. However, in the context of recent developments in clinical trial design, which aim to test multiple targeted therapies in corresponding subpopulations simultaneously, the assumption that there is no treatment-by-stratum interaction seems inappropriate. It becomes an issue if the expected sample size of the strata makes it unfeasible to analyze the trial arms individually. Alternatively, one might choose as primary aim to prove efficacy of the overall (targeted) treatment strategy. When testing for the overall treatment effect, a violation of the no-interaction assumption renders it necessary to deviate from standard methods that rely on this assumption. We investigate the performance of different methods for sample size calculation and data analysis under heterogeneous treatment effects. The commonly used sample size formula by Schoenfeld is compared to another formula by Lachin and Foulkes, and to an extension of Schoenfeld's formula allowing for stratification. Beyond the widely used (stratified) Cox model, we explore the lognormal shared frailty model, and a two-step analysis approach as potential alternatives that attempt to adjust for interstrata heterogeneity. We carry out a simulation study for a trial with three strata and violations of the no-interaction assumption. The extension of Schoenfeld's formula to heterogeneous strata effects provides the most reliable sample size with respect to desired versus actual power. The two-step analysis and frailty model prove to be more robust against loss of power caused by heterogeneous treatment effects than the stratified Cox model and should be preferred in such situations.
Collapse
Affiliation(s)
- Christina Beisel
- Department of Biostatistics, German Cancer Research Center (DKFZ), Im Neuenheimer Feld 280, D-69120, Heidelberg, Germany
| | - Axel Benner
- Department of Biostatistics, German Cancer Research Center (DKFZ), Im Neuenheimer Feld 280, D-69120, Heidelberg, Germany
| | - Christina Kunz
- Department of Biostatistics, German Cancer Research Center (DKFZ), Im Neuenheimer Feld 280, D-69120, Heidelberg, Germany
| | - Annette Kopp-Schneider
- Department of Biostatistics, German Cancer Research Center (DKFZ), Im Neuenheimer Feld 280, D-69120, Heidelberg, Germany
| |
Collapse
|
18
|
Lin DY, Dai L, Cheng G, Sailer MO. On confidence intervals for the hazard ratio in randomized clinical trials. Biometrics 2016; 72:1098-1102. [PMID: 27123760 DOI: 10.1111/biom.12528] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2015] [Revised: 02/01/2016] [Accepted: 03/01/2016] [Indexed: 11/30/2022]
Abstract
The log-rank test is widely used to compare two survival distributions in a randomized clinical trial, while partial likelihood (Cox, 1975) is the method of choice for making inference about the hazard ratio under the Cox (1972) proportional hazards model. The Wald 95% confidence interval of the hazard ratio may include the null value of 1 when the p-value of the log-rank test is less than 0.05. Peto et al. (1977) provided an estimator for the hazard ratio based on the log-rank statistic; the corresponding 95% confidence interval excludes the null value of 1 if and only if the p-value of the log-rank test is less than 0.05. However, Peto's estimator is not consistent, and the corresponding confidence interval does not have correct coverage probability. In this article, we construct the confidence interval by inverting the score test under the (possibly stratified) Cox model, and we modify the variance estimator such that the resulting score test for the null hypothesis of no treatment difference is identical to the log-rank test in the possible presence of ties. Like Peto's method, the proposed confidence interval excludes the null value if and only if the log-rank test is significant. Unlike Peto's method, however, this interval has correct coverage probability. An added benefit of the proposed confidence interval is that it tends to be more accurate and narrower than the Wald confidence interval. We demonstrate the advantages of the proposed method through extensive simulation studies and a colon cancer study.
Collapse
Affiliation(s)
- Dan-Yu Lin
- Department of Biostatistics, University of North Carolina, Chapel Hill, North Carolina 27599, U.S.A
| | - Luyan Dai
- Boehringer Ingelheim Investment Co., Ltd., 1601 Nanjing Road West, Shanghai 200040, P.R. China
| | - Gang Cheng
- Boehringer Ingelheim Investment Co., Ltd., 1601 Nanjing Road West, Shanghai 200040, P.R. China
| | - Martin Oliver Sailer
- Boehringer Ingelheim Pharma GmbH & Co. KG, Birkendorfer Strasse 65, 88397 Biberach an der Riss, Germany
| |
Collapse
|
19
|
Abstract
This article develops joint inferential methods for the cause-specific hazard function and the cumulative incidence function of a specific type of failure to assess the effects of a variable on the time to the type of failure of interest in the presence of competing risks. Joint inference for the two functions are needed in practice because (i) they describe different characteristics of a given type of failure, (ii) they do not uniquely determine each other, and (iii) the effects of a variable on the two functions can be different and one often does not know which effects are to be expected. We study both the group comparison problem and the regression problem. We also discuss joint inference for other related functions. Our simulation shows that our joint tests can be considerably more powerful than the Bonferroni method, which has important practical implications to the analysis and design of clinical studies with competing risks data. We illustrate our method using a Hodgkin disease data and a lymphoma data. Supplementary materials for this article are available online.
Collapse
Affiliation(s)
- Gang Li
- Department of Biostatistics, University of California Los Angeles, Los Angeles, CA, USA
| | - Qing Yang
- School of Nursing, Duke University, Durham, NC, USA
| |
Collapse
|
20
|
Tomeczkowski J, Lange A, Güntert A, Thilakarathne P, Diels J, Xiu L, De Porre P, Tapprich C. Converging or Crossing Curves: Untie the Gordian Knot or Cut it? Appropriate Statistics for Non-Proportional Hazards in Decitabine DACO-016 Study (AML). Adv Ther 2015; 32:854-62. [PMID: 26369324 PMCID: PMC4604504 DOI: 10.1007/s12325-015-0238-9] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2015] [Indexed: 11/17/2022]
Abstract
Introduction Among patients with acute myeloid leukemia (AML), the DACO-016 randomized study showed reduction in mortality for decitabine [Dacogen® (DAC), Eisai Inc., Woodcliff Lake, NJ, USA] compared with treatment choice (TC): at primary analysis the hazard ratio (HR) was 0.85 (95% confidence interval 0.69–1.04; stratified log-rank P = 0.108). With two interim analyses, two-sided alpha was adjusted to 0.0462. With 1-year additional follow-up the HR reached 0.82 (nominal P = 0.0373). These data resulted in approval of DAC in the European Union, though not in the United States. Though pre-specified, the log-rank test could be considered not optimal to assess the observed survival difference because of the non-proportional hazard nature of the survival curves. Methods We applied the Wilcoxon test as a sensitivity analysis. Patients were randomized to DAC (N = 242) or TC (N = 243). One-hundred and eight (44.4%) patients in the TC arm and 91 (37.6%) patients in the DAC arm selectively crossed over to subsequent disease modifying therapies at progression, which might impact the survival beyond the median with resultant converging curves (and disproportional hazards). Results The stratified Wilcoxon test showed a significant improvement in median (CI 95%) overall survival with DAC [7.7 (6.2; 9.2) months] versus TC [5.0 (4.3; 6.3) months; P = 0.0458]. Conclusion Wilcoxon test indicated significant increase in survival for DAC versus TC compared to log-rank test. Funding Janssen-Cilag GmbH. Electronic supplementary material The online version of this article (doi:10.1007/s12325-015-0238-9) contains supplementary material, which is available to authorized users.
Collapse
|
21
|
Mangili F, Benavoli A, de Campos CP, Zaffalon M. Reliable survival analysis based on the Dirichlet process. Biom J 2015; 57:1002-19. [PMID: 26296502 DOI: 10.1002/bimj.201500062] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2015] [Revised: 04/24/2015] [Accepted: 05/12/2015] [Indexed: 11/10/2022]
Abstract
We present a robust Dirichlet process for estimating survival functions from samples with right-censored data. It adopts a prior near-ignorance approach to avoid almost any assumption about the distribution of the population lifetimes, as well as the need of eliciting an infinite dimensional parameter (in case of lack of prior information), as it happens with the usual Dirichlet process prior. We show how such model can be used to derive robust inferences from right-censored lifetime data. Robustness is due to the identification of the decisions that are prior-dependent, and can be interpreted as an analysis of sensitivity with respect to the hypothetical inclusion of fictitious new samples in the data. In particular, we derive a nonparametric estimator of the survival probability and a hypothesis test about the probability that the lifetime of an individual from one population is shorter than the lifetime of an individual from another. We evaluate these ideas on simulated data and on the Australian AIDS survival dataset. The methods are publicly available through an easy-to-use R package.
Collapse
|
22
|
Abstract
Two parametric tests are proposed for designing randomized two-arm phase III survival trials under the Weibull model. The properties of the two parametric tests are compared with the nonparametric log-rank test through simulation studies. Power and sample size formulas of the two parametric tests are derived. The sensitivity of sample size under misspecification of the Weibull shape parameter is also investigated. The study can be designed by planning the study duration and handling nonuniform entry and loss to follow-up under the Weibull model using either the proposed parametric tests or the well-known nonparametric log-rank test.
Collapse
Affiliation(s)
- Jianrong Wu
- a Department of Biostatistics , St. Jude Children's Research Hospital , Memphis , Tennessee , USA
| |
Collapse
|