26
|
Claesen J, Valkenborg D, Burzykowski T. Predicting the number of sulfur atoms in peptides and small proteins based on the observed aggregated isotope distribution. RAPID COMMUNICATIONS IN MASS SPECTROMETRY : RCM 2021; 35:e9162. [PMID: 34240492 PMCID: PMC8459233 DOI: 10.1002/rcm.9162] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/31/2020] [Revised: 07/06/2021] [Accepted: 07/06/2021] [Indexed: 06/13/2023]
Abstract
RATIONALE Identification of peptides and proteins is a challenging task in mass spectrometry-based proteomics. Knowledge of the number of sulfur atoms can improve the identification of peptides and proteins. METHODS In this article, we propose a method for the prediction of S-atoms based on the aggregated isotope distribution. The Mahalanobis distance is used as dissimilarity measure to compare mass- and intensity-based features from the observed and theoretical isotope distributions. RESULTS The relative abundance of the second and the third aggregated isotopic variants (as compared to the monoisotopic one) and the mass difference between the second and third aggregated isotopic variants are the most important features to predict the number of S-atoms. CONCLUSIONS The mass and intensity accuracies of the observed aggregated isotopic variants are insufficient to accurately predict the number of atoms. However, using a limited set of predictions for a peptide, rather than predicting a single number of S-atoms, has a reasonably high prediction accuracy.
Collapse
|
27
|
Burzykowski T. Semi-parametric accelerated failure-time model: A useful alternative to the proportional-hazards model in cancer clinical trials. Pharm Stat 2021; 21:292-308. [PMID: 34553482 DOI: 10.1002/pst.2169] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2021] [Revised: 07/14/2021] [Accepted: 08/25/2021] [Indexed: 11/11/2022]
Abstract
The accelerated failure-time (AFT) model has been long recognized as a useful alternative to the proportional-hazards (PH) model. Semi-parametric AFT model has been known since 1981. Its use has been hampered by the difficulty in solving the estimating equations for the model's coefficients. In recent years, however, important developments have taken place regarding the methods of solving the equations. In this article, we briefly review the developments, focusing mainly on rank-based estimation. We conduct a simulation study that directly focuses on the applicability of the model in the context of (cancer) clinical trials. We also investigate the robustness of the AFT model to the omission of covariates. Finally, we conduct a meta-analysis of multiple clinical trials in gastric cancer to illustrate the benefits of the use of the model in practice.
Collapse
|
28
|
Garcia Barrado L, Burzykowski T, Legrand C, Buyse M. Using an interim analysis based exclusively on an early outcome in a randomized clinical trial with a long-term clinical endpoint. Pharm Stat 2021; 21:209-219. [PMID: 34505395 DOI: 10.1002/pst.2165] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2021] [Revised: 06/04/2021] [Accepted: 07/15/2021] [Indexed: 11/10/2022]
Abstract
In RCTs with an interest in a long-term efficacy endpoint, the follow-up time necessary to observe the endpoint may be substantial. In order to reduce the expected duration of such trials, early-outcome data may be collected to enrich an interim analysis aimed at stopping the trial early for efficacy. We propose to extend such a design with an additional interim analysis using solely early-outcome data in order to expedite the evaluation of treatment's efficacy. We evaluate the potential gain in operating characteristics (power, expected trial duration, and expected sample size) when introducing such an early interim analysis, in function of the properties of the early outcome as a surrogate for the long-term endpoint. In the context of a longitudinal age-related macular degeneration (ARMD) ophthalmology trial, results show potentially substantial gains in both the expected trial duration and the expected sample size. A prerequisite, though, is that the treatment effect on the early outcome has to be strongly correlated with the treatment effect on the long-term endpoint, that is, that the early outcome is a validated surrogate for the long-term endpoint.
Collapse
|
29
|
Deltuvaite-Thomas V, Burzykowski T. Operational characteristics of generalized pairwise comparisons for hierarchically ordered endpoints. Pharm Stat 2021; 21:122-132. [PMID: 34346169 DOI: 10.1002/pst.2156] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2020] [Revised: 05/10/2021] [Accepted: 07/12/2021] [Indexed: 11/10/2022]
Abstract
The method of generalized pairwise comparisons (GPC) is a multivariate extension of the well-known non-parametric Wilcoxon-Mann-Whitney test. It allows comparing two groups of observations based on multiple hierarchically ordered endpoints, regardless of the number or type of the latter. The summary measure, "net benefit," quantifies the difference between the probabilities that a random observation from one group is doing better than an observation from the opposite group. The method takes into account the correlations between the endpoints. We have performed a simulation study for the case of two hierarchical endpoints to evaluate the impact of their correlation on the type-I error probability and power of the test based on GPC. The simulations show that the power of the GPC test for the primary endpoint is modified if the secondary endpoint is included in the hierarchical GPC analysis. The change in power depends on the correlation between the endpoints. Interestingly, a decrease in power can occur, regardless of whether there is any marginal treatment effect on the secondary endpoint. It appears that the overall power of the hierarchical GPC procedure depends, in a complex manner, on the entire variance-covariance structure of the set of outcomes. Any additional factors (such as thresholds of clinical relevance, drop out, or censoring scheme) will also affect the power and will have to be taken into account when designing a trial based on the hierarchical GPC procedure.
Collapse
|
30
|
Plakwicz P, Andreasen JO, Górska R, Burzykowski T, Czochrowska E. Status of the alveolar bone after autotransplantation of developing premolars to the anterior maxilla assessed by CBCT measurements. Dent Traumatol 2021; 37:691-698. [PMID: 33942473 PMCID: PMC8453749 DOI: 10.1111/edt.12680] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2021] [Revised: 03/23/2021] [Accepted: 03/24/2021] [Indexed: 11/30/2022]
Abstract
Background/Aims Autotransplantation of developing premolars is an established treatment to replace missing teeth in the anterior maxilla in growing patients with a reported success rate of over 90%. The normal shape of the alveolus is observed after transplantation, but data on the presence and amount of alveolar bone after healing has not been previously reported. The aim of this study was to look for potential differences in alveolar bone dimensions between sites where autotransplanted premolars replaced missing incisors and control sites of contralateral incisors. Material/Methods There were 11 patients aged between 10 and 12 years five months (mean age: 10 years and 7 months) who underwent autotransplantation of a premolar to replace a central incisor. Cone Beam Computed Tomography (CBCT) performed at least 1 year after transplantation served to evaluate bone at sites of autotransplanted premolars and controls (contralateral maxillary central incisor). The thickness of the labial bone, plus the height and width of the alveolar process were measured on scans and compared at transplant and control sites. Results Mean thicknesses of the labial bone at the transplant and control sites were 0.78 mm and 0.82 mm respectively. Mean alveolar bone height was 15.15 mm at the transplant sites and 15.12 mm at the control sites. The mean marginal thickness of the alveolus was 7.75 mm at the transplant sites and 7.98 mm at the control sites. Mean thicknesses of the alveolus for half of its vertical dimension at the transplant and control sites were 7.54 mm and 8.03 mm, respectively. Conclusion The mean values of bone thickness, width and height of the alveolar process at sites of transplanted premolars were comparable to the mean values for the control incisors. Successful autotransplantation of developing premolars to replace missing central incisors allowed preservation of alveolar bone in the anterior maxilla.
Collapse
|
31
|
Verbeeck J, Deltuvaite-Thomas V, Berckmoes B, Burzykowski T, Aerts M, Thas O, Buyse M, Molenberghs G. Unbiasedness and efficiency of non-parametric and UMVUE estimators of the probabilistic index and related statistics. Stat Methods Med Res 2020; 30:747-768. [PMID: 33256560 DOI: 10.1177/0962280220966629] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
In reliability theory, diagnostic accuracy, and clinical trials, the quantity P(X>Y)+1/2P(X=Y), also known as the Probabilistic Index (PI), is a common treatment effect measure when comparing two groups of observations. The quantity P(X>Y)-P(Y>X), a linear transformation of PI known as the net benefit, has also been advocated as an intuitively appealing treatment effect measure. Parametric estimation of PI has received a lot of attention in the past 40 years, with the formulation of the Uniformly Minimum-Variance Unbiased Estimator (UMVUE) for many distributions. However, the non-parametric Mann-Whitney estimator of the PI is also known to be UMVUE in some situations. To understand this seeming contradiction, in this paper a systematic comparison is performed between the non-parametric estimator for the PI and parametric UMVUE estimators in various settings. We show that the Mann-Whitney estimator is always an unbiased estimator of the PI with univariate, completely observed data, while the parametric UMVUE is not when the distribution is misspecified. Additionally, the Mann-Whitney estimator is the UMVUE when observations belong to an unrestricted family. When observations come from a more restrictive family of distributions, the loss in efficiency for the non-parametric estimator is limited in realistic clinical scenarios. In conclusion, the Mann-Whitney estimator is simple to use and is a reliable estimator for the PI and net benefit in realistic clinical scenarios.
Collapse
|
32
|
Garcia Barrado L, Burzykowski T. Bayesian biomarker-driven outcome-adaptive randomization with an imperfect biomarker assay. Clin Trials 2020; 18:137-146. [PMID: 33231131 DOI: 10.1177/1740774520964202] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
OBJECTIVE We investigate the impact of biomarker assay's accuracy on the operating characteristics of a Bayesian biomarker-driven outcome-adaptive randomization design. METHODS In a simulation study, we assume a trial with two treatments, two biomarker-based strata, and a binary clinical outcome (response). Pbt denotes the probability of response for treatment t (t = 0 or 1) in biomarker stratum (b = 0 or 1). Four different scenarios in terms of true underlying response probabilities are considered: a null (P00 = P01 = 0.25, P10 = P11= 0.25) and consistent (P00 = P10 = 0.25, P01 = 0.5) treatment effect scenario, as well as a quantitative (P00 = P01 = P10 = 0.25, P11 = 0.5) and a qualitative (P00 = P11 = 0.5, P01 = P10 = 0.25) stratum-treatment interaction. For each scenario, we compare the case of a perfect with the case of an imperfect biomarker assay with sensitivity and specificity of 0.8 and 0.7, respectively. In addition, biomarker-positive prevalence values P(B = 1) = 0.2 and 0.5 are investigated. RESULTS Results show that the use of an imperfect assay affects the operational characteristics of the Bayesian biomarker-based outcome-adaptive randomization design. In particular, the misclassification causes a substantial reduction in power accompanied by a considerable increase in the type-I error probability. The magnitude of these effects depends on the sensitivity and specificity of the assay, as well as on the distribution of the biomarker in the patient population. CONCLUSION With an imperfect biomarker assay, the decision to apply a biomarker-based outcome-adaptive randomization design may require careful reflection.
Collapse
|
33
|
Agten A, Van Houtven J, Askenazi M, Burzykowski T, Laukens K, Valkenborg D. Visualizing the agreement of peptide assignments between different search engines. JOURNAL OF MASS SPECTROMETRY : JMS 2020; 55:e4471. [PMID: 31713933 DOI: 10.1002/jms.4471] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/04/2019] [Revised: 10/23/2019] [Accepted: 10/28/2019] [Indexed: 06/10/2023]
Abstract
There is a trend in the analysis of shotgun proteomics data that aims to combine information from multiple search engines to increase the number of peptide annotations in an experiment. Typically, the degree of search engine complementarity and search engine agreement is visually illustrated by means of Venn diagrams that present the findings of a database search on the level of the nonredundant peptide annotations. We argue this practice to be not fit-for-purpose since the diagrams do not take into account and often conceal the information on complementarity and agreement at the level of the spectrum identification. We promote a new type of visualization that provides insight on the peptide sequence agreement at the level of the peptide-spectrum match (PSM) as a measure of consensus between two search engines with nominal outcomes. We applied the visualizations and percentage sequence agreement to an in-house data set of our benchmark organism, Caenorhabditis elegans, and illustrated that when assessing the agreement between search engine, one should disentangle the notion of PSM confidence and PSM identity. The visualizations presented in this manuscript provide a more informative assessment of pairs of search engines and are made available as an R function in the Supporting Information.
Collapse
|
34
|
Claesen J, Valkenborg D, Burzykowski T. De novo prediction of the elemental composition of peptides and proteins based on a single mass. JOURNAL OF MASS SPECTROMETRY : JMS 2020; 55:e4367. [PMID: 31035305 DOI: 10.1002/jms.4367] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/17/2018] [Revised: 03/06/2019] [Accepted: 04/24/2019] [Indexed: 06/09/2023]
Abstract
Identification of peptides and proteins is a common task in mass spectrometry-based proteomics but often fails to deliver a comprehensive list of identifications. Downstream analysis, quantitative or qualitative, depends on the outcome of this process. Despite continuous improvement of computational methods, a large fraction of the screened peptides and/or proteins remains unidentified. We introduce here pacMASS, a method that de novo predicts the elemental composition of peptides and small proteins based on a single accurate mass, ie, the observed monoisotopic or average mass. This novel approach returns in a fast and memory efficient manner a limited number of elemental compositions per queried peptide or protein.
Collapse
|
35
|
Oba K, Paoletti X, Bang YJ, Bouché O, Ducreux M, Michiels S, Moehler MH, Morita S, Ohashi Y, Sakamoto J, Sasako M, Shitara K, Van Cutsem E, Buyse ME, Burzykowski T. Progression-free survival (PFS) as a surrogate endpoint for overall survival (OS) in advanced/recurrent gastric cancer (AGC) treatment: Individual-patient-data (IPD) based meta-analysis of randomized trials. J Clin Oncol 2020. [DOI: 10.1200/jco.2020.38.15_suppl.e16506] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
e16506 Background: In 2013, the GASTRIC (Global Advanced/Adjuvant Stomach Tumor Research through International Collaboration) evaluated the surrogacy of PFS based on IPD of 4,069 patients from 20 randomized trials of AGC. Treatment effects on PFS and on OS were only moderately correlated, and we could not validate PFS as a surrogate endpoint for OS. More recent trials, with refined inclusion criteria and higher standards for evaluating progression, may allow for a more accurate estimate of the correlation. The 2nd round of the GASTRIC sought to re-evaluate the surrogacy of PFS for OS in AGC. Methods: The GASTRIC database was updated with trials published after 2010 which used RECIST (Response Evaluation Criteria in Solid Tumors). Since the proportional hazards assumption was questionable for PFS, we primarily used mean-time ratio as a treatment effect measure, estimated by using the log-logistic model. Using the meta-analytic approach, correlations between PFS and OS at the individual level (Rindiv), and between treatment effects on PFS and on OS at the trial level (Rtrial), were estimated using Spearman’s rank-correlation and estimation-error-adjusted regression, respectively. Surrogate threshold effect was estimated as well. Results: We analyzed 10,912 patient data (1st round 4,069 patients from 20 trials and 2nd round 6,843 patients from 17 trials). Overall, moderate correlations were found at the individual level (Rindiv = 0.75, 95%CI = 0.75 to 0.76 in Hougaard copula) and at the trial level (Rtrial = 0.77, 95%CI = 0.32 to 1.00), respectively. Surrogate threshold effect was equal to 1.29, i.e., observing 29% increase in mean PFS time would predict a significant increase of the OS time. In the subgroup of patients with measurable disease in the 2nd round dataset (4,866 patients), Rtrial was higher and equal to 0.93 (95%CI = 0.70 to 1.00), with STE equal to 1.21. These results were same for 1st and 2nd line trials. Conclusions: The meta-analysis indicates a strong correlation between treatment effects (expressed as log-mean-ratios) on PFS and OS in patients with measurable disease.
Collapse
|
36
|
Claesen J, Valkenborg D, Burzykowski T. The (generalized) hydrogen rule for organic molecules. JOURNAL OF MASS SPECTROMETRY : JMS 2020; 55:e4485. [PMID: 31814214 DOI: 10.1002/jms.4485] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/06/2019] [Revised: 11/27/2019] [Accepted: 12/02/2019] [Indexed: 06/10/2023]
|
37
|
Buyse M, Saad ED, Burzykowski T, Péron J. Assessing Treatment Benefit in Immuno-oncology. STATISTICS IN BIOSCIENCES 2020. [DOI: 10.1007/s12561-020-09268-1] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
|
38
|
Górczak K, Claesen J, Burzykowski T. A Conceptual Framework for Abundance Estimation of Genomic Targets in the Presence of Ambiguous Short Sequencing Reads. J Comput Biol 2020; 27:1232-1247. [PMID: 31895597 DOI: 10.1089/cmb.2019.0272] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
RNA sequencing (RNA-seq) is widely used to study gene-, transcript-, or exon expression. To quantify the expression level, millions of short sequenced reads need to be mapped back to a reference genome or transcriptome. Read mapping makes it possible to find a location to which a read is identical or similar. Based upon this alignment, expression summaries, that is, read counts are generated. However, reads may be matched to multiple locations. Such ambiguously mapped reads are often ignored in the analysis, which is a potential loss of information and may cause bias in expression estimation. We present the general principles underlying multiread allocation and unbiased estimation of the expression level of genes, exons, or transcripts in the presence of multiple mapped reads. The underlying principles are derived from a theoretical concept that identifies important sources of information such as the number of uniquely mapped reads, the total target length, and the length of the shared target regions. We show with simulation studies that methods incorporating some or all of the aforementioned sources of information estimate the expression levels of genes, exons, and/or transcripts with a higher precision and accuracy than methods that do not use this information. We identify important sources of information that should be taken into account by methods that estimate the abundance of genes, exons, and/or transcripts to achieve good precision and accuracy.
Collapse
|
39
|
Claesen J, Valkenborg D, Burzykowski T. A "Refined Hydrogen Rule" and a "Refined Hydrogen and Halogen Rule" for Organic Molecules. JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY 2020; 31:132-136. [PMID: 32881509 DOI: 10.1021/jasms.9b00064] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Deriving chemical formulas of organic molecules, based on spectral information, with heuristic rules is a commonly recurring task. The computational effort and the potentially extensive list of candidate formulas put a strain on the downstream analysis. In this paper, we introduce a set of redefined heuristics based on the hydrogen and halogen rules that reduce the computational burden and the number of candidate formulas for organic molecules, such as peptides and lipids.
Collapse
|
40
|
Burzykowski T, Coart E, Saad ED, Shi Q, Sommeijer DW, Bokemeyer C, Díaz-Rubio E, Douillard JY, Falcone A, Fuchs CS, Goldberg RM, Hecht JR, Hoff PM, Hurwitz H, Kabbinavar FF, Koopman M, Maughan TS, Punt CJA, Saltz L, Schmoll HJ, Seymour MT, Tebbutt NC, Tournigand C, Van Cutsem E, de Gramont A, Zalcberg JR, Buyse M. Evaluation of Continuous Tumor-Size-Based End Points as Surrogates for Overall Survival in Randomized Clinical Trials in Metastatic Colorectal Cancer. JAMA Netw Open 2019; 2:e1911750. [PMID: 31539075 PMCID: PMC6755539 DOI: 10.1001/jamanetworkopen.2019.11750] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
IMPORTANCE Tumor measurements can be used to estimate time to nadir and depth of nadir as potential surrogates for overall survival (OS). OBJECTIVE To assess time to nadir and depth of nadir as surrogates for OS in metastatic colorectal cancer. DESIGN, SETTING, AND PARTICIPANTS Pooled analysis of 20 randomized clinical trials within the Aide et Recherche en Cancerologie Digestive database, which contains academic and industry-sponsored trials, was conducted. Three sets of comparisons were performed: chemotherapy alone, antiangiogenic agents, and anti-epidermal growth factor receptor agents in first-line treatment for patients with metastatic colorectal cancer. MAIN OUTCOMES AND MEASURES Surrogacy of time to nadir and depth of nadir was assessed at the trial level based on joint modeling of relative tumor-size change vs baseline and OS. Treatment effects on time to nadir and on depth of nadir were defined in terms of between-arm differences in time to nadir and in depth of nadir, and both were assessed in linear regressions for their correlation with treatment effects (hazard ratios) on OS within each set. The strengths of association were quantified using sample-size-weighted coefficients of determination (R2), with values closer to 1.00 indicating stronger association. At the patient level, the correlation was assessed between modeled relative tumor-size change and OS. RESULTS For 14 chemotherapy comparisons in 4289 patients, the R2 value was 0.63 (95% CI, 0.30-0.96) for the association between treatment effects on time to nadir and OS and 0.08 (95% CI, 0-0.37) for depth of nadir and OS. For 11 antiangiogenic agent comparisons (4854 patients), corresponding values of R2 were 0.25 (95% CI, 0-0.72) and 0.06 (95% CI, 0-0.35). For 8 anti-epidermal growth factor receptor comparisons (2684 patients), corresponding values of R2 were 0.24 (95% CI, 0-0.83) and 0.21 (95% CI, 0-0.78). CONCLUSIONS AND RELEVANCE In contrast with early reports favoring depth of response as a surrogate, these results suggest that neither time to nadir nor depth of nadir is an acceptable surrogate for OS in the first-line treatment of metastatic colorectal cancer.
Collapse
|
41
|
Plakwicz P, Abramczyk J, Wojtaszek-Lis J, Sajkowska J, Warych B, Gawron K, Burzykowski T, Zadurska M, Czochrowska EM, Wojtowicz A, Górska R, Kukuła K. The retrospective study of 93 patients with transmigration of mandibular canine and a comparative analysis with a control group. Eur J Orthod 2019; 41:390-396. [PMID: 30295778 PMCID: PMC6686080 DOI: 10.1093/ejo/cjy067] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/04/2022]
Abstract
Objectives The aim of this study was to evaluate characteristics of patients with unilateral transmigration of a mandibular canine in the largest study group presented until now. Materials and methods The study group consisted of 93 patients with unilateral transmigration of mandibular canine; the control group included 85 non-affected patients. Type of transmigration, status of deciduous and permanent canines, prevalence of missing teeth, class of occlusion, and space conditions were assessed to draw comparisons between groups. Results In this study, 64.5 per cent patients presented type 1 of transmigration; types 2, 3, 4, and 5 were present in, respectively, 23.7, 5.4, 4.3, and 2.1 per cent patients. There was a clear, statistically significant difference (P < 0.0001) between the mean crown and apex migration and angulation for the three groups of canines (transmigrated, contralateral, and control), whereas no differences were observed for the total number of permanent teeth present. In the study group, 73.1 per cent patients retained their primary canine on the affected side and 18.3 per cent on the contralateral side; in the control group, 22.3 per cent subjects had at least one primary canine. There was a statistically significant difference in the distribution of types of malocclusion between the study and the control groups. Conclusions Transmigration of mandibular canine was associated with the presence of retained primary canine on the affected side, higher mesial tilting of contralateral mandibular canine when compared to the canines in the control group. Additionally, higher prevalence of Angle’s Class I occlusion in patients with canine transmigration was recorded.
Collapse
|
42
|
Trotta L, Kabeya Y, Buyse M, Doffagne E, Venet D, Desmet L, Burzykowski T, Tsuburaya A, Yoshida K, Miyashita Y, Morita S, Sakamoto J, Praveen P, Oba K. Detection of atypical data in multicenter clinical trials using unsupervised statistical monitoring. Clin Trials 2019; 16:512-522. [PMID: 31331195 DOI: 10.1177/1740774519862564] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
BACKGROUND/AIMS A risk-based approach to clinical research may include a central statistical assessment of data quality. We investigated the operating characteristics of unsupervised statistical monitoring aimed at detecting atypical data in multicenter experiments. The approach is premised on the assumption that, save for random fluctuations and natural variations, data coming from all centers should be comparable and statistically consistent. Unsupervised statistical monitoring consists of performing as many statistical tests as possible on all trial data, in order to detect centers whose data are inconsistent with data from other centers. METHODS We conducted simulations using data from a large multicenter trial conducted in Japan for patients with advanced gastric cancer. The actual trial data were contaminated in computer simulations for varying percentages of centers, percentages of patients modified within each center and numbers and types of modified variables. The unsupervised statistical monitoring software was run by a blinded team on the contaminated data sets, with the purpose of detecting the centers with contaminated data. The operating characteristics (sensitivity, specificity and Youden's J-index) were calculated for three detection methods: one using the p-values of individual statistical tests after adjustment for multiplicity, one using a summary of all p-values for a given center, called the Data Inconsistency Score, and one using both of these methods. RESULTS The operating characteristics of the three methods were satisfactory in situations of data contamination likely to occur in practice, specifically when a single or a few centers were contaminated. As expected, the sensitivity increased for increasing proportions of patients and increasing numbers of variables contaminated. The three methods showed a specificity better than 93% in all scenarios of contamination. The method based on the Data Inconsistency Score and individual p-values adjusted for multiplicity generally had slightly higher sensitivity at the expense of a slightly lower specificity. CONCLUSIONS The use of brute force (a computer-intensive approach that generates large numbers of statistical tests) is an effective way to check data quality in multicenter clinical trials. It can provide a cost-effective complement to other data-management and monitoring techniques.
Collapse
|
43
|
Kazakiewicz D, Claesen J, Górczak K, Plewczynski D, Burzykowski T. A Multivariate Negative-Binomial Model with Random Effects for Differential Gene-Expression Analysis of Correlated mRNA Sequencing Data. J Comput Biol 2019; 26:1339-1348. [PMID: 31314581 DOI: 10.1089/cmb.2019.0168] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Experimental designs such as matched-pair or longitudinal studies yield mRNA sequencing (mRNA-Seq) counts that are correlated across samples. Most of the approaches for the analysis of correlated mRNA-Seq data are restricted to a specific design and/or balanced data only (with the same number of samples in each group). We propose a model that is applicable to the analysis of correlated mRNA-Seq data of different types: paired, clustered, longitudinal, or others. Any combination of explanatory variables, as well as unbalanced data, can be processed within the proposed modeling framework. The model assumes that exon counts of a particular gene of an individual sample jointly follow a multivariate negative-binomial distribution. Additional correlation between exon counts obtained for, for example, individual samples within the same pair or cluster, is taken into account by including into the model a cluster-level normally distributed random effect. An interesting feature of the model is that it provides explicit expression for marginal correlation between exon counts at different levels. The performance of the model is evaluated by using a simulation study and an analysis of two real-life data sets: a paired mRNA-Seq experiment for 24 patients with clear-cell renal-cell carcinoma and a longitudinal mRNA-Seq experiment for 29 patients with Lyme disease.
Collapse
|
44
|
Vinh-Hung V, Burzykowski T, Van de Steene J, Voordeckers M, Lamote J, Storme G. Statistical Interaction in the Survival Analysis of Early Breast Cancer using Registry Data: Role of Breast Conserving Surgery and Radiotherapy. TUMORI JOURNAL 2019; 91:9-14. [PMID: 15849998 DOI: 10.1177/030089160509100103] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Purpose To identify subgroup effects that might influence the survival results of postoperative radiotherapy. Patients and methods Women selected from the Surveillance, Epidemiology, and End Results database, aged 40-69 years, with non-metastasized T1-T2 breast carcinoma, in whom axillary lymph node dissection was performed. Subgroup analyses were performed using proportional hazards models with interactions. Joint significance of subgroups was evaluated with the Wald test. Event was death from any cause. Results Statistically significant interactions were found between type of surgery (breast-conserving [BCS] or mastectomy [ME]), radiotherapy [RT], T stage, and extent of nodal involvement, but not between treatments and nodal examination. For each treatment combination, ME-no RT, ME+RT, BCS-no RT, BCS+RT, the mortality hazard ratios were respectively: 1, 1.12, 1.11, 0.78 in T1, 0-3 positive nodes; 2.45, 2.77, 2.71, 1.92 in T2, 4+ nodes; 1.31, 1.38, 1.33, 1.19 in T2, 0-3+ nodes; and 3.41, 2.79, 3.44, 2.40 in T2, 4+ nodes. The corresponding joint tests showed: in the absence of radiotherapy, no significant survival disadvantage for breast-conserving surgery vs mastectomy; with radiotherapy, significant survival advantage for breast-conserving surgery irrespective of stage and for mastectomy in T2, 4+ nodes. For mastectomy in less advanced stages receiving radiotherapy, excess breast cancer deaths suggested undocumented adverse selection. The corresponding result was considered inconclusive. Conclusions The analyses found subgroup effects that should be taken into account to interpret treatment results in breast cancer.
Collapse
|
45
|
Rutkowski J, Saad ED, Burzykowski T, Buyse M, Jassem J. Chronological Trends in Progression-Free, Overall, and Post-Progression Survival in First-Line Therapy for Advanced NSCLC. J Thorac Oncol 2019; 14:1619-1627. [PMID: 31163279 DOI: 10.1016/j.jtho.2019.05.030] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2019] [Revised: 05/14/2019] [Accepted: 05/16/2019] [Indexed: 11/18/2022]
Abstract
BACKGROUND There is a debate about the merits of progression-free survival (PFS) versus overall survival (OS) as primary endpoints in NSCLC. It has been postulated that post-progression therapy may influence OS in both arms. To investigate this issue, we analyzed chronological trends in PFS and OS in advanced NSCLC using restricted mean survival times (RMSTs). METHODS We digitized survival curves from first-line phase III trials published between 1998 and 2015 in 13 leading journals to compute RMSTs for PFS and OS at three truncation landmarks (5, 12, and 18 months). RESULTS Among the 161 trials identified, RMSTs could be computed for both endpoints in 102, 97, and 82 trials for the 5-, 12-, and 18-month truncation landmarks, respectively. Post-progression survival in the control arm, quantified as mean OS minus mean PFS truncated at 18 months, was on average 3.3 months between 1998 and 2003, 4.4 months between 2004 and 2009, and 5.4 months between 2010 and 2015. This increase was due to increasing RMST for OS over time, with no increase in RMST for PFS. The average within-trial difference in RMSTs between experimental and control arm was close to 0 for OS and less than 1 month for PFS. CONCLUSIONS There is a progressive increase in post-progression survival in NSCLC trials, likely from salvage therapy. These results question both PFS and OS as sensitive endpoints in first-line trials, but suggest that the outlook for patients is improving regardless of within-trial gains.
Collapse
|
46
|
Barrado LG, Coart E, Burzykowski T. A Bayesian Framework Allowing Incorporation of Retrospective Information in Prospective Diagnostic Biomarker-Validation Designs. Stat Biopharm Res 2019. [DOI: 10.1080/19466315.2019.1574489] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
|
47
|
Padayachee T, Khamiakova T, Shkedy Z, Salo P, Perola M, Burzykowski T. A multivariate linear model for investigating the association between gene-module co-expression and a continuous covariate. Stat Appl Genet Mol Biol 2019; 18:/j/sagmb.ahead-of-print/sagmb-2018-0008/sagmb-2018-0008.xml. [PMID: 30875332 DOI: 10.1515/sagmb-2018-0008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
A way to enhance our understanding of the development and progression of complex diseases is to investigate the influence of cellular environments on gene co-expression (i.e. gene-pair correlations). Often, changes in gene co-expression are investigated across two or more biological conditions defined by categorizing a continuous covariate. However, the selection of arbitrary cut-off points may have an influence on the results of an analysis. To address this issue, we use a general linear model (GLM) for correlated data to study the relationship between gene-module co-expression and a covariate like metabolite concentration. The GLM specifies the gene-pair correlations as a function of the continuous covariate. The use of the GLM allows for investigating different (linear and non-linear) patterns of co-expression. Furthermore, the modeling approach offers a formal framework for testing hypotheses about possible patterns of co-expression. In our paper, a simulation study is used to assess the performance of the GLM. The performance is compared with that of a previously proposed GLM that utilizes categorized covariates. The versatility of the model is illustrated by using a real-life example. We discuss the theoretical issues related to the construction of the test statistics and the computational challenges related to fitting of the proposed model.
Collapse
|
48
|
Saad ED, Squifflet P, Burzykowski T, Quinaux E, Delaloge S, Mavroudis D, Perez E, Piccart-Gebhart M, Schneider BP, Slamon D, Wolmark N, Buyse M. Disease-free survival as a surrogate for overall survival in patients with HER2-positive, early breast cancer in trials of adjuvant trastuzumab for up to 1 year: a systematic review and meta-analysis. Lancet Oncol 2019; 20:361-370. [PMID: 30709633 PMCID: PMC7050571 DOI: 10.1016/s1470-2045(18)30750-2] [Citation(s) in RCA: 54] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2018] [Revised: 09/25/2018] [Accepted: 10/03/2018] [Indexed: 01/03/2023]
Abstract
BACKGROUND Although frequently used as a primary endpoint, disease-free survival has not been validated as a surrogate for overall survival in early breast cancer. We investigated this surrogacy in the adjuvant setting of treatment with anti-HER2 antibodies. METHODS In a systematic review and meta-analysis, we identified published and non-published randomised controlled trials with completed accrual and available disease-free survival and overall survival results for the intention-to-treat population as of September 2016. Bibliographic databases (MEDLINE, Embase, and Cochrane Central Register of Controlled Trials), clinical trial registries (Clinicaltrials.gov, EU Clinical Trials Register, WHO International Clinical Trials Registry Platform, and PharmNet.Bund), and trial registries from relevant pharmaceutical companies were searched. Eligibility for treatment of HER2-positive early breast cancer required at least one group to have an anti-HER antibody treatment (ie, trastuzumab, pertuzumab, or trastuzumab emtansine) planned for 12 months, and at least one control arm with chemotherapy without the antibody, a lower total dose or duration of the antibody, or observation alone. Units of analysis were contrasts: two-group trials gave rise to one contrast, whereas trials with more than two groups gave rise to more than one contrast. We excluded trials enrolling patients with recurrent, metastatic, or non-invasive disease, and those testing neoadjuvant therapy exclusively. Our primary objective was to estimate patient-level and trial-level correlations between disease-free survival and overall survival. We measured the association between disease-free survival and overall survival using Spearman's correlation coefficient (rs), and the association between hazard ratios (HRs) for disease-free survival and overall survival using R2. We computed the surrogate threshold effect, the maximum HR for disease-free survival that statistically predicts an HR for overall survival less than 1·00 in a future trial. FINDINGS Eight trials (n=21 480 patients) gave rise to a full set (12 contrasts). Patient-level associations between disease-free and overall survival were strong (rs=0·90 [95% CI 0·89-0·90]). Trial-level associations gave rise to values of R2 of 0·75 (95% CI 0·50-1·00) for the full set. Subgroups defined by nodal status and hormone receptor status yielded qualitatively similar results. Depending on the expected number of deaths in a future trial, the surrogate threshold effects ranged from 0·56 to 0·81, based on the full set. INTERPRETATION These findings suggest that it is appropriate to continue to use disease-free survival as a surrogate for overall survival in trials in HER-2-positive, early breast cancer. The key limitation of this study is the dependence of its results on the trials included and on the existence of an outlying trial. FUNDING Roche Pharma AG.
Collapse
|
49
|
Malta TM, Sokolov A, Gentles AJ, Burzykowski T, Poisson L, Weinstein JN, Kamińska B, Huelsken J, Omberg L, Gevaert O, Colaprico A, Czerwińska P, Mazurek S, Mishra L, Heyn H, Krasnitz A, Godwin AK, Lazar AJ, Stuart JM, Hoadley KA, Laird PW, Noushmehr H, Wiznerowicz M. Machine Learning Identifies Stemness Features Associated with Oncogenic Dedifferentiation. Cell 2019; 173:338-354.e15. [PMID: 29625051 DOI: 10.1016/j.cell.2018.03.034] [Citation(s) in RCA: 1220] [Impact Index Per Article: 244.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2017] [Revised: 01/30/2018] [Accepted: 03/14/2018] [Indexed: 12/16/2022]
Abstract
Cancer progression involves the gradual loss of a differentiated phenotype and acquisition of progenitor and stem-cell-like features. Here, we provide novel stemness indices for assessing the degree of oncogenic dedifferentiation. We used an innovative one-class logistic regression (OCLR) machine-learning algorithm to extract transcriptomic and epigenetic feature sets derived from non-transformed pluripotent stem cells and their differentiated progeny. Using OCLR, we were able to identify previously undiscovered biological mechanisms associated with the dedifferentiated oncogenic state. Analyses of the tumor microenvironment revealed unanticipated correlation of cancer stemness with immune checkpoint expression and infiltrating immune cells. We found that the dedifferentiated oncogenic phenotype was generally most prominent in metastatic tumors. Application of our stemness indices to single-cell data revealed patterns of intra-tumor molecular heterogeneity. Finally, the indices allowed for the identification of novel targets and possible targeted therapies aimed at tumor differentiation.
Collapse
|
50
|
Padayachee T, Khamiakova T, Louis E, Adriaensens P, Burzykowski T. The impact of the method of extracting metabolic signal from 1H-NMR data on the classification of samples: A case study of binning and BATMAN in lung cancer. PLoS One 2019; 14:e0211854. [PMID: 30726273 PMCID: PMC6364941 DOI: 10.1371/journal.pone.0211854] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2018] [Accepted: 01/23/2019] [Indexed: 11/23/2022] Open
Abstract
Nuclear magnetic resonance (NMR) spectroscopy is a principal analytical technique in metabolomics. Extracting metabolic information from NMR spectra is complex due to the fact that an immense amount of detail on the chemical composition of a biological sample is expressed through a single spectrum. The simplest approach to quantify the signal is through spectral binning which involves subdividing the spectra into regions along the chemical shift axis and integrating the peaks within each region. However, due to overlapping resonance signals, the integration values do not always correspond to the concentrations of specific metabolites. An alternate, more advanced statistical approach is spectral deconvolution. BATMAN (Bayesian AuTomated Metabolite Analyser for NMR data) performs spectral deconvolution using prior information on the spectral signatures of metabolites. In this way, BATMAN estimates relative metabolic concentrations. In this study, both spectral binning and spectral deconvolution using BATMAN were applied to 400 MHz and 900 MHz NMR spectra of blood plasma samples from lung cancer patients and control subjects. The relative concentrations estimated by BATMAN were compared with the binning integration values in terms of their ability to discriminate between lung cancer patients and controls. For the 400 MHz data, the spectral binning approach provided greater discriminatory power. However, for the 900 MHz data, the relative metabolic concentrations obtained by using BATMAN provided greater predictive power. While spectral binning is computationally advantageous and less laborious, complementary models developed using BATMAN-estimated features can add complementary information regarding the biological interpretation of the data and therefore are clinically useful.
Collapse
|