51
|
Xiong D, Wang Y, You M. Reply to: "Inconsistent prediction capability of ImmuneCells.Sig across different RNA-seq datasets". Nat Commun 2021; 12:4168. [PMID: 34234120 PMCID: PMC8263738 DOI: 10.1038/s41467-021-24304-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2020] [Accepted: 06/10/2021] [Indexed: 12/29/2022] Open
Affiliation(s)
- Donghai Xiong
- Center for Cancer Prevention, Houston Methodist Cancer Center, Houston Methodist Research Institute, Houston, TX, United States
| | - Yian Wang
- Center for Cancer Prevention, Houston Methodist Cancer Center, Houston Methodist Research Institute, Houston, TX, United States
| | - Ming You
- Center for Cancer Prevention, Houston Methodist Cancer Center, Houston Methodist Research Institute, Houston, TX, United States.
| |
Collapse
|
52
|
Da-ano R, Lucia F, Masson I, Abgral R, Alfieri J, Rousseau C, Mervoyer A, Reinhold C, Pradier O, Schick U, Visvikis D, Hatt M. A transfer learning approach to facilitate ComBat-based harmonization of multicentre radiomic features in new datasets. PLoS One 2021; 16:e0253653. [PMID: 34197503 PMCID: PMC8248970 DOI: 10.1371/journal.pone.0253653] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2021] [Accepted: 06/09/2021] [Indexed: 12/15/2022] Open
Abstract
PURPOSE To facilitate the demonstration of the prognostic value of radiomics, multicenter radiomics studies are needed. Pooling radiomic features of such data in a statistical analysis is however challenging, as they are sensitive to the variability in scanner models, acquisition protocols and reconstruction settings, which is often unavoidable in a multicentre retrospective analysis. A statistical harmonization strategy called ComBat was utilized in radiomics studies to deal with the "center-effect". The goal of the present work was to integrate a transfer learning (TL) technique within ComBat-and recently developed alternate versions of ComBat with improved flexibility (M-ComBat) and robustness (B-ComBat)-to allow the use of a previously determined harmonization transform to the radiomic feature values of new patients from an already known center. MATERIAL AND METHODS The proposed TL approach were incorporated in the four versions of ComBat (standard, B, M, and B-M ComBat). The proposed approach was evaluated using a dataset of 189 locally advanced cervical cancer patients from 3 centers, with magnetic resonance imaging (MRI) and positron emission tomography (PET) images, with the clinical endpoint of predicting local failure. The impact performance of the TL approach was evaluated by comparing the harmonization achieved using only parts of the data to the reference (harmonization achieved using all the available data). It was performed through three different machine learning pipelines. RESULTS The proposed TL technique was successful in harmonizing features of new patients from a known center in all versions of ComBat, leading to predictive models reaching similar performance as the ones developed using the features harmonized with all the data available. CONCLUSION The proposed TL approach enables applying a previously determined ComBat transform to new, previously unseen data.
Collapse
Affiliation(s)
- Ronrick Da-ano
- INSERM, UMR 1101, LaTIM, University of Brest, Brest, France
- * E-mail:
| | - François Lucia
- INSERM, UMR 1101, LaTIM, University of Brest, Brest, France
- Radiation Oncology Department, University Hospital, Brest, France
| | - Ingrid Masson
- INSERM, UMR 1101, LaTIM, University of Brest, Brest, France
- Department of Radiation Oncology, Institut de cancérologie de l’Ouest René-Gauducheau, Saint-Herblain, France
| | - Ronan Abgral
- Department of Nuclear Medicine, University of Brest, Brest, France
| | - Joanne Alfieri
- Department of Radiation Oncology, McGill University Health Centre, Montreal, Quebec
| | - Caroline Rousseau
- Department of Nuclear Medicine, Institut de cancérologie de l’Ouest René-Gauducheau, Saint-Herblain, France
| | - Augustin Mervoyer
- Department of Radiation Oncology, Institut de cancérologie de l’Ouest René-Gauducheau, Saint-Herblain, France
| | - Caroline Reinhold
- Department of Radiology, McGill University Health Centre, Montreal, Canada
- Augmented Intelligence & Precision Health Laboratory of the Research Institute of McGill University Health Centre, Montreal, Canada
| | - Olivier Pradier
- INSERM, UMR 1101, LaTIM, University of Brest, Brest, France
- Radiation Oncology Department, University Hospital, Brest, France
| | - Ulrike Schick
- INSERM, UMR 1101, LaTIM, University of Brest, Brest, France
- Radiation Oncology Department, University Hospital, Brest, France
| | | | - Mathieu Hatt
- INSERM, UMR 1101, LaTIM, University of Brest, Brest, France
| |
Collapse
|
53
|
Aliverti E, Lum K, Johndrow JE, Dunson DB. Removing the influence of group variables in high-dimensional predictive modelling. JOURNAL OF THE ROYAL STATISTICAL SOCIETY. SERIES A, (STATISTICS IN SOCIETY) 2021; 184:791-811. [PMID: 35755858 PMCID: PMC9221581 DOI: 10.1111/rssa.12613] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
In many application areas, predictive models are used to support or make important decisions. There is increasing awareness that these models may contain spurious or otherwise undesirable correlations. Such correlations may arise from a variety of sources, including batch effects, systematic measurement errors, or sampling bias. Without explicit adjustment, machine learning algorithms trained using these data can produce poor out-of-sample predictions which propagate these undesirable correlations. We propose a method to pre-process the training data, producing an adjusted dataset that is statistically independent of the nuisance variables with minimum information loss. We develop a conceptually simple approach for creating an adjusted dataset in high-dimensional settings based on a constrained form of matrix decomposition. The resulting dataset can then be used in any predictive algorithm with the guarantee that predictions will be statistically independent of the group variable. We develop a scalable algorithm for implementing the method, along with theory support in the form of independence guarantees and optimality. The method is illustrated on some simulation examples and applied to two case studies: removing machine-specific correlations from brain scan data, and removing race and ethnicity information from a dataset used to predict recidivism. That the motivation for removing undesirable correlations is quite different in the two applications illustrates the broad applicability of our approach.
Collapse
|
54
|
Zhao Z, Guo Y, Liu Y, Sun L, Chen B, Wang C, Chen T, Wang Y, Li Y, Dong Q, Ai L, Wang R, Gu Y, Li X. Individualized lncRNA differential expression profile reveals heterogeneity of breast cancer. Oncogene 2021; 40:4604-4614. [PMID: 34131286 PMCID: PMC8266678 DOI: 10.1038/s41388-021-01883-6] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2020] [Revised: 05/18/2021] [Accepted: 06/01/2021] [Indexed: 02/05/2023]
Abstract
Long non-coding RNAs (lncRNAs) play key regulatory roles in breast cancer. However, population-level differential expression analysis methods disregard the heterogeneous expression of lncRNAs in individual patients. Therefore, we individualized lncRNA expression profiles for breast invasive carcinoma (BRCA) using the method of LncRNA Individualization (LncRIndiv). After evaluating the robustness of LncRIndiv, we constructed an individualized differentially expressed lncRNA (IDElncRNA) profile for BRCA and investigated the subtype-specific IDElncRNAs. The breast cancer subtype-specific IDElncRNA showed frequent co-occurrence with alterations of protein-coding genes, including mutations, copy number variation and differential methylation. We performed hierarchical clustering to subdivide TNBC and revealed mesenchymal subtype and immune subtype for TNBC. The TNBC immune subtype showed a better prognosis than the TNBC mesenchymal subtype. LncRNA PTOV1-AS1 was the top differentially expressed lncRNA in the mesenchymal subtype. And biological experiments validated that the upregulation of PTOV1-AS1 could downregulate TJP1 (ZO-1) and E-Cadherin, and upregulate Vimentin, which suggests PTOV1-AS1 may promote epithelial-mesenchymal transition and lead to migration and invasion of TNBC cells. The mesenchymal subtype showed a higher fraction of M2 macrophages, whereas the immune subtype was more associated with CD4 + T cells. The immune subtype is characterized by genomic instability and upregulation of immune checkpoint genes, thereby suggesting a potential response to immunosuppressive drugs. Last, drug response analysis revealed lncRNA ENSG00000230082 (PRRT3-AS1) is a potential resistance biomarker for paclitaxel in BRCA treatment. Our analysis highlights that IDElncRNAs can characterize inter-tumor heterogeneity in BRCA and the new TNBC subtypes indicate novel insights into TNBC immunotherapy.
Collapse
Affiliation(s)
- Zhangxiang Zhao
- Department of Systems Biology, College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - YingYing Guo
- Department of Pharmacology (State-Province Key Laboratories of Biomedicine-Pharmaceutics of China Key Laboratory of Cardiovascular Research, Ministry of Education), College of Pharmacy, Harbin Medical University, Harbin, China
- Northern Translational Medicine Research and Cooperation, Heilongjiang Academy of Medical Sciences, Harbin Medical University, Harbin, China
| | - Yaoyao Liu
- Department of Systems Biology, College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Lichun Sun
- Department of Breast Medical Oncology, Harbin Medical University Cancer Hospital, Harbin, China
| | - Bo Chen
- Department of Systems Biology, College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Chengyu Wang
- Department of Systems Biology, College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Tingting Chen
- Department of Systems Biology, College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Yuquan Wang
- Department of Systems Biology, College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Yawei Li
- Department of Systems Biology, College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Qi Dong
- Department of Systems Biology, College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Liqiang Ai
- Department of Systems Biology, College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Ran Wang
- Department of Physiology, Harbin Medical University, Harbin, China
| | - Yunyan Gu
- Department of Systems Biology, College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China.
| | - Xia Li
- Department of Bioinformatics, College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China.
| |
Collapse
|
55
|
Castellano-Escuder P, González-Domínguez R, Carmona-Pontaque F, Andrés-Lacueva C, Sánchez-Pla A. POMAShiny: A user-friendly web-based workflow for metabolomics and proteomics data analysis. PLoS Comput Biol 2021; 17:e1009148. [PMID: 34197462 PMCID: PMC8279420 DOI: 10.1371/journal.pcbi.1009148] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2021] [Revised: 07/14/2021] [Accepted: 06/05/2021] [Indexed: 12/14/2022] Open
Abstract
Metabolomics and proteomics, like other omics domains, usually face a data mining challenge in providing an understandable output to advance in biomarker discovery and precision medicine. Often, statistical analysis is one of the most difficult challenges and it is critical in the subsequent biological interpretation of the results. Because of this, combined with the computational programming skills needed for this type of analysis, several bioinformatic tools aimed at simplifying metabolomics and proteomics data analysis have emerged. However, sometimes the analysis is still limited to a few hidebound statistical methods and to data sets with limited flexibility. POMAShiny is a web-based tool that provides a structured, flexible and user-friendly workflow for the visualization, exploration and statistical analysis of metabolomics and proteomics data. This tool integrates several statistical methods, some of them widely used in other types of omics, and it is based on the POMA R/Bioconductor package, which increases the reproducibility and flexibility of analyses outside the web environment. POMAShiny and POMA are both freely available at https://github.com/nutrimetabolomics/POMAShiny and https://github.com/nutrimetabolomics/POMA, respectively. Metabolomics and proteomics are two growing areas in human health and personalized medicine fields. Often, one of the main applications of metabolomics and proteomics is the discovery of novel biomarkers and new therapeutic targets in these areas. However, these data are extremely complex and hard to analyse, since they have a large number of features, several missing values, and often important clinical variables to consider in the analyses. Therefore, powerful and versatile tools are needed to provide efficient methods for data visualization and exploration, as well as a wide range of robust statistical methods to meet all data and users requirements. Although powerful tools do exist for the analysis of these data, many of them are still limiting the analyses in terms of visualization and statistical analysis. To address this limitation and complement the existing tools, we have developed a web-based application, named POMAShiny, for the data analysis of metabolomics and proteomics. This novel and versatile tool offers a wholly interactive and easy-to-use environment for the analysis of these data, including numerous methods for preprocessing, data visualization and statistical analysis. The POMAShiny open-source tool is extremely flexible and portable, as it can be installed locally and freely accessed online at https://webapps.nutrimetabolomics.com/POMAShiny.
Collapse
Affiliation(s)
- Pol Castellano-Escuder
- Biomarkers and Nutritional & Food Metabolomics Research Group, Department of Nutrition, Food Science and Gastronomy, Food Innovation Network (XIA), University of Barcelona, Barcelona, Spain
- Statistics and Bioinformatics Research Group, Department of Genetics, Microbiology and Statistics, University of Barcelona, Barcelona, Spain
- CIBERFES, Instituto de Salud Carlos III, Madrid, Spain
- * E-mail: (PC-E); (AS-P)
| | - Raúl González-Domínguez
- Biomarkers and Nutritional & Food Metabolomics Research Group, Department of Nutrition, Food Science and Gastronomy, Food Innovation Network (XIA), University of Barcelona, Barcelona, Spain
- CIBERFES, Instituto de Salud Carlos III, Madrid, Spain
| | - Francesc Carmona-Pontaque
- Statistics and Bioinformatics Research Group, Department of Genetics, Microbiology and Statistics, University of Barcelona, Barcelona, Spain
- CIBERFES, Instituto de Salud Carlos III, Madrid, Spain
| | - Cristina Andrés-Lacueva
- Biomarkers and Nutritional & Food Metabolomics Research Group, Department of Nutrition, Food Science and Gastronomy, Food Innovation Network (XIA), University of Barcelona, Barcelona, Spain
- CIBERFES, Instituto de Salud Carlos III, Madrid, Spain
| | - Alex Sánchez-Pla
- Statistics and Bioinformatics Research Group, Department of Genetics, Microbiology and Statistics, University of Barcelona, Barcelona, Spain
- CIBERFES, Instituto de Salud Carlos III, Madrid, Spain
- * E-mail: (PC-E); (AS-P)
| |
Collapse
|
56
|
Skogheim TS, Weyde KVF, Engel SM, Aase H, Surén P, Øie MG, Biele G, Reichborn-Kjennerud T, Caspersen IH, Hornig M, Haug LS, Villanger GD. Metal and essential element concentrations during pregnancy and associations with autism spectrum disorder and attention-deficit/hyperactivity disorder in children. ENVIRONMENT INTERNATIONAL 2021; 152:106468. [PMID: 33765546 DOI: 10.1016/j.envint.2021.106468] [Citation(s) in RCA: 97] [Impact Index Per Article: 24.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/15/2020] [Revised: 02/10/2021] [Accepted: 02/13/2021] [Indexed: 06/12/2023]
Abstract
BACKGROUND Prenatal exposure to toxic metals or variations in maternal levels of essential elements during pregnancy may be a risk factor for neurodevelopmental disorders such as attention-deficit/hyperactivity disorder (ADHD) and autism spectrum disorder (ASD) in offspring. OBJECTIVES We investigated whether maternal levels of toxic metals and essential elements measured in mid-pregnancy, individually and as mixtures, were associated with childhood diagnosis of ADHD or ASD. METHODS This study is based on the Norwegian Mother, Father and Child Cohort Study and included 705 ADHD cases, 397 ASD cases and 1034 controls. Cases were identified through linkage with the Norwegian Patient Registry. Maternal concentrations of 11 metals/elements were measured in blood at week 17 of gestation; cadmium; cesium; cobalt; copper; lead; magnesium; manganese; selenium; zinc; total arsenic; and total mercury. Multivariable adjusted logistic regression models were used to examine associations between quartile levels of individual metals/elements and outcomes. We also investigated non-linear associations using restricted cubic spline models. The joint effects of the metal/element mixture on ASD and ADHD diagnoses were estimated using a quantile-based g-computation approach. RESULTS For ASD, we identified positive associations (increased risks) in the second quartile of arsenic [OR = 1.77 (CI: 1.26, 2.49)] and the fourth quartiles of cadmium and manganese [OR = 1.57 (CI: 1.07 2.31); OR = 1.84 (CI: 1.30, 2.59)], respectively. In addition, there were negative associations between cesium, copper, mercury, and zinc and ASD. For ADHD, we found increased risk in the fourth quartiles of cadmium and magnesium [OR = 1.59 (CI: 1.15, 2.18); [OR = 1.42 (CI: 1.06, 1.91)]. There were also some negative associations, among others with mercury. In addition, we identified non-linear associations between ASD and arsenic, mercury, magnesium, and lead, and between ADHD and arsenic, copper, manganese, and mercury. There were no significant findings in the mixture approach analyses. CONCLUSION Results from the present study show several associations between levels of metals and elements during gestation and ASD and ADHD in children. The most notable ones involved arsenic, cadmium, copper, mercury, manganese, magnesium, and lead. Our results suggest that even population levels of these compounds may have negative impacts on neurodevelopment. As we observed mainly similarities among the metals' and elements' impact on ASD and ADHD, it could be that the two disorders share some neurochemical and neurodevelopmental pathways. The results warrant further investigation and replication, as well as studies of combined effects of metals/elements and mechanistic underpinnings.
Collapse
Affiliation(s)
- Thea S Skogheim
- Division of Mental and Physical Health, Norwegian Institute of Public Health, PO Box 222 Skøyen, 0213 Oslo, Norway.
| | - Kjell Vegard F Weyde
- Division of Mental and Physical Health, Norwegian Institute of Public Health, PO Box 222 Skøyen, 0213 Oslo, Norway
| | - Stephanie M Engel
- Gillings School of Global Public Health, University of North Carolina at Chapel Hill, 135 Dauer Drive, Campus Box 7435, Chapel Hill, NC 27599-7435, USA
| | - Heidi Aase
- Division of Mental and Physical Health, Norwegian Institute of Public Health, PO Box 222 Skøyen, 0213 Oslo, Norway
| | - Pål Surén
- Division of Mental and Physical Health, Norwegian Institute of Public Health, PO Box 222 Skøyen, 0213 Oslo, Norway
| | - Merete G Øie
- Department of Psychology, University of Oslo, PO Box 1094 Blindern, 0317 Oslo, Norway
| | - Guido Biele
- Division of Mental and Physical Health, Norwegian Institute of Public Health, PO Box 222 Skøyen, 0213 Oslo, Norway
| | - Ted Reichborn-Kjennerud
- Division of Mental and Physical Health, Norwegian Institute of Public Health, PO Box 222 Skøyen, 0213 Oslo, Norway; Institute of Clinical Medicine, University of Oslo, PO Box 1171 Blindern, 0318 Oslo, Norway
| | - Ida H Caspersen
- Centre for Fertility and Health, Norwegian Institute of Public Health, PO Box 222 Skøyen, 0213 Oslo, Norway
| | - Mady Hornig
- Department of Epidemiology, Columbia University, Mailman School of Public Health, 722 W 168th St, Rm. 736, New York, NY 10032, USA
| | - Line S Haug
- Division of Infection Control and Environmental Health, Norwegian Institute of Public Health, PO Box 222 Skøyen, 0213 Oslo, Norway
| | - Gro D Villanger
- Division of Mental and Physical Health, Norwegian Institute of Public Health, PO Box 222 Skøyen, 0213 Oslo, Norway
| |
Collapse
|
57
|
Transcriptomic analysis of castration, chemo-resistant and metastatic prostate cancer elucidates complex genetic crosstalk leading to disease progression. Funct Integr Genomics 2021; 21:451-472. [PMID: 34184132 DOI: 10.1007/s10142-021-00789-6] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2020] [Revised: 06/05/2020] [Accepted: 05/06/2021] [Indexed: 12/22/2022]
Abstract
Prostate adenocarcinoma, with its rising numbers and high fatality rate, is a daunting healthcare challenge to clinicians and researchers alike. The mainstay of our meta-analysis was to decipher differentially expressed genes (DEGs), their corresponding transcription factors (TFs), miRNAs (microRNA) and interacting pathways underlying the progression of prostate cancer (PCa). We have chosen multiple datasets from primary, castration-resistant, chemo-resistant and metastatic prostate cancer stages for investigation. From our tissue-specific and disease-specific co-expression networks, fifteen hub genes such as ACTB, ACTN1, CDH1, CDKN1A, DDX21, ELF3, FLNA, FLNC, IKZF1, ILK, KRT13, KRT18, KRT19, SVIL and TRIM29 were identified and validated by molecular complex detection analysis as well as survival analysis. In our attempt to highlight hub gene-associated mutations and drug interactions, FLNC was found to be most commonly mutated and CDKN1A gene was found to have highest druggability. Moreover, from DAVID and gene set enrichment analysis, the focal adhesion and oestrogen signalling pathways were found enriched which indicates the involvement of hub genes in tumour invasiveness and metastasis. Finally by Enrichr tool and miRNet, we identified transcriptional factors SNAI2, TP63, CEBPB and KLF11 and microRNAs, namely hsa-mir-1-3p, hsa-mir-145-5p, hsa-mir-124-3p and hsa-mir-218-5p significantly controlling the hub gene expressions. In a nutshell, our report will help to gain a deeper insight into complex molecular intricacies and thereby unveil the probable biomarkers and therapeutic targets involved with PCa progression.
Collapse
|
58
|
Boakye D, Jansen L, Schöttker B, Jansen EHJM, Halama N, Maalmi H, Gào X, Chang-Claude J, Hoffmeister M, Brenner H. The association of vitamin D with survival in colorectal cancer patients depends on antioxidant capacity. Am J Clin Nutr 2021; 113:1458-1467. [PMID: 33740035 DOI: 10.1093/ajcn/nqaa405] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2020] [Accepted: 12/01/2020] [Indexed: 12/20/2022] Open
Abstract
BACKGROUND Vitamin D plays a role in detoxifying free radicals, which might explain the previously reported lower mortality in colorectal cancer (CRC) patients with higher vitamin D concentrations. OBJECTIVES We aimed to assess whether the associations of 25-hydroxyvitamin D [25(OH)D] with prognosis in CRC patients differ by total thiol concentration (TTC), a biomarker of antioxidant capacity. METHODS CRC patients who were diagnosed from 2003 to 2010 and recruited into a population-based study in southern Germany (n = 2,592) were followed over a period of 6 y. 25(OH)D and TTC were evaluated from blood samples collected shortly after CRC diagnosis. Associations of 25(OH)D with all-cause and CRC mortality according to TTC were estimated using multivariable Cox proportional hazards regression. RESULTS There was a weak positive correlation between 25(OH)D and TTC (r = 0.26, P < 0.001). 25(OH)D was inversely associated with mortality among patients in the lowest and middle TTC tertiles, but no associations were found among patients in the highest TTC tertile (P-interaction = 0.01). Among patients in the lowest/middle TTC tertiles, those in the middle and highest (compared with lowest) 25(OH)D tertiles had 31% and 44% lower all-cause mortality (P < 0.001) and 25% and 45% lower CRC mortality (P < 0.001), respectively. However, in the highest TTC tertile, 25(OH)D was not associated with all-cause (P = 0.638) or CRC mortality (P = 0.395). CONCLUSIONS The survival advantages in CRC patients with adequate vitamin D strongly depend on antioxidant capacity and are most pronounced in cases of low antioxidant capacity. These findings suggest that TTC and other biomarkers of antioxidant status may be useful as the basis for enhanced selection criteria of patients for vitamin D supplementation, in addition to the conventional judgment based on blood 25(OH)D concentrations, and also for refining selection of patients for clinical trials aiming to estimate the effect of vitamin D supplementation.
Collapse
Affiliation(s)
- Daniel Boakye
- Division of Clinical Epidemiology and Aging Research, German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Lina Jansen
- Division of Clinical Epidemiology and Aging Research, German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Ben Schöttker
- Division of Clinical Epidemiology and Aging Research, German Cancer Research Center (DKFZ), Heidelberg, Germany.,Network of Aging Research, Heidelberg University, Heidelberg, Germany
| | - Eugene H J M Jansen
- Centre for Health Protection, National Institute for Public Health and the Environment, Bilthoven, The Netherlands
| | - Niels Halama
- Division of Translational Immunotherapy, German Cancer Research Center (DKFZ) and National Center for Tumor Diseases (NCT), Heidelberg, Germany
| | - Haifa Maalmi
- Division of Clinical Epidemiology and Aging Research, German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Xin Gào
- Division of Clinical Epidemiology and Aging Research, German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Jenny Chang-Claude
- Unit of Genetic Epidemiology, Division of Cancer Epidemiology, German Cancer Research Center (DKFZ), Heidelberg, Germany.,Cancer Epidemiology Group, University Cancer Center Hamburg (UCCH), University Medical Center Hamburg-Eppendorf (UKE), Hamburg, Germany
| | - Michael Hoffmeister
- Division of Clinical Epidemiology and Aging Research, German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Hermann Brenner
- Division of Clinical Epidemiology and Aging Research, German Cancer Research Center (DKFZ), Heidelberg, Germany.,Division of Preventive Oncology, German Cancer Research Center (DKFZ) and National Center for Tumor Diseases (NCT), Heidelberg, Germany
| |
Collapse
|
59
|
Bhavani SV, Wolfe KS, Hrusch CL, Greenberg JA, Krishack PA, Lin J, Lecompte-Osorio P, Carey KA, Kress JP, Coopersmith CM, Sperling AI, Verhoef PA, Churpek MM, Patel BK. Temperature Trajectory Subphenotypes Correlate With Immune Responses in Patients With Sepsis. Crit Care Med 2021; 48:1645-1653. [PMID: 32947475 DOI: 10.1097/ccm.0000000000004610] [Citation(s) in RCA: 38] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
OBJECTIVES We recently found that distinct body temperature trajectories of infected patients correlated with survival. Understanding the relationship between the temperature trajectories and the host immune response to infection could allow us to immunophenotype patients at the bedside using temperature. The objective was to identify whether temperature trajectories have consistent associations with specific cytokine responses in two distinct cohorts of infected patients. DESIGN Prospective observational study. SETTING Large academic medical center between 2013 and 2019. SUBJECTS Two cohorts of infected patients: 1) patients in the ICU with septic shock and 2) hospitalized patients with Staphylococcus aureus bacteremia. INTERVENTIONS Clinical data (including body temperature) and plasma cytokine concentrations were measured. Patients were classified into four temperature trajectory subphenotypes using their temperature measurements in the first 72 hours from the onset of infection. Log-transformed cytokine levels were standardized to the mean and compared with the subphenotypes in both cohorts. MEASUREMENTS AND MAIN RESULTS The cohorts consisted of 120 patients with septic shock (cohort 1) and 88 patients with S. aureus bacteremia (cohort 2). Patients from both cohorts were classified into one of four previously validated temperature subphenotypes: "hyperthermic, slow resolvers" (n = 19 cohort 1; n = 13 cohort 2), "hyperthermic, fast resolvers" (n = 18 C1; n = 24 C2), "normothermic" (n = 54 C1; n = 31 C2), and "hypothermic" (n = 29 C1; n = 20 C2). Both "hyperthermic, slow resolvers" and "hyperthermic, fast resolvers" had high levels of G-CSF, CCL2, and interleukin-10 compared with the "hypothermic" group when controlling for cohort and timing of cytokine measurement (p < 0.05). In contrast to the "hyperthermic, slow resolvers," the "hyperthermic, fast resolvers" showed significant decreases in the levels of several cytokines over a 24-hour period, including interleukin-1RA, interleukin-6, interleukin-8, G-CSF, and M-CSF (p < 0.001). CONCLUSIONS Temperature trajectory subphenotypes are associated with consistent cytokine profiles in two distinct cohorts of infected patients. These subphenotypes could play a role in the bedside identification of cytokine profiles in patients with sepsis.
Collapse
Affiliation(s)
| | - Krysta S Wolfe
- Department of Medicine, University of Chicago Medical Center, Chicago, IL
| | - Cara L Hrusch
- Department of Medicine, University of Chicago Medical Center, Chicago, IL
| | | | | | - Julie Lin
- Department of Medicine, University of Chicago Medical Center, Chicago, IL
| | | | - Kyle A Carey
- Department of Medicine, University of Chicago Medical Center, Chicago, IL
| | - John P Kress
- Department of Medicine, University of Chicago Medical Center, Chicago, IL
| | | | - Anne I Sperling
- Department of Medicine, University of Chicago Medical Center, Chicago, IL
| | | | - Matthew M Churpek
- Department of Medicine, University of Wisconsin-Madison, Madison, WI
| | - Bhakti K Patel
- Department of Medicine, University of Chicago Medical Center, Chicago, IL
| |
Collapse
|
60
|
Nguyen H, Tran D, Tran B, Pehlivan B, Nguyen T. A comprehensive survey of regulatory network inference methods using single cell RNA sequencing data. Brief Bioinform 2021; 22:bbaa190. [PMID: 34020546 PMCID: PMC8138892 DOI: 10.1093/bib/bbaa190] [Citation(s) in RCA: 78] [Impact Index Per Article: 19.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2019] [Revised: 06/19/2020] [Accepted: 07/24/2020] [Indexed: 12/13/2022] Open
Abstract
Gene regulatory network is a complicated set of interactions between genetic materials, which dictates how cells develop in living organisms and react to their surrounding environment. Robust comprehension of these interactions would help explain how cells function as well as predict their reactions to external factors. This knowledge can benefit both developmental biology and clinical research such as drug development or epidemiology research. Recently, the rapid advance of single-cell sequencing technologies, which pushed the limit of transcriptomic profiling to the individual cell level, opens up an entirely new area for regulatory network research. To exploit this new abundant source of data and take advantage of data in single-cell resolution, a number of computational methods have been proposed to uncover the interactions hidden by the averaging process in standard bulk sequencing. In this article, we review 15 such network inference methods developed for single-cell data. We discuss their underlying assumptions, inference techniques, usability, and pros and cons. In an extensive analysis using simulation, we also assess the methods' performance, sensitivity to dropout and time complexity. The main objective of this survey is to assist not only life scientists in selecting suitable methods for their data and analysis purposes but also computational scientists in developing new methods by highlighting outstanding challenges in the field that remain to be addressed in the future development.
Collapse
Affiliation(s)
- Hung Nguyen
- Department of Computer Science and Engineering, University of Nevada, Reno, NV 89557
| | - Duc Tran
- Department of Computer Science and Engineering, University of Nevada, Reno, NV 89557
| | - Bang Tran
- Department of Computer Science and Engineering, University of Nevada, Reno, NV 89557
| | - Bahadir Pehlivan
- Department of Computer Science and Engineering, University of Nevada, Reno, NV 89557
| | - Tin Nguyen
- Department of Computer Science and Engineering, University of Nevada, Reno, NV 89557
| |
Collapse
|
61
|
Chen Y, Wu T, Zhu Z, Huang H, Zhang L, Goel A, Yang M, Wang X. An integrated workflow for biomarker development using microRNAs in extracellular vesicles for cancer precision medicine. Semin Cancer Biol 2021; 74:134-155. [PMID: 33766650 DOI: 10.1016/j.semcancer.2021.03.011] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2020] [Revised: 03/13/2021] [Accepted: 03/16/2021] [Indexed: 02/06/2023]
Abstract
EV-miRNAs are microRNA (miRNA) molecules encapsulated in extracellular vesicles (EVs), which play crucial roles in tumor pathogenesis, progression, and metastasis. Recent studies about EV-miRNAs have gained novel insights into cancer biology and have demonstrated a great potential to develop novel liquid biopsy assays for various applications. Notably, compared to conventional liquid biomarkers, EV-miRNAs are more advantageous in representing host-cell molecular architecture and exhibiting higher stability and specificity. Despite various available techniques for EV-miRNA separation, concentration, profiling, and data analysis, a standardized approach for EV-miRNA biomarker development is yet lacking. In this review, we performed a substantial literature review and distilled an integrated workflow encompassing important steps for EV-miRNA biomarker development, including sample collection and EV isolation, EV-miRNA extraction and quantification, high-throughput data preprocessing, biomarker prioritization and model construction, functional analysis, as well as validation. With the rapid growth of "big data", we highlight the importance of efficient mining of high-throughput data for the discovery of EV-miRNA biomarkers and integrating multiple independent datasets for in silico and experimental validations to increase the robustness and reproducibility. Furthermore, as an efficient strategy in systems biology, network inference provides insights into the regulatory mechanisms and can be used to select functionally important EV-miRNAs to refine the biomarker candidates. Despite the encouraging development in the field, a number of challenges still hinder the clinical translation. We finally summarize several common challenges in various biomarker studies and discuss potential opportunities emerging in the related fields.
Collapse
Affiliation(s)
- Yu Chen
- Department of Biomedical Sciences, City University of Hong Kong, 31 To Yuen Street, Kowloon Tong, Hong Kong
| | - Tan Wu
- Department of Biomedical Sciences, City University of Hong Kong, 31 To Yuen Street, Kowloon Tong, Hong Kong
| | - Zhongxu Zhu
- Department of Biomedical Sciences, City University of Hong Kong, 31 To Yuen Street, Kowloon Tong, Hong Kong
| | - Hao Huang
- Department of Biomedical Sciences, City University of Hong Kong, 31 To Yuen Street, Kowloon Tong, Hong Kong
| | - Liang Zhang
- Department of Biomedical Sciences, City University of Hong Kong, 31 To Yuen Street, Kowloon Tong, Hong Kong; Tung Biomedical Sciences Centre, City University of Hong Kong, Hong Kong; Key Laboratory of Biochip Technology, Biotech and Health Centre, Shenzhen Research Institute, City University of Hong Kong, Shenzhen, Guangdong Province, China
| | - Ajay Goel
- Department of Molecular Diagnostics and Experimental Therapeutics, Beckman Research Institute of City of Hope Comprehensive Cancer Center, Duarte, CA, USA
| | - Mengsu Yang
- Department of Biomedical Sciences, City University of Hong Kong, 31 To Yuen Street, Kowloon Tong, Hong Kong; Tung Biomedical Sciences Centre, City University of Hong Kong, Hong Kong; Key Laboratory of Biochip Technology, Biotech and Health Centre, Shenzhen Research Institute, City University of Hong Kong, Shenzhen, Guangdong Province, China
| | - Xin Wang
- Department of Biomedical Sciences, City University of Hong Kong, 31 To Yuen Street, Kowloon Tong, Hong Kong; Tung Biomedical Sciences Centre, City University of Hong Kong, Hong Kong; Key Laboratory of Biochip Technology, Biotech and Health Centre, Shenzhen Research Institute, City University of Hong Kong, Shenzhen, Guangdong Province, China.
| |
Collapse
|
62
|
Ligero M, Jordi-Ollero O, Bernatowicz K, Garcia-Ruiz A, Delgado-Muñoz E, Leiva D, Mast R, Suarez C, Sala-Llonch R, Calvo N, Escobar M, Navarro-Martin A, Villacampa G, Dienstmann R, Perez-Lopez R. Minimizing acquisition-related radiomics variability by image resampling and batch effect correction to allow for large-scale data analysis. Eur Radiol 2021. [PMID: 32909055 DOI: 10.1007/s00330-020-07174-0/figures/6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/01/2023]
Abstract
OBJECTIVE To identify CT-acquisition parameters accounting for radiomics variability and to develop a post-acquisition CT-image correction method to reduce variability and improve radiomics classification in both phantom and clinical applications. METHODS CT-acquisition protocols were prospectively tested in a phantom. The multi-centric retrospective clinical study included CT scans of patients with colorectal/renal cancer liver metastases. Ninety-three radiomics features of first order and texture were extracted. Intraclass correlation coefficients (ICCs) between CT-acquisition protocols were evaluated to define sources of variability. Voxel size, ComBat, and singular value decomposition (SVD) compensation methods were explored for reducing the radiomics variability. The number of robust features was compared before and after correction using two-proportion z test. The radiomics classification accuracy (K-means purity) was assessed before and after ComBat- and SVD-based correction. RESULTS Fifty-three acquisition protocols in 13 tissue densities were analyzed. Ninety-seven liver metastases from 43 patients with CT from two vendors were included. Pixel size, reconstruction slice spacing, convolution kernel, and acquisition slice thickness are relevant sources of radiomics variability with a percentage of robust features lower than 80%. Resampling to isometric voxels increased the number of robust features when images were acquired with different pixel sizes (p < 0.05). SVD-based for thickness correction and ComBat correction for thickness and combined thickness-kernel increased the number of reproducible features (p < 0.05). ComBat showed the highest improvement of radiomics-based classification in both the phantom and clinical applications (K-means purity 65.98 vs 73.20). CONCLUSION CT-image post-acquisition processing and radiomics normalization by means of batch effect correction allow for standardization of large-scale data analysis and improve the classification accuracy. KEY POINTS • The voxel size (accounting for the pixel size and slice spacing), slice thickness, and convolution kernel are relevant sources of CT-radiomics variability. • Voxel size resampling increased the mean percentage of robust CT-radiomics features from 59.50 to 89.25% when comparing CT scans acquired with different pixel sizes and from 71.62 to 82.58% when the scans were acquired with different slice spacings. • ComBat batch effect correction reduced the CT-radiomics variability secondary to the slice thickness and convolution kernel, improving the capacity of CT-radiomics to differentiate tissues (in the phantom application) and the primary tumor type from liver metastases (in the clinical application).
Collapse
Affiliation(s)
- Marta Ligero
- Radiomics Group, Vall d'Hebron Institute of Oncology (VHIO), Hospital Universitari Vall d'Hebron, Vall d'Hebron Barcelona Hospital Campus (Spain), Barcelona, Spain
| | - Olivia Jordi-Ollero
- Medical Physics and Radiation Protection Department, Catalan Institute of Oncology (ICO), Duran i Reynals Hospital, Barcelona, Spain
| | - Kinga Bernatowicz
- Radiomics Group, Vall d'Hebron Institute of Oncology (VHIO), Hospital Universitari Vall d'Hebron, Vall d'Hebron Barcelona Hospital Campus (Spain), Barcelona, Spain
| | - Alonso Garcia-Ruiz
- Radiomics Group, Vall d'Hebron Institute of Oncology (VHIO), Hospital Universitari Vall d'Hebron, Vall d'Hebron Barcelona Hospital Campus (Spain), Barcelona, Spain
| | - Eric Delgado-Muñoz
- Radiomics Group, Vall d'Hebron Institute of Oncology (VHIO), Hospital Universitari Vall d'Hebron, Vall d'Hebron Barcelona Hospital Campus (Spain), Barcelona, Spain
| | - David Leiva
- Radiology Department, Bellvitge University Hospital, Barcelona, Spain
| | - Richard Mast
- Radiology Department, Vall d'Hebron University Hospital, Barcelona, Spain
| | - Cristina Suarez
- Medical Oncology, Vall d'Hebron Institute of Oncology (VHIO), Hospital Universitari Vall d´Hebron, Vall d'Hebron Barcelona Hospital Campus (Spain), Barcelona, Spain
| | - Roser Sala-Llonch
- Department of Biomedicine, Faculty of Medicine, University of Barcelona, Barcelona, Spain
| | - Nahum Calvo
- Radiology Department, Bellvitge University Hospital, Barcelona, Spain
| | - Manuel Escobar
- Radiology Department, Vall d'Hebron University Hospital, Barcelona, Spain
| | - Arturo Navarro-Martin
- Radiation Oncology Department, Catalan Institute of Oncology (ICO), Duran i Reynals Hospital, Barcelona, Spain
| | - Guillermo Villacampa
- Oncology Data Science (ODysSey) Group, Vall d'Hebron Institute of Oncology (VHIO), Hospital Universitari Vall d'Hebron, Vall d'Hebron Barcelona Hospital Campus (Spain), Barcelona, Spain
| | - Rodrigo Dienstmann
- Oncology Data Science (ODysSey) Group, Vall d'Hebron Institute of Oncology (VHIO), Hospital Universitari Vall d'Hebron, Vall d'Hebron Barcelona Hospital Campus (Spain), Barcelona, Spain
| | - Raquel Perez-Lopez
- Radiomics Group, Vall d'Hebron Institute of Oncology (VHIO), Hospital Universitari Vall d'Hebron, Vall d'Hebron Barcelona Hospital Campus (Spain), Barcelona, Spain.
- Radiology Department, Vall d'Hebron University Hospital, Barcelona, Spain.
| |
Collapse
|
63
|
Tomin T, Schittmayer M, Sedej S, Bugger H, Gollmer J, Honeder S, Darnhofer B, Liesinger L, Zuckermann A, Rainer PP, Birner-Gruenberger R. Mass Spectrometry-Based Redox and Protein Profiling of Failing Human Hearts. Int J Mol Sci 2021; 22:1787. [PMID: 33670142 PMCID: PMC7916846 DOI: 10.3390/ijms22041787] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2021] [Revised: 02/04/2021] [Accepted: 02/08/2021] [Indexed: 12/11/2022] Open
Abstract
Oxidative stress contributes to detrimental functional decline of the myocardium, leading to the impairment of the antioxidative defense, dysregulation of redox signaling, and protein damage. In order to precisely dissect the changes of the myocardial redox state correlated with oxidative stress and heart failure, we subjected left-ventricular tissue specimens collected from control or failing human hearts to comprehensive mass spectrometry-based redox and quantitative proteomics, as well as glutathione status analyses. As a result, we report that failing hearts have lower glutathione to glutathione disulfide ratios and increased oxidation of a number of different proteins, including constituents of the contractile machinery as well as glycolytic enzymes. Furthermore, quantitative proteomics of failing hearts revealed a higher abundance of proteins responsible for extracellular matrix remodeling and reduced abundance of several ion transporters, corroborating contractile impairment. Similar effects were recapitulated by an in vitro cell culture model under a controlled oxygen atmosphere. Together, this study provides to our knowledge the most comprehensive report integrating analyses of protein abundance and global and peptide-level redox state in end-stage failing human hearts as well as oxygen-dependent redox and global proteome profiles of cultured human cardiomyocytes.
Collapse
Affiliation(s)
- Tamara Tomin
- Faculty of Technical Chemistry, Institute of Chemical Technologies and Analytics, Vienna University of Technology-TU Wien, Getreidemarkt 9/164, 1060 Vienna, Austria;
- Diagnostic and Research Institute of Pathology, Medical University of Graz, Stiftingtalstrasse 6, 8010 Graz, Austria; (S.H.); (B.D.); (L.L.)
- BiotechMed-Graz, Mozartgasse 12/II, 8010 Graz, Austria;
| | - Matthias Schittmayer
- Faculty of Technical Chemistry, Institute of Chemical Technologies and Analytics, Vienna University of Technology-TU Wien, Getreidemarkt 9/164, 1060 Vienna, Austria;
- Diagnostic and Research Institute of Pathology, Medical University of Graz, Stiftingtalstrasse 6, 8010 Graz, Austria; (S.H.); (B.D.); (L.L.)
- BiotechMed-Graz, Mozartgasse 12/II, 8010 Graz, Austria;
| | - Simon Sedej
- BiotechMed-Graz, Mozartgasse 12/II, 8010 Graz, Austria;
- Division of Cardiology, Medical University of Graz, Auenbruggerplatz 15, 8036 Graz, Austria; (H.B.); (J.G.)
- Faculty of Medicine, University of Maribor, 2000 Maribor, Slovenia
| | - Heiko Bugger
- Division of Cardiology, Medical University of Graz, Auenbruggerplatz 15, 8036 Graz, Austria; (H.B.); (J.G.)
| | - Johannes Gollmer
- Division of Cardiology, Medical University of Graz, Auenbruggerplatz 15, 8036 Graz, Austria; (H.B.); (J.G.)
| | - Sophie Honeder
- Diagnostic and Research Institute of Pathology, Medical University of Graz, Stiftingtalstrasse 6, 8010 Graz, Austria; (S.H.); (B.D.); (L.L.)
- BiotechMed-Graz, Mozartgasse 12/II, 8010 Graz, Austria;
| | - Barbara Darnhofer
- Diagnostic and Research Institute of Pathology, Medical University of Graz, Stiftingtalstrasse 6, 8010 Graz, Austria; (S.H.); (B.D.); (L.L.)
- BiotechMed-Graz, Mozartgasse 12/II, 8010 Graz, Austria;
| | - Laura Liesinger
- Diagnostic and Research Institute of Pathology, Medical University of Graz, Stiftingtalstrasse 6, 8010 Graz, Austria; (S.H.); (B.D.); (L.L.)
- BiotechMed-Graz, Mozartgasse 12/II, 8010 Graz, Austria;
| | - Andreas Zuckermann
- Cardiac Transplantation, Department of Cardiac Surgery, Medical University of Vienna, Spitalgasse 23, 1090 Vienna, Austria;
| | - Peter P. Rainer
- BiotechMed-Graz, Mozartgasse 12/II, 8010 Graz, Austria;
- Division of Cardiology, Medical University of Graz, Auenbruggerplatz 15, 8036 Graz, Austria; (H.B.); (J.G.)
| | - Ruth Birner-Gruenberger
- Faculty of Technical Chemistry, Institute of Chemical Technologies and Analytics, Vienna University of Technology-TU Wien, Getreidemarkt 9/164, 1060 Vienna, Austria;
- Diagnostic and Research Institute of Pathology, Medical University of Graz, Stiftingtalstrasse 6, 8010 Graz, Austria; (S.H.); (B.D.); (L.L.)
- BiotechMed-Graz, Mozartgasse 12/II, 8010 Graz, Austria;
| |
Collapse
|
64
|
Lehmann U, Stark H, Bartels S, Schlue J, Büsche G, Kreipe H. Genome-wide DNA methylation profiling is able to identify prefibrotic PMF cases at risk for progression to myelofibrosis. Clin Epigenetics 2021; 13:28. [PMID: 33541399 PMCID: PMC7860011 DOI: 10.1186/s13148-021-01010-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2020] [Accepted: 01/11/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Patients suffering from the BCR-ABL1-negative myeloproliferative disease prefibrotic primary myelofibrosis (pre-PMF) have a certain risk for progression to myelofibrosis. Accurate risk estimation for this fibrotic progression is of prognostic importance and clinically relevant. Commonly applied risk scores are based on clinical, cytogenetic, and genetic data but do not include epigenetic modifications. Therefore, we evaluated the assessment of genome-wide DNA methylation patterns for their ability to predict fibrotic progression in PMF patients. RESULTS For this purpose, the DNA methylation profile was analyzed genome-wide in a training set of 22 bone marrow trephines from patients with either fibrotic progression (n = 12) or stable disease over several years (n = 10) using the 850 k EPIC array from Illumina. The DNA methylation classifier constructed from this data set was validated in an independently measured test set of additional 11 bone marrow trephines (7 with stable disease, 4 with fibrotic progress). Hierarchical clustering of methylation β-values and linear discriminant classification yielded very good discrimination between both patient groups. By gene ontology analysis, the most differentially methylated CpG sites are primarily associated with genes involved in cell-cell and cell-matrix interactions. CONCLUSIONS In conclusion, we could show that genome-wide DNA methylation profiling of bone marrow trephines is feasible under routine diagnostic conditions and, more importantly, is able to predict fibrotic progression in pre-fibrotic primary myelofibrosis with high accuracy.
Collapse
Affiliation(s)
- Ulrich Lehmann
- Institute of Pathology, Medical School Hannover, Medizinische Hochschule Hannover, Carl-Neuberg-Str. 1, 30625, Hannover, Germany.
| | - Helge Stark
- Institute of Pathology, Medical School Hannover, Medizinische Hochschule Hannover, Carl-Neuberg-Str. 1, 30625, Hannover, Germany
| | - Stephan Bartels
- Institute of Pathology, Medical School Hannover, Medizinische Hochschule Hannover, Carl-Neuberg-Str. 1, 30625, Hannover, Germany
| | - Jerome Schlue
- Institute of Pathology, Medical School Hannover, Medizinische Hochschule Hannover, Carl-Neuberg-Str. 1, 30625, Hannover, Germany
| | - Guntram Büsche
- Institute of Pathology, Medical School Hannover, Medizinische Hochschule Hannover, Carl-Neuberg-Str. 1, 30625, Hannover, Germany
| | - Hans Kreipe
- Institute of Pathology, Medical School Hannover, Medizinische Hochschule Hannover, Carl-Neuberg-Str. 1, 30625, Hannover, Germany
| |
Collapse
|
65
|
Gutiérrez-Díez PJ, Gomez-Pilar J, Hornero R, Martínez-Rodríguez J, López-Marcos MA, Russo J. The role of gene to gene interaction in the breast's genomic signature of pregnancy. Sci Rep 2021; 11:2643. [PMID: 33514799 PMCID: PMC7846553 DOI: 10.1038/s41598-021-81704-8] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2020] [Accepted: 12/18/2020] [Indexed: 12/20/2022] Open
Abstract
Full-term pregnancy at an early age confers long-term protection against breast cancer. Published data shows a specific transcriptomic profile controlling chromatin remodeling that could play a relevant role in the pregnancy-induced protection. This process of chromatin remodeling, induced by the breast differentiation caused by the first full-term pregnancy, has mainly been measured by the expression level of genes individually considered. However, genes equally expressed during the process of chromatin remodeling may behave differently in their interaction with other genes. These changes at the gene cluster level could constitute an additional dimension of chromatin remodeling and therefore of the pregnancy-induced protection. In this research, we apply Information and Graph Theories, Differential Co-expression Network Analysis, and Multiple Regression Analysis, specially designed to examine structural and informational aspects of data sets, to analyze this question. Our findings demonstrate that, independently of the changes in the gene expression at the individual level, there are significant changes in gene-gene interactions and gene cluster behaviors. These changes indicate that the parous breast, through the process of early full-term pregnancy, generates more modules in the networks, with higher density, and a genomic structure performing additional and more complex functions than those found in the nulliparous breast.
Collapse
Affiliation(s)
- Pedro J Gutiérrez-Díez
- IMUVA Mathematical Institute, University of Valladolid, Valladolid, Spain
- Faculty of Economics, University of Valladolid, Valladolid, Spain
| | - Javier Gomez-Pilar
- Biomedical Engineering Group, University of Valladolid, Paseo de Belén, 15, 47011, Valladolid, Spain.
- Centro de Investigación Biomédica en Red en Bioingeniería, Biomateriales Y Nanomedicina (CIBER-BBN), Valladolid, Spain.
| | - Roberto Hornero
- IMUVA Mathematical Institute, University of Valladolid, Valladolid, Spain
- Biomedical Engineering Group, University of Valladolid, Paseo de Belén, 15, 47011, Valladolid, Spain
- Centro de Investigación Biomédica en Red en Bioingeniería, Biomateriales Y Nanomedicina (CIBER-BBN), Valladolid, Spain
| | - Julia Martínez-Rodríguez
- IMUVA Mathematical Institute, University of Valladolid, Valladolid, Spain
- Faculty of Economics, University of Valladolid, Valladolid, Spain
| | - Miguel A López-Marcos
- IMUVA Mathematical Institute, University of Valladolid, Valladolid, Spain
- Faculty of Science, University of Valladolid, Valladolid, Spain
| | - Jose Russo
- The Irma H. Russo, MD Breast Cancer Research Laboratory, Fox Chase Cancer Center - Temple University Health System, Philadelphia, USA
| |
Collapse
|
66
|
Tell-Marti G, Puig Sarda S, Puig-Butille JA. Gene Expression Microarray: Technical Fundamentals and Data Analysis. COMPREHENSIVE FOODOMICS 2021:291-312. [DOI: 10.1016/b978-0-08-100596-5.22905-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/03/2025]
|
67
|
Akpe V, Murhekar S, Kim TH, Brown CL, Cock IE. Batch Effect Adjustment to Lower the Drug Attrition Rate of MCF-7 Breast Cancer Cells Exposed to Silica Nanomaterial-Derived Scaffolds. Assay Drug Dev Technol 2021; 19:46-61. [PMID: 33443468 DOI: 10.1089/adt.2020.1016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Drug attrition rate is the calculation or measure of the clinical efficacy of a candidate drug on a screen platform for a specific period. Determining the attrition rate of a prospective cancer drug is a reliable way of testing the clinical efficacy. A low attrition rate in the last phase of a preclinical trial increases the success of a drug discovery process. It has been reported that the attrition rates of antineoplastic drugs are much higher than for other therapeutic drugs. Among the factors identified for the high attrition rates in antineoplastic drugs are the nature of the screen-based platforms involving human-derived xenografts, extracellular matrix-derived scaffold systems, and the synthetic scaffolds, which all have propensity to proliferate tumor cells at faster rates than in vivo primary tumors. Other factors that affect the high attrition rates are induced scaffold toxicity and the use of assays that are irrelevant, yet affect data processing. These factors contribute to the wide variation in data and systematic errors. As a result, it becomes imperative to filter batch variations and to standardize the data. Importantly, understanding the interplay between the biological milieu and scaffold connections is also crucial. Here the cell viability of MCF-7 (breast cancer cell line) cells exposed to different scaffolds were screened before cisplatin dosing using the calculated p-values. The statistical significance (p-value) of data was calculated using the one-way analysis of variance, with the p-value set as: 0 < p < 0.06. In addition, the half-maximal inhibitory concentration (IC50) of the different scaffolds exposed to MCF-7 cells were calculated with the probit extension model and cumulative distribution (%) of the extension data. The chemotherapeutic dose (cisplatin, 56 mg/m2) reduced the cell viability of MCF-7 cells to 5% within 24 h on the scaffold developed from silica nanoparticles (SNPs) and polyethylene glycol (PEG) formulation (SNP:PEG) mixtures with a ratio of 1:10, respectively.
Collapse
Affiliation(s)
- Victor Akpe
- Environmental Futures Research Institute, Griffith University, Nathan Campus, Nathan, Australia.,School of Environment and Science, Griffith University, Nathan Campus, Nathan, Australia
| | - Shweta Murhekar
- Environmental Futures Research Institute, Griffith University, Nathan Campus, Nathan, Australia.,School of Environment and Science, Griffith University, Nathan Campus, Nathan, Australia
| | - Tak H Kim
- Environmental Futures Research Institute, Griffith University, Nathan Campus, Nathan, Australia.,School of Environment and Science, Griffith University, Nathan Campus, Nathan, Australia
| | - Christopher L Brown
- Environmental Futures Research Institute, Griffith University, Nathan Campus, Nathan, Australia.,School of Environment and Science, Griffith University, Nathan Campus, Nathan, Australia
| | - Ian E Cock
- Environmental Futures Research Institute, Griffith University, Nathan Campus, Nathan, Australia.,School of Environment and Science, Griffith University, Nathan Campus, Nathan, Australia
| |
Collapse
|
68
|
Dincer AB, Janizek JD, Lee SI. Adversarial deconfounding autoencoder for learning robust gene expression embeddings. Bioinformatics 2020; 36:i573-i582. [PMID: 33381842 DOI: 10.1093/bioinformatics/btaa796] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION Increasing number of gene expression profiles has enabled the use of complex models, such as deep unsupervised neural networks, to extract a latent space from these profiles. However, expression profiles, especially when collected in large numbers, inherently contain variations introduced by technical artifacts (e.g. batch effects) and uninteresting biological variables (e.g. age) in addition to the true signals of interest. These sources of variations, called confounders, produce embeddings that fail to transfer to different domains, i.e. an embedding learned from one dataset with a specific confounder distribution does not generalize to different distributions. To remedy this problem, we attempt to disentangle confounders from true signals to generate biologically informative embeddings. RESULTS In this article, we introduce the Adversarial Deconfounding AutoEncoder (AD-AE) approach to deconfounding gene expression latent spaces. The AD-AE model consists of two neural networks: (i) an autoencoder to generate an embedding that can reconstruct original measurements, and (ii) an adversary trained to predict the confounder from that embedding. We jointly train the networks to generate embeddings that can encode as much information as possible without encoding any confounding signal. By applying AD-AE to two distinct gene expression datasets, we show that our model can (i) generate embeddings that do not encode confounder information, (ii) conserve the biological signals present in the original space and (iii) generalize successfully across different confounder domains. We demonstrate that AD-AE outperforms standard autoencoder and other deconfounding approaches. AVAILABILITY AND IMPLEMENTATION Our code and data are available at https://gitlab.cs.washington.edu/abdincer/ad-ae. CONTACT SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Ayse B Dincer
- Paul G. Allen School of Computer Science & Engineering, University of Washington, Seattle, WA 98195, USA
| | - Joseph D Janizek
- Paul G. Allen School of Computer Science & Engineering, University of Washington, Seattle, WA 98195, USA.,Medical Scientist Training Program, University of Washington, Seattle, WA 98195, USA
| | - Su-In Lee
- Paul G. Allen School of Computer Science & Engineering, University of Washington, Seattle, WA 98195, USA
| |
Collapse
|
69
|
Lazareva O, Canzar S, Yuan K, Baumbach J, Blumenthal DB, Tieri P, Kacprowski T, List M. BiCoN: network-constrained biclustering of patients and omics data. Bioinformatics 2020; 37:2398-2404. [PMID: 33367514 DOI: 10.1093/bioinformatics/btaa1076] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2020] [Revised: 11/25/2020] [Accepted: 12/15/2020] [Indexed: 12/21/2022] Open
Abstract
Abstract
Motivation
Unsupervised learning approaches are frequently used to stratify patients into clinically relevant subgroups and to identify biomarkers such as disease-associated genes. However, clustering and biclustering techniques are oblivious to the functional relationship of genes and are thus not ideally suited to pinpoint molecular mechanisms along with patient subgroups.
Results
We developed the network-constrained biclustering approach Biclustering Constrained by Networks (BiCoN) which (i) restricts biclusters to functionally related genes connected in molecular interaction networks and (ii) maximizes the difference in gene expression between two subgroups of patients. This allows BiCoN to simultaneously pinpoint molecular mechanisms responsible for the patient grouping. Network-constrained clustering of genes makes BiCoN more robust to noise and batch effects than typical clustering and biclustering methods. BiCoN can faithfully reproduce known disease subtypes as well as novel, clinically relevant patient subgroups, as we could demonstrate using breast and lung cancer datasets. In summary, BiCoN is a novel systems medicine tool that combines several heuristic optimization strategies for robust disease mechanism extraction. BiCoN is well-documented and freely available as a python package or a web interface.
Availability and implementation
PyPI package: https://pypi.org/project/bicon.
Web interface
https://exbio.wzw.tum.de/bicon.
Supplementary information
Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Olga Lazareva
- Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, Weihenstephan, 80333 Munich, Germany
| | - Stefan Canzar
- Gene Center, Ludwig-Maximilians-University of Munich, 81377 Munich, Germany
| | - Kevin Yuan
- Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, Weihenstephan, 80333 Munich, Germany
| | - Jan Baumbach
- Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, Weihenstephan, 80333 Munich, Germany
| | - David B Blumenthal
- Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, Weihenstephan, 80333 Munich, Germany
| | - Paolo Tieri
- CNR National Research Council, IAC Institute for Applied Computing, Rome 00185, Italy
- La Sapienza University of Rome, Rome 00185, Italy
| | - Tim Kacprowski
- Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, Weihenstephan, 80333 Munich, Germany
- Division of Data Science in Biomedicine, Peter L. Reichertz Institute for Medical Informatics, TU Braunschweig and Hannover Medical School, Brunswick 38106, Germany
| | - Markus List
- Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, Weihenstephan, 80333 Munich, Germany
| |
Collapse
|
70
|
Abstract
Carrying out large multicenter studies is one of the key goals to be achieved towards a faster transfer of the radiomics approach in the clinical setting. This requires large-scale radiomics data analysis, hence the need for integrating radiomic features extracted from images acquired in different centers. This is challenging as radiomic features exhibit variable sensitivity to differences in scanner model, acquisition protocols and reconstruction settings, which is similar to the so-called 'batch-effects' in genomics studies. In this review we discuss existing methods to perform data integration with the aid of reducing the unwanted variation associated with batch effects. We also discuss the future potential role of deep learning methods in providing solutions for addressing radiomic multicentre studies.
Collapse
Affiliation(s)
- R Da-Ano
- LaTiM, INSERM, UMR 1101, Univ Brest, Brest, France
| | - D Visvikis
- LaTiM, INSERM, UMR 1101, Univ Brest, Brest, France
- equally contributed
| | - M Hatt
- LaTiM, INSERM, UMR 1101, Univ Brest, Brest, France
- equally contributed
| |
Collapse
|
71
|
Fantauzzi MF, Aguiar JA, Tremblay BJM, Mansfield MJ, Yanagihara T, Chandiramohan A, Revill S, Ryu MH, Carlsten C, Ask K, Stämpfli M, Doxey AC, Hirota JA. Expression of endocannabinoid system components in human airway epithelial cells: impact of sex and chronic respiratory disease status. ERJ Open Res 2020; 6:00128-2020. [PMID: 33344628 PMCID: PMC7737429 DOI: 10.1183/23120541.00128-2020] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2020] [Accepted: 09/18/2020] [Indexed: 12/12/2022] Open
Abstract
Cannabis smoking is the dominant route of delivery, with the airway epithelium functioning as the site of first contact. The endocannabinoid system is responsible for mediating the physiological effects of inhaled phytocannabinoids. The expression of the endocannabinoid system in the airway epithelium and contribution to normal physiological responses remains to be defined. To begin to address this knowledge gap, a curated dataset of 1090 unique human bronchial brushing gene expression profiles was created. The dataset included 616 healthy subjects, 136 subjects with asthma, and 338 subjects with COPD. A 32-gene endocannabinoid signature was analysed across all samples with sex and disease-specific analyses performed. Immunohistochemistry and immunoblots were performed to probe in situ and in vitro protein expression. CB1, CB2, and TRPV1 protein signal is detectable in human airway epithelial cells in situ and in vitro, justifying examining the downstream endocannabinoid pathway. Sex status was associated with differential expression of 7 of 32 genes. In contrast, disease status was associated with differential expression of 21 of 32 genes in people with asthma and 26 of 32 genes in people with COPD. We confirm at the protein level that TRPV1, the most differentially expressed candidate in our analyses, was upregulated in airway epithelial cells from people with asthma relative to healthy subjects. Our data demonstrate that the endocannabinoid system is expressed in human airway epithelial cells with expression impacted by disease status and minimally by sex. The data suggest that cannabis consumers may have differential physiological responses in the respiratory mucosa.
Collapse
Affiliation(s)
- Matthew F Fantauzzi
- Firestone Institute for Respiratory Health - Division of Respirology, Dept of Medicine, McMaster University, Hamilton, ON, Canada.,McMaster Immunology Research Centre, McMaster University, Hamilton, ON, Canada
| | | | | | - Michael J Mansfield
- Genomics and Regulatory Systems Unit, Okinawa Institute of Science and Technology Graduate University, Onna, Japan
| | - Toyoshi Yanagihara
- Firestone Institute for Respiratory Health - Division of Respirology, Dept of Medicine, McMaster University, Hamilton, ON, Canada
| | - Abiram Chandiramohan
- Firestone Institute for Respiratory Health - Division of Respirology, Dept of Medicine, McMaster University, Hamilton, ON, Canada
| | - Spencer Revill
- Firestone Institute for Respiratory Health - Division of Respirology, Dept of Medicine, McMaster University, Hamilton, ON, Canada
| | - Min Hyung Ryu
- Division of Respiratory Medicine, Dept of Medicine, University of British Columbia, Vancouver, BC, Canada
| | - Chris Carlsten
- Division of Respiratory Medicine, Dept of Medicine, University of British Columbia, Vancouver, BC, Canada
| | - Kjetil Ask
- Firestone Institute for Respiratory Health - Division of Respirology, Dept of Medicine, McMaster University, Hamilton, ON, Canada.,McMaster Immunology Research Centre, McMaster University, Hamilton, ON, Canada
| | - Martin Stämpfli
- Firestone Institute for Respiratory Health - Division of Respirology, Dept of Medicine, McMaster University, Hamilton, ON, Canada.,McMaster Immunology Research Centre, McMaster University, Hamilton, ON, Canada
| | - Andrew C Doxey
- Firestone Institute for Respiratory Health - Division of Respirology, Dept of Medicine, McMaster University, Hamilton, ON, Canada.,Dept of Biology, University of Waterloo, Waterloo, ON, Canada
| | - Jeremy A Hirota
- Firestone Institute for Respiratory Health - Division of Respirology, Dept of Medicine, McMaster University, Hamilton, ON, Canada.,McMaster Immunology Research Centre, McMaster University, Hamilton, ON, Canada.,Dept of Biology, University of Waterloo, Waterloo, ON, Canada.,Division of Respiratory Medicine, Dept of Medicine, University of British Columbia, Vancouver, BC, Canada
| |
Collapse
|
72
|
Ben Azzouz F, Michel B, Lasla H, Gouraud W, François AF, Girka F, Lecointre T, Guérin-Charbonnel C, Juin PP, Campone M, Jézéquel P. Development of an absolute assignment predictor for triple-negative breast cancer subtyping using machine learning approaches. Comput Biol Med 2020; 129:104171. [PMID: 33316552 DOI: 10.1016/j.compbiomed.2020.104171] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2020] [Revised: 12/01/2020] [Accepted: 12/05/2020] [Indexed: 12/12/2022]
Abstract
Triple-negative breast cancer (TNBC) heterogeneity represents one of the main obstacles to precision medicine for this disease. Recent concordant transcriptomics studies have shown that TNBC could be divided into at least three subtypes with potential therapeutic implications. Although a few studies have been conducted to predict TNBC subtype using transcriptomics data, the subtyping was partially sensitive and limited by batch effect and dependence on a given dataset, which may penalize the switch to routine diagnostic testing. Therefore, we sought to build an absolute predictor (i.e., intra-patient diagnosis) based on machine learning algorithms with a limited number of probes. To that end, we started by introducing probe binary comparison for each patient (indicators). We based the predictive analysis on this transformed data. Probe selection was first involved combining both filter and wrapper methods for variable selection using cross-validation. We tested three prediction models (random forest, gradient boosting [GB], and extreme gradient boosting) using this optimal subset of indicators as inputs. Nested cross-validation consistently allowed us to choose the best model. The results showed that the fifty selected indicators highlighted the biological characteristics associated with each TNBC subtype. The GB based on this subset of indicators performs better than other models.
Collapse
Affiliation(s)
- Fadoua Ben Azzouz
- Unité de Bioinfomique, Institut de Cancérologie de L'Ouest, Bd Jacques Monod, 44805, Saint Herblain Cedex, France; SIRIC ILIAD, Nantes, Angers, France
| | - Bertrand Michel
- Unité de Bioinfomique, Institut de Cancérologie de L'Ouest, Bd Jacques Monod, 44805, Saint Herblain Cedex, France; SIRIC ILIAD, Nantes, Angers, France; Ecole Centrale de Nantes, 1 Rue de La Noë, 44300, Nantes, France; Laboratoire de Mathématiques Jean Leray, BP 92208, 2 Rue de La Houssinière, 44322, Nantes Cedex 03, France
| | - Hamza Lasla
- Unité de Bioinfomique, Institut de Cancérologie de L'Ouest, Bd Jacques Monod, 44805, Saint Herblain Cedex, France; SIRIC ILIAD, Nantes, Angers, France
| | - Wilfried Gouraud
- Unité de Bioinfomique, Institut de Cancérologie de L'Ouest, Bd Jacques Monod, 44805, Saint Herblain Cedex, France; SIRIC ILIAD, Nantes, Angers, France
| | | | - Fabien Girka
- Ecole Centrale de Nantes, 1 Rue de La Noë, 44300, Nantes, France
| | - Théo Lecointre
- Ecole Centrale de Nantes, 1 Rue de La Noë, 44300, Nantes, France
| | - Catherine Guérin-Charbonnel
- Unité de Bioinfomique, Institut de Cancérologie de L'Ouest, Bd Jacques Monod, 44805, Saint Herblain Cedex, France; SIRIC ILIAD, Nantes, Angers, France
| | - Philippe P Juin
- SIRIC ILIAD, Nantes, Angers, France; CRCINA, INSERM, CNRS, Université de Nantes, Université D'Angers, Institut de Recherche en Santé-Université de Nantes, 8 Quai Moncousu - BP 70721, 44007, Nantes Cedex 1, France
| | - Mario Campone
- SIRIC ILIAD, Nantes, Angers, France; CRCINA, INSERM, CNRS, Université de Nantes, Université D'Angers, Institut de Recherche en Santé-Université de Nantes, 8 Quai Moncousu - BP 70721, 44007, Nantes Cedex 1, France; Oncologie Médicale, Institut de Cancérologie de L'Ouest - René Gauducheau, Bd Jacques Monod, 44805, Saint Herblain Cedex, France
| | - Pascal Jézéquel
- Unité de Bioinfomique, Institut de Cancérologie de L'Ouest, Bd Jacques Monod, 44805, Saint Herblain Cedex, France; SIRIC ILIAD, Nantes, Angers, France; CRCINA, INSERM, CNRS, Université de Nantes, Université D'Angers, Institut de Recherche en Santé-Université de Nantes, 8 Quai Moncousu - BP 70721, 44007, Nantes Cedex 1, France.
| |
Collapse
|
73
|
Mo W, Qi Z, Liu Y. Learning Optimal Distributionally Robust Individualized Treatment Rules. J Am Stat Assoc 2020; 116:659-674. [PMID: 34177007 PMCID: PMC8221611 DOI: 10.1080/01621459.2020.1796359] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2019] [Revised: 06/24/2020] [Accepted: 07/02/2020] [Indexed: 10/23/2022]
Abstract
Recent development in the data-driven decision science has seen great advances in individualized decision making. Given data with individual covariates, treatment assignments and outcomes, policy makers best individualized treatment rule (ITR) that maximizes the expected outcome, known as the value function. Many existing methods assume that the training and testing distributions are the same. However, the estimated optimal ITR may have poor generalizability when the training and testing distributions are not identical. In this paper, we consider the problem of finding an optimal ITR from a restricted ITR class where there is some unknown covariate changes between the training and testing distributions. We propose a novel distributionally robust ITR (DR-ITR) framework that maximizes the worst-case value function across the values under a set of underlying distributions that are "close" to the training distribution. The resulting DR-ITR can guarantee the performance among all such distributions reasonably well. We further propose a calibrating procedure that tunes the DR-ITR adaptively to a small amount of calibration data from a target population. In this way, the calibrated DR-ITR can be shown to enjoy better generalizability than the standard ITR based on our numerical studies.
Collapse
Affiliation(s)
- Weibin Mo
- Department of Statistics and Operations Research, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599
| | - Zhengling Qi
- Department of Decision Sciences, George Washington University, Washington, D.C. 20052, USA
| | - Yufeng Liu
- Department of Statistics and Operations Research, Department of Genetics, Department of Biostatistics, Carolina Center for Genome Science, Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, NC 27599, USA
| |
Collapse
|
74
|
Minimizing acquisition-related radiomics variability by image resampling and batch effect correction to allow for large-scale data analysis. Eur Radiol 2020; 31:1460-1470. [PMID: 32909055 PMCID: PMC7880962 DOI: 10.1007/s00330-020-07174-0] [Citation(s) in RCA: 103] [Impact Index Per Article: 20.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2020] [Revised: 06/23/2020] [Accepted: 08/10/2020] [Indexed: 01/20/2023]
Abstract
Objective To identify CT-acquisition parameters accounting for radiomics variability and to develop a post-acquisition CT-image correction method to reduce variability and improve radiomics classification in both phantom and clinical applications. Methods CT-acquisition protocols were prospectively tested in a phantom. The multi-centric retrospective clinical study included CT scans of patients with colorectal/renal cancer liver metastases. Ninety-three radiomics features of first order and texture were extracted. Intraclass correlation coefficients (ICCs) between CT-acquisition protocols were evaluated to define sources of variability. Voxel size, ComBat, and singular value decomposition (SVD) compensation methods were explored for reducing the radiomics variability. The number of robust features was compared before and after correction using two-proportion z test. The radiomics classification accuracy (K-means purity) was assessed before and after ComBat- and SVD-based correction. Results Fifty-three acquisition protocols in 13 tissue densities were analyzed. Ninety-seven liver metastases from 43 patients with CT from two vendors were included. Pixel size, reconstruction slice spacing, convolution kernel, and acquisition slice thickness are relevant sources of radiomics variability with a percentage of robust features lower than 80%. Resampling to isometric voxels increased the number of robust features when images were acquired with different pixel sizes (p < 0.05). SVD-based for thickness correction and ComBat correction for thickness and combined thickness–kernel increased the number of reproducible features (p < 0.05). ComBat showed the highest improvement of radiomics-based classification in both the phantom and clinical applications (K-means purity 65.98 vs 73.20). Conclusion CT-image post-acquisition processing and radiomics normalization by means of batch effect correction allow for standardization of large-scale data analysis and improve the classification accuracy. Key Points • The voxel size (accounting for the pixel size and slice spacing), slice thickness, and convolution kernel are relevant sources of CT-radiomics variability. • Voxel size resampling increased the mean percentage of robust CT-radiomics features from 59.50 to 89.25% when comparing CT scans acquired with different pixel sizes and from 71.62 to 82.58% when the scans were acquired with different slice spacings. • ComBat batch effect correction reduced the CT-radiomics variability secondary to the slice thickness and convolution kernel, improving the capacity of CT-radiomics to differentiate tissues (in the phantom application) and the primary tumor type from liver metastases (in the clinical application). Electronic supplementary material The online version of this article (10.1007/s00330-020-07174-0) contains supplementary material, which is available to authorized users.
Collapse
|
75
|
Ai D, Wang Y, Li X, Pan H. Colorectal Cancer Prediction Based on Weighted Gene Co-Expression Network Analysis and Variational Auto-Encoder. Biomolecules 2020; 10:biom10091207. [PMID: 32825264 PMCID: PMC7563725 DOI: 10.3390/biom10091207] [Citation(s) in RCA: 27] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2020] [Revised: 07/30/2020] [Accepted: 08/01/2020] [Indexed: 02/06/2023] Open
Abstract
An effective feature extraction method is key to improving the accuracy of a prediction model. From the Gene Expression Omnibus (GEO) database, which includes 13,487 genes, we obtained microarray gene expression data for 238 samples from colorectal cancer (CRC) samples and normal samples. Twelve gene modules were obtained by weighted gene co-expression network analysis (WGCNA) on 173 samples. By calculating the Pearson correlation coefficient (PCC) between the characteristic genes of each module and colorectal cancer, we obtained a key module that was highly correlated with CRC. We screened hub genes from the key module by considering module membership, gene significance, and intramodular connectivity. We selected 10 hub genes as a type of feature for the classifier. We used the variational autoencoder (VAE) for 1159 genes with significantly different expressions and mapped the data into a 10-dimensional representation, as another type of feature for the cancer classifier. The two types of features were applied to the support vector machines (SVM) classifier for CRC. The accuracy was 0.9692 with an AUC of 0.9981. The result shows a high accuracy of the two-step feature extraction method, which includes obtaining hub genes by WGCNA and a 10-dimensional representation by variational autoencoder (VAE).
Collapse
Affiliation(s)
- Dongmei Ai
- Basic Experimental Center of Natural Science, University of Science and Technology Beijing, Beijing 100083, China
- School of Mathematics and Physics, University of Science and Technology Beijing, Beijing 100083, China; (Y.W.); (X.L.); (H.P.)
- Correspondence: ; Tel.: +86-136-2105-2939
| | - Yuduo Wang
- School of Mathematics and Physics, University of Science and Technology Beijing, Beijing 100083, China; (Y.W.); (X.L.); (H.P.)
| | - Xiaoxin Li
- School of Mathematics and Physics, University of Science and Technology Beijing, Beijing 100083, China; (Y.W.); (X.L.); (H.P.)
| | - Hongfei Pan
- School of Mathematics and Physics, University of Science and Technology Beijing, Beijing 100083, China; (Y.W.); (X.L.); (H.P.)
| |
Collapse
|
76
|
Microarray Normalization Revisited for Reproducible Breast Cancer Biomarkers. BIOMED RESEARCH INTERNATIONAL 2020; 2020:1363827. [PMID: 32832541 PMCID: PMC7428878 DOI: 10.1155/2020/1363827] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/04/2019] [Revised: 03/30/2020] [Accepted: 05/11/2020] [Indexed: 11/21/2022]
Abstract
Precision medicine for breast cancer relies on biomarkers to select therapies. However, the reliability of biomarkers drawn from gene expression arrays has been questioned and calls for reassessment, in particular for large datasets. We revisit widely used data-normalization procedures and evaluate differences in outcome in order to pinpoint the most reliable reprocessing methods biomarkers can be based upon. We generated a database of 3753 breast cancer patients out of 38 studies by downloading and curating patient samples from NCBI-GEO. As gene-expression biomarkers, we select the assessment of receptor status and breast cancer subtype classification. Each normalization procedure is applied separately, and biomarkers are then evaluated for each patient. Differences between normalization pipelines are quantified as percentages of patients having outcomes different for each pipeline. Some normalization procedures lead to quite consistent biomarkers, differing only in 1-2% of patients. Other normalization procedures—some of them have been used in many clinical studies—end up with distrusting discrepancies (10% and more). A good deal of doubt regarding the reliability of microarrays may root in the haphazard application of inadequate preprocessing pipelines. Several modes of batch corrections are evaluated regarding a possible improvement of receptor prediction from gene expression versus the golden standard of immunohistochemistry. Finally, we nominate those normalization methods yielding consistent and trustable results. Adequate bioinformatics data preprocessing is key and crucial for any subsequent statistics to arrive at trustable results. We conclude with a suggestion for future bioinformatics development to further increase the reliability of cancer biomarkers.
Collapse
|
77
|
Qiu Z, Chen S, Qi Y, Liu C, Zhai J, Xie S, Ma C. Exploring transcriptional switches from pairwise, temporal and population RNA-Seq data using deepTS. Brief Bioinform 2020; 22:5877690. [PMID: 32728687 DOI: 10.1093/bib/bbaa137] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2020] [Revised: 05/25/2020] [Accepted: 06/05/2020] [Indexed: 12/11/2022] Open
Abstract
Transcriptional switch (TS) is a widely observed phenomenon caused by changes in the relative expression of transcripts from the same gene, in spatial, temporal or other dimensions. TS has been associated with human diseases, plant development and stress responses. Its investigation is often hampered by a lack of suitable tools allowing comprehensive and flexible TS analysis for high-throughput RNA sequencing (RNA-Seq) data. Here, we present deepTS, a user-friendly web-based implementation that enables a fully interactive, multifunctional identification, visualization and analysis of TS events for large-scale RNA-Seq datasets from pairwise, temporal and population experiments. deepTS offers rich functionality to streamline RNA-Seq-based TS analysis for both model and non-model organisms and for those with or without reference transcriptome. The presented case studies highlight the capabilities of deepTS and demonstrate its potential for the transcriptome-wide TS analysis of pairwise, temporal and population RNA-Seq data. We believe deepTS will help research groups, regardless of their informatics expertise, perform accessible, reproducible and collaborative TS analyses of large-scale RNA-Seq data.
Collapse
Affiliation(s)
| | | | | | | | | | | | - Chuang Ma
- Bioinformatics Laboratory at Northwest A&F University
| |
Collapse
|
78
|
Jacob L, Witteveen A, Beumer I, Delahaye L, Wehkamp D, van den Akker J, Snel M, Chan B, Floore A, Bakx N, Brink G, Poncet C, Bogaerts J, Delorenzi M, Piccart M, Rutgers E, Cardoso F, Speed T, van 't Veer L, Glas A. Controlling technical variation amongst 6693 patient microarrays of the randomized MINDACT trial. Commun Biol 2020; 3:397. [PMID: 32719399 PMCID: PMC7385160 DOI: 10.1038/s42003-020-1111-1] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2020] [Accepted: 06/23/2020] [Indexed: 12/12/2022] Open
Abstract
Gene expression data obtained in large studies hold great promises for discovering disease signatures or subtypes through data analysis. It is also prone to technical variation, whose removal is essential to avoid spurious discoveries. Because this variation is not always known and can be confounded with biological signals, its removal is a challenging task. Here we provide a step-wise procedure and comprehensive analysis of the MINDACT microarray dataset. The MINDACT trial enrolled 6693 breast cancer patients and prospectively validated the gene expression signature MammaPrint for outcome prediction. The study also yielded a full-transcriptome microarray for each tumor. We show for the first time in such a large dataset how technical variation can be removed while retaining expected biological signals. Because of its unprecedented size, we hope the resulting adjusted dataset will be an invaluable tool to discover or test gene expression signatures and to advance our understanding of breast cancer.
Collapse
Affiliation(s)
- Laurent Jacob
- Université de Lyon, Université Lyon 1, CNRS, Laboratoire de Biométrie et Biologie Évolutive UMR 5558, Villeurbanne, France
| | | | - Inès Beumer
- Agendia NV/Agendia Inc, Amsterdam, The Netherlands
| | | | | | | | | | - Bob Chan
- Agendia NV/Agendia Inc, Amsterdam, The Netherlands
| | - Arno Floore
- Agendia NV/Agendia Inc, Amsterdam, The Netherlands
| | - Niels Bakx
- Agendia NV/Agendia Inc, Amsterdam, The Netherlands
| | - Guido Brink
- Agendia NV/Agendia Inc, Amsterdam, The Netherlands
| | | | | | - Mauro Delorenzi
- University Lausanne, Lausanne, Switzerland
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | | | - Emiel Rutgers
- Netherlands Cancer Institute, Amsterdam, The Netherlands
| | | | - Terence Speed
- Walter and Eliza Hall Institute of Medical Research, Melbourne, VIC, Australia
| | - Laura van 't Veer
- Agendia NV/Agendia Inc, Amsterdam, The Netherlands.
- Helen Diller Family Comprehensive Cancer Center, University California San Francisco, San Francisco, CA, USA.
| | - Annuska Glas
- Agendia NV/Agendia Inc, Amsterdam, The Netherlands.
| |
Collapse
|
79
|
Classification of Microarray Gene Expression Data Using an Infiltration Tactics Optimization (ITO) Algorithm. Genes (Basel) 2020; 11:genes11070819. [PMID: 32708429 PMCID: PMC7397166 DOI: 10.3390/genes11070819] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2020] [Revised: 07/08/2020] [Accepted: 07/09/2020] [Indexed: 11/17/2022] Open
Abstract
A number of different feature selection and classification techniques have been proposed in literature including parameter-free and parameter-based algorithms. The former are quick but may result in local maxima while the latter use dataset-specific parameter-tuning for higher accuracy. However, higher accuracy may not necessarily mean higher reliability of the model. Thus, generalized optimization is still a challenge open for further research. This paper presents a warzone inspired "infiltration tactics" based optimization algorithm (ITO)-not to be confused with the ITO algorithm based on the Itõ Process in the field of Stochastic calculus. The proposed ITO algorithm combines parameter-free and parameter-based classifiers to produce a high-accuracy-high-reliability (HAHR) binary classifier. The algorithm produces results in two phases: (i) Lightweight Infantry Group (LIG) converges quickly to find non-local maxima and produces comparable results (i.e., 70 to 88% accuracy) (ii) Followup Team (FT) uses advanced tuning to enhance the baseline performance (i.e., 75 to 99%). Every soldier of the ITO army is a base model with its own independently chosen Subset selection method, pre-processing, and validation methods and classifier. The successful soldiers are combined through heterogeneous ensembles for optimal results. The proposed approach addresses a data scarcity problem, is flexible to the choice of heterogeneous base classifiers, and is able to produce HAHR models comparable to the established MAQC-II results.
Collapse
|
80
|
Predicting and affecting response to cancer therapy based on pathway-level biomarkers. Nat Commun 2020; 11:3296. [PMID: 32620799 PMCID: PMC7335104 DOI: 10.1038/s41467-020-17090-y] [Citation(s) in RCA: 56] [Impact Index Per Article: 11.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2019] [Accepted: 06/12/2020] [Indexed: 12/15/2022] Open
Abstract
Identifying robust, patient-specific, and predictive biomarkers presents a major obstacle in precision oncology. To optimize patient-specific therapeutic strategies, here we couple pathway knowledge with large-scale drug sensitivity, RNAi, and CRISPR-Cas9 screening data from 460 cell lines. Pathway activity levels are found to be strong predictive biomarkers for the essentiality of 15 proteins, including the essentiality of MAD2L1 in breast cancer patients with high BRCA-pathway activity. We also find strong predictive biomarkers for the sensitivity to 31 compounds, including BCL2 and microtubule inhibitors (MTIs). Lastly, we show that Bcl-xL inhibition can modulate the activity of a predictive biomarker pathway and re-sensitize lung cancer cells and tumors to MTI therapy. Overall, our results support the use of pathways in helping to achieve the goal of precision medicine by uncovering dozens of predictive biomarkers.
Collapse
|
81
|
Da-Ano R, Masson I, Lucia F, Doré M, Robin P, Alfieri J, Rousseau C, Mervoyer A, Reinhold C, Castelli J, De Crevoisier R, Rameé JF, Pradier O, Schick U, Visvikis D, Hatt M. Performance comparison of modified ComBat for harmonization of radiomic features for multicenter studies. Sci Rep 2020; 10:10248. [PMID: 32581221 PMCID: PMC7314795 DOI: 10.1038/s41598-020-66110-w] [Citation(s) in RCA: 127] [Impact Index Per Article: 25.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2020] [Accepted: 05/04/2020] [Indexed: 11/08/2022] Open
Abstract
Multicenter studies are needed to demonstrate the clinical potential value of radiomics as a prognostic tool. However, variability in scanner models, acquisition protocols and reconstruction settings are unavoidable and radiomic features are notoriously sensitive to these factors, which hinders pooling them in a statistical analysis. A statistical harmonization method called ComBat was developed to deal with the "batch effect" in gene expression microarray data and was used in radiomics studies to deal with the "center-effect". Our goal was to evaluate modifications in ComBat allowing for more flexibility in choosing a reference and improving robustness of the estimation. Two modified ComBat versions were evaluated: M-ComBat allows to transform all features distributions to a chosen reference, instead of the overall mean, providing more flexibility. B-ComBat adds bootstrap and Monte Carlo for improved robustness in the estimation. BM-ComBat combines both modifications. The four versions were compared regarding their ability to harmonize features in a multicenter context in two different clinical datasets. The first contains 119 locally advanced cervical cancer patients from 3 centers, with magnetic resonance imaging and positron emission tomography imaging. In that case ComBat was applied with 3 labels corresponding to each center. The second one contains 98 locally advanced laryngeal cancer patients from 5 centers with contrast-enhanced computed tomography. In that specific case, because imaging settings were highly heterogeneous even within each of the five centers, unsupervised clustering was used to determine two labels for applying ComBat. The impact of each harmonization was evaluated through three different machine learning pipelines for the modelling step in predicting the clinical outcomes, across two performance metrics (balanced accuracy and Matthews correlation coefficient). Before harmonization, almost all radiomic features had significantly different distributions between labels. These differences were successfully removed with all ComBat versions. The predictive ability of the radiomic models was always improved with harmonization and the improved ComBat provided the best results. This was observed consistently in both datasets, through all machine learning pipelines and performance metrics. The proposed modifications allow for more flexibility and robustness in the estimation. They also slightly but consistently improve the predictive power of resulting radiomic models.
Collapse
Affiliation(s)
- R Da-Ano
- INSERM, UMR 1101, LaTIM, University of Brest, Brest, France.
| | - I Masson
- INSERM, UMR 1101, LaTIM, University of Brest, Brest, France
- Department of Radiation Oncology, Institut de cancérologie de l'Ouest René-Gauducheau, Saint-Herblain, France
| | - F Lucia
- INSERM, UMR 1101, LaTIM, University of Brest, Brest, France
- Radiation Oncology Department, University Hospital, Brest, France
| | - M Doré
- Department of Radiation Oncology, Institut de cancérologie de l'Ouest René-Gauducheau, Saint-Herblain, France
| | - P Robin
- Department of Nuclear Medicine, University of Brest, Brest, France
| | - J Alfieri
- Department of Radiation Oncology, McGill University Health Centre, Montreal, Quebec, Canada
| | - C Rousseau
- Department of Nuclear Medicine, Institut de cancerologie de l'Ouest René-Gauducheau, Saint-Herblain, France
- CRCINA, University of Nantes, INSERM UMR1232, CNRS-ERL6001, Nantes, France
| | - A Mervoyer
- Department of Radiation Oncology, Institut de cancérologie de l'Ouest René-Gauducheau, Saint-Herblain, France
| | - C Reinhold
- Department of Radiology, McGill University Health Centre, Montreal, Canada
| | - J Castelli
- Radiotherapy Department Cancer, Institute Eugene Marquis, Rennes, France
- University of Rennes 1, LTSI, Rennes, France
| | - R De Crevoisier
- Radiotherapy Department Cancer, Institute Eugene Marquis, Rennes, France
- University of Rennes 1, LTSI, Rennes, France
| | - J F Rameé
- Department of Medical Oncology, Centre Hospitalier de Vendee, La Roche sur Yon, France
| | - O Pradier
- INSERM, UMR 1101, LaTIM, University of Brest, Brest, France
- Radiation Oncology Department, University Hospital, Brest, France
| | - U Schick
- INSERM, UMR 1101, LaTIM, University of Brest, Brest, France
- Radiation Oncology Department, University Hospital, Brest, France
| | - D Visvikis
- INSERM, UMR 1101, LaTIM, University of Brest, Brest, France
| | - M Hatt
- INSERM, UMR 1101, LaTIM, University of Brest, Brest, France
| |
Collapse
|
82
|
Li C, Zou H, Xiong Z, Xiong Y, Miyagishima DF, Wanggou S, Li X. Construction and Validation of a 13-Gene Signature for Prognosis Prediction in Medulloblastoma. Front Genet 2020; 11:429. [PMID: 32508873 PMCID: PMC7249855 DOI: 10.3389/fgene.2020.00429] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2019] [Accepted: 04/07/2020] [Indexed: 01/28/2023] Open
Abstract
Background: Recent studies have identified several molecular subgroups of medulloblastoma associated with distinct clinical outcomes; however, no robust gene signature has been established for prognosis prediction. Our objective was to construct a robust gene signature-based model to predict the prognosis of patients with medulloblastoma. Methods: Expression data of medulloblastomas were acquired from the Gene Expression Omnibus (GSE85217, n = 763; GSE37418, n = 76). To identify genes associated with overall survival (OS), we performed univariate survival analysis and least absolute shrinkage and selection operator (LASSO) Cox regression. A risk score model was constructed based on selected genes and was validated using multiple datasets. Differentially expressed genes (DEGs) between the risk groups were identified. Kyoto Encyclopedia of Genes and Genomes (KEGG), Gene Ontology (GO), and protein–protein interaction (PPI) analyses were performed. Network modules and hub genes were identified using Cytoscape. Furthermore, tumor microenvironment (TME) was evaluated using ESTIMATE algorithm. Tumor-infiltrating immune cells (TIICs) were inferred using CIBERSORTx. Results: A 13-gene model was constructed and validated. Patients classified as high-risk group had significantly worse OS than those as low-risk group (Training set: p < 0.0001; Validation set 1: p < 0.0001; Validation set 2: p = 0.00052). The area under the curve (AUC) of the receiver operating characteristic (ROC) analysis indicated a good performance in predicting 1-, 3-, and 5-year OS in all datasets. Multivariate analysis integrating clinical factors demonstrated that the risk score was an independent predictor for the OS (validation set 1: p = 0.001, validation set 2: p = 0.004). We then identified 265 DEGs between risk groups and PPI analysis predicted modules that were highly related to central nervous system and embryonic development. The risk score was significantly correlated with programmed death-ligand 1 (PD-L1) expression (p < 0.001), as well as immune score (p = 0.035), stromal score (p = 0.010), and tumor purity (p = 0.010) in Group 4 medulloblastomas. Correlations between the 13-gene signature and the TIICs in Sonic hedgehog and Group 4 medulloblastomas were revealed. Conclusion: Our study constructed and validated a robust 13-gene signature model estimating the prognosis of medulloblastoma patients. We also revealed genes and pathways that may be related to the development and prognosis of medulloblastoma, which might provide candidate targets for future investigation.
Collapse
Affiliation(s)
- Chang Li
- Department of Neurosurgery, Xiangya Hospital, Central South University, Changsha, China.,Xiangya School of Medicine, Central South University, Changsha, China.,Hunan International Scientific and Technological Cooperation Base of Brain Tumor Research, Xiangya Hospital, Central South University, Changsha, China
| | - Han Zou
- Department of Neurosurgery, Xiangya Hospital, Central South University, Changsha, China.,Xiangya School of Medicine, Central South University, Changsha, China.,Hunan International Scientific and Technological Cooperation Base of Brain Tumor Research, Xiangya Hospital, Central South University, Changsha, China
| | - Zujian Xiong
- Department of Neurosurgery, Xiangya Hospital, Central South University, Changsha, China.,Xiangya School of Medicine, Central South University, Changsha, China.,Hunan International Scientific and Technological Cooperation Base of Brain Tumor Research, Xiangya Hospital, Central South University, Changsha, China
| | - Yi Xiong
- Department of Neurosurgery, Xiangya Hospital, Central South University, Changsha, China.,Xiangya School of Medicine, Central South University, Changsha, China.,Hunan International Scientific and Technological Cooperation Base of Brain Tumor Research, Xiangya Hospital, Central South University, Changsha, China
| | - Danielle F Miyagishima
- Department of Neurosurgery, Yale School of Medicine, New Haven, CT, United States.,Department of Genetics, Yale School of Medicine, New Haven, CT, United States
| | - Siyi Wanggou
- Department of Neurosurgery, Xiangya Hospital, Central South University, Changsha, China.,Hunan International Scientific and Technological Cooperation Base of Brain Tumor Research, Xiangya Hospital, Central South University, Changsha, China
| | - Xuejun Li
- Department of Neurosurgery, Xiangya Hospital, Central South University, Changsha, China.,Hunan International Scientific and Technological Cooperation Base of Brain Tumor Research, Xiangya Hospital, Central South University, Changsha, China
| |
Collapse
|
83
|
Boakye D, Jansen L, Schöttker B, Jansen EHJM, Schneider M, Halama N, Gào X, Chang-Claude J, Hoffmeister M, Brenner H. Blood markers of oxidative stress are strongly associated with poorer prognosis in colorectal cancer patients. Int J Cancer 2020; 147:2373-2386. [PMID: 32319674 DOI: 10.1002/ijc.33018] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2020] [Revised: 03/18/2020] [Accepted: 04/02/2020] [Indexed: 12/24/2022]
Abstract
Oxidative stress has been implicated in the initiation of several cancers, including colorectal cancer (CRC). Whether it also plays a role in CRC prognosis is unclear. We assessed the associations of two oxidative stress biomarkers (Diacron's reactive oxygen metabolites [d-ROMs] and total thiol level [TTL]) with CRC prognosis. CRC patients who were diagnosed in 2003 to 2012 and recruited into a population-based study in Germany (n = 3361) were followed for up to 6 years. Hazard ratios (HRs) and 95% confidence intervals (95% CIs) for the associations of d-ROMs and TTL (measured from blood samples collected shortly after CRC diagnosis) with overall survival (OS) and disease-specific survival (DSS) were estimated using multivariable Cox regression. Particularly pronounced associations of higher d-ROMs with lower survival were observed in stage IV patients, with patients in the highest (vs lowest) tertile having much lower OS (HR = 1.52, 95% CI = 1.14-2.04) and DSS (HR = 1.61, 95% CI = 1.20-2.17). For TTL, strong inverse associations of TTL with mortality were observed within all stages. In patients of all stages, those in the highest (vs lowest) quintile had substantially higher OS (HR = 0.48, 95% CI = 0.38-0.62) and DSS (HR = 0.52, 95% CI = 0.39-0.69). The addition of these biomarkers to models that included age, sex, tumor stage and subsite significantly improved the prediction of CRC prognosis. The observed strong associations of higher d-ROMs and lower TTL levels with poorer prognosis even in stage IV patients suggest that oxidative stress contributes significantly to premature mortality in CRC patients and demonstrate a large potential of these biomarkers in enhancing the prediction of CRC prognosis beyond tumor stage.
Collapse
Affiliation(s)
- Daniel Boakye
- Division of Clinical Epidemiology and Aging Research, German Cancer Research Center (DKFZ), Heidelberg, Germany.,Medical Faculty Heidelberg, Heidelberg University, Heidelberg, Germany
| | - Lina Jansen
- Division of Clinical Epidemiology and Aging Research, German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Ben Schöttker
- Division of Clinical Epidemiology and Aging Research, German Cancer Research Center (DKFZ), Heidelberg, Germany.,Network of Aging Research, Heidelberg University, Heidelberg, Germany
| | - Eugene H J M Jansen
- Centre for Health Protection, National Institute for Public Health and the Environment, Bilthoven, The Netherlands
| | - Martin Schneider
- Department of General, Visceral and Transplantation Surgery, Heidelberg University Hospital, Heidelberg, Germany
| | - Niels Halama
- Division of Translational Immunotherapy, German Cancer Research Center (DKFZ) and National Center for Tumor Diseases (NCT), Heidelberg, Germany
| | - Xin Gào
- Division of Clinical Epidemiology and Aging Research, German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Jenny Chang-Claude
- Unit of Genetic Epidemiology, Division of Cancer Epidemiology, German Cancer Research Center (DKFZ), Heidelberg, Germany.,Cancer Epidemiology Group, University Cancer Center Hamburg (UCCH), University Medical Center Hamburg-Eppendorf (UKE), Hamburg, Germany
| | - Michael Hoffmeister
- Division of Clinical Epidemiology and Aging Research, German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Hermann Brenner
- Division of Clinical Epidemiology and Aging Research, German Cancer Research Center (DKFZ), Heidelberg, Germany.,Division of Preventive Oncology, German Cancer Research Center (DKFZ) and National Center for Tumor Diseases (NCT), Heidelberg, Germany.,German Cancer Consortium (DKTK), German Cancer Research Center (DKFZ), Heidelberg, Germany
| |
Collapse
|
84
|
Delestro F, Scheunemann L, Pedrazzani M, Tchenio P, Preat T, Genovesio A. In vivo large-scale analysis of Drosophila neuronal calcium traces by automated tracking of single somata. Sci Rep 2020; 10:7153. [PMID: 32346011 PMCID: PMC7188892 DOI: 10.1038/s41598-020-64060-x] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2019] [Accepted: 04/07/2020] [Indexed: 01/30/2023] Open
Abstract
How does the concerted activity of neuronal populations shape behavior? Impediments to address this question are primarily due to critical experimental barriers. An integrated perspective on large scale neural information processing requires an in vivo approach that can combine the advantages of exhaustively observing all neurons dedicated to a given type of stimulus, and simultaneously achieve a resolution that is precise enough to capture individual neuron activity. Current experimental data from in vivo observations are either restricted to a small fraction of the total number of neurons, or are based on larger brain volumes but at a low spatial and temporal resolution. Consequently, fundamental questions as to how sensory information is represented on a population scale remain unanswered. In Drosophila melanogaster, the mushroom body (MB) represents an excellent model to analyze sensory coding and memory plasticity. In this work, we present an experimental setup coupled with a dedicated computational method that provides in vivo measurements of the activity of hundreds of densely packed somata uniformly spread in the MB. We exploit spinning-disk confocal 3D imaging over time of the whole MB cell body layer in vivo while it is exposed to olfactory stimulation. Importantly, to derive individual signal from densely packed somata, we have developed a fully automated image analysis procedure that takes advantage of the specificities of our data. After anisotropy correction, our approach operates a dedicated spot detection and registration over the entire time sequence to transform trajectories to identifiable clusters. This enabled us to discard spurious detections and reconstruct missing ones in a robust way. We demonstrate that this approach outperformed existing methods in this specific context and made possible high-throughput analysis of approximately 500 single somata uniformly spread over the MB in various conditions. Applying this approach, we find that learned experiences change the population code of odor representations in the MB. After long-term memory (LTM) formation, we quantified an increase in responsive somata count and a stable single neuron signal. We predict that this method, which should further enable studying the population pattern of neuronal activity, has the potential to uncover fine details of sensory processing and memory plasticity.
Collapse
Affiliation(s)
- Felipe Delestro
- Computational Bioimaging and Bioinformatics, IBENS, ENS, INSERM, CNRS, PSL, 46 rue d'Ulm, 75005, Paris, France
| | - Lisa Scheunemann
- Genes and Dynamics of Memory Systems, Brain Plasticity Unit, CNRS, ESPCI Paris, PSL, 10 Rue Vauquelin, 75005, Paris, France
| | - Mélanie Pedrazzani
- Genes and Dynamics of Memory Systems, Brain Plasticity Unit, CNRS, ESPCI Paris, PSL, 10 Rue Vauquelin, 75005, Paris, France
| | - Paul Tchenio
- Genes and Dynamics of Memory Systems, Brain Plasticity Unit, CNRS, ESPCI Paris, PSL, 10 Rue Vauquelin, 75005, Paris, France
| | - Thomas Preat
- Genes and Dynamics of Memory Systems, Brain Plasticity Unit, CNRS, ESPCI Paris, PSL, 10 Rue Vauquelin, 75005, Paris, France.
| | - Auguste Genovesio
- Computational Bioimaging and Bioinformatics, IBENS, ENS, INSERM, CNRS, PSL, 46 rue d'Ulm, 75005, Paris, France.
| |
Collapse
|
85
|
Villanger GD, Drover SSM, Nethery RC, Thomsen C, Sakhi AK, Øvergaard KR, Zeiner P, Hoppin JA, Reichborn-Kjennerud T, Aase H, Engel SM. Associations between urine phthalate metabolites and thyroid function in pregnant women and the influence of iodine status. ENVIRONMENT INTERNATIONAL 2020; 137:105509. [PMID: 32044443 DOI: 10.1016/j.envint.2020.105509] [Citation(s) in RCA: 36] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/21/2019] [Revised: 01/17/2020] [Accepted: 01/17/2020] [Indexed: 05/23/2023]
Abstract
BACKGROUND Human populations, including susceptible subpopulations such as pregnant women and their fetuses, are continuously exposed to phthalates. Phthalates may affect the thyroid hormone system, causing concern for pregnancy health, birth outcomes and child development. Few studies have investigated the joint effect of phthalates on thyroid function in pregnant women, although they are present as a mixture with highly inter-correlated compounds. Additionally, no studies have investigated if the key nutrient for thyroid health, iodine, modifies these relationships. METHODS In this study, we examined the cross-sectional relationships between concentrations of 12 urinary phthalate metabolites and 6 plasma thyroid function biomarkers measured mid-pregnancy (~17 week gestation) in pregnant women (N = 1072), that were selected from a population-based prospective birth cohort, The Norwegian Mother, Father and Child Cohort study (MoBa). We investigated if the phthalate metabolite-thyroid function biomarker associations differed by iodine status by using a validated estimate of habitual dietary iodine intake based on a food frequency questionnaire from the 22nd gestation week. We accounted for the phthalate metabolite mixture by factor analyses, ultimately reducing the exposure into two uncorrelated factors. These factors were used as predictors in multivariable adjusted linear regression models with thyroid function biomarkers as the outcomes. RESULTS Factor 1, which included high loadings for mono-iso-butyl phthalate (MiBP), mono-n-butyl phthalate (MnBP), and monobenzyl phthalate (MBzP), was associated with increased total triiodothyronine (TT3) and free T3 index (fT3i). These associations appeared to be driven primarily by women with low iodine intake (<150 µg/day, ~70% of our sample). Iodine intake significantly modified (p-interaction < 0.05) the association of factor 1 with thyroid stimulating hormone (TSH), total thyroxine (TT4) and free T4 index (fT4i), such that only among women in the high iodine intake category (≥150 µg/day, i.e. sufficient) was this factor associated with increased TSH and decreased TT4 and FT4i, respectively. In contrast, factor 2, which included high loadings for di-2-ethylhexyl phthalate metabolites (∑DEHP) and di-iso-nonyl phthalate metabolites (∑DiNP), was associated with a decrease in TT3 and fT3i, which appeared fairly uniform across iodine intake categories. CONCLUSION We find that phthalate exposure is associated with thyroid function in mid-pregnancy among Norwegian women, and that iodine intake, which is essential for thyroid health, could influence some of these relationships.
Collapse
Affiliation(s)
- Gro D Villanger
- Norwegian Institute of Public Health, PO Box 222 Skøyen, N-0213 Oslo, Norway.
| | - Samantha S M Drover
- Department of Epidemiology, Gillings School of Global Public Health, University of North Carolina and Chapel Hill, Chapel Hill, NC, USA
| | | | - Cathrine Thomsen
- Norwegian Institute of Public Health, PO Box 222 Skøyen, N-0213 Oslo, Norway
| | - Amrit K Sakhi
- Norwegian Institute of Public Health, PO Box 222 Skøyen, N-0213 Oslo, Norway
| | - Kristin R Øvergaard
- Division of Mental Health and Addiction, Oslo University Hospital, Oslo, Norway
| | - Pal Zeiner
- Division of Mental Health and Addiction, Oslo University Hospital, Oslo, Norway; Institute of Clinical Medicine, University of Oslo, Oslo, Norway
| | - Jane A Hoppin
- Department of Biological Sciences, NC State University, Raleigh, NC, USA
| | - Ted Reichborn-Kjennerud
- Norwegian Institute of Public Health, PO Box 222 Skøyen, N-0213 Oslo, Norway; Institute of Clinical Medicine, University of Oslo, Oslo, Norway
| | - Heidi Aase
- Norwegian Institute of Public Health, PO Box 222 Skøyen, N-0213 Oslo, Norway
| | - Stephanie M Engel
- Department of Epidemiology, Gillings School of Global Public Health, University of North Carolina and Chapel Hill, Chapel Hill, NC, USA
| |
Collapse
|
86
|
Rong Z, Tan Q, Cao L, Zhang L, Deng K, Huang Y, Zhu ZJ, Li Z, Li K. NormAE: Deep Adversarial Learning Model to Remove Batch Effects in Liquid Chromatography Mass Spectrometry-Based Metabolomics Data. Anal Chem 2020; 92:5082-5090. [DOI: 10.1021/acs.analchem.9b05460] [Citation(s) in RCA: 35] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Affiliation(s)
- Zhiwei Rong
- Department of Epidemiology and Biostatistics, School of Public Health, Harbin Medical University, Harbin 150086, China
| | - Qilong Tan
- Department of Epidemiology and Biostatistics, School of Public Health, Harbin Medical University, Harbin 150086, China
| | - Lei Cao
- Department of Epidemiology and Biostatistics, School of Public Health, Harbin Medical University, Harbin 150086, China
| | - Liuchao Zhang
- Department of Epidemiology and Biostatistics, School of Public Health, Harbin Medical University, Harbin 150086, China
| | - Kui Deng
- Department of Epidemiology and Biostatistics, School of Public Health, Harbin Medical University, Harbin 150086, China
| | - Yue Huang
- Department of Epidemiology and Biostatistics, School of Public Health, Harbin Medical University, Harbin 150086, China
| | - Zheng-Jiang Zhu
- Interdisciplinary Research Center on Biology and Chemistry, Shanghai Institute of Organic Chemistry, Chinese Academy of Sciences, Shanghai, 200032, China
| | - Zhenzi Li
- Department of Epidemiology and Biostatistics, School of Public Health, Harbin Medical University, Harbin 150086, China
| | - Kang Li
- Department of Epidemiology and Biostatistics, School of Public Health, Harbin Medical University, Harbin 150086, China
| |
Collapse
|
87
|
Stavropoulos G, Jonkers DMAE, Mujagic Z, Koek GH, Masclee AAM, Pierik MJ, Dallinga JW, Van Schooten FJ, Smolinska A. Implementation of quality controls is essential to prevent batch effects in breathomics data and allow for cross-study comparisons. J Breath Res 2020; 14:026012. [PMID: 32120348 DOI: 10.1088/1752-7163/ab7b8d] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Exhaled breath analysis has become a promising monitoring tool for various ailments by identifying volatile organic compounds (VOCs) as indicative biomarkers excreted in the human body. Throughout the process of sampling, measuring, and data processing, non-biological variations are introduced in the data leading to batch effects. Algorithmic approaches have been developed to cope with within-study batch effects. Batch differences, however, may occur among different studies too, and up-to-date, ways to correct for cross-study batch effects are lacking; ultimately, cross-study comparisons to verify the uniqueness of found VOC profiles for a specific disease may be challenging. This study applies within-study batch-effect-correction approaches to correct for cross-study batch effects; suggestions are made that may help prevent the introduction of cross-study variations. Three batch-effect-correction algorithms were investigated: zero-centering, combat, and the analysis of covariance framework. The breath samples were collected from inflammatory bowel disease ([Formula: see text]), chronic liver disease ([Formula: see text]), and irritable bowel syndrome ([Formula: see text]) patients at different periods, and they were analysed via gas chromatography-mass spectrometry. Multivariate statistics were used to visualise and verify the results. The visualisation of the data before any batch-effect-correction technique was applied showed a clear distinction due to probable batch effects among the datasets of the three cohorts. The visualisation of the three datasets after implementing all three correction techniques showed that the batch effects were still present in the data. Predictions made using partial least squares discriminant analysis and random forest confirmed this observation. The within-study batch-effect-correction approaches fail to correct for cross-study batch effects present in the data. The present study proposes a framework for systematically standardising future breathomics data by using internal standards or quality control samples at regular analysis intervals. Further knowledge regarding the nature of the unsolicited variations among cross-study batches must be obtained to move the field further.
Collapse
Affiliation(s)
- Georgios Stavropoulos
- Department of Pharmacology and Toxicology, NUTRIM School of Nutrition and Translational Research, Maastricht University, Maastricht, The Netherlands
| | | | | | | | | | | | | | | | | |
Collapse
|
88
|
Terkelsen T, Krogh A, Papaleo E. CAncer bioMarker Prediction Pipeline (CAMPP)-A standardized framework for the analysis of quantitative biological data. PLoS Comput Biol 2020; 16:e1007665. [PMID: 32176694 PMCID: PMC7108742 DOI: 10.1371/journal.pcbi.1007665] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2019] [Revised: 03/31/2020] [Accepted: 01/18/2020] [Indexed: 01/21/2023] Open
Abstract
With the improvement of -omics and next-generation sequencing (NGS) methodologies, along with the lowered cost of generating these types of data, the analysis of high-throughput biological data has become standard both for forming and testing biomedical hypotheses. Our knowledge of how to normalize datasets to remove latent undesirable variances has grown extensively, making for standardized data that are easily compared between studies. Here we present the CAncer bioMarker Prediction Pipeline (CAMPP), an open-source R-based wrapper (https://github.com/ELELAB/CAncer-bioMarker-Prediction-Pipeline -CAMPP) intended to aid bioinformatic software-users with data analyses. CAMPP is called from a terminal command line and is supported by a user-friendly manual. The pipeline may be run on a local computer and requires little or no knowledge of programming. To avoid issues relating to R-package updates, a renv .lock file is provided to ensure R-package stability. Data-management includes missing value imputation, data normalization, and distributional checks. CAMPP performs (I) k-means clustering, (II) differential expression/abundance analysis, (III) elastic-net regression, (IV) correlation and co-expression network analyses, (V) survival analysis, and (VI) protein-protein/miRNA-gene interaction networks. The pipeline returns tabular files and graphical representations of the results. We hope that CAMPP will assist in streamlining bioinformatic analysis of quantitative biological data, whilst ensuring an appropriate bio-statistical framework.
Collapse
Affiliation(s)
- Thilde Terkelsen
- Computational Biology Laboratory, Danish Cancer Society Research Center and Center for Autophagy, Recycling and Disease, Copenhagen, Denmark
| | - Anders Krogh
- Unit of Computational and RNA biology, Department of Biology, University of Copenhagen, Copenhagen Denmark
| | - Elena Papaleo
- Computational Biology Laboratory, Danish Cancer Society Research Center and Center for Autophagy, Recycling and Disease, Copenhagen, Denmark
- Translational Disease System Biology, Faculty of Health and Medical Science, Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Copenhagen, Denmark
- * E-mail:
| |
Collapse
|
89
|
Ferrari E, Retico A, Bacciu D. Measuring the effects of confounders in medical supervised classification problems: the Confounding Index (CI). Artif Intell Med 2020; 103:101804. [PMID: 32143800 DOI: 10.1016/j.artmed.2020.101804] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2019] [Revised: 11/08/2019] [Accepted: 01/10/2020] [Indexed: 10/25/2022]
Abstract
Over the years, there has been growing interest in using machine learning techniques for biomedical data processing. When tackling these tasks, one needs to bear in mind that biomedical data depends on a variety of characteristics, such as demographic aspects (age, gender, etc.) or the acquisition technology, which might be unrelated with the target of the analysis. In supervised tasks, failing to match the ground truth targets with respect to such characteristics, called confounders, may lead to very misleading estimates of the predictive performance. Many strategies have been proposed to handle confounders, ranging from data selection, to normalization techniques, up to the use of training algorithm for learning with imbalanced data. However, all these solutions require the confounders to be known a priori. To this aim, we introduce a novel index that is able to measure the confounding effect of a data attribute in a bias-agnostic way. This index can be used to quantitatively compare the confounding effects of different variables and to inform correction methods such as normalization procedures or ad-hoc-prepared learning algorithms. The effectiveness of this index is validated on both simulated data and real-world neuroimaging data.
Collapse
Affiliation(s)
- Elisa Ferrari
- Scuola Normale Superiore, Italy; Pisa Division, INFN, Italy.
| | | | - Davide Bacciu
- Dipartimento di Informatica, Università di Pisa, Italy.
| |
Collapse
|
90
|
Schultze AE, Bennet B, Rae JC, Chiang AY, Frazier K, Katavolos P, McKinney L, Patrick DJ, Tripathi N. Scientific Regulatory Policy Committee Points to Consider*: Nuisance Factors, Block Effects, and Batch Effects in Nonclinical Safety Assessment Studies. Toxicol Pathol 2020; 48:537-548. [PMID: 32122253 DOI: 10.1177/0192623320906385] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
Detection of test article-related effects and the determination of the adversity of those changes are the primary goals of nonclinical safety assessment studies for drugs and chemicals in development. During these studies, variables that are not of primary interest to investigators may change and influence data interpretation. These variables, often referred to as "nuisance factors," may influence other groups of data and result in "block or batch effects" that complicate data interpretation. Definitions of the terms "nuisance factors," "block effects," and "batch effects," as they apply to nonclinical safety assessment studies, are reviewed. Multiple case examples of block and batch effects in safety assessment studies are provided, and the challenges these bring to pathology data interpretation are discussed. Methods to mitigate the occurrence of block and batch effects in safety assessment studies, including statistical blocking and utilization of study designs that minimize potential confounding variables, incorporation of adequate randomization, and use of an appropriate number of animals or repeated measurement of specific parameters for increased precision, are reviewed. [Box: see text].
Collapse
|
91
|
Elamin AA, Klunkelfuß S, Kämpfer S, Oehlmann W, Stehr M, Smith C, Simpson GR, Morgan R, Pandha H, Singh M. A Specific Blood Signature Reveals Higher Levels of S100A12: A Potential Bladder Cancer Diagnostic Biomarker Along With Urinary Engrailed-2 Protein Detection. Front Oncol 2020; 9:1484. [PMID: 31993369 PMCID: PMC6962349 DOI: 10.3389/fonc.2019.01484] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2019] [Accepted: 12/10/2019] [Indexed: 12/13/2022] Open
Abstract
Urothelial carcinoma of the urinary bladder (UCB) or bladder cancer remains a major health problem with high morbidity and mortality rates, especially in the western world. UCB is also associated with the highest cost per patient. In recent years numerous markers have been evaluated for suitability in UCB detection and surveillance. However, to date none of these markers can replace or even reduce the use of routine tools (cytology and cystoscopy). Our current study described UCB's extensive expression profile and highlighted the variations with normal bladder tissue. Our data revealed that JUP, PTGDR, KLRF1, MT-TC, and RNU6-135P are associated with prognosis in patients with UCB. The microarray expression data identified also S100A12, S100A8, and NAMPT as potential UCB biomarkers. Pathway analysis revealed that natural killer cell mediated cytotoxicity is the most involved pathway. Our analysis showed that S100A12 protein may be useful as a biomarker for early UCB detection. Plasma S100A12 has been observed in patients with UCB with an overall sensitivity of 90.5% and a specificity of 75%. S100A12 is highly expressed preferably in high-grade and high-stage UCB. Furthermore, using a panel of more than hundred urine samples, a prototype lateral flow test for the transcription factor Engrailed-2 (EN2) also showed reasonable sensitivity (85%) and specificity (71%). Such findings provide confidence to further improve and refine the EN2 rapid test for use in clinical practice. In conclusion, S100A12 and EN2 have shown potential value as biomarker candidates for UCB patients. These results can speed up the discovery of biomarkers, improving diagnostic accuracy and may help the management of UCB.
Collapse
Affiliation(s)
- Ayssar A Elamin
- LIONEX Diagnostics and Therapeutics GmbH, Brunswick, Germany
| | | | - Susanne Kämpfer
- LIONEX Diagnostics and Therapeutics GmbH, Brunswick, Germany
| | - Wulf Oehlmann
- LIONEX Diagnostics and Therapeutics GmbH, Brunswick, Germany
| | - Matthias Stehr
- LIONEX Diagnostics and Therapeutics GmbH, Brunswick, Germany
| | - Christopher Smith
- Department of Oncology, Faculty of Health and Medical Sciences, University of Surrey, Guildford, United Kingdom
| | - Guy R Simpson
- Department of Oncology, Faculty of Health and Medical Sciences, University of Surrey, Guildford, United Kingdom
| | - Richard Morgan
- Institute of Cancer Therapeutics, Faculty of Life Sciences, University of Bradford, Bradford, United Kingdom
| | - Hardev Pandha
- Department of Oncology, Faculty of Health and Medical Sciences, University of Surrey, Guildford, United Kingdom
| | - Mahavir Singh
- LIONEX Diagnostics and Therapeutics GmbH, Brunswick, Germany
| |
Collapse
|
92
|
Mehtonen J, Pölönen P, Häyrynen S, Dufva O, Lin J, Liuksiala T, Granberg K, Lohi O, Hautamäki V, Nykter M, Heinäniemi M. Data-driven characterization of molecular phenotypes across heterogeneous sample collections. Nucleic Acids Res 2020; 47:e76. [PMID: 31329928 PMCID: PMC6648337 DOI: 10.1093/nar/gkz281] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2019] [Revised: 04/02/2019] [Accepted: 04/10/2019] [Indexed: 12/31/2022] Open
Abstract
Existing large gene expression data repositories hold enormous potential to elucidate disease mechanisms, characterize changes in cellular pathways, and to stratify patients based on molecular profiles. To achieve this goal, integrative resources and tools are needed that allow comparison of results across datasets and data types. We propose an intuitive approach for data-driven stratifications of molecular profiles and benchmark our methodology using the dimensionality reduction algorithm t-distributed stochastic neighbor embedding (t-SNE) with multi-study and multi-platform data on hematological malignancies. Our approach enables assessing the contribution of biological versus technical variation to sample clustering, direct incorporation of additional datasets to the same low dimensional representation, comparison of molecular disease subtypes identified from separate t-SNE representations, and characterization of the obtained clusters based on pathway databases and additional data. In this manner, we performed an integrative analysis across multi-omics acute myeloid leukemia studies. Our approach indicated new molecular subtypes with differential survival and drug responsiveness among samples lacking fusion genes, including a novel myelodysplastic syndrome-like cluster and a cluster characterized with CEBPA mutations and differential activity of the S-adenosylmethionine-dependent DNA methylation pathway. In summary, integration across multiple studies can help to identify novel molecular disease subtypes and generate insight into disease biology.
Collapse
Affiliation(s)
- Juha Mehtonen
- Institute of Biomedicine, School of Medicine, University of Eastern Finland, Kuopio, Finland
| | - Petri Pölönen
- Institute of Biomedicine, School of Medicine, University of Eastern Finland, Kuopio, Finland
| | - Sergei Häyrynen
- Faculty of Medicine and Health Technology, Tampere University, Tampere, Finland
| | - Olli Dufva
- Hematology Research Unit Helsinki, University of Helsinki and Department of Hematology, Helsinki University Hospital Comprehensive Cancer Center, Helsinki, Finland
| | - Jake Lin
- Faculty of Medicine and Health Technology, Tampere University, Tampere, Finland
| | - Thomas Liuksiala
- Faculty of Medicine and Health Technology, Tampere University, Tampere, Finland.,Tampere Center for Child Health Research, Tampere University and Tampere University Hospital, Tampere, Finland
| | - Kirsi Granberg
- Faculty of Medicine and Health Technology, Tampere University, Tampere, Finland
| | - Olli Lohi
- Tampere Center for Child Health Research, Tampere University and Tampere University Hospital, Tampere, Finland
| | - Ville Hautamäki
- School of Computing, University of Eastern Finland, Joensuu, Finland
| | - Matti Nykter
- Faculty of Medicine and Health Technology, Tampere University, Tampere, Finland
| | - Merja Heinäniemi
- Institute of Biomedicine, School of Medicine, University of Eastern Finland, Kuopio, Finland
| |
Collapse
|
93
|
Čuklina J, Pedrioli PGA, Aebersold R. Review of Batch Effects Prevention, Diagnostics, and Correction Approaches. Methods Mol Biol 2020; 2051:373-387. [PMID: 31552638 DOI: 10.1007/978-1-4939-9744-2_16] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Systematic technical variation in high-throughput studies consisting of the serial measurement of large sample cohorts is known as batch effects. Batch effects reduce the sensitivity of biological signal extraction and can cause significant artifacts. The systematic bias in the data caused by batch effects is more common in studies in which logistical considerations restrict the number of samples that can be prepared or profiled in a single experiment, thus necessitating the arrangement of subsets of study samples in batches. To mitigate the negative impact of batch effects, statistical approaches for batch correction are used at the stage of experimental design and data processing. Whereas in genomics batch effects and possible remedies have been extensively discussed, they are a relatively new challenge in proteomics because methods with sufficient throughput to systematically measure through large sample cohorts have only recently become available. Here we provide general recommendations to mitigate batch effects: we discuss the design of large-scale proteomic studies, review the most commonly used tools for batch effect correction and overview their application in proteomics.
Collapse
Affiliation(s)
- Jelena Čuklina
- Department of Biology, Institute of Molecular Systems Biology, ETH Zürich, Zürich, Switzerland
- Ph.D. Program in Systems Biology, University of Zurich and ETH Zurich, Zürich, Switzerland
| | - Patrick G A Pedrioli
- Department of Biology, Institute of Molecular Systems Biology, ETH Zürich, Zürich, Switzerland
- ETH Zürich, PHRT-MS, Zürich, Switzerland
| | - Ruedi Aebersold
- Department of Biology, Institute of Molecular Systems Biology, ETH Zürich, Zürich, Switzerland.
- Faculty of Science, University of Zürich, Zürich, Switzerland.
| |
Collapse
|
94
|
Machine learning and data mining frameworks for predicting drug response in cancer: An overview and a novel in silico screening process based on association rule mining. Pharmacol Ther 2019; 203:107395. [DOI: 10.1016/j.pharmthera.2019.107395] [Citation(s) in RCA: 56] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2019] [Accepted: 07/11/2019] [Indexed: 12/20/2022]
|
95
|
Saurty-Seerunghen MS, Bellenger L, El-Habr EA, Delaunay V, Garnier D, Chneiweiss H, Antoniewski C, Morvan-Dubois G, Junier MP. Capture at the single cell level of metabolic modules distinguishing aggressive and indolent glioblastoma cells. Acta Neuropathol Commun 2019; 7:155. [PMID: 31619292 PMCID: PMC6796454 DOI: 10.1186/s40478-019-0819-y] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2019] [Accepted: 09/29/2019] [Indexed: 02/01/2023] Open
Abstract
Glioblastoma cell ability to adapt their functioning to microenvironment changes is a source of the extensive intra-tumor heterogeneity characteristic of this devastating malignant brain tumor. A systemic view of the metabolic pathways underlying glioblastoma cell functioning states is lacking. We analyzed public single cell RNA-sequencing data from glioblastoma surgical resections, which offer the closest available view of tumor cell heterogeneity as encountered at the time of patients’ diagnosis. Unsupervised analyses revealed that information dispersed throughout the cell transcript repertoires encoded the identity of each tumor and masked information related to cell functioning states. Data reduction based on an experimentally-defined signature of transcription factors overcame this hurdle. It allowed cell grouping according to their tumorigenic potential, regardless of their tumor of origin. The approach relevance was validated using independent datasets of glioblastoma cell and tissue transcriptomes, patient-derived cell lines and orthotopic xenografts. Overexpression of genes coding for amino acid and lipid metabolism enzymes involved in anti-oxidative, energetic and cell membrane processes characterized cells with high tumorigenic potential. Modeling of their expression network highlighted the very long chain polyunsaturated fatty acid synthesis pathway at the core of the network. Expression of its most downstream enzymatic component, ELOVL2, was associated with worsened patient survival, and required for cell tumorigenic properties in vivo. Our results demonstrate the power of signature-driven analyses of single cell transcriptomes to obtain an integrated view of metabolic pathways at play within the heterogeneous cell landscape of patient tumors.
Collapse
|
96
|
Schmidt F, List M, Cukuroglu E, Köhler S, Göke J, Schulz MH. An ontology-based method for assessing batch effect adjustment approaches in heterogeneous datasets. Bioinformatics 2019; 34:i908-i916. [PMID: 30423059 PMCID: PMC6129283 DOI: 10.1093/bioinformatics/bty553] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/04/2022] Open
Abstract
Motivation International consortia such as the Genotype-Tissue Expression (GTEx) project, The Cancer Genome Atlas (TCGA) or the International Human Epigenetics Consortium (IHEC) have produced a wealth of genomic datasets with the goal of advancing our understanding of cell differentiation and disease mechanisms. However, utilizing all of these data effectively through integrative analysis is hampered by batch effects, large cell type heterogeneity and low replicate numbers. To study if batch effects across datasets can be observed and adjusted for, we analyze RNA-seq data of 215 samples from ENCODE, Roadmap, BLUEPRINT and DEEP as well as 1336 samples from GTEx and TCGA. While batch effects are a considerable issue, it is non-trivial to determine if batch adjustment leads to an improvement in data quality, especially in cases of low replicate numbers. Results We present a novel method for assessing the performance of batch effect adjustment methods on heterogeneous data. Our method borrows information from the Cell Ontology to establish if batch adjustment leads to a better agreement between observed pairwise similarity and similarity of cell types inferred from the ontology. A comparison of state-of-the art batch effect adjustment methods suggests that batch effects in heterogeneous datasets with low replicate numbers cannot be adequately adjusted. Better methods need to be developed, which can be assessed objectively in the framework presented here. Availability and implementation Our method is available online at https://github.com/SchulzLab/OntologyEval. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Florian Schmidt
- Max Planck Institute for Informatics, Saarland Informatics Campus, Saarbrücken, Germany.,Cluster of Excellence MMCI, Saarland University, Saarland Informatics Campus, Saarbrücken, Germany.,Graduate School of Computer Science, Saarland Informatics Campus, Saarbrücken, Germany.,Genome Institute of Singapore, Computational Genomics and Transcriptomics, Singapore
| | - Markus List
- Max Planck Institute for Informatics, Saarland Informatics Campus, Saarbrücken, Germany.,Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, Freising, Germany
| | - Engin Cukuroglu
- Genome Institute of Singapore, Computational Genomics and Transcriptomics, Singapore
| | | | - Jonathan Göke
- Genome Institute of Singapore, Computational Genomics and Transcriptomics, Singapore
| | - Marcel H Schulz
- Max Planck Institute for Informatics, Saarland Informatics Campus, Saarbrücken, Germany.,Cluster of Excellence MMCI, Saarland University, Saarland Informatics Campus, Saarbrücken, Germany.,Institute for Cardiovascular Regeneration, Goethe University, Frankfurt am Main, Germany.,German Center for Cardiovascular Research, Partner Site Rhein-Main, Frankfurt am Main, Germany
| |
Collapse
|
97
|
Hackley RK, Schmid AK. Global Transcriptional Programs in Archaea Share Features with the Eukaryotic Environmental Stress Response. J Mol Biol 2019; 431:4147-4166. [PMID: 31437442 PMCID: PMC7419163 DOI: 10.1016/j.jmb.2019.07.029] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2019] [Revised: 07/18/2019] [Accepted: 07/18/2019] [Indexed: 01/06/2023]
Abstract
The environmental stress response (ESR), a global transcriptional program originally identified in yeast, is characterized by a rapid and transient transcriptional response composed of large, oppositely regulated gene clusters. Genes induced during the ESR encode core components of stress tolerance, macromolecular repair, and maintenance of homeostasis. In this review, we investigate the possibility for conservation of the ESR across the eukaryotic and archaeal domains of life. We first re-analyze existing transcriptomics data sets to illustrate that a similar transcriptional response is identifiable in Halobacterium salinarum, an archaeal model organism. To substantiate the archaeal ESR, we calculated gene-by-gene correlations, gene function enrichment, and comparison of temporal dynamics. We note reported examples of variation in the ESR across fungi, then synthesize high-level trends present in expression data of other archaeal species. In particular, we emphasize the need for additional high-throughput time series expression data to further characterize stress-responsive transcriptional programs in the Archaea. Together, this review explores an open question regarding features of global transcriptional stress response programs shared across domains of life.
Collapse
Affiliation(s)
- Rylee K Hackley
- Department of Biology, Duke University, Durham, NC 27708, USA; University Program in Genetics and Genomics, Duke University, Durham, NC 27708, USA
| | - Amy K Schmid
- Department of Biology, Duke University, Durham, NC 27708, USA; University Program in Genetics and Genomics, Duke University, Durham, NC 27708, USA; Center for Genomics and Computational Biology, Duke University, Durham, NC 27708, USA.
| |
Collapse
|
98
|
Zhou L, Chi-Hau Sue A, Bin Goh WW. Examining the practical limits of batch effect-correction algorithms: When should you care about batch effects? J Genet Genomics 2019; 46:433-443. [PMID: 31611172 DOI: 10.1016/j.jgg.2019.08.002] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2019] [Revised: 08/02/2019] [Accepted: 08/04/2019] [Indexed: 12/20/2022]
Abstract
Batch effects are technical sources of variation and can confound analysis. While many performance ranking exercises have been conducted to establish the best batch effect-correction algorithm (BECA), we hold the viewpoint that the notion of best is context-dependent. Moreover, alternative questions beyond the simplistic notion of "best" are also interesting: are BECAs robust against various degrees of confounding and if so, what is the limit? Using two different methods for simulating class (phenotype) and batch effects and taking various representative datasets across both genomics (RNA-Seq) and proteomics platforms, we demonstrate that under situations where sample classes and batch factors are moderately confounded, most BECAs are remarkably robust and only weakly affected by upstream normalization procedures. This observation is consistently supported across the multitude of test datasets. BECAs do have limits: When sample classes and batch factors are strongly confounded, BECA performance declines, with variable performance in precision, recall and also batch correction. We also report that while conventional normalization methods have minimal impact on batch effect correction, they do not affect downstream statistical feature selection, and in strongly confounded scenarios, may even outperform BECAs. In other words, removing batch effects is no guarantee of optimal functional analysis. Overall, this study suggests that simplistic performance ranking exercises are quite trivial, and all BECAs are compromises in some context or another.
Collapse
Affiliation(s)
- Longjian Zhou
- School of Pharmaceutical Science and Technology, Tianjin University, Tianjin, 30072, China
| | - Andrew Chi-Hau Sue
- School of Pharmaceutical Science and Technology, Tianjin University, Tianjin, 30072, China
| | - Wilson Wen Bin Goh
- School of Biological Sciences, Nanyang Technological University, 60 Nanyang Drive, 637551, Singapore.
| |
Collapse
|
99
|
Zhou W, Koudijs KKM, Böhringer S. Influence of batch effect correction methods on drug induced differential gene expression profiles. BMC Bioinformatics 2019; 20:437. [PMID: 31438848 PMCID: PMC6706913 DOI: 10.1186/s12859-019-3028-6] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2019] [Accepted: 08/13/2019] [Indexed: 01/17/2023] Open
Abstract
Background Batch effects were not accounted for in most of the studies of computational drug repositioning based on gene expression signatures. It is unknown how batch effect removal methods impact the results of signature-based drug repositioning. Herein, we conducted differential analyses on the Connectivity Map (CMAP) database using several batch effect correction methods to evaluate the influence of batch effect correction methods on computational drug repositioning using microarray data and compare several batch effect correction methods. Results Differences in average signature size were observed with different methods applied. The gene signatures identified by the Latent Effect Adjustment after Primary Projection (LEAPP) method and the methods fitted with Linear Models for Microarray Data (limma) software demonstrated little agreement. The external validity of the gene signatures was evaluated by connectivity mapping between the CMAP database and the Library of Integrated Network-based Cellular Signatures (LINCS) database. The results of connectivity mapping indicate that the genes identified were not reliable for drugs with total sample size (drug + control samples) smaller than 40, irrespective of the batch effect correction method applied. With total sample size larger than 40, the methods correcting for batch effects produced significantly better results than the method with no batch effect correction. In a simulation study, the power was generally low for simulated data with sample size smaller than 40. We observed best performance when using the limma method correcting for two principal components. Conclusion Batch effect correction methods strongly impact differential gene expression analysis when the sample size is large enough to contain sufficient information and thus the downstream drug repositioning. We recommend including two or three principal components as covariates in fitting models with limma when sample size is sufficient (larger than 40 drug and controls combined). Electronic supplementary material The online version of this article (10.1186/s12859-019-3028-6) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Wei Zhou
- Department of Biomedical Data Sciences, Leiden University Medical Center, Leiden, The Netherlands. .,Department of Internal Medicine, Erasmus Medical Center, Rotterdam, The Netherlands.
| | - Karel K M Koudijs
- Department of Clinical Pharmacy & Toxicology, Leiden University Medical Center, Leiden, The Netherlands
| | - Stefan Böhringer
- Department of Biomedical Data Sciences, Leiden University Medical Center, Leiden, The Netherlands
| |
Collapse
|
100
|
Bruderer T, Gaisl T, Gaugg MT, Nowak N, Streckenbach B, Müller S, Moeller A, Kohler M, Zenobi R. On-Line Analysis of Exhaled Breath Focus Review. Chem Rev 2019; 119:10803-10828. [PMID: 31594311 DOI: 10.1021/acs.chemrev.9b00005] [Citation(s) in RCA: 122] [Impact Index Per Article: 20.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
On-line analysis of exhaled breath offers insight into a person's metabolism without the need for sample preparation or sample collection. Due to its noninvasive nature and the possibility to sample continuously, the analysis of breath has great clinical potential. The unique features of this technology make it an attractive candidate for applications in medicine, beyond the task of diagnosis. We review the current methodologies for on-line breath analysis, discuss current and future applications, and critically evaluate challenges and pitfalls such as the need for standardization. Special emphasis is given to the use of the technology in diagnosing respiratory diseases, potential niche applications, and the promise of breath analysis for personalized medicine. The analytical methodologies used range from very small and low-cost chemical sensors, which are ideal for continuous monitoring of disease status, to optical spectroscopy and state-of-the-art, high-resolution mass spectrometry. The latter can be utilized for untargeted analysis of exhaled breath, with the capability to identify hitherto unknown molecules. The interpretation of the resulting big data sets is complex and often constrained due to a limited number of participants. Even larger data sets will be needed for assessing reproducibility and for validation of biomarker candidates. In addition, molecular structures and quantification of compounds are generally not easily available from on-line measurements and require complementary measurements, for example, a separation method coupled to mass spectrometry. Furthermore, a lack of standardization still hampers the application of the technique to screen larger cohorts of patients. This review summarizes the present status and continuous improvements of the principal on-line breath analysis methods and evaluates obstacles for their wider application.
Collapse
Affiliation(s)
- Tobias Bruderer
- Department of Chemistry and Applied Biosciences , Swiss Federal Institute of Technology , CH-8093 Zurich , Switzerland.,Division of Respiratory Medicine , University Children's Hospital Zurich and Children's Research Center Zurich , CH-8032 Zurich , Switzerland
| | - Thomas Gaisl
- Department of Pulmonology , University Hospital Zurich , CH-8091 Zurich , Switzerland.,Zurich Center for Interdisciplinary Sleep Research , University of Zurich , CH-8091 Zurich , Switzerland
| | - Martin T Gaugg
- Department of Chemistry and Applied Biosciences , Swiss Federal Institute of Technology , CH-8093 Zurich , Switzerland
| | - Nora Nowak
- Department of Chemistry and Applied Biosciences , Swiss Federal Institute of Technology , CH-8093 Zurich , Switzerland
| | - Bettina Streckenbach
- Department of Chemistry and Applied Biosciences , Swiss Federal Institute of Technology , CH-8093 Zurich , Switzerland
| | - Simona Müller
- Department of Chemistry and Applied Biosciences , Swiss Federal Institute of Technology , CH-8093 Zurich , Switzerland
| | - Alexander Moeller
- Division of Respiratory Medicine , University Children's Hospital Zurich and Children's Research Center Zurich , CH-8032 Zurich , Switzerland
| | - Malcolm Kohler
- Department of Pulmonology , University Hospital Zurich , CH-8091 Zurich , Switzerland.,Center for Integrative Human Physiology , University of Zurich , CH-8091 Zurich , Switzerland.,Zurich Center for Interdisciplinary Sleep Research , University of Zurich , CH-8091 Zurich , Switzerland
| | - Renato Zenobi
- Department of Chemistry and Applied Biosciences , Swiss Federal Institute of Technology , CH-8093 Zurich , Switzerland
| |
Collapse
|