1
|
Predicting which patients with cancer will see a psychiatrist or counsellor from their initial oncology consultation document using natural language processing. COMMUNICATIONS MEDICINE 2024; 4:69. [PMID: 38589545 PMCID: PMC11001970 DOI: 10.1038/s43856-024-00495-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2023] [Accepted: 03/28/2024] [Indexed: 04/10/2024] Open
Abstract
BACKGROUND Patients with cancer often have unmet psychosocial needs. Early detection of who requires referral to a counsellor or psychiatrist may improve their care. This work used natural language processing to predict which patients will see a counsellor or psychiatrist from a patient's initial oncology consultation document. We believe this is the first use of artificial intelligence to predict psychiatric outcomes from non-psychiatric medical documents. METHODS This retrospective prognostic study used data from 47,625 patients at BC Cancer. We analyzed initial oncology consultation documents using traditional and neural language models to predict whether patients would see a counsellor or psychiatrist in the 12 months following their initial oncology consultation. RESULTS Here, we show our best models achieved a balanced accuracy (receiver-operating-characteristic area-under-curve) of 73.1% (0.824) for predicting seeing a psychiatrist, and 71.0% (0.784) for seeing a counsellor. Different words and phrases are important for predicting each outcome. CONCLUSION These results suggest natural language processing can be used to predict psychosocial needs of patients with cancer from their initial oncology consultation document. Future research could extend this work to predict the psychosocial needs of medical patients in other settings.
Collapse
|
2
|
Assessing privacy leakage in synthetic 3-D PET imaging using transversal GAN. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2024; 243:107910. [PMID: 37976611 DOI: 10.1016/j.cmpb.2023.107910] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/06/2022] [Revised: 10/27/2023] [Accepted: 10/31/2023] [Indexed: 11/19/2023]
Abstract
BACKGROUND AND OBJECTIVE Training computer-vision related algorithms on medical images for disease diagnosis or image segmentation is difficult in large part due to privacy concerns. For this reason, generative image models are highly sought after to facilitate data sharing. However, 3-D generative models are understudied, and investigation of their privacy leakage is needed. METHODS We introduce our 3-D generative model, Transversal GAN (TrGAN), using head & neck PET images which are conditioned on tumor masks as a case study. We define quantitative measures of image fidelity and utility, and propose a novel framework for evaluating privacy-utility trade-off through membership inference attack. These metrics are evaluated in the course of training to identify ideal fidelity, utility and privacy trade-offs and establish the relationships between these parameters. RESULTS We show that the discriminator of the TrGAN is vulnerable to attack, and that an attacker can identify which samples were used in training with almost perfect accuracy (AUC = 0.99). We also show that an attacker with access to only the generator cannot reliably classify whether a sample had been used for training (AUC = 0.51). We also propose and demonstrate a general decision procedure for any deep learning based generative model, which allows the user to quantify and evaluate the decision trade-off between downstream utility and privacy protection. CONCLUSIONS TrGAN can generate 3-D medical images that retain important image features and statistical properties of the training data set, with minimal privacy loss as determined by a membership inference attack. Our utility-privacy decision procedure may be beneficial to researchers who wish to share data or lack a sufficient number of large labeled image datasets.
Collapse
|
3
|
Regulatory T Cell Biomarkers Identify Patients at Risk of Developing Acute Cellular Rejection in the First Year Following Heart Transplantation. Transplantation 2023; 107:1810-1819. [PMID: 37365692 DOI: 10.1097/tp.0000000000004607] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/28/2023]
Abstract
BACKGROUND Acute cellular rejection (ACR), an alloimmune response involving CD4+ and CD8+ T cells, occurs in up to 20% of patients within the first year following heart transplantation. The balance between a conventional versus regulatory CD4+ T cell alloimmune response is believed to contribute to developing ACR. Therefore, tracking these cells may elucidate whether changes in these cell populations could signal ACR risk. METHODS We used a CD4+ T cell gene signature (TGS) panel that tracks CD4+ conventional T cells (Tconv) and regulatory T cells (Treg) on longitudinal samples from 94 adult heart transplant recipients. We evaluated combined diagnostic performance of the TGS panel with a previously developed biomarker panel for ACR diagnosis, HEARTBiT, while also investigating TGS' prognostic utility. RESULTS Compared with nonrejection samples, rejection samples showed decreased Treg- and increased Tconv-gene expression. The TGS panel was able to discriminate between ACR and nonrejection samples and, when combined with HEARTBiT, showed improved specificity compared with either model alone. Furthermore, the increased risk of ACR in the TGS model was associated with lower expression of Treg genes in patients who later developed ACR. Reduced Treg gene expression was positively associated with younger recipient age and higher intrapatient tacrolimus variability. CONCLUSIONS We demonstrated that expression of genes associated with CD4+ Tconv and Treg could identify patients at risk of ACR. In our post hoc analysis, complementing HEARTBiT with TGS resulted in an improved classification of ACR. Our study suggests that HEARTBiT and TGS may serve as useful tools for further research and test development.
Collapse
|
4
|
Microbial dysbiosis and the host airway epithelial response: insights into HIV-associated COPD using multi'omics profiling. Respir Res 2023; 24:124. [PMID: 37143066 PMCID: PMC10161506 DOI: 10.1186/s12931-023-02431-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2022] [Accepted: 04/21/2023] [Indexed: 05/06/2023] Open
Abstract
BACKGROUND People living with HIV (PLWH) are at increased risk of developing Chronic Obstructive Pulmonary Disease (COPD) independent of cigarette smoking. We hypothesized that dysbiosis in PLWH is associated with epigenetic and transcriptomic disruptions in the airway epithelium. METHODS Airway epithelial brushings were collected from 18 COPD + HIV + , 16 COPD - HIV + , 22 COPD + HIV - and 20 COPD - HIV - subjects. The microbiome, methylome, and transcriptome were profiled using 16S sequencing, Illumina Infinium Methylation EPIC chip, and RNA sequencing, respectively. Multi 'omic integration was performed using Data Integration Analysis for Biomarker discovery using Latent cOmponents. A correlation > 0.7 was used to identify key interactions between the 'omes. RESULTS The COPD + HIV -, COPD -HIV + , and COPD + HIV + groups had reduced Shannon Diversity (p = 0.004, p = 0.023, and p = 5.5e-06, respectively) compared to individuals with neither COPD nor HIV, with the COPD + HIV + group demonstrating the most reduced diversity. Microbial communities were significantly different between the four groups (p = 0.001). Multi 'omic integration identified correlations between Bacteroidetes Prevotella, genes FUZ, FASTKD3, and ACVR1B, and epigenetic features CpG-FUZ and CpG-PHLDB3. CONCLUSION PLWH with COPD manifest decreased diversity and altered microbial communities in their airway epithelial microbiome. The reduction in Prevotella in this group was linked with epigenetic and transcriptomic disruptions in host genes including FUZ, FASTKD3, and ACVR1B.
Collapse
|
5
|
Predicting the Survival of Patients With Cancer From Their Initial Oncology Consultation Document Using Natural Language Processing. JAMA Netw Open 2023; 6:e230813. [PMID: 36848085 PMCID: PMC9972192 DOI: 10.1001/jamanetworkopen.2023.0813] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 03/01/2023] Open
Abstract
IMPORTANCE Predicting short- and long-term survival of patients with cancer may improve their care. Prior predictive models either use data with limited availability or predict the outcome of only 1 type of cancer. OBJECTIVE To investigate whether natural language processing can predict survival of patients with general cancer from a patient's initial oncologist consultation document. DESIGN, SETTING, AND PARTICIPANTS This retrospective prognostic study used data from 47 625 of 59 800 patients who started cancer care at any of the 6 BC Cancer sites located in the province of British Columbia between April 1, 2011, and December 31, 2016. Mortality data were updated until April 6, 2022, and data were analyzed from update until September 30, 2022. All patients with a medical or radiation oncologist consultation document generated within 180 days of diagnosis were included; patients seen for multiple cancers were excluded. EXPOSURES Initial oncologist consultation documents were analyzed using traditional and neural language models. MAIN OUTCOMES AND MEASURES The primary outcome was the performance of the predictive models, including balanced accuracy and receiver operating characteristics area under the curve (AUC). The secondary outcome was investigating what words the models used. RESULTS Of the 47 625 patients in the sample, 25 428 (53.4%) were female and 22 197 (46.6%) were male, with a mean (SD) age of 64.9 (13.7) years. A total of 41 447 patients (87.0%) survived 6 months, 31 143 (65.4%) survived 36 months, and 27 880 (58.5%) survived 60 months, calculated from their initial oncologist consultation. The best models achieved a balanced accuracy of 0.856 (AUC, 0.928) for predicting 6-month survival, 0.842 (AUC, 0.918) for 36-month survival, and 0.837 (AUC, 0.918) for 60-month survival, on a holdout test set. Differences in what words were important for predicting 6- vs 60-month survival were found. CONCLUSIONS AND RELEVANCE These findings suggest that models performed comparably with or better than previous models predicting cancer survival and that they may be able to predict survival using readily available data without focusing on 1 cancer type.
Collapse
|
6
|
The molecular and cellular mechanisms associated with the destruction of terminal bronchioles in COPD. Eur Respir J 2022; 59:2101411. [PMID: 34675046 DOI: 10.1183/13993003.01411-2021] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2020] [Accepted: 09/27/2021] [Indexed: 11/05/2022]
Abstract
RATIONALE Peripheral airway obstruction is a key feature of chronic obstructive pulmonary disease (COPD), but the mechanisms of airway loss are unknown. This study aims to identify the molecular and cellular mechanisms associated with peripheral airway obstruction in COPD. METHODS Ten explanted lung specimens donated by patients with very severe COPD treated by lung transplantation and five unused donor control lungs were sampled using systematic uniform random sampling (SURS), resulting in 240 samples. These samples were further examined by micro-computed tomography (CT), quantitative histology and gene expression profiling. RESULTS Micro-CT analysis showed that the loss of terminal bronchioles in COPD occurs in regions of microscopic emphysematous destruction with an average airspace size of ≥500 and <1000 µm, which we have termed a "hot spot". Based on microarray gene expression profiling, the hot spot was associated with an 11-gene signature, with upregulation of pro-inflammatory genes and downregulation of inhibitory immune checkpoint genes, indicating immune response activation. Results from both quantitative histology and the bioinformatics computational tool CIBERSORT, which predicts the percentage of immune cells in tissues from transcriptomic data, showed that the hot spot regions were associated with increased infiltration of CD4 and CD8 T-cell and B-cell lymphocytes. INTERPRETATION The reduction in terminal bronchioles observed in lungs from patients with COPD occurs in a hot spot of microscopic emphysema, where there is upregulation of IFNG signalling, co-stimulatory immune checkpoint genes and genes related to the inflammasome pathway, and increased infiltration of immune cells. These could be potential targets for therapeutic interventions in COPD.
Collapse
|
7
|
Mian: interactive web-based microbiome data table visualization and machine learning platform. Bioinformatics 2022; 38:1176-1178. [PMID: 34788784 DOI: 10.1093/bioinformatics/btab754] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2021] [Revised: 09/21/2021] [Accepted: 11/03/2021] [Indexed: 02/03/2023] Open
Abstract
SUMMARY Mian is a web application to interactively visualize, run statistical tools and train machine learning models on operational taxonomic unit (OTU) or amplicon sequence variant (ASV) datasets to identify key taxonomic groups, diversity trends or taxonomic composition shifts in the context of provided categorical or numerical sample metadata. Tools, including Fisher's exact test, Boruta feature selection, alpha and beta diversity, and random forest and deep neural network classifiers, facilitate open-ended data exploration and hypothesis generation on microbial datasets. AVAILABILITY Mian is freely available at: miandata.org. Mian is an open-source platform licensed under the MIT license with source code available at github.com/tbj128/mian. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
|
8
|
The transition from normal lung anatomy to minimal and established fibrosis in idiopathic pulmonary fibrosis (IPF). EBioMedicine 2021; 66:103325. [PMID: 33862585 PMCID: PMC8054143 DOI: 10.1016/j.ebiom.2021.103325] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2021] [Revised: 03/12/2021] [Accepted: 03/19/2021] [Indexed: 02/06/2023] Open
Abstract
BACKGROUND The transition from normal lung anatomy to minimal and established fibrosis is an important feature of the pathology of idiopathic pulmonary fibrosis (IPF). The purpose of this report is to examine the molecular and cellular mechanisms associated with this transition. METHODS Pre-operative thoracic Multidetector Computed Tomography (MDCT) scans of patients with severe IPF (n = 9) were used to identify regions of minimal(n = 27) and established fibrosis(n = 27). MDCT, Micro-CT, quantitative histology, and next-generation sequencing were used to compare 24 samples from donor controls (n = 4) to minimal and established fibrosis samples. FINDINGS The present results extended earlier reports about the transition from normal lung anatomy to minimal and established fibrosis by showing that there are activations of TGFBI, T cell co-stimulatory genes, and the down-regulation of inhibitory immune-checkpoint genes compared to controls. The expression patterns of these genes indicated activation of a field immune response, which is further supported by the increased infiltration of inflammatory immune cells dominated by lymphocytes that are capable of forming lymphoid follicles. Moreover, fibrosis pathways, mucin secretion, surfactant, TLRs, and cytokine storm-related genes also participate in the transitions from normal lung anatomy to minimal and established fibrosis. INTERPRETATION The transition from normal lung anatomy to minimal and established fibrosis is associated with genes that are involved in the tissue repair processes, the activation of immune responses as well as the increased infiltration of CD4, CD8, B cell lymphocytes, and macrophages. These molecular and cellular events correlate with the development of structural abnormality of IPF and probably contribute to its pathogenesis.
Collapse
|
9
|
Analytical Validation of HEARTBiT: A Blood-Based Multiplex Gene Expression Profiling Assay for Exclusionary Diagnosis of Acute Cellular Rejection in Heart Transplant Patients. Clin Chem 2021; 66:1063-1071. [PMID: 32705124 DOI: 10.1093/clinchem/hvaa123] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2019] [Accepted: 05/15/2020] [Indexed: 12/23/2022]
Abstract
BACKGROUND HEARTBiT is a whole blood-based gene profiling assay using the nucleic acid counting NanoString technology for the exclusionary diagnosis of acute cellular rejection in heart transplant patients. The HEARTBiT score measures the risk of acute cellular rejection in the first year following heart transplant, distinguishing patients with stable grafts from those at risk for acute cellular rejection. Here, we provide the analytical performance characteristics of the HEARTBiT assay and the results on pilot clinical validation. METHODS We used purified RNA collected from PAXgene blood samples to evaluate the characteristics of a 12-gene panel HEARTBiT assay, for its linearity range, quantitative bias, precision, and reproducibility. These parameters were estimated either from serial dilutions of individual samples or from repeated runs on pooled samples. RESULTS We found that all 12 genes showed linear behavior within the recommended assay input range of 125 ng to 500 ng of purified RNA, with most genes showing 3% or lower quantitative bias and around 5% coefficient of variation. Total variation resulting from unique operators, reagent lots, and runs was less than 0.02 units standard deviation (SD). The performance of the analytically validated assay (AUC = 0.75) was equivalent to what we observed in the signature development dataset. CONCLUSION The analytical performance of the assay within the specification input range demonstrated reliable quantification of the HEARTBiT score within 0.02 SD units, measured on a 0 to 1 unit scale. This assay may therefore be of high utility in clinical validation of HEARTBiT in future biomarker observational trials.
Collapse
|
10
|
ProbeRating: a recommender system to infer binding profiles for nucleic acid-binding proteins. Bioinformatics 2021; 36:4797-4804. [PMID: 32573679 PMCID: PMC7750938 DOI: 10.1093/bioinformatics/btaa580] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2019] [Revised: 05/18/2020] [Accepted: 06/18/2020] [Indexed: 12/15/2022] Open
Abstract
MOTIVATION The interaction between proteins and nucleic acids plays a crucial role in gene regulation and cell function. Determining the binding preferences of nucleic acid-binding proteins (NBPs), namely RNA-binding proteins (RBPs) and transcription factors (TFs), is the key to decipher the protein-nucleic acids interaction code. Today, available NBP binding data from in vivo or in vitro experiments are still limited, which leaves a large portion of NBPs uncovered. Unfortunately, existing computational methods that model the NBP binding preferences are mostly protein specific: they need the experimental data for a specific protein in interest, and thus only focus on experimentally characterized NBPs. The binding preferences of experimentally unexplored NBPs remain largely unknown. RESULTS Here, we introduce ProbeRating, a nucleic acid recommender system that utilizes techniques from deep learning and word embeddings of natural language processing. ProbeRating is developed to predict binding profiles for unexplored or poorly studied NBPs by exploiting their homologs NBPs which currently have available binding data. Requiring only sequence information as input, ProbeRating adapts FastText from Facebook AI Research to extract biological features. It then builds a neural network-based recommender system. We evaluate the performance of ProbeRating on two different tasks: one for RBP and one for TF. As a result, ProbeRating outperforms previous methods on both tasks. The results show that ProbeRating can be a useful tool to study the binding mechanism for the many NBPs that lack direct experimental evidence. and implementation. AVAILABILITY AND IMPLEMENTATION The source code is freely available at <https://github.com/syang11/ProbeRating>. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
|
11
|
HEARTBiT: A Transcriptomic Signature for Excluding Acute Cellular Rejection in Adult Heart Allograft Patients. Can J Cardiol 2019; 36:1217-1227. [PMID: 32553820 DOI: 10.1016/j.cjca.2019.11.017] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2019] [Revised: 10/30/2019] [Accepted: 11/07/2019] [Indexed: 01/06/2023] Open
Abstract
BACKGROUND Nine mRNA transcripts associated with acute cellular rejection (ACR) in previous microarray studies were ported to the clinically amenable NanoString nCounter platform. Here we report the diagnostic performance of the resulting blood test to exclude ACR in heart allograft recipients: HEARTBiT. METHODS Blood samples for transcriptomic profiling were collected during routine post-transplantation monitoring in 8 Canadian transplant centres participating in the Biomarkers in Transplantation initiative, a large (n = 1622) prospective observational study conducted between 2009 and 2014. All adult cardiac transplant patients were invited to participate (median age = 56 [17 to 71]). The reference standard for rejection status was histopathology grading of tissue from endomyocardial biopsy (EMB). All locally graded ISHLT ≥ 2R rejection samples were selected for analysis (n = 36). ISHLT 1R (n = 38) and 0R (n = 86) samples were randomly selected to create a cohort approximately matched for site, age, sex, and days post-transplantation, with a focus on early time points (median days post-transplant = 42 [7 to 506]). RESULTS ISHLT ≥ 2R rejection was confirmed by EMB in 18 and excluded in 92 samples in the test set. HEARTBiT achieved 47% specificity (95% confidence interval [CI], 36%-57%) given ≥ 90% sensitivity, with a corresponding area under the receiver operating characteristic curve of 0.69 (95% CI, 0.56-0.81). CONCLUSIONS HEARTBiT's diagnostic performance compares favourably to the only currently approved minimally invasive diagnostic test to rule out ACR, AlloMap (CareDx, Brisbane, CA) and may be used to inform care decisions in the first 2 months post-transplantation, when AlloMap is not approved, and most ACR episodes occur.
Collapse
|
12
|
Effect of short-term oral prednisone therapy on blood gene expression: a randomised controlled clinical trial. Respir Res 2019; 20:176. [PMID: 31382977 PMCID: PMC6683462 DOI: 10.1186/s12931-019-1147-2] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2019] [Accepted: 07/28/2019] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND Effects of systemic corticosteroids on blood gene expression are largely unknown. This study determined gene expression signature associated with short-term oral prednisone therapy in patients with chronic obstructive pulmonary disease (COPD) and its relationship to 1-year mortality following an acute exacerbation of COPD (AECOPD). METHODS Gene expression in whole blood was profiled using the Affymetrix Human Gene 1.1 ST microarray chips from two cohorts: 1) a prednisone cohort with 37 stable COPD patients randomly assigned to prednisone 30 mg/d + standard therapy for 4 days or standard therapy alone and 2) the Rapid Transition Program (RTP) cohort with 218 COPD patients who experienced AECOPD and were treated with systemic corticosteroids. All gene expression data were adjusted for the total number of white blood cells and their differential cell counts. RESULTS In the prednisone cohort, 51 genes were differentially expressed between prednisone and standard therapy group at a false discovery rate of < 0.05. The top 3 genes with the largest fold-changes were KLRF1, GZMH and ADGRG1; and 21 genes were significantly enriched in immune system pathways including the natural killer cell mediated cytotoxicity. In the RTP cohort, 27 patients (12.4%) died within 1 year after hospitalisation of AECOPD; 32 of 51 genes differentially expressed in the prednisone cohort significantly changed from AECOPD to the convalescent state and were enriched in similar cellular immune pathways to that in the prednisone cohort. Of these, 10 genes including CX3CR1, KLRD1, S1PR5 and PRF1 were significantly associated with 1-year mortality. CONCLUSIONS Short-term daily prednisone therapy produces a distinct blood gene signature that may be used to determine and monitor treatment responses to prednisone in COPD patients during AECOPD. TRIAL REGISTRATION The prednisone cohort was registered at clinicalTrials.gov ( NCT02534402 ) and the RTP cohort was registered at ClinicalTrials.gov ( NCT02050022 ).
Collapse
|
13
|
Development and Validation of Apolipoprotein AI-Associated Lipoprotein Proteome Panel for the Prediction of Cholesterol Efflux Capacity and Coronary Artery Disease. Clin Chem 2019; 65:282-290. [DOI: 10.1373/clinchem.2018.291922] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2018] [Accepted: 10/23/2018] [Indexed: 11/06/2022]
Abstract
Abstract
BACKGROUND
Cholesterol efflux capacity (CEC) is a measure of HDL function that, in cell-based studies, has demonstrated an inverse association with cardiovascular disease. The cell-based measure of CEC is complex and low-throughput. We hypothesized that assessment of the lipoprotein proteome would allow for precise, high-throughput CEC prediction.
METHODS
After isolating lipoprotein particles from serum, we used LC-MS/MS to quantify 21 lipoprotein-associated proteins. A bioinformatic pipeline was used to identify proteins with univariate correlation to cell-based CEC measurements and generate a multivariate algorithm for CEC prediction (pCE). Using logistic regression, protein coefficients in the pCE model were reweighted to yield a new algorithm predicting coronary artery disease (pCAD).
RESULTS
Discovery using targeted LC-MS/MS analysis of 105 training and test samples yielded a pCE model comprising 5 proteins (Spearman r = 0.86). Evaluation of pCE in a case–control study of 231 specimens from healthy individuals and patients with coronary artery disease revealed lower pCE in cases (P = 0.03). Derived within this same study, the pCAD model significantly improved classification (P < 0.0001). Following analytical validation of the multiplexed proteomic method, we conducted a case–control study of myocardial infarction in 137 postmenopausal women that confirmed significant separation of specimen cohorts in both the pCE (P = 0.015) and pCAD (P = 0.001) models.
CONCLUSIONS
Development of a proteomic pCE provides a reproducible high-throughput alternative to traditional cell-based CEC assays. The pCAD model improves stratification of case and control cohorts and, with further studies to establish clinical validity, presents a new opportunity for the assessment of cardiovascular health.
Collapse
|
14
|
Phenotyping and outcomes of hospitalized COPD patients using rapid molecular diagnostics on sputum samples. Int J Chron Obstruct Pulmon Dis 2019; 14:311-319. [PMID: 30774328 PMCID: PMC6350828 DOI: 10.2147/copd.s188186] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022] Open
Abstract
Background Etiologies of acute exacerbations of chronic obstructive pulmonary disease (AECOPD) are heterogeneous. We phenotyped severe AECOPD based on molecular pathogen detection of sputum samples collected at hospitalization of COPD patients and determined their outcomes. Methods We phenotyped 72 sputum samples of COPD patients who were hospitalized with a primary diagnosis of AECOPD using a molecular array that detected common bacterial and viral respiratory pathogens. Based on these results, the patients were classified into positive or negative pathogen groups. The pathogen-positive group was further divided into virus or bacteria subgroups. Admission day 1 blood samples were assayed for N-terminal prohormone brain natriuretic peptide, CRP, and complete blood counts. Results A total of 52 patients had a positive result on the array, while 20 patients had no pathogens detected. The most common bacterial pathogen detected was Haemophilus influenzae and the most common virus was rhinovirus. The pathogen-negative group had the worse outcomes with longer hospital stays (median 6.5 vs 5 days for bacteria-positive group, P=0.02) and a trend toward increased 1-year mortality (P=0.052). The bacteria-positive group had the best prognosis, whereas the virus-positive group had outcomes somewhere in between the bacteria-positive and pathogen-negative groups. Conclusion Molecular diagnostics on sputum can rapidly phenotype serious AECOPD into bacteria-, virus-, or pathogen-negative groups. The bacteria-positive group appears to have the best prognosis, while pathogen-negative group has the worst. These data suggest that AECOPD is a heterogeneous event and that accurate phenotyping of AECOPD may lead to novel management strategies that are personalized and more precise.
Collapse
|
15
|
Abstract A07: Alterations in G2/M phase associated transcriptional networks highlight lung cancer predisposition in COPD patients. Clin Cancer Res 2018. [DOI: 10.1158/1557-3265.aacriaslc18-a07] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Abstract
Background: Patients with chronic obstructive pulmonary disease (COPD) are at increased risk of developing lung cancer. COPD, clinically defined by reduced lung function measurements, is characterized by chronic airway inflammation, remodeling and loss as well as destruction of alveoli (emphysema). While this disease is an important lung cancer risk factor independent of smoking, the molecular progression from COPD to lung cancer tumourigenesis is relatively understudied.
Method: We first analyzed small-airway epithelial gene expression profiles obtained from bronchial brushings from 127 COPD and 140 non-COPD ever-smoker patients. We performed weighted gene correlation network analysis (WGCNA) on these gene expression profiles to discover deregulated gene modules (“metagenes”) associated with reduced lung function (Forced Expiratory Volume at 1 second, FEV-1)—a clinical measure of COPD severity most robustly negatively correlated with lung cancer risk. We then assessed the preservation of these modules in two non-small cell lung cancer (NSCLC) tumor/normal data sets (lung adenocarcinoma (LUAD) and squamous cell carcinoma (LUSC), n= 887 tumors total) to examine the molecular overlap between COPD and lung cancer. Airway and tumor patient cohorts were matched for age, gender, tumor stage, and smoking status.
Result: We discovered 10 distinct small-airway gene expression modules, two of which were significantly negatively correlated (p < 0.05) with patient FEV-1. One of these FEV-1 modules was the top overall module preserved in both NSCLC subtypes. This lung cancer-FEV-1 module contained 31 genes solely enriched for two related mitotic functions—G2/M phase transition (BH-p = 0.02) and mitotic roles of polo-like kinase (BH-p = 0.001, n=31). Of these, 28 genes were significantly overexpressed in both LUAD and LUSC, and mapped to a highly clustered sub-network of 23 proteins with 465 known and in silico-predicted protein-protein interactions. When tumors enriched for this lung cancer-FEV-1 gene signature were further examined, we observed a significant co-occurrence of DNA-level alterations in DNA damage-associated checkpoints, specifically mutated TP53.
Conclusion: Coordinated gene expression changes associated with COPD severity measures in small airways and preserved in NSCLC tumors are enriched for G2/M phase transition genes. These genes are further disrupted in tumors, where co-occurring mutations to gatekeeper genes are present. Progression of mitosis during abnormal aneuploidy in lung tissues of COPD patients may confer increased risk of oncogenic transformation in this population, and may underlie the molecular progression from COPD to lung cancer.
Citation Format: Erin A. Marshall, Emily A. Vucic, Victor D. Martinez, Raymond T. Ng, Wan L. Lam. Alterations in G2/M phase associated transcriptional networks highlight lung cancer predisposition in COPD patients [abstract]. In: Proceedings of the Fifth AACR-IASLC International Joint Conference: Lung Cancer Translational Science from the Bench to the Clinic; Jan 8-11, 2018; San Diego, CA. Philadelphia (PA): AACR; Clin Cancer Res 2018;24(17_Suppl):Abstract nr A07.
Collapse
|
16
|
Abstract
Background Characterizing the binding preference of RNA-binding proteins (RBP) is essential for us to understand the interaction between an RBP and its RNA targets, and to decipher the mechanism of post-transcriptional regulation. Experimental methods have been used to generate protein-RNA binding data for a number of RBPs in vivo and in vitro. Utilizing the binding data, a couple of computational methods have been developed to detect the RNA sequence or structure preferences of the RBPs. However, the majority of RBPs have not yet been experimentally characterized and lack RNA binding data. For these poorly studied RBPs, the identification of their binding preferences cannot be performed by most existing computational methods because the experimental binding data are prerequisite to these methods. Results Here we propose a new method based on co-evolution to predict the sequence preferences for the poorly studied RBPs, waiving the requirement of their binding data. First, we demonstrate the co-evolutionary relationship between RBPs and their RNA partners. We then present a K-nearest neighbors (KNN) based algorithm to infer the sequence preference of an RBP using only the preference information from its homologous RBPs. By benchmarking against several in vitro and in vivo datasets, our proposed method outperforms the existing alternative which uses the closest neighbor’s preference on all the datasets. Moreover, it shows comparable performance with two state-of-the-art methods that require the presence of the experimental binding data. Finally, we demonstrate the usage of this method to infer sequence preferences for novel proteins which have no binding preference information available. Conclusion For a poorly studied RBP, the current methods used to determine its binding preference need experimental data, which is expensive and time consuming. Therefore, determining RBP’s preference is not practical in many situations. This study provides an economic solution to infer the sequence preference of such protein based on the co-evolution. The source codes and related datasets are available at https://github.com/syang11/KNN. Electronic supplementary material The online version of this article (10.1186/s12859-018-2091-8) contains supplementary material, which is available to authorized users.
Collapse
|
17
|
Abstract
Rationale Acute exacerbations of chronic obstructive pulmonary disease (AECOPD) are caused by a variety of different etiologic agents. Our aim was to phenotype COPD exacerbations using imaging (chest X-ray [CXR] and computed tomography [CT]) and to determine the possible role of the blood tests (C-reactive protein [CRP], the N-terminal prohormone brain natriuretic peptide [NT-proBNP]) as diagnostic biomarkers. Materials and methods Subjects who were hospitalized with a primary diagnosis of AECOPD and who had had CXRs, CT scans, and blood collection for CRP and NT-proBNP were assessed in this study. Radiologist blinded to the clinical and laboratory characteristics of the subjects interpreted their CXRs and CT images. ANOVA and Spearman’s correlation were performed to test for associations between these imaging parameters and the blood-based biomarkers NT-proBNP and CRP; logistic regression models were used to assess the performance of these biomarkers in predicting the radiological parameters. Results A total of 309 subjects were examined for this study. Subjects had a mean age of 65.6±11.1 years, 66.7% of them were males, and 62.4% were current smokers, with a mean FEV1 54.4%±21.5% of predicted. Blood NT-proBNP concentrations were associated with cardiac enlargement (area under the curve [AUC] =0.72, P<0.001), pulmonary edema (AUC =0.63, P=0.009), and pleural effusion on CXR (AUC =0.64, P=0.01); whereas on CT images, NT-proBNP concentrations were associated with pleural effusion (AUC =0.71, P=0.002). Serum CRP concentrations, on the other hand, were associated with consolidation on CT images (AUC =0.75, P<0.001), ground glass opacities (AUC =0.64, P=0.028), and pleural effusion (AUC =0.72, P<0.001) on CT images. A serum CRP sensitivity-oriented cutoff point of 11.5 mg/L was selected for the presence of consolidation on CT images in subjects admitted as cases of AECOPD, which has a sensitivity of 91% and a specificity of 53% (P<0.001). Conclusion Elevated CRP may indicate the presence of pneumonia, while elevated NT-proBNP may indicate cardiac dysfunction. These readily available blood-based biomarkers may provide more accurate phenotyping of AECOPD and enable the discovery of more precise therapies.
Collapse
|
18
|
Multiple reaction monitoring mass spectrometry to identify novel plasma protein biomarkers of treatment response in cystic fibrosis pulmonary exacerbations. J Cyst Fibros 2017; 17:333-340. [PMID: 29174082 DOI: 10.1016/j.jcf.2017.10.013] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2017] [Revised: 10/10/2017] [Accepted: 10/12/2017] [Indexed: 11/28/2022]
Abstract
BACKGROUND Systemic inflammation decreases with IV antibiotics during the treatment of CF pulmonary exacerbations (PEx). We used multiple reaction monitoring mass spectrometry and immunoassays to monitor blood proteins during PEx treatment to determine if early changes could be used to predict PEx outcomes following treatment. METHODS Blood samples from 25 PEx (22 unique adults) were collected within 24h of admission, day 5, day 10, and at IV antibiotic completion. Ninety-two blood proteins involved in host immunity and inflammation were measured. RESULTS Levels of several blood proteins changed from admission to end of IV antibiotics, most increasing with treatment. Early changes (admission to day 5) in fibrinogen levels had the strongest correlation with overall improvement in CFRSD-CRISS and FEV1% predicted by the end of treatment. CONCLUSIONS Several plasma proteins changed significantly with IV antibiotics. Future studies will evaluate fibrinogen as an early biomarker of PEx treatment response in CF.
Collapse
|
19
|
Integrative Genomics of Emphysema-Associated Genes Reveals Potential Disease Biomarkers. Am J Respir Cell Mol Biol 2017; 57:411-418. [PMID: 28459279 DOI: 10.1165/rcmb.2016-0284oc] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
Chronic obstructive pulmonary disease is the third leading cause of death worldwide. Gene expression profiling across multiple regions of the same lung identified genes significantly related to emphysema. We sought to determine whether the lung and epithelial expression of 127 emphysema-related genes was also related to lung function in independent cohorts, and whether any of these genes could be used as biomarkers in the peripheral blood of patients with chronic obstructive pulmonary disease. To that end, we examined whether the expression levels of these genes were under genetic control in lung tissue (n = 1,111). We then determined whether the mRNA levels of these genes in lung tissue (n = 727), small airway epithelial cells (n = 238), and peripheral blood (n = 620) were significantly related to lung function measurements. The expression of 63 of the 127 genes (50%) was under genetic control in lung tissue. The lung and epithelial mRNA expression of a subset of the emphysema-associated genes, including ASRGL1, LPHN2, and EDNRB, was strongly associated with lung function. In peripheral blood, the expression of 40 genes was significantly associated with lung function. Twenty-nine of these genes (73%) were also associated with lung function in lung tissue, but with the opposite direction of effect for 24 of the 29 genes, including those involved in hypoxia and B cell-related responses. The integrative genomics approach uncovered a significant overlap of emphysema genes associations with lung function between lung and blood with opposite directions between the two. These results support the use of peripheral blood to detect disease biomarkers.
Collapse
|
20
|
PGCA: An algorithm to link protein groups created from MS/MS data. PLoS One 2017; 12:e0177569. [PMID: 28562641 PMCID: PMC5451011 DOI: 10.1371/journal.pone.0177569] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2016] [Accepted: 04/28/2017] [Indexed: 11/19/2022] Open
Abstract
The quantitation of proteins using shotgun proteomics has gained popularity in the last decades, simplifying sample handling procedures, removing extensive protein separation steps and achieving a relatively high throughput readout. The process starts with the digestion of the protein mixture into peptides, which are then separated by liquid chromatography and sequenced by tandem mass spectrometry (MS/MS). At the end of the workflow, recovering the identity of the proteins originally present in the sample is often a difficult and ambiguous process, because more than one protein identifier may match a set of peptides identified from the MS/MS spectra. To address this identification problem, many MS/MS data processing software tools combine all plausible protein identifiers matching a common set of peptides into a protein group. However, this solution introduces new challenges in studies with multiple experimental runs, which can be characterized by three main factors: i) protein groups' identifiers are local, i.e., they vary run to run, ii) the composition of each group may change across runs, and iii) the supporting evidence of proteins within each group may also change across runs. Since in general there is no conclusive evidence about the absence of proteins in the groups, protein groups need to be linked across different runs in subsequent statistical analyses. We propose an algorithm, called Protein Group Code Algorithm (PGCA), to link groups from multiple experimental runs by forming global protein groups from connected local groups. The algorithm is computationally inexpensive and enables the connection and analysis of lists of protein groups across runs needed in biomarkers studies. We illustrate the identification problem and the stability of the PGCA mapping using 65 iTRAQ experimental runs. Further, we use two biomarker studies to show how PGCA enables the discovery of relevant candidate protein group markers with similar but non-identical compositions in different runs.
Collapse
|
21
|
Network-based analysis reveals novel gene signatures in peripheral blood of patients with chronic obstructive pulmonary disease. Respir Res 2017; 18:72. [PMID: 28438154 PMCID: PMC5404332 DOI: 10.1186/s12931-017-0558-1] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2016] [Accepted: 04/20/2017] [Indexed: 01/27/2023] Open
Abstract
BACKGROUND Chronic obstructive pulmonary disease (COPD) is currently the third leading cause of death and there is a huge unmet clinical need to identify disease biomarkers in peripheral blood. Compared to gene level differential expression approaches to identify gene signatures, network analyses provide a biologically intuitive approach which leverages the co-expression patterns in the transcriptome to identify modules of co-expressed genes. METHODS A weighted gene co-expression network analysis (WGCNA) was applied to peripheral blood transcriptome from 238 COPD subjects to discover co-expressed gene modules. We then determined the relationship between these modules and forced expiratory volume in 1 s (FEV1). In a second, independent cohort of 381 subjects, we determined the preservation of these modules and their relationship with FEV1. For those modules that were significantly related to FEV1, we determined the biological processes as well as the blood cell-specific gene expression that were over-represented using additional external datasets. RESULTS Using WGCNA, we identified 17 modules of co-expressed genes in the discovery cohort. Three of these modules were significantly correlated with FEV1 (FDR < 0.1). In the replication cohort, these modules were highly preserved and their FEV1 associations were reproducible (P < 0.05). Two of the three modules were negatively related to FEV1 and were enriched in IL8 and IL10 pathways and correlated with neutrophil-specific gene expression. The positively related module, on the other hand, was enriched in DNA transcription and translation and was strongly correlated to CD4+, CD8+ T cell-specific gene expression. CONCLUSIONS Network based approaches are promising tools to identify potential biomarkers for COPD. TRIAL REGISTRATION The ECLIPSE study was funded by GlaxoSmithKline, under ClinicalTrials.gov identifier NCT00292552 and GSK No. SCO104960.
Collapse
|
22
|
Differentiating heart failure phenotypes using sex-specific transcriptomic and proteomic biomarker panels. ESC Heart Fail 2017; 4:301-311. [PMID: 28772032 PMCID: PMC5542716 DOI: 10.1002/ehf2.12136] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2016] [Revised: 10/25/2016] [Accepted: 12/28/2016] [Indexed: 12/31/2022] Open
Abstract
Aims Heart failure with preserved ejection fraction (HFpEF) accounts for 30–50% of patients with heart failure (HF). A major obstacle in HF management is the difficulty in differentiating between HFpEF and heart failure with reduced ejection fraction (HFrEF) using conventional clinical and laboratory investigations. The aim of this study is to develop robust transcriptomic and proteomic biomarker signatures that can differentiate HFpEF from HFrEF. Methods and results A total of 210 HF patients were recruited in participating institutions from the Alberta HEART study. An expert clinical adjudicating panel differentiated between patients with HFpEF and HFrEF. The discovery cohort consisted of 61 patients, and the replication cohort consisted of 70 patients. Transcriptomic and proteomic data were analysed to find panels of differentiating HFpEF from HFrEF. In the discovery cohort, a 22‐transcript panel was found to differentiate HFpEF from HFrEF in male patients with a cross‐validation AUC of 0.74, as compared with 0.70 for N‐terminal pro‐B‐type natriuretic peptide (NT‐proBNP) in those same patients. An ensemble of the transcript panel and NT‐pro‐BNP yielded a cross‐validation AUC of 0.80. This performance improvement was also observed in the replication cohort. An ensemble of the transcriptomic panel with NT‐proBNP produced a replication AUC of 0.90, as compared with 0.74 for NT‐proBNP alone and 0.73 for the transcriptomic panel. Conclusions We have identified a male‐specific transcriptomic biomarker panel that can differentiate between HFpEF and HFrEF. These biosignatures could be further replicated on other patients and potentially be developed into a blood test for better management of HF patients.
Collapse
|
23
|
Association of Serum MiR-142-3p and MiR-101-3p Levels with Acute Cellular Rejection after Heart Transplantation. PLoS One 2017; 12:e0170842. [PMID: 28125729 PMCID: PMC5268768 DOI: 10.1371/journal.pone.0170842] [Citation(s) in RCA: 44] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2016] [Accepted: 01/11/2017] [Indexed: 11/28/2022] Open
Abstract
Background Identifying non-invasive and reliable blood-derived biomarkers for early detection of acute cellular rejection in heart transplant recipients is of great importance in clinical practice. MicroRNAs are small molecules found to be stable in serum and their expression patterns reflect both physiological and underlying pathological conditions in human. Methods We compared a group of heart transplant recipients with histologically-verified acute cellular rejection (ACR, n = 26) with a control group of heart transplant recipients without allograft rejection (NR, n = 37) by assessing the levels of a select set of microRNAs in serum specimens. Results The levels of seven microRNAs, miR-142-3p, miR-101-3p, miR-424-5p, miR-27a-3p, miR-144-3p, miR-339-3p and miR-326 were significantly higher in ACR group compared to the control group and could discriminate between patients with and without allograft rejection. MiR-142-3p and miR-101-3p had the best diagnostic test performance among the microRNAs tested. Serum levels of miR-142-3p and miR-101-3p were independent of calcineurin inhibitor levels, as measured by tacrolimus and cyclosporin; kidney function, as measured by creatinine level, and general inflammation state, as measured by CRP level. Conclusion This study demonstrated two microRNAs, miR-142-3p and miR-101-3p, that could be relevant as non-invasive diagnostic tools for identifying heart transplant patients with acute cellular rejection.
Collapse
|
24
|
Enumerateblood - an R package to estimate the cellular composition of whole blood from Affymetrix Gene ST gene expression profiles. BMC Genomics 2017; 18:43. [PMID: 28061752 PMCID: PMC5219701 DOI: 10.1186/s12864-016-3460-1] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2016] [Accepted: 12/22/2016] [Indexed: 11/20/2022] Open
Abstract
Background Measuring genome-wide changes in transcript abundance in circulating peripheral whole blood is a useful way to study disease pathobiology and may help elucidate the molecular mechanisms of disease, or discovery of useful disease biomarkers. The sensitivity and interpretability of analyses carried out in this complex tissue, however, are significantly affected by its dynamic cellular heterogeneity. It is therefore desirable to quantify this heterogeneity, either to account for it or to better model interactions that may be present between the abundance of certain transcripts, specific cell types and the indication under study. Accurate enumeration of the many component cell types that make up peripheral whole blood can further complicate the sample collection process, however, and result in additional costs. Many approaches have been developed to infer the composition of a sample from high-dimensional transcriptomic and, more recently, epigenetic data. These approaches rely on the availability of isolated expression profiles for the cell types to be enumerated. These profiles are platform-specific, suitable datasets are rare, and generating them is expensive. No such dataset exists on the Affymetrix Gene ST platform. Results We present ‘Enumerateblood’, a freely-available and open source R package that exposes a multi-response Gaussian model capable of accurately predicting the composition of peripheral whole blood samples from Affymetrix Gene ST expression profiles, outperforming other current methods when applied to Gene ST data. Conclusions ‘Enumerateblood’ significantly improves our ability to study disease pathobiology from whole blood gene expression assayed on the popular Affymetrix Gene ST platform by allowing a more complete study of the various components of this complex tissue without the need for additional data collection. Future use of the model may allow for novel insights to be generated from the ~400 Affymetrix Gene ST blood gene expression datasets currently available on the Gene Expression Omnibus (GEO) website. Electronic supplementary material The online version of this article (doi:10.1186/s12864-016-3460-1) contains supplementary material, which is available to authorized users.
Collapse
|
25
|
Biomarker Development in COPD: Moving From P Values to Products to Impact Patient Care. Chest 2016; 151:455-467. [PMID: 27693595 DOI: 10.1016/j.chest.2016.09.012] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2016] [Revised: 08/06/2016] [Accepted: 09/21/2016] [Indexed: 01/02/2023] Open
Abstract
There is a great interest in developing biomarkers to enable precision medicine and improve health outcomes of patients with COPD. However, biomarker development is extremely challenging and expensive, and translation of research endeavors to date has been largely unsuccessful. In most cases, biomarkers fail because of poor replication of initial promising results in independent cohorts and/or inability to transfer the biomarker from a discovery platform to a clinical assay. Ultimately, new biomarker assays must address 5 questions for optimal clinical translation. They include the following: is the biomarker likely to be (1) superior (will the test outperform current standards?); (2) actionable (will the test change patient management?); (3) valuable (will the test improve patient outcomes?); (4) economical (will the implementation of the biomarker in the target population be cost-saving or cost-effective?); and (5) clinically deployable (is there a pathway for the biomarker and analytical technology to be implemented in a clinical laboratory?)? In this article we review some of the major barriers to biomarker development in COPD and provide possible solutions to overcome these limitations, enabling translation of promising biomarkers from discovery experiments to clinical implementation.
Collapse
|
26
|
The Effect of Different Case Definitions of Current Smoking on the Discovery of Smoking-Related Blood Gene Expression Signatures in Chronic Obstructive Pulmonary Disease. Nicotine Tob Res 2016; 18:1903-9. [PMID: 27154971 PMCID: PMC4978988 DOI: 10.1093/ntr/ntw129] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2015] [Accepted: 04/26/2016] [Indexed: 11/13/2022]
Abstract
INTRODUCTION Smoking is the number one modifiable environmental risk factor for chronic obstructive pulmonary disease (COPD). Clinical, epidemiological and increasingly "omics" studies assess or adjust for current smoking status using only self-report, which may be inaccurate. Objective measures such as exhaled carbon monoxide (eCO) may also be problematic owing to limitations in the measurements and the relatively short half life of the molecule. In this study, we determined the impact of different case definitions of current cigarette smoking on gene expression in peripheral blood of patients with COPD. METHODS Peripheral blood gene expression from 573 former- and current-smokers with COPD in the ECLIPSE study was used to find genes whose expression was associated with smoking status. Current smoking was defined using self-report, eCO concentrations, or both. Linear regression was used to determine the association of current smoking status with gene expression adjusting for age, sex and propensity score. Pathway enrichment analyses were performed on genes with P < .001. RESULT Using self-report or eCO, only two genes were differentially expressed between current and ex-smokers, with no enrichment in biological processes. When current smoking was defined using both eCO and self-report, four genes were differentially expressed (LRRN3, PID1, FUCA1, GPR15) with enrichment in 40 biological pathways related to metabolic processes, response to hypoxia and hormonal stimulus. Additionally, the combined definition provided better distributions of test statistics for differential gene expression. CONCLUSION A combined phenotype of eCO and self report allows for better discovery of genes and pathways related to current smoking. IMPLICATIONS Studies relying only on self report of smoking status to assess or adjust for the impact of smoking may not fully capture its effect and will lead to residual confounding of results.
Collapse
|
27
|
COPD Exacerbation Biomarkers Validated Using Multiple Reaction Monitoring Mass Spectrometry. PLoS One 2016; 11:e0161129. [PMID: 27525416 PMCID: PMC4985129 DOI: 10.1371/journal.pone.0161129] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2016] [Accepted: 07/30/2016] [Indexed: 12/03/2022] Open
Abstract
BACKGROUND Acute exacerbations of chronic obstructive pulmonary disease (AECOPD) result in considerable morbidity and mortality. However, there are no objective biomarkers to diagnose AECOPD. METHODS We used multiple reaction monitoring mass spectrometry to quantify 129 distinct proteins in plasma samples from patients with COPD. This analytical approach was first performed in a biomarker cohort of patients hospitalized with AECOPD (Cohort A, n = 72). Proteins differentially expressed between AECOPD and convalescent states were chosen using a false discovery rate <0.01 and fold change >1.2. Protein selection and classifier building were performed using an elastic net logistic regression model. The performance of the biomarker panel was then tested in two independent AECOPD cohorts (Cohort B, n = 37, and Cohort C, n = 109) using leave-pair-out cross-validation methods. RESULTS Five proteins were identified distinguishing AECOPD and convalescent states in Cohort A. Biomarker scores derived from this model were significantly higher during AECOPD than in the convalescent state in the discovery cohort (p<0.001). The receiver operating characteristic cross-validation area under the curve (CV-AUC) statistic was 0.73 in Cohort A, while in the replication cohorts the CV-AUC was 0.77 for Cohort B and 0.79 for Cohort C. CONCLUSIONS A panel of five biomarkers shows promise in distinguishing AECOPD from convalescence and may provide the basis for a clinical blood test to diagnose AECOPD. Further validation in larger cohorts is necessary for future clinical translation.
Collapse
|
28
|
Biomarker Development for Chronic Obstructive Pulmonary Disease. From Discovery to Clinical Implementation. Am J Respir Crit Care Med 2016; 192:1162-70. [PMID: 26176936 DOI: 10.1164/rccm.201505-0871pp] [Citation(s) in RCA: 41] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
Chronic obstructive pulmonary disease (COPD) is one of the major causes of morbidity and mortality in the world. Regrettably, there are no biomarkers to objectively diagnose COPD exacerbations, which are the major drivers of hospitalization and deaths from COPD. Moreover, there are no biomarkers to guide therapeutic choices or to risk stratify patients for imminent exacerbations and no objective biomarkers of disease activity or disease progression. Although there has been a tremendous investment in COPD biomarker discovery over the past 2 decades, clinical translation and implementation have not matched these efforts. In this article, we outline the challenges of biomarker development in COPD and provide an overview of a developmental pipeline that may be able to surmount these challenges and bring novel biomarker solutions to accelerate therapeutic discoveries and to improve the care and outcomes of the millions of individuals worldwide with COPD.
Collapse
|
29
|
Abstract
Background COPD is currently the fourth leading cause of death worldwide. Statins are lipid lowering agents with documented cardiovascular benefits. Observational studies have shown that statins may have a beneficial role in COPD. The impact of statins on blood gene expression from COPD patients is largely unknown. Objective Identify blood gene signature associated with statin use in COPD patients, and the pathways underpinning this signature that could explain any potential benefits in COPD. Methods Whole blood gene expression was measured on 168 statin users and 451 non-users from the ECLIPSE study using the Affymetrix Human Gene 1.1 ST microarray chips. Factor Analysis for Robust Microarray Summarization (FARMS) was used to process the expression data. Differential gene expression analysis was undertaken using the Linear Models for Microarray data (Limma) package adjusting for propensity score and surrogate variables. Similarity of the expression signal with published gene expression profiles was performed in ProfileChaser. Results 25 genes were differentially expressed between statin users and non-users at an FDR of 10%, including LDLR, CXCR2, SC4MOL, FAM108A1, IFI35, FRYL, ABCG1, MYLIP, and DHCR24. The 25 genes were significantly enriched in cholesterol homeostasis and metabolism pathways. The resulting gene signature showed correlation with Huntington’s disease, Parkinson’s disease and acute myeloid leukemia gene signatures. Conclusion The blood gene signature of statins’ use in COPD patients was enriched in cholesterol homeostasis pathways. Further studies are needed to delineate the role of these pathways in lung biology.
Collapse
|
30
|
Discovery of novel plasma protein biomarkers to predict imminent cystic fibrosis pulmonary exacerbations using multiple reaction monitoring mass spectrometry. Thorax 2015; 71:216-22. [PMID: 25777587 DOI: 10.1136/thoraxjnl-2014-206710] [Citation(s) in RCA: 32] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2014] [Accepted: 02/26/2015] [Indexed: 01/10/2023]
Abstract
BACKGROUND Despite the significant morbidity and mortality related to pulmonary exacerbations in cystic fibrosis (CF), there remains no reliable predictor of imminent exacerbation. OBJECTIVE To identify blood-based biomarkers to predict imminent (<4 months from stable blood draw) CF pulmonary exacerbations using targeted proteomics. METHODS 104 subjects provided plasma samples when clinically stable and were randomly split into discovery (n=70) and replication (n=34) cohorts. Multiple reaction monitoring mass spectrometry (MRM-MS) was used to measure 117 peptides (79 proteins) from plasma. Plasma proteins with differential abundance between subjects who did versus did not develop an imminent exacerbation were analysed and proteins with fold difference >1.5 between the groups were included in an MRM-MS classifier model to predict imminent exacerbations. Performance characteristics were compared with clinical predictors and candidate plasma protein biomarkers. RESULTS Six proteins were included in the final MRM-MS protein panel. The area under the curve (AUC) for the prediction of imminent exacerbations was highest for the MRM-MS protein panel (AUC 0.74) in comparison to FEV1% predicted (AUC 0.55) and the top candidate plasma protein biomarkers, including C-reactive protein (AUC 0.61) and interleukin-6 (AUC 0.60). The MRM-MS protein panel performed similarly in the replication cohort (AUC 0.73). CONCLUSIONS Using MRM-MS, a six-protein panel measured from plasma can distinguish individuals with versus without an imminent exacerbation. With further replication and assay development, this biomarker panel may be clinically applicable for prediction of exacerbations in individuals with CF.
Collapse
|
31
|
Novel multivariate methods for integration of genomics and proteomics data: applications in a kidney transplant rejection study. OMICS : A JOURNAL OF INTEGRATIVE BIOLOGY 2014; 18:682-95. [PMID: 25387159 PMCID: PMC4229708 DOI: 10.1089/omi.2014.0062] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
Multi-omics research is a key ingredient of data-intensive life sciences research, permitting measurement of biological molecules at different functional levels in the same individual. For a complete picture at the biological systems level, appropriate statistical techniques must however be developed to integrate different 'omics' data sets (e.g., genomics and proteomics). We report here multivariate projection-based analyses approaches to genomics and proteomics data sets, using the case study of and applications to observations in kidney transplant patients who experienced an acute rejection event (n=20) versus non-rejecting controls (n=20). In this data sets, we show how these novel methodologies might serve as promising tools for dimension reduction and selection of relevant features for different analytical frameworks. Unsupervised analyses highlighted the importance of post transplant time-of-rejection, while supervised analyses identified gene and protein signatures that together predicted rejection status with little time effect. The selected genes are part of biological pathways that are representative of immune responses. Gene enrichment profiles revealed increases in innate immune responses and neutrophil activities and a depletion of T lymphocyte related processes in rejection samples as compared to controls. In all, this article offers candidate biomarkers for future detection and monitoring of acute kidney transplant rejection, as well as ways forward for methodological advances to better harness multi-omics data sets.
Collapse
|
32
|
Two-stage, in silico deconvolution of the lymphocyte compartment of the peripheral whole blood transcriptome in the context of acute kidney allograft rejection. PLoS One 2014; 9:e95224. [PMID: 24733377 PMCID: PMC3986379 DOI: 10.1371/journal.pone.0095224] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2013] [Accepted: 03/24/2014] [Indexed: 01/21/2023] Open
Abstract
Acute rejection is a major complication of solid organ transplantation that prevents the long-term assimilation of the allograft. Various populations of lymphocytes are principal mediators of this process, infiltrating graft tissues and driving cell-mediated cytotoxicity. Understanding the lymphocyte-specific biology associated with rejection is therefore critical. Measuring genome-wide changes in transcript abundance in peripheral whole blood cells can deliver a comprehensive view of the status of the immune system. The heterogeneous nature of the tissue significantly affects the sensitivity and interpretability of traditional analyses, however. Experimental separation of cell types is an obvious solution, but is often impractical and, more worrying, may affect expression, leading to spurious results. Statistical deconvolution of the cell type-specific signal is an attractive alternative, but existing approaches still present some challenges, particularly in a clinical research setting. Obtaining time-matched sample composition to biologically interesting, phenotypically homogeneous cell sub-populations is costly and adds significant complexity to study design. We used a two-stage, in silico deconvolution approach that first predicts sample composition to biologically meaningful and homogeneous leukocyte sub-populations, and then performs cell type-specific differential expression analysis in these same sub-populations, from peripheral whole blood expression data. We applied this approach to a peripheral whole blood expression study of kidney allograft rejection. The patterns of differential composition uncovered are consistent with previous studies carried out using flow cytometry and provide a relevant biological context when interpreting cell type-specific differential expression results. We identified cell type-specific differential expression in a variety of leukocyte sub-populations at the time of rejection. The tissue-specificity of these differentially expressed probe-set lists is consistent with the originating tissue and their functional enrichment consistent with allograft rejection. Finally, we demonstrate that the strategy described here can be used to derive useful hypotheses by validating a cell type-specific ratio in an independent cohort using the nanoString nCounter assay.
Collapse
|
33
|
Variation in RNA-Seq transcriptome profiles of peripheral whole blood from healthy individuals with and without globin depletion. PLoS One 2014; 9:e91041. [PMID: 24608128 PMCID: PMC3946641 DOI: 10.1371/journal.pone.0091041] [Citation(s) in RCA: 60] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2013] [Accepted: 02/08/2014] [Indexed: 12/21/2022] Open
Abstract
Background The molecular profile of circulating blood can reflect physiological and pathological events occurring in other tissues and organs of the body and delivers a comprehensive view of the status of the immune system. Blood has been useful in studying the pathobiology of many diseases. It is accessible and easily collected making it ideally suited to the development of diagnostic biomarker tests. The blood transcriptome has a high complement of globin RNA that could potentially saturate next-generation sequencing platforms, masking lower abundance transcripts. Methods to deplete globin mRNA are available, but their effect has not been comprehensively studied in peripheral whole blood RNA-Seq data. In this study we aimed to assess technical variability associated with globin depletion in addition to assessing general technical variability in RNA-Seq from whole blood derived samples. Results We compared technical and biological replicates having undergone globin depletion or not and found that the experimental globin depletion protocol employed removed approximately 80% of globin transcripts, improved the correlation of technical replicates, allowed for reliable detection of thousands of additional transcripts and generally increased transcript abundance measures. Differential expression analysis revealed thousands of genes significantly up-regulated as a result of globin depletion. In addition, globin depletion resulted in the down-regulation of genes involved in both iron and zinc metal ion bonding. Conclusions Globin depletion appears to meaningfully improve the quality of peripheral whole blood RNA-Seq data, and may improve our ability to detect true biological variation. Some concerns remain, however. Key amongst them the significant reduction in RNA yields following globin depletion. More generally, our investigation of technical and biological variation with and without globin depletion finds that high-throughput sequencing by RNA-Seq is highly reproducible within a large dynamic range of detection and provides an accurate estimation of RNA concentration in peripheral whole blood. High-throughput sequencing is thus a promising technology for whole blood transcriptomics and biomarker discovery.
Collapse
|
34
|
Proteomic biomarkers of recovered heart function. Eur J Heart Fail 2014; 16:551-9. [PMID: 24574204 DOI: 10.1002/ejhf.65] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/14/2013] [Revised: 01/22/2014] [Accepted: 01/24/2014] [Indexed: 11/08/2022] Open
Abstract
AIMS Chronic heart failure is a costly epidemic that affects up to 2% of people in developed countries. The purpose of this study was to discover novel blood proteomic biomarker signatures of recovered heart function that could lead to more effective heart failure patient management by both primary care and specialty physicians. METHODS AND RESULTS The discovery cohort included 41 heart transplant patients and 20 healthy individuals. Plasma levels of 138 proteins were detected in at least 75% of these subjects by iTRAQ mass spectrometry. Eighteen proteins were identified that had (i) differential levels between pre-transplant patients with end-stage heart failure and healthy individuals; and (ii) levels that returned to normal by 1 month post-transplant in patients with stable heart function after transplantation. Seventeen of the 18 markers were validated by multiple reaction monitoring mass spectrometry in a cohort of 39 heart failure patients treated with drug therapy, of which 30 had recovered heart function and 9 had not. This 17-protein biomarker panel had 93% sensitivity and 89% specificity, while the RAMP® NT-proBNP assay had the same specificity but 80% sensitivity. Performance further improved when the panel was combined with NT-proBNP, yielding a net reclassification index relative to NT-proBNP of 0.28. CONCLUSIONS We have identified potential blood biomarkers of recovered heart function by harnessing data from transplant patients. These biomarkers can lead to the development of an inexpensive protein-based blood test that could be used by physicians to monitor response to therapy in heart failure, resulting in more personalized, front-line heart failure patient management.
Collapse
|
35
|
Longitudinal Analysis of Whole Blood Transcriptomes to Explore Molecular Signatures Associated with Acute Renal Allograft Rejection. Bioinform Biol Insights 2014; 8:17-33. [PMID: 24526836 PMCID: PMC3921155 DOI: 10.4137/bbi.s13376] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2013] [Revised: 11/17/2013] [Accepted: 11/17/2013] [Indexed: 11/05/2022] Open
Abstract
In this study, we explored a time course of peripheral whole blood transcriptomes from kidney transplantation patients who either experienced an acute rejection episode or did not in order to better delineate the immunological and biological processes measureable in blood leukocytes that are associated with acute renal allograft rejection. Using microarrays, we generated gene expression data from 24 acute rejectors and 24 nonrejectors. We filtered the data to obtain the most unambiguous and robustly expressing probe sets and selected a subset of patients with the clearest phenotype. We then performed a data-driven exploratory analysis using data reduction and differential gene expression analysis tools in order to reveal gene expression signatures associated with acute allograft rejection. Using a template-matching algorithm, we then expanded our analysis to include time course data, identifying genes whose expression is modulated leading up to acute rejection. We have identified molecular phenotypes associated with acute renal allograft rejection, including a significantly upregulated signature of neutrophil activation and accumulation following transplant surgery that is common to both acute rejectors and nonrejectors. Our analysis shows that this expression signature appears to stabilize over time in nonrejectors but persists in patients who go on to reject the transplanted organ. In addition, we describe an expression signature characteristic of lymphocyte activity and proliferation. This lymphocyte signature is significantly downregulated in both acute rejectors and nonrejectors following surgery; however, patients who go on to reject the organ show a persistent downregulation of this signature relative to the neutrophil signature.
Collapse
|
36
|
Longitudinal analysis of whole blood transcriptomes to explore molecular signatures associated with acute renal allograft rejection. Bioinform Biol Insights 2014. [PMID: 24526836 DOI: 10.4137/bbi.s13376.] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022] Open
Abstract
In this study, we explored a time course of peripheral whole blood transcriptomes from kidney transplantation patients who either experienced an acute rejection episode or did not in order to better delineate the immunological and biological processes measureable in blood leukocytes that are associated with acute renal allograft rejection. Using microarrays, we generated gene expression data from 24 acute rejectors and 24 nonrejectors. We filtered the data to obtain the most unambiguous and robustly expressing probe sets and selected a subset of patients with the clearest phenotype. We then performed a data-driven exploratory analysis using data reduction and differential gene expression analysis tools in order to reveal gene expression signatures associated with acute allograft rejection. Using a template-matching algorithm, we then expanded our analysis to include time course data, identifying genes whose expression is modulated leading up to acute rejection. We have identified molecular phenotypes associated with acute renal allograft rejection, including a significantly upregulated signature of neutrophil activation and accumulation following transplant surgery that is common to both acute rejectors and nonrejectors. Our analysis shows that this expression signature appears to stabilize over time in nonrejectors but persists in patients who go on to reject the transplanted organ. In addition, we describe an expression signature characteristic of lymphocyte activity and proliferation. This lymphocyte signature is significantly downregulated in both acute rejectors and nonrejectors following surgery; however, patients who go on to reject the organ show a persistent downregulation of this signature relative to the neutrophil signature.
Collapse
|
37
|
Biomarkers of Diastolic and Systolic Heart Failure. Can J Cardiol 2013. [DOI: 10.1016/j.cjca.2013.07.197] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022] Open
|
38
|
Plasma protein biosignatures for detection of cardiac allograft vasculopathy. J Heart Lung Transplant 2013; 32:723-33. [DOI: 10.1016/j.healun.2013.04.011] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2012] [Revised: 04/03/2013] [Accepted: 04/09/2013] [Indexed: 10/26/2022] Open
|
39
|
Computational biomarker pipeline from discovery to clinical implementation: plasma proteomic biomarkers for cardiac transplantation. PLoS Comput Biol 2013; 9:e1002963. [PMID: 23592955 PMCID: PMC3617196 DOI: 10.1371/journal.pcbi.1002963] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2012] [Accepted: 01/16/2013] [Indexed: 11/19/2022] Open
Abstract
Recent technical advances in the field of quantitative proteomics have stimulated a large number of biomarker discovery studies of various diseases, providing avenues for new treatments and diagnostics. However, inherent challenges have limited the successful translation of candidate biomarkers into clinical use, thus highlighting the need for a robust analytical methodology to transition from biomarker discovery to clinical implementation. We have developed an end-to-end computational proteomic pipeline for biomarkers studies. At the discovery stage, the pipeline emphasizes different aspects of experimental design, appropriate statistical methodologies, and quality assessment of results. At the validation stage, the pipeline focuses on the migration of the results to a platform appropriate for external validation, and the development of a classifier score based on corroborated protein biomarkers. At the last stage towards clinical implementation, the main aims are to develop and validate an assay suitable for clinical deployment, and to calibrate the biomarker classifier using the developed assay. The proposed pipeline was applied to a biomarker study in cardiac transplantation aimed at developing a minimally invasive clinical test to monitor acute rejection. Starting with an untargeted screening of the human plasma proteome, five candidate biomarker proteins were identified. Rejection-regulated proteins reflect cellular and humoral immune responses, acute phase inflammatory pathways, and lipid metabolism biological processes. A multiplex multiple reaction monitoring mass-spectrometry (MRM-MS) assay was developed for the five candidate biomarkers and validated by enzyme-linked immune-sorbent (ELISA) and immunonephelometric assays (INA). A classifier score based on corroborated proteins demonstrated that the developed MRM-MS assay provides an appropriate methodology for an external validation, which is still in progress. Plasma proteomic biomarkers of acute cardiac rejection may offer a relevant post-transplant monitoring tool to effectively guide clinical care. The proposed computational pipeline is highly applicable to a wide range of biomarker proteomic studies.
Collapse
|
40
|
Predicting acute cardiac rejection from donor heart and pre-transplant recipient blood gene expression. J Heart Lung Transplant 2013; 32:259-65. [DOI: 10.1016/j.healun.2012.11.008] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2012] [Revised: 11/03/2012] [Accepted: 11/10/2012] [Indexed: 12/21/2022] Open
|
41
|
A computational pipeline for the development of multi-marker bio-signature panels and ensemble classifiers. BMC Bioinformatics 2012; 13:326. [PMID: 23216969 PMCID: PMC3575305 DOI: 10.1186/1471-2105-13-326] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2012] [Accepted: 12/04/2012] [Indexed: 02/08/2023] Open
Abstract
Background Biomarker panels derived separately from genomic and proteomic data and with a variety of computational methods have demonstrated promising classification performance in various diseases. An open question is how to create effective proteo-genomic panels. The framework of ensemble classifiers has been applied successfully in various analytical domains to combine classifiers so that the performance of the ensemble exceeds the performance of individual classifiers. Using blood-based diagnosis of acute renal allograft rejection as a case study, we address the following question in this paper: Can acute rejection classification performance be improved by combining individual genomic and proteomic classifiers in an ensemble? Results The first part of the paper presents a computational biomarker development pipeline for genomic and proteomic data. The pipeline begins with data acquisition (e.g., from bio-samples to microarray data), quality control, statistical analysis and mining of the data, and finally various forms of validation. The pipeline ensures that the various classifiers to be combined later in an ensemble are diverse and adequate for clinical use. Five mRNA genomic and five proteomic classifiers were developed independently using single time-point blood samples from 11 acute-rejection and 22 non-rejection renal transplant patients. The second part of the paper examines five ensembles ranging in size from two to 10 individual classifiers. Performance of ensembles is characterized by area under the curve (AUC), sensitivity, and specificity, as derived from the probability of acute rejection for individual classifiers in the ensemble in combination with one of two aggregation methods: (1) Average Probability or (2) Vote Threshold. One ensemble demonstrated superior performance and was able to improve sensitivity and AUC beyond the best values observed for any of the individual classifiers in the ensemble, while staying within the range of observed specificity. The Vote Threshold aggregation method achieved improved sensitivity for all 5 ensembles, but typically at the cost of decreased specificity. Conclusion Proteo-genomic biomarker ensemble classifiers show promise in the diagnosis of acute renal allograft rejection and can improve classification performance beyond that of individual genomic or proteomic classifiers alone. Validation of our results in an international multicenter study is currently underway.
Collapse
|
42
|
Abstract
ChemModLab, written by the ECCR @ NCSU consortium under NIH support, is a toolbox for fitting and assessing quantitative structure-activity relationships (QSARs). Its elements are: a cheminformatic front end used to supply molecular descriptors for use in modeling; a set of methods for fitting models; and methods for validating the resulting model. Compounds may be input as structures from which standard descriptors will be calculated using the freely available cheminformatic front end PowerMV; PowerMV also supports compound visualization. In addition, the user can directly input their own choices of descriptors, so the capability for comparing descriptors is effectively unlimited. The statistical methodologies comprise a comprehensive collection of approaches whose validity and utility have been accepted by experts in the fields. As far as possible, these tools are implemented in open-source software linked into the flexible R platform, giving the user the capability of applying many different QSAR modeling methods in a seamless way. As promising new QSAR methodologies emerge from the statistical and data-mining communities, they will be incorporated in the laboratory. The web site also incorporates links to public-domain data sets that can be used as test cases for proposed new modeling methods. The capabilities of ChemModLab are illustrated using a variety of biological responses, with different modeling methodologies being applied to each. These show clear differences in quality of the fitted QSAR model, and in computational requirements. The laboratory is web-based, and use is free. Researchers with new assay data, a new descriptor set, or a new modeling method may readily build QSAR models and benchmark their results against other findings. Users may also examine the diversity of the molecules identified by a QSAR model. Moreover, users have the choice of placing their data sets in a public area to facilitate communication with other researchers; or can keep them hidden to preserve confidentiality.
Collapse
|
43
|
White blood cell differentials enrich whole blood expression data in the context of acute cardiac allograft rejection. Bioinform Biol Insights 2012; 6:49-61. [PMID: 22550401 PMCID: PMC3329187 DOI: 10.4137/bbi.s9197] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
Acute cardiac allograft rejection is a serious complication of heart transplantation. Investigating molecular processes in whole blood via microarrays is a promising avenue of research in transplantation, particularly due to the non-invasive nature of blood sampling. However, whole blood is a complex tissue and the consequent heterogeneity in composition amongst samples is ignored in traditional microarray analysis. This complicates the biological interpretation of microarray data. Here we have applied a statistical deconvolution approach, cell-specific significance analysis of microarrays (csSAM), to whole blood samples from subjects either undergoing acute heart allograft rejection (AR) or not (NR). We identified eight differentially expressed probe-sets significantly correlated to monocytes (mapping to 6 genes, all down-regulated in ARs versus NRs) at a false discovery rate (FDR) ≤ 15%. None of the genes identified are present in a biomarker panel of acute heart rejection previously published by our group and discovered in the same data***.
Collapse
|
44
|
Long non-coding RNAs are expressed in oral mucosa and altered in oral premalignant lesions. Oral Oncol 2011; 47:1055-61. [PMID: 21835683 DOI: 10.1016/j.oraloncology.2011.07.008] [Citation(s) in RCA: 66] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2011] [Revised: 06/17/2011] [Accepted: 07/09/2011] [Indexed: 01/01/2023]
Abstract
Oral epithelial dysplasias are believed to progress through a series of histopathological stages; from mild to severe dysplasia, to carcinoma in situ, and finally to invasive OSCC. Underlying this change in histopathological grade are gross chromosome alterations and changes in gene expression of both protein-coding genes and non-coding RNAs. Recent papers have described associations of aberrant expression of microRNAs, one class of non-coding RNAs, with oral cancer. However, expression profiling of long non-coding RNAs (lncRNAs) has not been reported. Long non-coding RNAs are a novel class of mRNA-like transcripts with no protein coding capacity, but with a variety of functions including roles in epigenetics and gene regulation. In recent reports, the aberrant expression of lncRNAs has been associated with human cancers, suggesting a critical role in tumorigenesis. Here, we present the first long non-coding RNA expression map for the human oral mucosa. We describe the expression of 325 long non-coding RNAs, suggesting lncRNA expression contributes significantly to the oral transcriptome. Intriguingly, ∼60% of the detected lncRNAs show aberrant expression in oral premalignant lesions. A number of these lncRNAs have been previously associated with other human cancers.
Collapse
|
45
|
Predicting Acute Cardiac Allograft Rejection Using Donor and Recipient Gene Expression. J Card Fail 2011. [DOI: 10.1016/j.cardfail.2011.06.145] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
46
|
A sequence-based approach to identify reference genes for gene expression analysis. BMC Med Genomics 2010; 3:32. [PMID: 20682026 PMCID: PMC2928167 DOI: 10.1186/1755-8794-3-32] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2010] [Accepted: 08/03/2010] [Indexed: 12/12/2022] Open
Abstract
BACKGROUND An important consideration when analyzing both microarray and quantitative PCR expression data is the selection of appropriate genes as endogenous controls or reference genes. This step is especially critical when identifying genes differentially expressed between datasets. Moreover, reference genes suitable in one context (e.g. lung cancer) may not be suitable in another (e.g. breast cancer). Currently, the main approach to identify reference genes involves the mining of expression microarray data for highly expressed and relatively constant transcripts across a sample set. A caveat here is the requirement for transcript normalization prior to analysis, and measurements obtained are relative, not absolute. Alternatively, as sequencing-based technologies provide digital quantitative output, absolute quantification ensues, and reference gene identification becomes more accurate. METHODS Serial analysis of gene expression (SAGE) profiles of non-malignant and malignant lung samples were compared using a permutation test to identify the most stably expressed genes across all samples. Subsequently, the specificity of the reference genes was evaluated across multiple tissue types, their constancy of expression was assessed using quantitative RT-PCR (qPCR), and their impact on differential expression analysis of microarray data was evaluated. RESULTS We show that (i) conventional references genes such as ACTB and GAPDH are highly variable between cancerous and non-cancerous samples, (ii) reference genes identified for lung cancer do not perform well for other cancer types (breast and brain), (iii) reference genes identified through SAGE show low variability using qPCR in a different cohort of samples, and (iv) normalization of a lung cancer gene expression microarray dataset with or without our reference genes, yields different results for differential gene expression and subsequent analyses. Specifically, key established pathways in lung cancer exhibit higher statistical significance using a dataset normalized with our reference genes relative to normalization without using our reference genes. CONCLUSIONS Our analyses found NDUFA1, RPL19, RAB5C, and RPS18 to occupy the top ranking positions among 15 suitable reference genes optimal for normalization of lung tissue expression data. Significantly, the approach used in this study can be applied to data generated using new generation sequencing platforms for the identification of reference genes optimal within diverse contexts.
Collapse
|
47
|
Abstract
Acute graft rejection is an important clinical problem in renal transplantation and an adverse predictor for long term graft survival. Plasma biomarkers may offer an important option for post-transplant monitoring and permit timely and effective therapeutic intervention to minimize graft damage. This case-control discovery study (n = 32) used isobaric tagging for relative and absolute protein quantification (iTRAQ) technology to quantitate plasma protein relative concentrations in precise cohorts of patients with and without biopsy-confirmed acute rejection (BCAR). Plasma samples were depleted of the 14 most abundant plasma proteins to enhance detection sensitivity. A total of 18 plasma proteins that encompassed processes related to inflammation, complement activation, blood coagulation, and wound repair exhibited significantly different relative concentrations between patient cohorts with and without BCAR (p value <0.05). Twelve proteins with a fold-change >or=1.15 were selected for diagnostic purposes: seven were increased (titin, lipopolysaccharide-binding protein, peptidase inhibitor 16, complement factor D, mannose-binding lectin, protein Z-dependent protease and beta(2)-microglobulin) and five were decreased (kininogen-1, afamin, serine protease inhibitor, phosphatidylcholine-sterol acyltransferase, and sex hormone-binding globulin) in patients with BCAR. The first three principal components of these proteins showed clear separation of cohorts with and without BCAR. Performance improved with the inclusion of sequential proteins, reaching a primary asymptote after the first three (titin, kininogen-1, and lipopolysaccharide-binding protein). Longitudinal monitoring over the first 3 months post-transplant based on ratios of these three proteins showed clear discrimination between the two patient cohorts at time of rejection. The score then declined to baseline following treatment and resolution of the rejection episode and remained comparable between cases and controls throughout the period of quiescent follow-up. Results were validated using ELISA where possible, and initial cross-validation estimated a sensitivity of 80% and specificity of 90% for classification of BCAR based on a four-protein ELISA classifier. This study provides evidence that protein concentrations in plasma may provide a relevant measure for the occurrence of BCAR and offers a potential tool for immunologic monitoring.
Collapse
|
48
|
Patient-derived first generation xenografts of non-small cell lung cancers: promising tools for predicting drug responses for personalized chemotherapy. Clin Cancer Res 2010; 16:1442-51. [PMID: 20179238 DOI: 10.1158/1078-0432.ccr-09-2878] [Citation(s) in RCA: 138] [Impact Index Per Article: 9.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
PURPOSE Current chemotherapeutic regimens have only modest benefit for non-small cell lung cancer (NSCLC) patients. Cumulative toxicities/drug resistance limit chemotherapy given after the first-line regimen. For personalized chemotherapy, clinically relevant NSCLC models are needed for quickly predicting the most effective regimens for therapy with curative intent. In this study, first generation subrenal capsule xenografts of primary NSCLCs were examined for (a) determining responses to conventional chemotherapeutic regimens and (b) selecting regimens most effective for individual patients. EXPERIMENTAL DESIGN Pieces (1x3x3 mm(3)) of 32 nontreated, completely resected patients' NSCLCs were grafted under renal capsules of nonobese diabetic/severe combined immunodeficient mice and treated with (A) cisplatin+vinorelbine, (B) cisplatin+docetaxel, (C) cisplatin+gemcitabine, and positive responses (treated tumor area <or=50% of control, P < 0.05) were determined. Clinical outcomes of treated patients were acquired. RESULTS Xenografts from all NSCLCs were established (engraftment rate, 90%) with the retention of major biological characteristics of the original cancers. The entire process of drug assessment took 8 weeks. Response rates to regimens A, B, and C were 28% (9 of 32), 42% (8 of 19), and 44% (7 of 16), respectively. Certain cancers that were resistant to a particular regimen were sensitive to others. The majority of responsive tumors contained foci of nonresponding cancer cells, indicative of tumor heterogeneity and potential drug resistance. Xenografts from six of seven patients who developed recurrence/metastasis were nonresponsive. CONCLUSIONS Models based on first generation NSCLC subrenal capsule xenografts have been developed, which are suitable for quick assessment (6-8 weeks) of the chemosensitivity of patients' cancers and selection of the most effective regimens. They hold promise for application in personalized chemotherapy of NSCLC patients.
Collapse
|
49
|
Transcriptome profiles of carcinoma-in-situ and invasive non-small cell lung cancer as revealed by SAGE. PLoS One 2010; 5:e9162. [PMID: 20161782 PMCID: PMC2820080 DOI: 10.1371/journal.pone.0009162] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2009] [Accepted: 01/07/2010] [Indexed: 12/29/2022] Open
Abstract
Background Non-small cell lung cancer (NSCLC) presents as a progressive disease spanning precancerous, preinvasive, locally invasive, and metastatic lesions. Identification of biological pathways reflective of these progressive stages, and aberrantly expressed genes associated with these pathways, would conceivably enhance therapeutic approaches to this devastating disease. Methodology/Principal Findings Through the construction and analysis of SAGE libraries, we have determined transcriptome profiles for preinvasive carcinoma-in-situ (CIS) and invasive squamous cell carcinoma (SCC) of the lung, and compared these with expression profiles generated from both bronchial epithelium, and precancerous metaplastic and dysplastic lesions using Ingenuity Pathway Analysis. Expression of genes associated with epidermal development, and loss of expression of genes associated with mucociliary biology, are predominant features of CIS, largely shared with precancerous lesions. Additionally, expression of genes associated with xenobiotic metabolism/detoxification is a notable feature of CIS, and is largely maintained in invasive cancer. Genes related to tissue fibrosis and acute phase immune response are characteristic of the invasive SCC phenotype. Moreover, the data presented here suggests that tissue remodeling/fibrosis is initiated at the early stages of CIS. Additionally, this study indicates that alteration in copy-number status represents a plausible mechanism for differential gene expression in CIS and invasive SCC. Conclusions/Significance This study is the first report of large-scale expression profiling of CIS of the lung. Unbiased expression profiling of these preinvasive and invasive lesions provides a platform for further investigations into the molecular genetic events relevant to early stages of squamous NSCLC development. Additionally, up-regulated genes detected at extreme differences between CIS and invasive cancer may have potential to serve as biomarkers for early detection.
Collapse
|
50
|
Abstract
Ensemble methods have become popular for QSAR modeling, but most studies have assumed balanced data, consisting of approximately equal numbers of active and inactive compounds. Cheminformatics data are often far from being balanced. We extend the application of ensemble methods to include cases of imbalance of class membership and to more adequately assess model output. Based on the extension, we propose an ensemble method called MBEnsemble that automatically determines the appropriate tuning parameters to provide reliable predictions and maximize the F-measure. Results from multiple data sets demonstrate that the proposed ensemble technique works well on imbalanced data.
Collapse
|