1
|
Dacic S, Travis WD, Giltnane JM, Kos F, Abel J, Hilz S, Fujimoto J, Sholl L, Ritter J, Khalil F, Liu Y, Taylor-Weiner A, Resnick M, Yu H, Hirsch FR, Bunn PA, Carbone DP, Rusch V, Kwiatkowski DJ, Johnson BE, Lee JM, Hennek SR, Wapinski I, Nicholas A, Johnson A, Schulze K, Kris MG, Wistuba II. Artificial Intelligence-Powered Assessment of Pathologic Response to Neoadjuvant Atezolizumab in Patients With NSCLC: Results From the LCMC3 Study. J Thorac Oncol 2023:S1556-0864(23)02415-2. [PMID: 38070597 DOI: 10.1016/j.jtho.2023.12.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2023] [Revised: 11/28/2023] [Accepted: 12/04/2023] [Indexed: 12/31/2023]
Abstract
INTRODUCTION Pathologic response (PathR) by histopathologic assessment of resected specimens may be an early clinical end point associated with long-term outcomes with neoadjuvant therapy. Digital pathology may improve the efficiency and precision of PathR assessment. LCMC3 (NCT02927301) evaluated neoadjuvant atezolizumab in patients with resectable NSCLC and reported a 20% major PathR rate. METHODS We determined PathR in primary tumor resection specimens using guidelines-based visual techniques and developed a convolutional neural network model using the same criteria to digitally measure the percent viable tumor on whole-slide images. Concordance was evaluated between visual determination of percent viable tumor (n = 151) performed by one of the 47 local pathologists and three central pathologists. RESULTS For concordance among visual determination of percent viable tumor, the interclass correlation coefficient was 0.87 (95% confidence interval [CI]: 0.84-0.90). Agreement for visually assessed 10% or less viable tumor (major PathR [MPR]) in the primary tumor was 92.1% (Fleiss kappa = 0.83). Digitally assessed percent viable tumor (n = 136) correlated with visual assessment (Pearson r = 0.73; digital/visual slope = 0.28). Digitally assessed MPR predicted visually assessed MPR with outstanding discrimination (area under receiver operating characteristic curve, 0.98) and was associated with longer disease-free survival (hazard ratio [HR] = 0.30; 95% CI: 0.09-0.97, p = 0.033) and overall survival (HR = 0.14, 95% CI: 0.02-1.06, p = 0.027) versus no MPR. Digitally assessed PathR strongly correlated with visual measurements. CONCLUSIONS Artificial intelligence-powered digital pathology exhibits promise in assisting pathologic assessments in neoadjuvant NSCLC clinical trials. The development of artificial intelligence-powered approaches in clinical settings may aid pathologists in clinical operations, including routine PathR assessments, and subsequently support improved patient care and long-term outcomes.
Collapse
Affiliation(s)
- Sanja Dacic
- Department of Pathology, Yale School of Medicine, New Haven, Connecticut.
| | - William D Travis
- Department of Pathology, Memorial Sloan Kettering Cancer Center, New York, New York
| | | | - Filip Kos
- Department of Machine Learning, PathAI, Inc., Boston, Massachusetts
| | - John Abel
- Department of Machine Learning, PathAI, Inc., Boston, Massachusetts
| | - Stephanie Hilz
- Research Pathology, Genentech, Inc., South San Francisco, California
| | - Junya Fujimoto
- Department of Translational Molecular Pathology, The University of Texas MD Anderson Cancer Center, Houston, Texas
| | - Lynette Sholl
- Department of Anatomic Pathology, Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusetts
| | - Jon Ritter
- Department of Pathology and Immunology, Washington University School of Medicine, St. Louis, Missouri
| | - Farah Khalil
- Department of Pathology, Moffitt Cancer Center, Tampa, Florida
| | - Yi Liu
- Department of Machine Learning, PathAI, Inc., Boston, Massachusetts
| | | | - Murray Resnick
- Department of Pathology, PathAI, Inc., Boston, Massachusetts
| | - Hui Yu
- Department of Pathology, University of Colorado Anschutz Medical Campus, Aurora, Colorado
| | - Fred R Hirsch
- Department of Hematology and Medical Oncology, University of Colorado/Icahn School of Medicine, Mount Sinai, New York
| | - Paul A Bunn
- Department of Medicine, University of Colorado Anschutz Medical Campus, Aurora, Colorado
| | - David P Carbone
- Division of Medical Oncology, The Ohio State University Medical Center and Pelotonia Institute for Immuno-Oncology, Columbus, Ohio
| | - Valerie Rusch
- Thoracic Surgery Service, Memorial Sloan Kettering Cancer Center, New York, New York
| | - David J Kwiatkowski
- Department of Anatomic Pathology, Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusetts
| | - Bruce E Johnson
- Lowe Center for Thoracic Oncology, Dana-Farber Cancer Institute, Boston, Massachusetts
| | - Jay M Lee
- Division of Thoracic Surgery, University of California, Los Angeles, Los Angeles, California
| | - Stephanie R Hennek
- Department of Translational Research, PathAI, Inc., Boston, Massachusetts
| | - Ilan Wapinski
- Department of Translational Research, PathAI, Inc., Boston, Massachusetts
| | - Alan Nicholas
- U.S. Medical Affairs, Genentech, Inc., South San Francisco, California
| | - Ann Johnson
- U.S. Medical Affairs, Genentech, Inc., South San Francisco, California
| | - Katja Schulze
- Research Pathology, Genentech, Inc., South San Francisco, California
| | - Mark G Kris
- Department of Thoracic Oncology, Memorial Sloan Kettering Cancer Center, New York, New York
| | - Ignacio I Wistuba
- Department of Translational Molecular Pathology, The University of Texas MD Anderson Cancer Center, Houston, Texas
| |
Collapse
|
2
|
Najdawi F, Sucipto K, Mistry P, Hennek S, Jayson CKB, Lin M, Fahy D, Kinsey S, Wapinski I, Beck AH, Resnick MB, Khosla A, Drage MG. Artificial Intelligence Enables Quantitative Assessment of Ulcerative Colitis Histology. Mod Pathol 2023; 36:100124. [PMID: 36841434 DOI: 10.1016/j.modpat.2023.100124] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2022] [Revised: 12/23/2022] [Accepted: 01/28/2023] [Indexed: 02/17/2023]
Abstract
Ulcerative colitis is a chronic inflammatory bowel disease that is characterized by a relapsing and remitting course. Assessment of disease activity critically informs treatment decisions. In addition to endoscopic remission, histologic remission is emerging as a treatment target and a key factor in the evaluation of disease activity and therapeutic efficacy. However, manual pathologist evaluation is semiquantitative and limited in granularity. Machine learning approaches are increasingly being developed to aid pathologists in accurate and reproducible scoring of histology, enabling precise quantitation of clinically relevant features. Here, we report the development and validation of convolutional neural network models that quantify histologic features pertinent to ulcerative colitis disease activity, directly from hematoxylin and eosin-stained whole slide images. Tissue and cell model predictions were used to generate quantitative human-interpretable features to fully characterize the histology samples. Tissue and cell predictions showed comparable agreement to pathologist annotations, and the extracted slide-level human-interpretable features demonstrated strong correlations with disease severity and pathologist-assigned Nancy histological index scores. Moreover, using a random forest classifier based on 13 human-interpretable features derived from the tissue and cell models, we were able to accurately predict Nancy histological index scores, with a weighted kappa (κ = 0.91) and Spearman correlation (⍴ = 0.89, P < .001) when compared with pathologist consensus Nancy histological index scores. We were also able to predict histologic remission, based on the absence of neutrophil extravasation, with a high accuracy of 0.97. This work demonstrates the potential of computer vision to enable a standardized and robust assessment of ulcerative colitis histopathology for translational research and improved evaluation of disease activity and prognosis.
Collapse
Affiliation(s)
| | | | | | | | | | - Mary Lin
- PathAI, Inc, Boston, Massachusetts
| | | | | | | | | | | | | | | |
Collapse
|
3
|
Iyer JS, Pokkalla H, Biddle-Snead C, Carrasco-Zevallos O, Lin M, Shanis Z, Le Q, Juyal D, Pouryahya M, Pedawi A, Hoffman S, Elliott H, Leidal K, Myers RP, Chung C, Billin AN, Watkins TR, Resnick M, Wack K, Glickman J, Burt AD, Loomba R, Sanyal AJ, Montalto MC, Beck AH, Taylor-Weiner A, Wapinski I. AI-based histologic scoring enables automated and reproducible assessment of enrollment criteria and endpoints in NASH clinical trials. medRxiv 2023:2023.04.20.23288534. [PMID: 37162870 PMCID: PMC10168404 DOI: 10.1101/2023.04.20.23288534] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/11/2023]
Abstract
Clinical trials in nonalcoholic steatohepatitis (NASH) require histologic scoring for assessment of inclusion criteria and endpoints. However, guidelines for scoring key features have led to variability in interpretation, impacting clinical trial outcomes. We developed an artificial intelligence (AI)-based measurement (AIM) tool for scoring NASH histology (AIM-NASH). AIM-NASH predictions for NASH Clinical Research Network (CRN) grades of necroinflammation and stages of fibrosis aligned with expert consensus scores and were reproducible. Continuous scores produced by AIM-NASH for key histological features of NASH correlated with mean pathologist scores and with noninvasive biomarkers and strongly predicted patient outcomes. In a retrospective analysis of the ATLAS trial, previously unmet pathological endpoints were met when scored by the AIM-NASH algorithm alone. Overall, these results suggest that AIM-NASH may assist pathologists in histologic review of NASH clinical trials, reducing inter-rater variability on trial outcomes and offering a more sensitive and reproducible measure of patient therapeutic response.
Collapse
Affiliation(s)
| | | | | | - Oscar Carrasco-Zevallos
- PathAI, Boston, MA, USA
- Affiliation shown is that during the time of study; current affiliation is Johnson & Johnson, New Brunswick, NJ, USA
| | | | | | | | | | - Maryam Pouryahya
- PathAI, Boston, MA, USA
- Affiliation shown is that during the time of study; current affiliation is AstraZeneca, Gaithersburg, MD, USA
| | - Aryan Pedawi
- PathAI, Boston, MA, USA
- Affiliation shown is that during the time of study; current affiliation is Atomwise, San Francisco, CA, USA
| | | | - Hunter Elliott
- PathAI, Boston, MA, USA
- Affiliation shown is that during the time of study; current affiliation is BigHat Biosciences, San Mateo, CA, USA
| | - Kenneth Leidal
- PathAI, Boston, MA, USA
- Affiliation shown is that during the time of study; current affiliation is Genesis Therapeutics, Burlingame, CA, USA
| | - Robert P. Myers
- Gilead Sciences, Inc., Foster City, CA, USA
- Affiliation shown is that during the time of study; current affiliation is OrsoBio, Inc., Palo Alto, CA, USA
| | - Chuhan Chung
- Gilead Sciences, Inc., Foster City, CA, USA
- Affiliation shown is that during the time of study; current affiliation is Inipharm, San Diego, CA, USA
| | | | | | - Murray Resnick
- PathAI, Boston, MA, USA
- Affiliation shown is that during the time of study; current affiliation is Rhode Island Hospital and The Miriam Hospital, Providence, RI, USA
| | | | | | | | - Rohit Loomba
- NAFLD Research Center, Division of Gastroenterology and Hepatology, University of California at San Diego, San Diego, CA, USA
| | - Arun J. Sanyal
- Stravitz-Sanyal Institute for Liver Disease and Metabolic Health, VCU School of Medicine, Richmond, VA, USA
| | | | | | | | | |
Collapse
|
4
|
Conway J, Pouryahya M, Gindin Y, Pan DZ, Carrasco-Zevallos OM, Mountain V, Subramanian GM, Montalto MC, Resnick M, Beck AH, Huss RS, Myers RP, Taylor-Weiner A, Wapinski I, Chung C. Integration of deep learning-based histopathology and transcriptomics reveals key genes associated with fibrogenesis in patients with advanced NASH. Cell Rep Med 2023; 4:101016. [PMID: 37075704 PMCID: PMC10140650 DOI: 10.1016/j.xcrm.2023.101016] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2022] [Revised: 12/31/2022] [Accepted: 03/21/2023] [Indexed: 04/21/2023]
Abstract
Nonalcoholic steatohepatitis (NASH) is the most common chronic liver disease globally and a leading cause for liver transplantation in the US. Its pathogenesis remains imprecisely defined. We combined two high-resolution modalities to tissue samples from NASH clinical trials, machine learning (ML)-based quantification of histological features and transcriptomics, to identify genes that are associated with disease progression and clinical events. A histopathology-driven 5-gene expression signature predicted disease progression and clinical events in patients with NASH with F3 (pre-cirrhotic) and F4 (cirrhotic) fibrosis. Notably, the Notch signaling pathway and genes implicated in liver-related diseases were enriched in this expression signature. In a validation cohort where pharmacologic intervention improved disease histology, multiple Notch signaling components were suppressed.
Collapse
|
5
|
Qamra A, Srivastava MK, Fuentes E, Trotter B, Biju R, Chhor G, Cowan J, Gendreau S, Lincoln W, McGinnis L, Molinero L, Patil NS, Schedlbauer A, Schulze K, Stanford-Moore A, Chambre L, Wapinski I, Shames DS, Koeppen H, Hennek S, Fridlyand J, Giltnane JM, Amitai A. Abstract 5705: Digital pathology based prognostic & predictive biomarkers in metastatic non-small cell lung cancer. Cancer Res 2023. [DOI: 10.1158/1538-7445.am2023-5705] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/07/2023]
Abstract
Abstract
Background: In recent years, a relationship between the tumor microenvironment (TME) and patient response to targeted cancer immunotherapy has been suggested. We applied machine-learning algorithms on H&E stained tissue to study the TME in metastatic non-small cell lung cancer (NSCLC) patients. Our goal was to identify digital pathology (DP) features associated with outcome under combination treatment or monotherapy with atezolizumab (atezo), an anti-PD-L1 therapy, and relate those features to other data modalities. We analyzed patient data from two phase 3 clinical trials, OAK (docetaxel versus atezo in 2L+ NSCLC) and IMpower150 (bevacizumab, carboplatin, and paclitaxel (BCP) versus BCP+atezo (ABCP) in advanced 1L non-squamous NSCLC).
Methods: As part of our effort to build a DP-based tumor-immune microenvironment atlas, digitized H&E images were registered onto the PathAI research platform. Over 200K annotations from 90 pathologists were used to train convolutional neural networks (CNNs) that classify slide-level human-interpretable features (HIFs) of cells and tissue structures from images and deployed on images from OAK and IMpower150. HIFs and PD-L1 status were associated with outcome in all samples in each arm in OAK and results were validated in IMpower150, using Cox proportional hazard models. Bulk RNAseq was run using samples extracted from the same area as the H&E slide.
Results: We identified a composite feature capturing the ratio of immune cells to fibroblasts in the stroma predictive of both overall survival (OS) (HR=0.74 p=0.0046) and progression-free survival (PFS) (HR=0.87 p=0.14). While patients primarily benefit from atezo if they are PD-L1 high, we found that even PD-L1 negative patients benefited from atezo when enriched for this feature (22C3 PD-L1 assay: OS HR=0.59 p=0.015, PFS HR=0.8 p=0.25; SP142 PD-L1 assay: OS HR=0.74 p=0.12, PFS HR=0.88 p=0.45). We thus recognized a DP feature that was predictive for positive outcome with atezo treatment, independent of PD-L1 levels. This association was then validated in IMpower150 comparing ABCP to BCP, both overall (OS HR=0.69 p=0.012) and in PD-L1 negative patients (SP263 assay OS HR=0.56 p=0.034). Integrating with RNAseq, patients enriched for this DP feature showed similar enrichment for B and T gene signatures and depletion in CAF-related gene signatures, thus showing the harmonization of TME between different data modalities.
Conclusions: Using a deep learning-based assay for quantifying pathology features of the TME from H&E images in two NSCLC trials, we identified a novel biomarker predictive of outcome to PD-L1 targeting therapy, even in PD-L1 low & negative patients. Importantly, our work shows how different data modalities (DP, gene expression) can be integrated to further our understanding of the TME.
Citation Format: Aditi Qamra, Minu K. Srivastava, Eloisa Fuentes, Ben Trotter, Raymond Biju, Guillaume Chhor, James Cowan, Steven Gendreau, Webster Lincoln, Lisa McGinnis, Luciana Molinero, Namrata S. Patil, Amber Schedlbauer, Katja Schulze, Adam Stanford-Moore, Laura Chambre, Ilan Wapinski, David S. Shames, Hartmut Koeppen, Stephanie Hennek, Jane Fridlyand, Jennifer M. Giltnane, Assaf Amitai. Digital pathology based prognostic & predictive biomarkers in metastatic non-small cell lung cancer. [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2023; Part 1 (Regular and Invited Abstracts); 2023 Apr 14-19; Orlando, FL. Philadelphia (PA): AACR; Cancer Res 2023;83(7_Suppl):Abstract nr 5705.
Collapse
Affiliation(s)
- Aditi Qamra
- 1Hoffmann-La Roche Limited, Mississauga, Ontario, Canada
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
6
|
Kirkup C, Vasudevan S, Kos F, Trotter B, Resnick M, Beck AH, Montalto M, Wapinski I, Glass B, Lin M, Hennek S, Khosla A, Drage MG, Chambre L. Abstract P6-04-08: Machine learning-based characterization of the breast cancer tumor microenvironment for assessment of neoadjuvant-treatment response. Cancer Res 2023. [DOI: 10.1158/1538-7445.sabcs22-p6-04-08] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/06/2023]
Abstract
Abstract
Background: Neoadjuvant treatment of breast cancer has been shown to potentially reduce the extent and morbidity of subsequent surgery. Response to neoadjuvant therapy may also be prognostic; complete pathologic response (pCR) following neoadjuvant treatment is associated with improved long-term outcomes. pCR, defined as the absence of residual invasive cancer, is determined by evaluation of H&E-stained breast resections and regional lymph nodes following neoadjuvant treatment; however, pathologist assessment is subject to intra- and inter-reader variability. Here we report machine learning (ML)-based models to identify tissue regions and cell types in the tumor microenvironment (TME) of H&E-stained breast cancer specimens. Model predictions were used to derive tumor bed area, a key component of the residual cancer burden score (RCB) used to assess neoadjuvant-treatment pathological response.
Methods: Convolutional neural network (CNN) models were trained using digitized H&E-stained whole slide images (WSIs) of 2700 neoadjuvant-treated breast cancer specimens (resections and biopsies) from 4 sources, and an additional 1100 breast cancer primary resections from TCGA. 229,901 pathologist annotations were used to train CNN models to segment tissue regions (cancer epithelium, stroma, diffuse inflammatory infiltrate, ductal carcinoma in situ, lymph nodes and necrosis) and cell types (cancer epithelial cells, fibroblasts, lymphocytes, macrophages, foamy macrophages and plasma cells) at single-pixel resolution. These tissue region segmentations were then used to derive tumor bed area using a convex hull algorithm. Each model was evaluated by board certified pathologists for performance. Model predictions of tumor bed area were evaluated in comparison to mean measurements from 3 pathologists for each of 22 held-out test slides. To further assess cell model performance, 5 pathologists exhaustively annotated 120 frames (300 x 300 pixels) on test samples from a dataset not used in model development (N=536; resections and biopsies) to produce consensus ground truth cell labels. Model predictions were compared with pathologist annotations in these frames using Pearson correlation, precision, recall, and F1 metrics. Only those classes with greater than 50 consensus cells identified were evaluated.
Results: CNN predictions of tissue and cell classes within H&E breast cancer WSIs showed concordance with manual pathologist consensus labels. The weighted average Pearson correlation (across the relevant cell types) between the model and consensus was 0.75, comparable to the correlation of 0.81 between pathologists and consensus. Classification metrics for each cell class are reported in Table 1. Reduced performance of the model relative to the average pathologist performance may be due to heterogeneous slide characteristics and infrequency of some cell types in the data. For prediction of tumor bed area, CNN model predictions showed moderate correlation with pathologist consensus (Pearson r=0.65, 95% CI: 0.38-0.81).
Conclusions: CNN model classification of cell types and tissue regions across entire H&E breast cancer WSIs shows concordance with pathologist consensus. Model predictions of tumor bed area also show concordance with pathologist assessment and can be used to derive the RCB score. These models can be reproducibly applied to quantify diverse histological features in large datasets, potentially enabling improved standardization and efficiency of pathologist evaluation of the breast cancer TME and neoadjuvant response.
Classification Metrics for Individual Cell Classes
Citation Format: Christian Kirkup, Sanjana Vasudevan, Filip Kos, Benjamin Trotter, Murray Resnick, Andrew H. Beck, Michael Montalto, Ilan Wapinski, Ben Glass, Mary Lin, Stephanie Hennek, Archit Khosla, Michael G. Drage, Laura Chambre. Machine learning-based characterization of the breast cancer tumor microenvironment for assessment of neoadjuvant-treatment response [abstract]. In: Proceedings of the 2022 San Antonio Breast Cancer Symposium; 2022 Dec 6-10; San Antonio, TX. Philadelphia (PA): AACR; Cancer Res 2023;83(5 Suppl):Abstract nr P6-04-08.
Collapse
|
7
|
Nguyen TH, Mirzadeh M, Prakash A, Krause EL, Zhang J, Pyle M, Ogayo ER, Cramer HC, Kurt BB, Brosnan-Cashman J, Drage MG, Schnitt S, Beck AH, Montalto M, Wapinski I, Chambre L, Tolaney S, Waks A, Lee J, Mittendorf EA. Abstract P5-02-09: Quantitative analysis of fiber-level collagen features in H&E whole-slide images predicts neoadjuvant therapy response in patients with HER2+ breast cancer. Cancer Res 2023. [DOI: 10.1158/1538-7445.sabcs22-p5-02-09] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/06/2023]
Abstract
Abstract
Background: Neoadjuvant treatment (NAT) combining chemotherapy and HER2-targeted agents is frequently administered to HER2-positive (HER2+) breast cancer (BC) patients, with some experiencing a pathological complete response (pCR) and others having residual disease measured by the residual cancer burden (RCB) score. Here, we use a physics-guided machine learning (ML)-based approach to extract fiber-level collagen features from hematoxylin and eosin (H&E)-stained whole slide images (WSIs) and identify collagen-related associations with treatment response in HER2+ patients receiving NAT.
Methods: Clinical data and specimens from stage II-III HER2+ BC patients enrolled on the De-escalation to Adjuvant Antibodies Post-pCR to Neoadjuvant THP (DAPHNe; NCT03716180) clinical trial and treated with neoadjuvant paclitaxel/trastuzumab/pertuzumab were analyzed. An ML-based model trained to identify regions of BC tissue as invasive carcinoma, ductal carcinoma in situ (DCIS), diffuse inflammatory infiltrate, stroma, necrosis, or normal tissue was deployed on WSIs of H&E-stained diagnostic core needle biopsies (N=89) to generate tissue overlays. Additional tissue areas were computed from the tissue model predictions using heatmap transformation, including tumor nests (continuous regions predicted as invasive cancer epithelium or DCIS), tumor nest borders (stromal region boundaries 10 μm from tumor nests), and bulk tumor borders (stromal region boundaries 300 μm from aggregated tumor nests). A separate ML-based model trained to identify fiber-level collagen features in WSIs of H&E-stained specimens was also deployed to generate collagen overlays. A fiber feature extraction pipeline was utilized to characterize properties of all identified collagen fibers in the WSI (on the order of hundreds of thousands per slide), including length, width, tortuosity, and angle. These fiber features were then assessed based on their position within the tumor (e.g. relative to the tumor nest border). Combinatorial features (e.g. angle of fibers with respect to tumor boundary) were then explored univariately for associations (N=609) with treatment response. Patients with pCR (RCB=0; N=53) were considered responders, while all other cases (RCBI-III; N=36) were designated non-responders. Due to the small size of the cohort analyzed here, raw p-values are reported.
Results: Using estrogen receptor status as a clinical covariate, a logistic regression-based univariate analysis of 609 collagen-associated features revealed six features to strongly associate with pCR (p< 0.05, AUC≥0.75; Table 1). Notable feature themes were identified: 1) fiber tortuosity in tumor nest borders and tumor borders, 2) angle of fibers in tumor border with respect to tumor boundary, and 3) distribution patterns of fiber width in tumor nest borders. The presence of fibers perpendicular to tumor boundary tangents was negatively associated with pCR, as was higher fiber tortuosity and thickness in tumor nest borders.
Conclusions: Improved prediction of response to NAT in patients with BC is needed to determine appropriate treatment strategies for each patient. Here, using ML-based models to identify tissue features and collagen fibers, we identify collagen-associated features, measured directly from WSIs of H&E-stained diagnostic BC biopsies, that negatively correlate with pCR. Additional development of this strategy, including the addition of cell identification models and known clinical information, is underway to further refine this novel predictive model.
Citation Format: Tan H. Nguyen, Mohammad Mirzadeh, Aaditya Prakash, Emma L. Krause, Jun Zhang, Michael Pyle, Esther R. Ogayo, Harry C. Cramer, Busem Binboga Kurt, Jacqueline Brosnan-Cashman, Michael G. Drage, Stuart Schnitt, Andrew H. Beck, Michael Montalto, Ilan Wapinski, Laura Chambre, Sara Tolaney, Adrienne Waks, Justin Lee, Elizabeth A. Mittendorf. Quantitative analysis of fiber-level collagen features in H&E whole-slide images predicts neoadjuvant therapy response in patients with HER2+ breast cancer [abstract]. In: Proceedings of the 2022 San Antonio Breast Cancer Symposium; 2022 Dec 6-10; San Antonio, TX. Philadelphia (PA): AACR; Cancer Res 2023;83(5 Suppl):Abstract nr P5-02-09.
Collapse
|
8
|
Abel J, Kirkup C, Kos F, Gerardin Y, Srinivasan S, Brosnan-Cashman J, Leidal K, Vasudevan S, Rajan D, Jain S, Prakash A, Padigela H, Conway J, Patel N, Trotter B, Yu L, Taylor-Weiner A, Krause EL, Bronnimann M, Chambre L, Glass B, Parmar C, Hennek S, Khosla A, Resnick M, Beck AH, Montalto M, Najdawi F, Drage MG, Wapinski I. Abstract P4-09-08: AI-based quantitation of cancer cell and fibroblast nuclear morphology reflects transcriptomic heterogeneity and predicts survival in breast cancer. Cancer Res 2023. [DOI: 10.1158/1538-7445.sabcs22-p4-09-08] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/06/2023]
Abstract
Abstract
Background: Morphological features of cancer cell nuclei are routinely used to assess disease severity and prognosis, and cancer nuclear morphology has been linked to genomic alterations. Quantitative analyses of the nuclear features of cancer cells and other tumor-resident cell types, such as cancer-associated fibroblasts (CAFs), may reveal novel biomarkers for prognosis and treatment response. Here, we applied a pan-cancer nucleus detection and segmentation algorithm and a cell classification model to hematoxylin and eosin (H&E)-stained whole slide images (WSIs) of breast cancer specimens, enabling the measurement of morphological features of nuclei of multiple cell types within a tumor. Methods: Convolutional Neural Network models for 1) nucleus detection and segmentation and 2) cell classification were deployed on H&E-stained WSIs from The Cancer Genome Atlas (TCGA) breast cancer dataset (primary surgical resections; N=890). Separate models were trained to segment regions of stromal subtypes, such as inflamed and fibroblastic stroma. Nuclear features (area, axis length, eccentricity, color, and texture) were computed and aggregated across each slide to summarize slide-level nuclear morphology for each cell type. Next-generation sequencing-based metrics of genomic instability (N=774) and gene expression (N=868) were acquired and paired with TCGA WSIs. Gene set enrichment analysis was performed using the Molecular Signatures Database. Spearman correlation compared nuclear features to genomic instability metrics. Linear regression was used to assess the relationship between nuclear features and bulk gene expression. Multivariable Cox regression with age and ordinal tumor stage as covariates was used to find association between overall survival (OS) and nuclear features. All reported results were significant (p< 0.05) when adjusted for false discovery rate via the Benjamini-Hochberg procedure. Results: Variation in cancer cell nuclear area, a quantitative metric related to pathologist-assessed nuclear pleomorphism, was calculated by the standard deviation of the nuclear area of cancer cells across a WSI. This feature was associated with genomic instability, as measured by aneuploidy score (r=0.448) and homologous recombination deficiency score (r=0.382), and reduced OS. In contrast, the variability in fibroblast and lymphocyte nuclear areas did not correlate with either metric of genomic instability (all r< 0.1, p>0.05). Furthermore, an association between variation in cancer cell nuclear area with the expression of cell cycle and proliferation pathway genes was observed, suggesting that increased nuclear size heterogeneity may indicate a more aggressive cancer phenotype. Features quantifying CAF nuclear morphology were also assessed, revealing that CAF nucleus shape (larger minor axis length) was associated with lower OS, as well as the expression of gene sets involved in extracellular matrix remodeling and degradation. Conclusions: The nuclear morphologies of breast cancer cells and CAFs reflect underlying genomic and transcriptomic properties of the tumor and correlates with patient outcome. The application of digital pathology analysis of breast cancer histopathology slides enables the integrative study of genomics, transcriptomics, tumor morphology, and overall survival to support research into disease biology research and biomarker discovery.
Citation Format: John Abel, Christian Kirkup, Filip Kos, Ylaine Gerardin, Sandhya Srinivasan, Jacqueline Brosnan-Cashman, Ken Leidal, Sanjana Vasudevan, Deepta Rajan, Suyog Jain, Aaditya Prakash, Harshith Padigela, Jake Conway, Neel Patel, Benjamin Trotter, Limin Yu, Amaro Taylor-Weiner, Emma L. Krause, Matthew Bronnimann, Laura Chambre, Ben Glass, Chintan Parmar, Stephanie Hennek, Archit Khosla, Murray Resnick, Andrew H. Beck, Michael Montalto, Fedaa Najdawi, Michael G. Drage, Ilan Wapinski. AI-based quantitation of cancer cell and fibroblast nuclear morphology reflects transcriptomic heterogeneity and predicts survival in breast cancer [abstract]. In: Proceedings of the 2022 San Antonio Breast Cancer Symposium; 2022 Dec 6-10; San Antonio, TX. Philadelphia (PA): AACR; Cancer Res 2023;83(5 Suppl):Abstract nr P4-09-08.
Collapse
|
9
|
Michener C, Kirkup C, Rahsepar B, Iyer J, Abel J, Leidal K, Khosla A, Trotter B, Lin M, Resnick M, Glass B, Wapinski I, Najdawi F. 593P AI-powered analysis of nuclear morphology associated with prognosis in high-grade serous carcinoma. Ann Oncol 2022. [DOI: 10.1016/j.annonc.2022.07.721] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open
|
10
|
Griffin M, Gemici M, Javed A, Agrawal N, Resnick M, Yu L, Hoffman S, Mountain V, Harisiades J, Rothney M, Glass B, Wapinski I, Beck A, Walk E. Abstract 471: AIM PD-L1-NSCLC: Artificial intelligence-powered PD-L1 quantification for accurate prediction of tumor proportion score in diverse, multi-stain clinical tissue samples. Cancer Res 2022. [DOI: 10.1158/1538-7445.am2022-471] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Abstract
Introduction: Important immunotherapy drugs targeting PD-L1 are approved for first and second line treatment for various stages of NSCLC. Reproducible and precise evaluation of PD-L1 expression is essential to accurately evaluate patients’ eligibility for treatment and for enrollment in clinical trials. Current guidelines rely on pathologists to interpret tumor samples, which is challenging in part because different PD-L1 assays have distinct scoring criteria. As a result, determining eligibility by manual assessment can be inconsistent and inaccurate, leading to untreated patients. To support pathologist quantification of PD-L1 in clinical trials, PathAI has developed scanner-and antibody-agnostic machine learning (ML) models, AI-based histologist measurement of PD-L1 in NSCLC (AIM-PD-L1-NSCLC), for the quantification of PD-L1 expression in NSCLC using four PD-L1 immunohistochemistry (IHC) clones.
Methods: AIM-PD-L1-NSCLC was trained using convolutional neural networks to identify and quantify PD-L1-positive cells in digitized whole slide images (WSI) of tissue samples. Models were developed using over 5,000 diverse clinical biopsies and resections, including primary and metastatic adenocarcinoma and squamous cell carcinoma samples collected from 10 clinical trials and from two clinical laboratories, each stained for PD-L1 with one of four IHC clones: SP263 (N=1,320), SP142 (N=1,829) (both Ventana Medical Systems Inc., Tucson, AZ), 28-8 (N=1,331), or 22C3 (N=843) (both Agilent Technologies, Santa Clara, USA). Slides were digitized using Aperio, Philips, and Ventana scanners, and WSI were split into training (N=3,818) and test (N=1,505) datasets. The training dataset was annotated by board certified pathologists (313,770 annotations) to label tissue regions and cells. Human Interpretable features representing the number of tumor cells were automatically extracted from the model and a slide level Tumor Proportion Score (TPS) calculated as the proportion of PD-L1+ cancer cells divided by total cancer cells in tumor regions. Model predicted slide level TPS were compared with the median TPS of five pathologists’ scores using intraclass correlation coefficient (ICC) statistics.
Results: There was high concordance between ML model-predicted and median pathologists’ slide level TPS for all PD-L1 clones (ICC 0.93 (95% CI 0.90-0.94), and for each individual clone: 22C3 ICC 0.93 (95% CI 0.89-0.96); SP142 ICC 0.88 (95% CI 0.79-0.93); SP263 ICC 0.96 (95% CI 0.93-0.97; 28-8 ICC 0.90 (95% CI 0.85-0.93).
Conclusions: AIM PD-L1 NSCLC is highly concordant with the gold standard pathologist consensus score across four PD-L1 clones in a large diverse dataset. This model could support patient enrollment and stratification in prospective clinical trials, as well as quality control of staining and pathology drift.
Citation Format: Michael Griffin, Mevlana Gemici, Ashar Javed, Nishant Agrawal, Murray Resnick, Limin Yu, Sara Hoffman, Victoria Mountain, Jamie Harisiades, Megan Rothney, Benjamin Glass, Ilan Wapinski, Andrew Beck, Eric Walk. AIM PD-L1-NSCLC: Artificial intelligence-powered PD-L1 quantification for accurate prediction of tumor proportion score in diverse, multi-stain clinical tissue samples [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2022; 2022 Apr 8-13. Philadelphia (PA): AACR; Cancer Res 2022;82(12_Suppl):Abstract nr 471.
Collapse
|
11
|
Abel J, Rivard C, Kos F, Chhor G, Liu Y, Giltnane J, Hoffman S, Resnick M, Hedvat C, Taylor-Weiner A, Khalil F, Nicholas A, Fishbein GA, Sholl LM, Rekhtman N, Hennek S, Wapinski I, Johnson A, Montalto M, Schulze K, Johnson BE, Carbone DP, Shilo K, Beck AH, Dacic S, Travis WD, Wistuba I. Abstract CT112: AI-powered and manual assessment of PD-L1 are comparable in predicting response to neoadjuvant atezolizumab in patients (pts) with resectable non-squamous, non-small cell lung cancer (NSCLC). Cancer Res 2022. [DOI: 10.1158/1538-7445.am2022-ct112] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Abstract
Background: PD-L1 expression evaluated by immunohistochemistry (IHC) is a well-established predictor of anti-PD-L1/PD-1 cancer immunotherapy (CIT). The Phase II LCMC3 (NCT02927301) study evaluated pre-operative treatment (tx) with atezolizumab (anti-PD-L1) in pts with untreated early stage resectable NSCLC, achieving a 20% major pathologic response (MPR) rate (primary efficacy pts, n=143). A digital PD-L1 scoring method was developed to assess PD-L1 expression as a potential predictive marker for MPR in squamous and non-squamous tumor samples from LCMC3.
Methods: Manual scoring was used to determine PD-L1 status on pre-tx biopsy samples using the tumor proportion score (TPS) with a positive threshold of TPS≥50 (22C3). Binary results were correlated with MPR and stratified by squamous/non-squamous histology. A digital pathology workflow for automated PD-L1 scoring was developed to yield a precise continuous PD-L1 TPS. Deep convolutional neural networks trained using pathologist annotations were used to detect individual cells within the tumor and tumor microenvironment and quantify their PD-L1 expression. These cell type predictions were used to compute a digital PD-L1 TPS. LCMC3 pts with available digital and manual PD-L1 scores were then used to assess the role of PD-L1 expression in predicting MPR.
Results: PD-L1 scores were available for pre-tx biopsies from 108 pts. No significant difference in scores was seen between histological subtypes. At cutoff (Oct 15, 2021), TPS≥50 was seen in 41 (non-squamous, n=26 [39%]; squamous, n=15 [36%]) of 108 pts and was associated with MPR in non-squamous (odds ratio [OR], 28.6; P<0.001; Fisher’s exact test) but not squamous histology (OR, 1.3; P=1.0). Continuous digital PD-L1 scores (range: 0-100) were highly correlated with local manual PD-L1 scores (range: 0-100) for squamous (n=42, Pearson r=0.90, P<0.001) and non-squamous stained histology slides (n=66, Pearson r=0.90, P<0.001). Continuous digital and manual PD-L1 TPS on pre-tx biopsies (n=108) were predictive of MPR (digital: area under the receiver operating curve (AUROC)=0.678, logistic regression [LR] P=0.01; manual: AUROC=0.675, LR P=0.003). Strikingly, when pts were stratified by histology, PD-L1 scores were predictive of MPR from pre-tx biopsies for non-squamous samples (n=66; digital: AUROC=0.821, LR P=0.002; manual: AUROC=0.819, LR P=0.001) but not for squamous samples (n=42; digital: AUROC=0.519, LR P=0.93; manual: AUROC=0.506, LR P=0.90), despite no significant difference in MPR rate between the 2 groups.
Conclusions: These findings support using digitally assessed PD-L1 IHC as a centralized and standardized scoring system and suggest that tumor histological subtype could be an important factor in the utility of PD-L1 as a predictive biomarker for neoadjuvant CIT in early stage NSCLC.
Citation Format: John Abel, Christopher Rivard, Filip Kos, Guillaume Chhor, Yi Liu, Jennifer Giltnane, Sara Hoffman, Murray Resnick, Cyrus Hedvat, Amaro Taylor-Weiner, Farah Khalil, Alan Nicholas, Gregory A. Fishbein, Lynette M. Sholl, Natasha Rekhtman, Stephanie Hennek, Ilan Wapinski, Ann Johnson, Michael Montalto, Katja Schulze, Bruce E. Johnson, David P. Carbone, Konstantin Shilo, Andrew H. Beck, Sanja Dacic, William D. Travis, Ignacio Wistuba. AI-powered and manual assessment of PD-L1 are comparable in predicting response to neoadjuvant atezolizumab in patients (pts) with resectable non-squamous, non-small cell lung cancer (NSCLC) [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2022; 2022 Apr 8-13. Philadelphia (PA): AACR; Cancer Res 2022;82(12_Suppl):Abstract nr CT112.
Collapse
Affiliation(s)
| | - Christopher Rivard
- 2University of Colorado School of Medicine, Division of Medical Oncology, Aurora, CO
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - David P. Carbone
- 9The Ohio State University Comprehensive Cancer Center, Columbus, OH
| | | | | | - Sanja Dacic
- 11University of Pittsburgh Medical Center, Pittsburgh, PA
| | | | - Ignacio Wistuba
- 12The University of Texas MD Anderson Cancer Center, Houston, TX
| |
Collapse
|
12
|
Egger R, Ieong M, Resnick M, Taylor-Weiner A, Mountain V, Wapinski I, Montalto M, Beck A, Hayes J, Glass B. Abstract 449: Machine learning models identify histological features that can predict KEAP1 mutations in lung adenocarcinoma. Cancer Res 2022. [DOI: 10.1158/1538-7445.am2022-449] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Abstract
Background: In lung cancer, the KEAP1/NRF2 pathway modulates an anti-tumor response by regulating cellular metabolism and inflammatory processes. Approximately 19% of lung adenocarcinoma (LUAD) tumors have a mutation in KEAP1. Clinically, KEAP1-mutated LUAD has poor prognosis, and there is need for rapid and accurate patient genotyping to inform treatment decisions. Here, we describe machine learning (ML) models that can predict KEAP1 mutation status from tissue histology.
Methods: ML models, pre-trained to identify and quantify areas of tissue (cancer epithelium, cancer stroma, and necrosis), counts of cancer, fibroblast, and immune cells (lymphocytes, macrophages, plasma cells) in non-small cell lung cancer (NSCLC), were applied to 208 hematoxylin and eosin (H&E)-stained whole slide images (WSI) of LUAD from The Cancer Genome Atlas (TCGA) without further training. Genomic analyses indicated that 17% (N=35) of these cases are KEAP1MUT. Human Interpretable Features (HIFs), based on histology predictions, are automatically extracted from the model and provide a quantitative description of the tumor microenvironment of each WSI. Associations between HIFs and KEAP1MUT were determined by univariate analysis followed by false discovery rate (FDR) correction. Hierarchical clustering using cross correlation and combining p-values for HIF groups using an Empirical Brown’s method identified correlations between HIFs. Confounding factor influence was accounted for after positive associations were identified. Independent validation of associations between KEAP1MUT and HIFs was performed using TCGA transcriptomic data to correlate specific mutations with mRNA expression of relevant markers.
Results: ML-models generated 4,443 HIFs from the TCGA LUAD WSI, which were reduced to 2,684 HIFs after removal of outlier HIFs, exclusion of HIFs that are degenerate, have missing features, or are of an absolute value.
KEAP1MUT was significantly associated with 193 HIFs in univariate analyses (p<0.05 after FDR correction), and four groups of HIFs after taking correlations into account (p<0.05 after group-wise FDR correction; 211-264 HIFs per group). 161 HIFs were identified by both methods. Further assessment of the KEAP1MUT -associated HIFs showed that these mutations were correlated with a reduction in macrophages in the tumor microenvironment (TME). This was supported by analysis of transcriptomic data from KEAP1MUT (N=35) and KEAP1WT (N=164) samples, which showed significantly reduced expression of the macrophage marker CD14 (p<0.001) in KEAP1MUT tissue samples.
Conclusions: ML model quantification of TME histological features can generate HIFs that correlate with the KEAP1MUT status of a LUAD biopsy. These results exemplify how ML-powered digital pathology could predict molecular markers directly from standard H&E biopsy slides.
Citation Format: Robert Egger, Martin Ieong, Murray Resnick, Amaro Taylor-Weiner, Victoria Mountain, Ilan Wapinski, Michael Montalto, Andrew Beck, Josie Hayes, Benjamin Glass. Machine learning models identify histological features that can predict KEAP1 mutations in lung adenocarcinoma [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2022; 2022 Apr 8-13. Philadelphia (PA): AACR; Cancer Res 2022;82(12_Suppl):Abstract nr 449.
Collapse
|
13
|
Shen C, Schlager C, Rajan D, Pouryahya M, Lin M, Mountain V, Wapinski I, Taylor-Weiner A, Glass B, Egger R, Beck A. Abstract 1922: Application of an interpretable graph neural network to predict gene expression signatures associated with tertiary lymphoid structures in histopathological images. Cancer Res 2022. [DOI: 10.1158/1538-7445.am2022-1922] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Abstract
Background: Tertiary lymphoid structures (TLS) are vascularized lymphocyte aggregates in the tumor microenvironment (TME) that correlate with better patient outcomes. Previous studies identified a 12 chemokine gene expression signature associated with disease progression and the type and degree of TLS. These signatures could provide insight important for clinical decision making during pathologic evaluation, but predicting gene expression from whole slide images (WSI) may be impeded by low prediction accuracy and lack of interpretability. Here we report an artificial intelligence (AI)-based, state-of-the-art workflow to predict the 12-chemokine TLS gene signature from lung cancer WSI, and identify histological features relevant to model predictions.
Methods: Models were trained using 538 cases of paired lung cancer WSI and mRNA-seq expression data (The Cancer Genome Atlas). Cell and tissue classifiers, based on convolutional neural networks (CNN) were trained on WSI, and a graph neural network (GNN) model that leverages the relative spatial arrangement of the CNN-identified cells and tissues was used to predict gene expression. GNN predictions of TLS signature genes were compared with the predictions of models trained using hand-crafted, task-specific features (TLS feature models) describing the number, size, and cellular composition of identified TLS. The Pearson correlation coefficient was used to assess the accuracy of GNN and TLS feature model predictions. GNNExplainer1, a tool that simultaneously identifies a subgraph and a subset of node features important for predictions, was applied to interpret the GNN model predictions.
Results: GNN model predictions show reasonable accuracy: GNN models significantly predicted mRNA expression of all 12 genes (p<0.05), and the predicted expression of six genes was moderately correlated with ground-truth measurements (Pearson-r>0.5). The correlation of GNN predictions was higher than that of the TLS feature models for all 12 signature genes. The GNNExplainer identified relevant features including the mean and standard deviation of lymphocyte count, and fraction of lymphocytes in cancer stroma. Subgraphs selected by the GNNExplainer focus on, but extend beyond, regions of human-annotated TLS objects, indicating that TLS may influence gene expression and the TME in regions beyond their immediate vicinity.
Conclusion: Here, we show a comparison of two interpretable AI methods for the prediction of TLS-induced gene expression from WSI. The outperforming GNN-based approach is highly reproducible and accurate, predicting histopathology features relevant to TLS that may be used to inform patient prognosis and treatment. These methods could be applied to predict additional clinically relevant transcriptomic signatures. 1. Ying, R, et al. 2019. arXiv:1903.03894v4
Citation Format: Ciyue Shen, Collin Schlager, Deepta Rajan, Maryam Pouryahya, Mary Lin, Victoria Mountain, Ilan Wapinski, Amaro Taylor-Weiner, Benjamin Glass, Robert Egger, Andrew Beck. Application of an interpretable graph neural network to predict gene expression signatures associated with tertiary lymphoid structures in histopathological images [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2022; 2022 Apr 8-13. Philadelphia (PA): AACR; Cancer Res 2022;82(12_Suppl):Abstract nr 1922.
Collapse
|
14
|
Abel J, Jain S, Rajan D, Leidal K, Padigela H, Prakash A, Conway J, Nercessian M, Kirkup C, Egger R, Trotter B, Beck A, Wapinski I, Drage MG, Yu L, Taylor-Weiner A. Abstract 464: AI-powered segmentation and analysis of nuclei morphology predicts genomic and clinical markers in multiple cancer types. Cancer Res 2022. [DOI: 10.1158/1538-7445.am2022-464] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Abstract
Morphological features of cancer cell nuclei are linked to gene expression signatures and genomic alterations. In addition, pathologists have leveraged nuclear morphology as diagnostic and prognostic markers. To enable the use of nuclear morphology in digital pathology, we developed a pan-tissue, deep-learning-based digital pathology pipeline for exhaustive nucleus detection, instance segmentation, and classification. We collected > 29,000 manual nucleus annotations from hematoxylin and eosin (H&E)-stained pathology images from 21 tumor types at 40x and 20x magnification from The Cancer Genome Atlas (TCGA) project, as well as a proprietary set of H&E-stained tissue biopsies of skin, liver non-alcoholic steatohepatitis (NASH), colon inflammatory bowel disease (IBD), and kidney lupus. Annotations were used to train an object detection and segmentation model for identifying nuclei. Application of the model to held-out test data, including held-out tissue types, demonstrated performance comparable to state-of-the-art models described in the literature (mean Dice score = 0.80, aggregated Jaccard index = 0.60). We deployed our model to segment nuclei in H&E slides from the breast cancer (BRCA, N = 941) and prostate adenocarcinoma (PRAD, N = 457) TCGA cohorts. We extracted interpretable features describing the shape (circularity, eccentricity), size, staining intensity (mean and standard deviation), and texture of each nucleus. Nuclei were assigned as cancer or other cell types using separately trained convolutional neural networks for BRCA and PRAD. We used the mean and standard deviation of each feature sampled from a random subset of cancer nuclei to summarize the nuclear morphology on each slide (mean (range) = 10,068 (5,981-10,452) cancer cells from each BRCA slide; mean (range) = 10,053 (5,029-10,495) cancer cells from each PRAD slide). We used nuclear features to construct random forest classification models for predicting markers of genomic instability and prognosis: whole-genome doubling (WGD) and homologous recombination deficiency (HRD) status separately in BRCA and PRAD, HER2 subtype in BRCA, and Gleason grade in PRAD. Nuclear features were predictive of WGD (area under the receiver operating characteristic curve (AUROC) = 0.78 BRCA, = 0.69 PRAD) and binarized HRD status (AUROC = 0.65 BRCA, = 0.68 PRAD) on held-out test sets. Nuclear features were predictive of HER2-enriched breast cancer vs. other molecular subtypes (AUROC = 0.72), and distinguished between low risk (6) and moderate/high risk (7-10) Gleason grade in PRAD (AUROC = 0.72). In summary, we present a powerful pan-tissue approach for nucleus segmentation and featurization, which enables the construction of predictive models and the identification of features linking nuclear morphology with clinically-relevant prognostic biomarkers across multiple cancer types.
Citation Format: John Abel, Suyog Jain, Deepta Rajan, Ken Leidal, Harshith Padigela, Aaditya Prakash, Jake Conway, Michael Nercessian, Christian Kirkup, Robert Egger, Ben Trotter, Andrew Beck, Ilan Wapinski, Michael G. Drage, Limin Yu, Amaro Taylor-Weiner. AI-powered segmentation and analysis of nuclei morphology predicts genomic and clinical markers in multiple cancer types [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2022; 2022 Apr 8-13. Philadelphia (PA): AACR; Cancer Res 2022;82(12_Suppl):Abstract nr 464.
Collapse
|
15
|
Bosch J, Chung C, Carrasco-Zevallos OM, Harrison SA, Abdelmalek MF, Shiffman ML, Rockey DC, Shanis Z, Juyal D, Pokkalla H, Le QH, Resnick M, Montalto M, Beck AH, Wapinski I, Han L, Jia C, Goodman Z, Afdhal N, Myers RP, Sanyal AJ. A Machine Learning Approach to Liver Histological Evaluation Predicts Clinically Significant Portal Hypertension in NASH Cirrhosis. Hepatology 2021; 74:3146-3160. [PMID: 34333790 DOI: 10.1002/hep.32087] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/19/2021] [Revised: 06/23/2021] [Accepted: 06/28/2021] [Indexed: 12/30/2022]
Abstract
BACKGROUND AND AIMS The hepatic venous pressure gradient (HVPG) is the standard for estimating portal pressure but requires expertise for interpretation. We hypothesized that HVPG could be extrapolated from liver histology using a machine learning (ML) algorithm. APPROACH AND RESULTS Patients with NASH with compensated cirrhosis from a phase 2b trial were included. HVPG and biopsies from baseline and weeks 48 and 96 were reviewed centrally, and biopsies evaluated with a convolutional neural network (PathAI, Boston, MA). Using trichrome-stained biopsies in the training set (n = 130), an ML model was developed to recognize fibrosis patterns associated with HVPG, and the resultant ML HVPG score was validated in a held-out test set (n = 88). Associations between the ML HVPG score with measured HVPG and liver-related events, and performance of the ML HVPG score for clinically significant portal hypertension (CSPH) (HVPG ≥ 10 mm Hg), were determined. The ML-HVPG score was more strongly correlated with HVPG than hepatic collagen by morphometry (ρ = 0.47 vs. ρ = 0.28; P < 0.001). The ML HVPG score differentiated patients with normal (0-5 mm Hg) and elevated (5.5-9.5 mm Hg) HVPG and CSPH (median: 1.51 vs. 1.93 vs. 2.60; all P < 0.05). The areas under receiver operating characteristic curve (AUROCs) (95% CI) of the ML-HVPG score for CSPH were 0.85 (0.80, 0.90) and 0.76 (0.68, 0.85) in the training and test sets, respectively. Discrimination of the ML-HVPG score for CSPH improved with the addition of a ML parameter for nodularity, Enhanced Liver Fibrosis, platelets, aspartate aminotransferase (AST), and bilirubin (AUROC in test set: 0.85; 95% CI: 0.78, 0.92). Although baseline ML-HVPG score was not prognostic, changes were predictive of clinical events (HR: 2.13; 95% CI: 1.26, 3.59) and associated with hemodynamic response and fibrosis improvement. CONCLUSIONS An ML model based on trichrome-stained liver biopsy slides can predict CSPH in patients with NASH with cirrhosis.
Collapse
Affiliation(s)
- Jaime Bosch
- Department of Biomedical Research, University of Bern, Bern, Switzerland
- University of Barcelona-IDIBAPS and CIBERehd, Barcelona, Spain
| | | | | | | | | | | | - Don C Rockey
- Medical University of South Carolina, Charleston, SC
| | | | | | | | | | | | | | | | | | - Ling Han
- Gilead Sciences, Inc, Foster City, CA
| | | | | | - Nezam Afdhal
- Beth Israel Deaconess Medical Center and Harvard Medical School, Boston, MA
| | | | | |
Collapse
|
16
|
Glass B, Adam Stanford-Moore S, Meghwal D, Agrawal N, Lin M, Hedvat C, Lee G, Ely S, Montalto M, Wapinski I, Baxi V, Beck A. 821 Machine learning models can quantify CD8 positivity in lymphocytes in melanoma clinical trial samples. J Immunother Cancer 2021. [DOI: 10.1136/jitc-2021-sitc2021.821] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022] Open
Abstract
BackgroundAn accurate histological characterization of immune cells in the tumor microenvironment is essential for developing novel immune oncology targeted therapies and can assist in guiding patient treatment decisions. However, immune phenotyping is subject to challenges of manual scoring and inter-pathologist scoring variability. To support pathologist-scored immune phenotyping across tumor types, we are developing machine learning (ML)-based models that can identify and quantify CD8+ lymphocytes within the stromal and parenchyma regions of tumors from non-small cell lung cancer, renal cell carcinoma, breast cancer, gastric cancer, head and neck squamous cell carcinoma, urothelial carcinoma, and melanoma. Here, we focus on the ML model for melanoma showing recent results for ML-based identification and quantification of CD8+ lymphocytes and concordance with manual pathologic assessment in data derived from clinical trials.MethodsML algorithms were developed to quantify CD8+ lymphocytes in melanoma using 200 samples from a commercial dataset containing both primary and metastatic melanoma cases. Models were trained using the PathAI research platform on digitized whole slide images (WSI) stained for CD8 using clone C8/144b (Dako), and annotations were provided by the PathAI network of expert pathologists. Training included identification of slide artifacts, parenchyma, cancer stroma, and necrosis, as well as CD8+ lymphocytes and other CD8– cell types. Examples of melanin, such as pigmented macrophages, were added to non-CD8+ cell types. To evaluate the performance of the ML model, model-predicted CD8+ counts were compared to a consensus count from five independent pathologists for representative regions (“frames”) using the Pearson correlation. This was done in 112 held-out test frames from 90 WSI baseline samples from three clinical trials of immunotherapy treatment in individuals with metastatic melanoma. Inter-pathologist agreement among the five pathologists was also calculated.ResultsML-based quantitation of CD8 positivity in lymphocytes showed high concordance with manual pathologist consensus counts. In frames validation of CD8+ counts on the test set of WSI, there was high correlation between the ML model and pathologist consensus counts (r=0.92 [95% CI 0.88–0.94]). This correlation was comparable to the agreement among the five expert pathologists (r=0.88 [95% CI 0.85–0.91]).ConclusionsML model-predicted CD8+ cell counts are highly concordant with pathologist scores on WSI samples from melanoma-focused clinical trials. These data demonstrate the capability of AI-powered digital pathology for accurate and reproducible quantitation of CD8+ lymphocytes in clinical trial samples, contributing to improved evaluation of the tumor microenvironment and targeted development of therapeutics.
Collapse
|
17
|
Taylor‐Weiner A, Pokkalla H, Han L, Jia C, Huss R, Chung C, Elliott H, Glass B, Pethia K, Carrasco‐Zevallos O, Shukla C, Khettry U, Najarian R, Taliano R, Subramanian GM, Myers RP, Wapinski I, Khosla A, Resnick M, Montalto MC, Anstee QM, Wong VW, Trauner M, Lawitz EJ, Harrison SA, Okanoue T, Romero‐Gomez M, Goodman Z, Loomba R, Beck AH, Younossi ZM. A Machine Learning Approach Enables Quantitative Measurement of Liver Histology and Disease Monitoring in NASH. Hepatology 2021; 74:133-147. [PMID: 33570776 PMCID: PMC8361999 DOI: 10.1002/hep.31750] [Citation(s) in RCA: 77] [Impact Index Per Article: 25.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/24/2020] [Revised: 12/23/2020] [Accepted: 01/05/2021] [Indexed: 12/16/2022]
Abstract
BACKGROUND AND AIMS Manual histological assessment is currently the accepted standard for diagnosing and monitoring disease progression in NASH, but is limited by variability in interpretation and insensitivity to change. Thus, there is a critical need for improved tools to assess liver pathology in order to risk stratify NASH patients and monitor treatment response. APPROACH AND RESULTS Here, we describe a machine learning (ML)-based approach to liver histology assessment, which accurately characterizes disease severity and heterogeneity, and sensitively quantifies treatment response in NASH. We use samples from three randomized controlled trials to build and then validate deep convolutional neural networks to measure key histological features in NASH, including steatosis, inflammation, hepatocellular ballooning, and fibrosis. The ML-based predictions showed strong correlations with expert pathologists and were prognostic of progression to cirrhosis and liver-related clinical events. We developed a heterogeneity-sensitive metric of fibrosis response, the Deep Learning Treatment Assessment Liver Fibrosis score, which measured antifibrotic treatment effects that went undetected by manual pathological staging and was concordant with histological disease progression. CONCLUSIONS Our ML method has shown reproducibility and sensitivity and was prognostic for disease progression, demonstrating the power of ML to advance our understanding of disease heterogeneity in NASH, risk stratify affected patients, and facilitate the development of therapies.
Collapse
Affiliation(s)
| | | | - Ling Han
- Gilead Sciences, Inc.Foster CityCA
| | | | | | | | | | | | | | | | | | | | | | - Ross Taliano
- Warren Alpert Medical School of Brown UniversityProvidenceRI
| | | | | | | | | | - Murray Resnick
- PathAIBostonMA,Warren Alpert Medical School of Brown UniversityProvidenceRI
| | | | - Quentin M. Anstee
- Translational & Clinical Research Institute, Faculty of Medical SciencesNewcastle UniversityNewcastle upon TyneUK
| | - Vincent Wai‐Sun Wong
- Department of Medicine and TherapeuticsThe Chinese University of Hong KongHong KongHong Kong
| | - Michael Trauner
- Division of Gastroenterology and HepatologyMedical University of ViennaViennaAustria
| | | | | | | | | | - Zachary Goodman
- Department of MedicineInova Fairfax Medical CampusFalls ChurchVA,Betty and Guy Beatty Center for Integrated ResearchInova Health SystemFalls ChurchVA
| | - Rohit Loomba
- NAFLD Research CenterUniversity of California at San DiegoLa JollaCA
| | | | - Zobair M. Younossi
- Department of MedicineInova Fairfax Medical CampusFalls ChurchVA,Betty and Guy Beatty Center for Integrated ResearchInova Health SystemFalls ChurchVA
| |
Collapse
|
18
|
Glass B, Vandenberghe ME, Chavali ST, Javed SA, Rebelatto M, Sridharan S, Elliott H, Rao S, Montalto M, Resnick M, Wapinski I, Beck A, Barker C. Machine learning models to quantify HER2 for real-time tissue image analysis in prospective clinical trials. J Clin Oncol 2021. [DOI: 10.1200/jco.2021.39.15_suppl.3061] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
3061 Background: Patient eligibility for HER2-targeting treatments is commonly informed by testing tumor HER2 expression using immunohistochemistry. As HER2 expression is visually assessed by pathologists, inter- and intra-rater variability might affect treatment decisions. Here, we report the development of an automated machine learning (ML)-based algorithm to quantify HER2 cell membrane expression across a diversity of breast cancer phenotypes as a clinical tool for monitoring HER2 testing quality. Methods: A total of 689 breast cancer tissue samples were either procured (Avaden Biosciences) or were anonymized samples from the AstraZeneca biobank comprising tissues from primary and metastatic tumors, core needle biopsies and surgical resections, lobular and ductal carcinomas, across tumor grades and HER2 expression levels. Samples were stained for HER2 detection (Ventana HER2 (4B5) Assay) and digitized (Leica Biosystems) across 5 laboratories in the US. Whole-slide images (WSIs) were stratified into training (n = 407), validation (n = 110), and test sets (n = 172). Multiple convolutional neural network based ML models (PathAI, Boston, MA) were trained using 190,000 manual annotations provided by 30 board-certified pathologists to identify artifacts, invasive tumor, identify individual cancer cells and measure tumor cell membrane HER2 expression as partial or complete, and negative, weak-or-moderate, or intense. Cell-level scores were validated against a consensus of manual cell counts from 5 independent pathologists in 320 representative regions of test set WSIs. HER2 scores were generated by automatically applying rules derived from 2018 ASCO/CAP guidelines and then compared in the test set with consensus scores from 3 independent pathologists. Results: Cell counts provided by the ML model were strongly consistent with cell counts obtained by pathologist consensus in all cell-types except for faintly positive HER2 cells where ML-based quantification identified more cells on average. Automatically generated ML-ASCO/CAP HER2 scores using WSI showed substantial consistency across IHC categories with the consensus of pathologists (ICC 0.88, 95%CI 0.82-0.92) in the test set and improved further when ML models were trained to agree with pathologists by adjusting cut offs (ICC 0.91, 95%CI 0.89-0.94). The ML-based model was deployed through the PathAI cloud platform to calculate HER2 testing quality control metrics in real-time in multicentric clinical trials. Conclusions: Automated image analysis of HER2-stained breast cancer tissues using ML-based models is consistent with pathologist consensus across breast cancer tissue types. The results support evidence that ML-based algorithms can help pathologists assess HER2 testing reproducibility in clinical trials.
Collapse
|
19
|
Taylor-Weiner A, Pedawi A, Chui WF, Diao J, Wang J, Mountain V, Glass B, Elliott H, Wapinski I, Montalto M, Khosla A, Beck AH. Abstract PD6-04: Deep-learning based prediction of homologous recombination deficiency (hrd) status from histological features in breast cancer; a research study. Cancer Res 2021. [DOI: 10.1158/1538-7445.sabcs20-pd6-04] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Abstract
BackgroundHomologous recombination deficiency (HRD), originally described in tumors from patients with germline mutations in BRCA1/2 genes, renders cells sensitive to poly-ADP ribose polymerase inhibitors (PARPi) (1), but can be caused by mutations in other genes and is prevalent across multiple cancer types (2). HRD status is of clinical interest because it can indicate patient eligibility for treatment with PARPi. Currently, HRD status is determined by sequencing to identify BRCA mutations or genomic instability, but this has a high rate of failure (3). In this research study, we apply a deep-learning based computational approach to directly infer HRD status from digitized images of hematoxylin and eosin (H&E) stained histology samples in breast cancer tumors. MethodsDigitized whole slide images (WSI) of 931 H&E stained, formalin-fixed and paraffin-embedded (FFPE) breast adenocarcinoma (BRCA) tumor biopsies from the cancer genome atlas (TCGA) were used to train machine learning (ML) models to identify patients that are HRD based on human-interpretable features (HIFs) and end-to-end (E2E) modeling. To train the models, samples were split into training and validation sets designated either HRD or homologous recombination proficient (HRP) based on a previously generated aggregate HRD score (calculated from regions of loss of heterozygosity, large scale genomic instability, and telomeric allelic imbalance) by genomic analysis of the PanCancerAtlas (2). We applied an untuned HRD score threshold of 45 to assign class labels resulting in 142/931 (15.3%) HRD cases.
Board certified pathologists (N=93) annotated tissue regions and cellular foci on the PathAI research platform yielding 65,477 annotations. ML models based on convolutional neural networks were trained to recognize breast cancer cells, lymphocytes, macrophages, plasma cells, fibroblasts, and tissue compartments including cancer epithelium, cancer stroma and necrosis within the H&E stained breast cancer samples.
Two pipelines constructed H&E histology-based classifiers of HRD status. A weakly-supervised “end-to-end” model using ResNets extracted features from small image patches with an attention module to aggregate across patches and directly predict HRD status. The HIF-based approach used the tissue segmentation and cell identification classifiers to quantify histological features in the WSI. From the labeled images, we extracted 600 HIFs that capture complex relationships between cell and tissue types. HIFs and patient clinical covariates were applied as input to a Sparse Group Lasso model to predict the HRD status of the associated patients.ResultsML models predicted HRD status from H&E stained WSI. The area under the receiver operating characteristics curve (AUROC) was 0.87 for the HIF model and 0.80 for the E2E model. Both classifiers achieved high sensitivity for HRD status (0.86) with more moderate precision (F1 score HIF: 0.80 and E2E: 0.72). Our HIF with clinical covariates model revealed morphological features that were significantly associated with HRD compared with HRP. HRD samples were enriched for areas of necrosis, stromal fibroblasts, and tumor infiltrating lymphocytes (p< 0.001, Mann-Whitney U test).
Conclusions
Computational models built with the PathAI research platform identified HRD positive patients directly from routinely collected H&E stained WSIs and identified a histological basis for how mutational signatures impact the tumor microenvironment. Disclaimer: The PathAI platform and HRD model are not intended for diagnostic purposes.
1 Farmer et al., 2005. Nature 14;434(7035):917-212 Knijnenburg et al., 2018. Cell Rep 23, 239–2543 Hoppe et al., 2018. JNCI 110(7): djy0854 Coudray et al., 2018 Nat Med 24:1559-15675 Kather et al., 2019. Nat Med 25: 1054–1056pages
Citation Format: Amaro Taylor-Weiner, Aryan Pedawi, Wan Fung Chui, James Diao, Jason Wang, Victoria Mountain, Benjamin Glass, Hunter Elliott, Ilan Wapinski, Michael Montalto, Aditya Khosla, Andrew H. Beck. Deep-learning based prediction of homologous recombination deficiency (hrd) status from histological features in breast cancer; a research study [abstract]. In: Proceedings of the 2020 San Antonio Breast Cancer Virtual Symposium; 2020 Dec 8-11; San Antonio, TX. Philadelphia (PA): AACR; Cancer Res 2021;81(4 Suppl):Abstract nr PD6-04.
Collapse
|
20
|
Loomba R, Noureddin M, Kowdley KV, Kohli A, Sheikh A, Neff G, Bhandari BR, Gunn N, Caldwell SH, Goodman Z, Wapinski I, Resnick M, Beck AH, Ding D, Jia C, Chuang JC, Huss RS, Chung C, Subramanian GM, Myers RP, Patel K, Borg BB, Ghalib R, Kabler H, Poulos J, Younes Z, Elkhashab M, Hassanein T, Iyer R, Ruane P, Shiffman ML, Strasser S, Wong VWS, Alkhouri N. Combination Therapies Including Cilofexor and Firsocostat for Bridging Fibrosis and Cirrhosis Attributable to NASH. Hepatology 2021; 73:625-643. [PMID: 33169409 DOI: 10.1002/hep.31622] [Citation(s) in RCA: 133] [Impact Index Per Article: 44.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/21/2020] [Revised: 10/19/2020] [Accepted: 10/26/2020] [Indexed: 12/11/2022]
Abstract
BACKGROUND AND AIMS Advanced fibrosis attributable to NASH is a leading cause of end-stage liver disease. APPROACH AND RESULTS In this phase 2b trial, 392 patients with bridging fibrosis or compensated cirrhosis (F3-F4) were randomized to receive placebo, selonsertib 18 mg, cilofexor 30 mg, or firsocostat 20 mg, alone or in two-drug combinations, once-daily for 48 weeks. The primary endpoint was a ≥1-stage improvement in fibrosis without worsening of NASH between baseline and 48 weeks based on central pathologist review. Exploratory endpoints included changes in NAFLD Activity Score (NAS), liver histology assessed using a machine learning (ML) approach, liver biochemistry, and noninvasive markers. The majority had cirrhosis (56%) and NAS ≥5 (83%). The primary endpoint was achieved in 11% of placebo-treated patients versus cilofexor/firsocostat (21%; P = 0.17), cilofexor/selonsertib (19%; P = 0.26), firsocostat/selonsertib (15%; P = 0.62), firsocostat (12%; P = 0.94), and cilofexor (12%; P = 0.96). Changes in hepatic collagen by morphometry were not significant, but cilofexor/firsocostat led to a significant decrease in ML NASH CRN fibrosis score (P = 0.040) and a shift in biopsy area from F3-F4 to ≤F2 fibrosis patterns. Compared to placebo, significantly higher proportions of cilofexor/firsocostat patients had a ≥2-point NAS reduction; reductions in steatosis, lobular inflammation, and ballooning; and significant improvements in alanine aminotransferase (ALT), aspartate aminotransferase (AST), bilirubin, bile acids, cytokeratin-18, insulin, estimated glomerular filtration rate, ELF score, and liver stiffness by transient elastography (all P ≤ 0.05). Pruritus occurred in 20%-29% of cilofexor versus 15% of placebo-treated patients. CONCLUSIONS In patients with bridging fibrosis and cirrhosis, 48 weeks of cilofexor/firsocostat was well tolerated, led to improvements in NASH activity, and may have an antifibrotic effect. This combination offers potential for fibrosis regression with longer-term therapy in patients with advanced fibrosis attributable to NASH.
Collapse
Affiliation(s)
- Rohit Loomba
- NAFLD Research CenterUniversity of California at San DiegoLa JollaCA
| | | | | | | | | | - Guy Neff
- Covenant Research, LLCSarasotaFL
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Reem Ghalib
- Texas Clinical Research InstituteArlingtonTX
| | | | - John Poulos
- Cumberland Research AssociatesFayettevilleNC
| | | | | | | | | | - Peter Ruane
- Ruane Medical and Liver Health InstituteLos AngelesCA
| | | | - Simone Strasser
- Royal Prince Alfred Hospital and The University of SydneyCamperdownNew South WalesAustralia
| | - Vincent Wai-Sun Wong
- Department of Medicine and TherapeuticsThe Chinese University of Hong KongHong KongHong Kong
| | - Naim Alkhouri
- Texas Liver InstituteUT Health San AntonioSan AntonioTX
| | | |
Collapse
|
21
|
Duan C, Montalto M, Lee G, Pandya D, Cohen D, Chang H, Tang H, Agrawal N, Elliott H, Glass B, Wapinski I, Edwards R, Beck AH, Baxi V. Abstract 2017: Association of digital and manual quantification of tumor PD-L1 expression with outcomes in nivolumab-treated patients. Cancer Res 2020. [DOI: 10.1158/1538-7445.am2020-2017] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Abstract
Background: Programmed death ligand 1 (PD-L1) expression on tumor cells (TC), detected by immunohistochemistry (IHC), is associated with response to programmed death-1 (PD-1)/PD-L1 inhibitors in some tumor types. Manual review of PD-L1–positive (PD-L1+) tumors can be subjective, with the potential for misclassification of PD-L1–low tumors as PD-L1–negative due to weak positivity. We compared artificial-intelligence (digital) and manual scoring methods and assessed the association of PD-L1 expression with clinical outcomes in nivolumab (NIVO)-treated patients with urothelial carcinoma (UC) and melanoma (MEL).
Methods: PD-L1 expression was determined in baseline samples from NIVO monotherapy-treated patients with UC (CM275, NCT02387996) and MEL (CM067, NCT01844505; CM238, NCT02388906) using the Dako PD-L1 IHC 28-8 pharmDx assay. PD-L1+ TC were scored using digital (PathAI research platform) and manual (LabCorp) methods. Prevalence of tumors with PD-L1+ TC ≥ 1% and ≥ 5% and associations between PD-L1 expression and outcomes with NIVO were evaluated.
Results: Prevalence of UC and MEL tumors with ≥ 1% and ≥ 5% PD-L1+ TC was higher for digital vs manual scoring (Table). For all samples, digital and manual scoring was associated with response to NIVO for PD-L1 ≥ 1% and ≥ 5%, and associations were similar between digital and manual scoring (Table). Digital and manual PD-L1 scoring correlated across samples from all trials (Kendall's tau range: 0.57–0.62).
TablePrevalence PD-L1+ TC ≥ 1%, n (%)Evaluable samples, nDigitalManualP valueSamples ≥ 1% by digital onlyCM275241166 (69)113 (47)1.61 × 10−658 (24)CM067264173 (66)160 (61)0.27936 (14)CM238377307 (81)259 (69)7.61 × 10−566 (18)PD-L1+ TC ≥ 1% vs < 1%DigitalManualORR, odds ratio (95% CI)CM275a2.15 (0.98–4.70)1.60 (0.82–3.14)CM067b1.99 (1.19–3.35)1.89 (1.12–3.18)Survival, hazard ratio (95% CI)CM275 (OS)a0.67 (0.48–0.92)0.66 (0.48–0.90)CM067 (OS)b0.57 (0.41–0.80)0.71 (0.50–1.00)CM238 (RFS)c0.53 (0.36–0.77)0.83 (0.57–1.21)Prevalence PD-L1+ TC ≥ 5%, n (%)Evaluable samples, nDigitalManualP valueSamples ≥ 5% by digital onlyCM27524190 (37)74 (31)0.14928 (12)CM067264103 (39)76 (29)0.01736 (14)CM238377234 (62)139 (37)7.54 × 10−12104 (28)PD-L1+ TC ≥ 5% vs < 5%DigitalManualORR, odds ratio (95% CI)CM275a3.50 (1.76–6.98)2.37 (1.18–4.73)CM067b2.33 (1.40–3.86)1.77 (1.01–3.09)Survival, hazard ratio (95% CI)CM275 (OS)a0.50 (0.36–0.71)0.58 (0.41–0.83)CM067 (OS)b0.67 (0.47–0.96)0.74 (0.51–1.09)CM238 (RFS)c0.50 (0.35–0.70)0.52 (0.36–0.76)Database lock 2019: CM275, June 14; CM067, January 18; CM238, April 3.aAdjusted for ECOG performance status, liver metastatic status, and hemoglobin.bAdjusted for ECOG performance status, liver metastatic status, lactate dehydrogenase, and BRAF mutation.cAdjusted for ECOG performance status, AJCC stage, lactate dehydrogenase, and BRAF mutation.AJCC, American Joint Committee on Cancer; CI, confidence interval; ECOG, Eastern Cooperative Oncology Group; ORR, objective response rate; OS, overall survival; PD-L1, programmed death ligand 1; RFS, recurrence-free survival; TC, tumor cells.
Conclusion: In post-hoc exploratory analyses, digital scoring of PD-L1 expression identified higher prevalence of PD-L1+ tumors and shows good association with response to NIVO in UC and MEL samples compared with manual scoring. Digital quantification demonstrated higher sensitivity at low levels of PD-L1 expression and may identify patients who could benefit from NIVO. Further study of the association with clinical outcomes is warranted and exploratory studies are ongoing to assess the performance of digital scoring in additional tumor types.
Citation Format: Chunzhe Duan, Michael Montalto, George Lee, Dimple Pandya, Daniel Cohen, Han Chang, Hao Tang, Nishant Agrawal, Hunter Elliott, Benjamin Glass, Ilan Wapinski, Robin Edwards, Andrew H. Beck, Vipul Baxi. Association of digital and manual quantification of tumor PD-L1 expression with outcomes in nivolumab-treated patients [abstract]. In: Proceedings of the Annual Meeting of the American Association for Cancer Research 2020; 2020 Apr 27-28 and Jun 22-24. Philadelphia (PA): AACR; Cancer Res 2020;80(16 Suppl):Abstract nr 2017.
Collapse
Affiliation(s)
| | - Michael Montalto
- 2(BMS employee at the time the analysis was conducted) PathAI, Boston, MA
| | | | | | | | - Han Chang
- 1Bristol-Myers Squibb, Princeton, NJ
| | - Hao Tang
- 1Bristol-Myers Squibb, Princeton, NJ
| | | | | | | | | | | | | | | |
Collapse
|
22
|
Taylor-Weiner A, Beck A, Cowan JD, Elliott H, Fridlyand J, Glass B, Guardino E, Hegde P, Kerner JK, Khosla A, Lee M, Liu Y, McCleland M, Montalto M, Schulze K, Shames DS, Srinivasan R, Zou W, Wapinski I, Giltnane JM. Machine learning-based identification of predictive features of the tumor micro-environment and vasculature in NSCLC patients using the IMpower150 study. J Clin Oncol 2020. [DOI: 10.1200/jco.2020.38.15_suppl.3130] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
3130 Background: IMpower150 is a phase 3 study measuring the effect of carboplatin and paclitaxel (CP) combined with atezolizumab (A) and/or bevacizumab (B) in patients with advanced nonsquamous NSCLC, testing the hypothesis that anti-PD-L1 therapy may be enhanced by the blockade of VEGF. Here, we apply a machine-learning based approach to quantify the tumor micro-environment (TME) and vasculature and identify associations with clinical outcome in IMpower150. Methods: Digitized H&E images were registered onto the PathAI research platform (n=1027). Over 200K annotations from 90 pathologists were used to train convolutional neural networks (CNNs) that classify human-interpretable features (HIFs) of cells and tissue structures from images. Blood vessel compression (BVC) indices were calculated using the long versus short axes for each predicted blood vessel. HIFs were clustered to reduce redundancy, and selected features were associated with progression free survival (PFS) within each arm (ABCP, ACP, and BCP) using Cox proportional hazard models. Results: We used the trained CNNs to generate 4,534 features summarizing each patient’s histopathology and TME. After association with survival and correction for multiple comparisons we identified clusters that were significantly associated with survival in at least one arm. Among patients receiving treatments that target PD-L1 (ABCP and ACP), high lymphocyte to fibroblast ratio (LFR) was associated with improved PFS (HR=0.64 (0.51, 0.81), p < 0.001) and showed no significant association with PFS among patients treated with BCP alone (HR=1.13 (0.85, 1.51), p=0.4). Among BCP treated patients, a higher average BVC within the tumor tissue was associated with improved PFS (HR=0.67 (0.50,0.90), p=0.01) and worse PFS among patients treated with ACP (HR=1.50 (1.10,2.06), p=0.009). Conclusions: We developed a deep learning-based assay for quantifying pathology features of the TME and vasculature from H&E images. Application of this system to Impower150 identified an association between high LFR and improved PFS among patients receiving PD-L1 targeting therapy, and between low BVC and improved PFS among patients receiving BCP. These findings support the importance of the TME and vasculature in determining response to PD-L1 and VEGF-targeting therapies.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | | | | | - Mark Lee
- Genentech, Roche, Menlo Park, CA
| | | | | | | | | | | | | | - Wei Zou
- Genentech, Inc., South San Francisco, CA
| | | | | |
Collapse
|
23
|
Kerner JK, Cleary A, Jain S, Pokkalla H, Glass B, Grossmith S, Harary M, Mittendorf E, Beck AH, Khosla A, Schnitt SJ, Wapinski I, King T. Abstract P5-02-02: Artificial intelligence powered predictive analysis of atypical ductal hyperplasia from digitized pathology images. Cancer Res 2020. [DOI: 10.1158/1538-7445.sabcs19-p5-02-02] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Abstract
Background: Approximately 15-25% of patients with atypical ductal hyperplasia (ADH) diagnosed on breast core needle biopsy (CNB) are upgraded to ductal carcinoma in situ (DCIS) or invasive carcinoma (IC) on surgical excision. The reproducible identification of patients with ADH on CNB who are more likely to have upgrades at excision remains elusive. We hypothesized that a machine learning approach could be utilized to train models to recognize ADH on digitized pathology images and to identify cases of ADH more likely to be upgraded to DCIS or IC at excision. The purpose of this study was to determine the accuracy of the machine learning approach to identify ADH.
Methods: 726 digitized images of CNB slides derived from 306 cases with a diagnosis of ADH between 11/2004-3/2018 were included in this study. Independent histologic review by two breast pathologists identified slides with and without ADH from each case. 39 board certified pathologists with experience in evaluation of breast biopsies were employed for tissue region annotation on the PathAI research platform (not intended for diagnostic purposes), yielding 14,118 tissue region annotations. Region annotations included ADH, ADH stroma, flat epithelial atypia (FEA), lobular neoplasia (LN), calcifications (Ca), columnar cell change/hyperplasia, sclerosing adenosis, papilloma, normal terminal duct lobular units and other non-atypical breast tissue regions. These annotations were used to train a convolutional neural network (CNN) with 35 layers and approximately 9 million parameters to identify ADH. The data were split into training and testing sets, representing 61.1% and 38.9% of the data respectively. The distribution of cases, images with ADH and cases with upgrade were balanced between the training and testing sets.
Results: CNB specimens were assigned labels of “ADH” or “No ADH” based on histologic assessment. AI models were able to predict the diagnosis of ADH with 85% sensitivity (144 of 168 images within the test set) and 69% specificity (78 of 113 images within the test set). The slide-level area under the receiver operator curve (ROC) for this model was 0.84.
Conclusions: A deep learning-based classifier showed strong performance for the identification of ADH from whole slide images of H&E stained breast CNBs. With further development, this approach may improve the reproducibility and standardization of the diagnosis of ADH. Future analyses will focus on determining if morphologic features of ADH extracted by the deep learning system can be used to predict upgrade to DCIS and IC. This approach may help stratify patients with ADH on CNB into those who require surgical excision and those who can be followed with active surveillance.
Citation Format: Jennifer K. Kerner, Allison Cleary, Suyog Jain, Harsha Pokkalla, Benjamin Glass, Sam Grossmith, Maya Harary, Elizabeth Mittendorf, Andrew H. Beck, Aditya Khosla, Stuart J. Schnitt, Ilan Wapinski, Tari King. Artificial intelligence powered predictive analysis of atypical ductal hyperplasia from digitized pathology images [abstract]. In: Proceedings of the 2019 San Antonio Breast Cancer Symposium; 2019 Dec 10-14; San Antonio, TX. Philadelphia (PA): AACR; Cancer Res 2020;80(4 Suppl):Abstract nr P5-02-02.
Collapse
Affiliation(s)
| | - Allison Cleary
- 2Dana Farber / Brigham and Women’s Cancer Center, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA
| | | | | | | | - Sam Grossmith
- 3Dana Farber / Brigham and Women’s Cancer Center, Boston, MA
| | - Maya Harary
- 4Brigham and Women’s Hospital, Harvard Medical School, Boston, MA
| | - Elizabeth Mittendorf
- 2Dana Farber / Brigham and Women’s Cancer Center, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA
| | | | | | - Stuart J. Schnitt
- 2Dana Farber / Brigham and Women’s Cancer Center, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA
| | | | - Tari King
- 2Dana Farber / Brigham and Women’s Cancer Center, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA
| |
Collapse
|
24
|
Szabo PM, Lee G, Ely S, Baxi V, Pokkalla H, Elliott H, Wang D, Glass B, Kerner JK, Wapinski I, Hedvat C, Locke D, Pandya D, Adya N, Qi Z, Greenfield A, Edwards R, Montalto M. CD8+ T cells in tumor parenchyma and stroma by image analysis (IA) and gene expression profiling (GEP): Potential biomarkers for immuno-oncology (I-O) therapy. J Clin Oncol 2019. [DOI: 10.1200/jco.2019.37.15_suppl.2594] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
2594 Background: Distribution patterns of CD8+ T cells within the tumor microenvironment (TME) can be assessed by IA, which may reflect underlying tumor biology and serve as a potential biomarker to assess the utility of I-O therapy. These patterns are variable and may be classified as immune desert (minimal infiltrate), excluded (T cells confined to tumor stroma or to the invasive margin), or inflamed (T cells diffusely infiltrating tumor parenchyma and stroma). We hypothesized that association of a GEP signature with abundance of parenchymal and stromal T-cell infiltrates may identify biomarkers of response or resistance to I-O therapy. To test this, we applied an AI-powered IA platform to quantify CD8+ T cells by geographical location and used GEP to define both CD8 abundance and associated geographic localization to tumor parenchyma and stroma. Methods: We performed an analysis using a tumor inflammatory GEP assay and CD8 immunohistochemistry on procured specimens (335 melanoma, 391 SCCHN). Digitized slides were used to train a convolutional neural network to quantify the number of CD8+ T cells in stroma, tumor parenchyma, parenchyma-stromal interface, and invasive margin. Generalized constrained regression models were used to predict GEP signatures specifically for stromal and parenchymal CD8+ T cells. Results: Parenchymal and stromal GEP scores were highly concordant with CD8+ infiltrate geography (adj- r2: 0.67, 0.65, respectively; P ≤ 0.01). Little overlap existed between gene sets associated with parenchymal and stromal CD8 T-cell geographies. CSF1R and NECTIN2 gene expression was observed to correlate inversely with parenchymal localization and directly with stromal CD8+ T-cell abundance. Conclusions: GEP signatures can be identified that are concordant with various CD8+ T-cell localization patterns in melanoma and SCCHN, demonstrating that GEP-IA can be developed to identify the immune status of interest in the TME. The specific genes identified have potential to elucidate mechanisms of resistance and/or inform I-O targets that can be further evaluated in relation to clinical significance in future studies.
Collapse
|
25
|
Georgescu CH, Manson AL, Griggs AD, Desjardins CA, Pironti A, Wapinski I, Abeel T, Haas BJ, Earl AM. SynerClust: a highly scalable, synteny-aware orthologue clustering tool. Microb Genom 2018; 4. [PMID: 30418868 PMCID: PMC6321874 DOI: 10.1099/mgen.0.000231] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
Abstract
Accurate orthologue identification is a vital component of bacterial comparative genomic studies, but many popular sequence-similarity-based approaches do not scale well to the large numbers of genomes that are now generated routinely. Furthermore, most approaches do not take gene synteny into account, which is useful information for disentangling paralogues. Here, we present SynerClust, a user-friendly synteny-aware tool based on synergy that can process thousands of genomes. SynerClust was designed to analyse genomes with high levels of local synteny, particularly prokaryotes, which have operon structure. SynerClust’s run-time is optimized by selecting cluster representatives at each node in the phylogeny; thus, avoiding the need for exhaustive pairwise similarity searches. In benchmarking against Roary, Hieranoid2, PanX and Reciprocal Best Hit, SynerClust was able to more completely identify sets of core genes for datasets that included diverse strains, while using substantially less memory, and with scalability comparable to the fastest tools. Due to its scalability, ease of installation and use, and suitability for a variety of computing environments, orthogroup clustering using SynerClust will enable many large-scale prokaryotic comparative genomics efforts.
Collapse
Affiliation(s)
| | | | | | | | | | | | - Thomas Abeel
- 1Broad Institute, Cambridge, MA, USA.,3Delft University of Technology, Delft, The Netherlands
| | | | | |
Collapse
|
26
|
Koo BM, Kritikos G, Farelli JD, Todor H, Tong K, Kimsey H, Wapinski I, Galardini M, Cabal A, Peters JM, Hachmann AB, Rudner DZ, Allen KN, Typas A, Gross CA. Construction and Analysis of Two Genome-Scale Deletion Libraries for Bacillus subtilis. Cell Syst 2017; 4:291-305.e7. [PMID: 28189581 DOI: 10.1016/j.cels.2016.12.013] [Citation(s) in RCA: 329] [Impact Index Per Article: 47.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2016] [Revised: 11/19/2016] [Accepted: 12/21/2016] [Indexed: 12/16/2022]
Abstract
A systems-level understanding of Gram-positive bacteria is important from both an environmental and health perspective and is most easily obtained when high-quality, validated genomic resources are available. To this end, we constructed two ordered, barcoded, erythromycin-resistance- and kanamycin-resistance-marked single-gene deletion libraries of the Gram-positive model organism, Bacillus subtilis. The libraries comprise 3,968 and 3,970 genes, respectively, and overlap in all but four genes. Using these libraries, we update the set of essential genes known for this organism, provide a comprehensive compendium of B. subtilis auxotrophic genes, and identify genes required for utilizing specific carbon and nitrogen sources, as well as those required for growth at low temperature. We report the identification of enzymes catalyzing several missing steps in amino acid biosynthesis. Finally, we describe a suite of high-throughput phenotyping methodologies and apply them to provide a genome-wide analysis of competence and sporulation. Altogether, we provide versatile resources for studying gene function and pathway and network architecture in Gram-positive bacteria.
Collapse
Affiliation(s)
- Byoung-Mo Koo
- Department of Microbiology and Immunology, University of California, San Francisco, San Francisco, CA 94158, USA
| | - George Kritikos
- European Molecular Biology Laboratory, Genome Biology Unit, Meyerhofstrasse 1, 69117 Heidelberg, Germany
| | | | - Horia Todor
- Department of Microbiology and Immunology, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Kenneth Tong
- Department of Microbiology and Immunology, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Harvey Kimsey
- Department of Microbiology and Immunobiology, Harvard Medical School, Boston, MA 02115, USA
| | - Ilan Wapinski
- Department of Systems Biology, Harvard Medical School, Boston, MA 02115, USA
| | - Marco Galardini
- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Angelo Cabal
- Department of Microbiology and Immunology, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Jason M Peters
- Department of Microbiology and Immunology, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Anna-Barbara Hachmann
- Department of Microbiology and Immunobiology, Harvard Medical School, Boston, MA 02115, USA
| | - David Z Rudner
- Department of Microbiology and Immunobiology, Harvard Medical School, Boston, MA 02115, USA
| | - Karen N Allen
- Department of Chemistry, Boston University, Boston, MA 02215, USA
| | - Athanasios Typas
- European Molecular Biology Laboratory, Genome Biology Unit, Meyerhofstrasse 1, 69117 Heidelberg, Germany.
| | - Carol A Gross
- Department of Microbiology and Immunology, University of California, San Francisco, San Francisco, CA 94158, USA; Department of Cell and Tissue Biology, University of California, San Francisco, San Francisco, CA 94158, USA; California Institute of Quantitative Biology, University of California, San Francisco, San Francisco, CA 94158, USA.
| |
Collapse
|
27
|
Muntel J, Boswell SA, Tang S, Ahmed S, Wapinski I, Foley G, Steen H, Springer M. Abundance-based classifier for the prediction of mass spectrometric peptide detectability upon enrichment (PPA). Mol Cell Proteomics 2014; 14:430-40. [PMID: 25473088 DOI: 10.1074/mcp.m114.044321] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
The function of a large percentage of proteins is modulated by post-translational modifications (PTMs). Currently, mass spectrometry (MS) is the only proteome-wide technology that can identify PTMs. Unfortunately, the inability to detect a PTM by MS is not proof that the modification is not present. The detectability of peptides varies significantly making MS potentially blind to a large fraction of peptides. Learning from published algorithms that generally focus on predicting the most detectable peptides we developed a tool that incorporates protein abundance into the peptide prediction algorithm with the aim to determine the detectability of every peptide within a protein. We tested our tool, "Peptide Prediction with Abundance" (PPA), on in-house acquired as well as published data sets from other groups acquired on different instrument platforms. Incorporation of protein abundance into the prediction allows us to assess not only the detectability of all peptides but also whether a peptide of interest is likely to become detectable upon enrichment. We validated the ability of our tool to predict changes in protein detectability with a dilution series of 31 purified proteins at several different concentrations. PPA predicted the concentration dependent peptide detectability in 78% of the cases correctly, demonstrating its utility for predicting the protein enrichment needed to observe a peptide of interest in targeted experiments. This is especially important in the analysis of PTMs. PPA is available as a web-based or executable package that can work with generally applicable defaults or retrained from a pilot MS data set.
Collapse
Affiliation(s)
- Jan Muntel
- From the ‡Departments of Pathology, Boston Children's Hospital and Harvard Medical School, Boston, MA
| | - Sarah A Boswell
- §Department of Systems Biology, Harvard Medical School, Boston, MA
| | - Shaojun Tang
- From the ‡Departments of Pathology, Boston Children's Hospital and Harvard Medical School, Boston, MA
| | - Saima Ahmed
- From the ‡Departments of Pathology, Boston Children's Hospital and Harvard Medical School, Boston, MA
| | - Ilan Wapinski
- §Department of Systems Biology, Harvard Medical School, Boston, MA
| | - Greg Foley
- §Department of Systems Biology, Harvard Medical School, Boston, MA
| | - Hanno Steen
- From the ‡Departments of Pathology, Boston Children's Hospital and Harvard Medical School, Boston, MA;
| | - Michael Springer
- §Department of Systems Biology, Harvard Medical School, Boston, MA
| |
Collapse
|
28
|
Thompson DA, Roy S, Chan M, Styczynski MP, Pfiffner J, French C, Socha A, Thielke A, Napolitano S, Muller P, Kellis M, Konieczka JH, Wapinski I, Regev A. Correction: Evolutionary principles of modular gene regulation in yeasts. eLife 2013; 2:e01114. [PMID: 23840936 PMCID: PMC3699816 DOI: 10.7554/elife.01114] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
|
29
|
Thompson DA, Roy S, Chan M, Styczynsky MP, Pfiffner J, French C, Socha A, Thielke A, Napolitano S, Muller P, Kellis M, Konieczka JH, Wapinski I, Regev A. Evolutionary principles of modular gene regulation in yeasts. eLife 2013; 2:e00603. [PMID: 23795289 PMCID: PMC3687341 DOI: 10.7554/elife.00603] [Citation(s) in RCA: 57] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2013] [Accepted: 05/02/2013] [Indexed: 12/20/2022] Open
Abstract
Divergence in gene regulation can play a major role in evolution. Here, we used a phylogenetic framework to measure mRNA profiles in 15 yeast species from the phylum Ascomycota and reconstruct the evolution of their modular regulatory programs along a time course of growth on glucose over 300 million years. We found that modules have diverged proportionally to phylogenetic distance, with prominent changes in gene regulation accompanying changes in lifestyle and ploidy, especially in carbon metabolism. Paralogs have significantly contributed to regulatory divergence, typically within a very short window from their duplication. Paralogs from a whole genome duplication (WGD) event have a uniquely substantial contribution that extends over a longer span. Similar patterns occur when considering the evolution of the heat shock regulatory program measured in eight of the species, suggesting that these are general evolutionary principles. DOI:http://dx.doi.org/10.7554/eLife.00603.001 The incredible diversity of living creatures belies the fact that their genes are quite similar. In the 1970s Mary-Claire King and Allan Wilson proposed that a process called gene regulation—which determines when, where and how genes are expressed as proteins—is responsible for this diversity. Four decades later, the central role of gene regulation in evolution has been confirmed in a wide range of species including bacteria, fungi, flies and mammals, although the details remain poorly understood. In recent years it has been suggested that the duplication of genes—and sometimes the duplication of whole genomes—has had a crucial influence on the part played by gene regulation in the evolution of many different species. Ascomycota fungi are uniquely suited to the study of genetics and evolution because of their diversity—they include C. albicans, a fungus that is found in the human mouth and gut, and various species of yeast—and because many of their genomes have already been sequenced. Moreover, their genomes are relatively small, which simplifies the task of working out how it has changed over the course of evolution. It is also known that species in this branch of the tree of life diverged before and after an event in which a whole genome was duplicated. Ascomycota fungi use glucose as a source of carbon in different ways during aerobic growth. Most, including C. albicans, are respiratory and rely on oxidative phosphorylation processes to produce energy. However, a small number—including S. cerevisiae and S. pombe, two types of yeast that are widely used as model organisms—prefer to ferment glucose, even when oxygen is available. Species that favor the latter respiro-fermentative lifestyle have evolved independently at least twice: once after the whole genome duplication event that lead to S. cerevisiae, and once when S. pombe and the other fission yeasts evolved. Thompson et al. have measured mRNA profiles in 15 different species of yeast and reconstructed how the regulation of groups of genes (modules) have evolved over a period of more than 300 million years. They found that modules have diverged proportionally to evolutionary time, with prominent changes in gene regulation being associated with changes in lifestyle (especially changes in carbon metabolism) and a whole genome duplication event. Gene duplication events result in gene paralogs—identical genes at different places in the genome—and these have made significant contributions to the evolution of different forms of gene regulation, especially just after the duplication event. Moreover, the paralogs produced in whole genome duplication events have resulted in bigger changes over longer periods of time. Similar patterns were observed in the regulation of the genes involved in the response to heat shock in eight of the species, which suggests that these are general evolutionary principles. The changes in gene expression associated with the respiro-fermentative lifestyle may also have implications for our understanding of cancer: healthy cells rely on oxidative phosphorylation to produce energy whereas, similar to yeast cells, most cancerous cells rely on respiro-fermentation. Furthermore, yeast cells and cancer cells both support their rapid growth and proliferation by using glucose for biosynthesis to support cell division, although this process is not fully understood. Normal cells, on the other hand, use glucose primarily for energy and tend not to divide rapidly. Thompson et al. found that the genes encoding enzymes in two biosynthetic pathways—one that produces the nucleotides necessary for DNA replication, and one that synthesizes glycine—are induced in respiro-fermentative yeasts but repressed in respiratory yeast cells. The fact that similar changes are observed in the same two pathways when normal cells become cancer cells suggests that these pathways have an important role in the development of cancer. The framework developed by Thompson et al. could also be used to explore the evolution of gene regulation in other species and biological processes. DOI:http://dx.doi.org/10.7554/eLife.00603.002
Collapse
Affiliation(s)
- Dawn A Thompson
- Broad Institute of MIT and Harvard , Cambridge , United States
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
30
|
Roy S, Wapinski I, Pfiffner J, French C, Socha A, Konieczka J, Habib N, Kellis M, Thompson D, Regev A. Arboretum: reconstruction and analysis of the evolutionary history of condition-specific transcriptional modules. Genome Res 2013; 23:1039-50. [PMID: 23640720 PMCID: PMC3668358 DOI: 10.1101/gr.146233.112] [Citation(s) in RCA: 49] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
Comparative functional genomics studies the evolution of biological processes by analyzing functional data, such as gene expression profiles, across species. A major challenge is to compare profiles collected in a complex phylogeny. Here, we present Arboretum, a novel scalable computational algorithm that integrates expression data from multiple species with species and gene phylogenies to infer modules of coexpressed genes in extant species and their evolutionary histories. We also develop new, generally applicable measures of conservation and divergence in gene regulatory modules to assess the impact of changes in gene content and expression on module evolution. We used Arboretum to study the evolution of the transcriptional response to heat shock in eight species of Ascomycota fungi and to reconstruct modules of the ancestral environmental stress response (ESR). We found substantial conservation in the stress response across species and in the reconstructed components of the ancestral ESR modules. The greatest divergence was in the most induced stress, primarily through module expansion. The divergence of the heat stress response exceeds that observed in the response to glucose depletion in the same species. Arboretum and its associated analyses provide a comprehensive framework to systematically study regulatory evolution of condition-specific responses.
Collapse
Affiliation(s)
- Sushmita Roy
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, USA.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
31
|
McGuire AM, Weiner B, Park ST, Wapinski I, Raman S, Dolganov G, Peterson M, Riley R, Zucker J, Abeel T, White J, Sisk P, Stolte C, Koehrsen M, Yamamoto RT, Iacobelli-Martinez M, Kidd MJ, Maer AM, Schoolnik GK, Regev A, Galagan J. Comparative analysis of Mycobacterium and related Actinomycetes yields insight into the evolution of Mycobacterium tuberculosis pathogenesis. BMC Genomics 2012; 13:120. [PMID: 22452820 PMCID: PMC3388012 DOI: 10.1186/1471-2164-13-120] [Citation(s) in RCA: 69] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2011] [Accepted: 03/28/2012] [Indexed: 01/24/2023] Open
Abstract
BACKGROUND The sequence of the pathogen Mycobacterium tuberculosis (Mtb) strain H37Rv has been available for over a decade, but the biology of the pathogen remains poorly understood. Genome sequences from other Mtb strains and closely related bacteria present an opportunity to apply the power of comparative genomics to understand the evolution of Mtb pathogenesis. We conducted a comparative analysis using 31 genomes from the Tuberculosis Database (TBDB.org), including 8 strains of Mtb and M. bovis, 11 additional Mycobacteria, 4 Corynebacteria, 2 Streptomyces, Rhodococcus jostii RHA1, Nocardia farcinia, Acidothermus cellulolyticus, Rhodobacter sphaeroides, Propionibacterium acnes, and Bifidobacterium longum. RESULTS Our results highlight the functional importance of lipid metabolism and its regulation, and reveal variation between the evolutionary profiles of genes implicated in saturated and unsaturated fatty acid metabolism. It also suggests that DNA repair and molybdopterin cofactors are important in pathogenic Mycobacteria. By analyzing sequence conservation and gene expression data, we identify nearly 400 conserved noncoding regions. These include 37 predicted promoter regulatory motifs, of which 14 correspond to previously validated motifs, as well as 50 potential noncoding RNAs, of which we experimentally confirm the expression of four. CONCLUSIONS Our analysis of protein evolution highlights gene families that are associated with the adaptation of environmental Mycobacteria to obligate pathogenesis. These families include fatty acid metabolism, DNA repair, and molybdopterin biosynthesis. Our analysis reinforces recent findings suggesting that small noncoding RNAs are more common in Mycobacteria than previously expected. Our data provide a foundation for understanding the genome and biology of Mtb in a comparative context, and are available online and through TBDB.org.
Collapse
|
32
|
Habib N, Wapinski I, Margalit H, Regev A, Friedman N. A functional selection model explains evolutionary robustness despite plasticity in regulatory networks. Mol Syst Biol 2012; 8:619. [PMID: 23089682 PMCID: PMC3501536 DOI: 10.1038/msb.2012.50] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2012] [Accepted: 08/29/2012] [Indexed: 11/09/2022] Open
Abstract
Evolutionary rewiring of regulatory networks is an important source of diversity among species. Previous evidence suggested substantial divergence of regulatory networks across species. However, systematically assessing the extent of this plasticity and its functional implications has been challenging due to limited experimental data and the noisy nature of computational predictions. Here, we introduce a novel approach to study cis-regulatory evolution, and use it to trace the regulatory history of 88 DNA motifs of transcription factors across 23 Ascomycota fungi. While motifs are conserved, we find a pervasive gain and loss in the regulation of their target genes. Despite this turnover, the biological processes associated with a motif are generally conserved. We explain these trends using a model with a strong selection to conserve the overall function of a transcription factor, and a much weaker selection over the specific genes it targets. The model also accounts for the turnover of bound targets measured experimentally across species in yeasts and mammals. Thus, selective pressures on regulatory networks mostly tolerate local rewiring, and may allow for subtle fine-tuning of gene regulation during evolution.
Collapse
Affiliation(s)
- Naomi Habib
- School of Computer Science and Engineering, Hebrew University, Jerusalem, Israel
- Department of Microbiology and Molecular Genetics, IMRIC, Faculty of Medicine, Hebrew University, Jerusalem, Israel
- Alexander Silberman Institute of Life Sciences, Hebrew University, Jerusalem, Israel
| | - Ilan Wapinski
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
- Broad Institute, 7 Cambridge Center, Cambridge, MA, USA
| | - Hanah Margalit
- Department of Microbiology and Molecular Genetics, IMRIC, Faculty of Medicine, Hebrew University, Jerusalem, Israel
| | - Aviv Regev
- Broad Institute, 7 Cambridge Center, Cambridge, MA, USA
- Howard Hughes Medical Institute, Department of Biology, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Nir Friedman
- School of Computer Science and Engineering, Hebrew University, Jerusalem, Israel
- Alexander Silberman Institute of Life Sciences, Hebrew University, Jerusalem, Israel
| |
Collapse
|
33
|
Rhind N, Chen Z, Yassour M, Thompson DA, Haas BJ, Habib N, Wapinski I, Roy S, Lin MF, Heiman DI, Young SK, Furuya K, Guo Y, Pidoux A, Chen HM, Robbertse B, Goldberg JM, Aoki K, Bayne EH, Berlin AM, Desjardins CA, Dobbs E, Dukaj L, Fan L, FitzGerald MG, French C, Gujja S, Hansen K, Keifenheim D, Levin JZ, Mosher RA, Müller CA, Pfiffner J, Priest M, Russ C, Smialowska A, Swoboda P, Sykes SM, Vaughn M, Vengrova S, Yoder R, Zeng Q, Allshire R, Baulcombe D, Birren BW, Brown W, Ekwall K, Kellis M, Leatherwood J, Levin H, Margalit H, Martienssen R, Nieduszynski CA, Spatafora JW, Friedman N, Dalgaard JZ, Baumann P, Niki H, Regev A, Nusbaum C. Comparative functional genomics of the fission yeasts. Science 2011; 332:930-6. [PMID: 21511999 DOI: 10.1126/science.1203357] [Citation(s) in RCA: 370] [Impact Index Per Article: 28.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023]
Abstract
The fission yeast clade--comprising Schizosaccharomyces pombe, S. octosporus, S. cryophilus, and S. japonicus--occupies the basal branch of Ascomycete fungi and is an important model of eukaryote biology. A comparative annotation of these genomes identified a near extinction of transposons and the associated innovation of transposon-free centromeres. Expression analysis established that meiotic genes are subject to antisense transcription during vegetative growth, which suggests a mechanism for their tight regulation. In addition, trans-acting regulators control new genes within the context of expanded functional modules for meiosis and stress response. Differences in gene content and regulation also explain why, unlike the budding yeast of Saccharomycotina, fission yeasts cannot use ethanol as a primary carbon source. These analyses elucidate the genome structure and gene regulation of fission yeast and provide tools for investigation across the Schizosaccharomyces clade.
Collapse
Affiliation(s)
- Nicholas Rhind
- Biochemistry and Molecular Pharmacology, University of Massachusetts Medical School, 364 Plantation Street, Worcester, MA 01605, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
34
|
Ma LJ, van der Does HC, Borkovich KA, Coleman JJ, Daboussi MJ, Di Pietro A, Dufresne M, Freitag M, Grabherr M, Henrissat B, Houterman PM, Kang S, Shim WB, Woloshuk C, Xie X, Xu JR, Antoniw J, Baker SE, Bluhm BH, Breakspear A, Brown DW, Butchko RAE, Chapman S, Coulson R, Coutinho PM, Danchin EGJ, Diener A, Gale LR, Gardiner DM, Goff S, Hammond-Kosack KE, Hilburn K, Hua-Van A, Jonkers W, Kazan K, Kodira CD, Koehrsen M, Kumar L, Lee YH, Li L, Manners JM, Miranda-Saavedra D, Mukherjee M, Park G, Park J, Park SY, Proctor RH, Regev A, Ruiz-Roldan MC, Sain D, Sakthikumar S, Sykes S, Schwartz DC, Turgeon BG, Wapinski I, Yoder O, Young S, Zeng Q, Zhou S, Galagan J, Cuomo CA, Kistler HC, Rep M. Comparative genomics reveals mobile pathogenicity chromosomes in Fusarium. Nature 2010; 464:367-73. [PMID: 20237561 PMCID: PMC3048781 DOI: 10.1038/nature08850] [Citation(s) in RCA: 998] [Impact Index Per Article: 71.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2009] [Accepted: 01/20/2010] [Indexed: 11/09/2022]
Abstract
Fusarium species are among the most important phytopathogenic and toxigenic fungi. To understand the molecular underpinnings of pathogenicity in the genus Fusarium, we compared the genomes of three phenotypically diverse species: Fusarium graminearum, Fusarium verticillioides and Fusarium oxysporum f. sp. lycopersici. Our analysis revealed lineage-specific (LS) genomic regions in F. oxysporum that include four entire chromosomes and account for more than one-quarter of the genome. LS regions are rich in transposons and genes with distinct evolutionary profiles but related to pathogenicity, indicative of horizontal acquisition. Experimentally, we demonstrate the transfer of two LS chromosomes between strains of F. oxysporum, converting a non-pathogenic strain into a pathogen. Transfer of LS chromosomes between otherwise genetically isolated strains explains the polyphyletic origin of host specificity and the emergence of new pathogenic lineages in F. oxysporum. These findings put the evolution of fungal pathogenicity into a new perspective.
Collapse
Affiliation(s)
- Li-Jun Ma
- The Broad Institute, Cambridge, Massachusetts 02141, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
35
|
Tetievsky A, Cohen O, Eli-Berchoer L, Gerstenblith G, Stern MD, Wapinski I, Friedman N, Horowitz M. Physiological and molecular evidence of heat acclimation memory: a lesson from thermal responses and ischemic cross-tolerance in the heart. Physiol Genomics 2008; 34:78-87. [PMID: 18430807 PMCID: PMC10585612 DOI: 10.1152/physiolgenomics.00215.2007] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
Sporadic findings in humans suggest that reinduction of heat acclimation (AC) after its loss occurs markedly faster than that during the initial AC session. Animal studies substantiated that the underlying acclimatory processes are molecular. Here we test the hypothesis that faster reinduction of AC (ReAC) implicates "molecular memory." In vivo measurements of colonic temperature profiles during heat stress and ex vivo assessment of cross-tolerance to ischemia-reperfusion or anoxia insults in the heart demonstrated that ReAC only needs 2 days vs. the 30 days required for the initial development of AC. Stress gene profiling in the experimental groups highlighted clusters of transcriptionally activated genes (37%), which included heat shock protein (HSP) genes, antiapoptotic genes, and chromatin remodeling genes. Despite a return of the physiological phenotype to its preacclimation state, after a 1 mo deacclimation (DeAC) period, the gene transcripts did not resume their preacclimation levels, suggesting a dichotomy between genotype and phenotype in this system. Individual detection of hsp70 and hsf1 transcripts agreed with these findings. HSP72, HSF1/P-HSF1, and Bcl-xL protein profiles followed the observed dichotomized genomic response. In contrast, HSP90, an essential cytoprotective component mismatched transcriptional activation upon DeAC. The uniform activation of the similarly responding gene clusters upon De-/ReAC implies that reacclimatory phenotypic plasticity is associated with upstream denominators. During AC, DeAC, and ReAC, the maintenance of elevated/phosphorylated HSF1 protein levels and transcriptionally active chromatin remodeling genes implies that chromatin remodeling plays a pivotal role in the transcriptome profile and in preconditioning to rapid cytoprotective acclimatory memory.
Collapse
Affiliation(s)
- Anna Tetievsky
- Laboratory of Environmental Physiology, Faculty of Dental Medicine, The Hebrew University, Jerusalem, Israel
| | | | | | | | | | | | | | | |
Collapse
|
36
|
Li QR, Carvunis AR, Yu H, Han JDJ, Zhong Q, Simonis N, Tam S, Hao T, Klitgord NJ, Dupuy D, Mou D, Wapinski I, Regev A, Hill DE, Cusick ME, Vidal M. Revisiting the Saccharomyces cerevisiae predicted ORFeome. Genome Res 2008; 18:1294-303. [PMID: 18502943 DOI: 10.1101/gr.076661.108] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Accurately defining the coding potential of an organism, i.e., all protein-encoding open reading frames (ORFs) or "ORFeome," is a prerequisite to fully understand its biology. ORFeome annotation involves iterative computational predictions from genome sequences combined with experimental verifications. Here we reexamine a set of Saccharomyces cerevisiae "orphan" ORFs recently removed from the original ORFeome annotation due to lack of conservation across evolutionarily related yeast species. We show that many orphan ORFs produce detectable transcripts and/or translated products in various functional genomics and proteomics experiments. By combining a naïve Bayes model that predicts the likelihood of an ORF to encode a functional product with experimental verification of strand-specific transcripts, we argue that orphan ORFs should still remain candidates for functional ORFs. In support of this model, interstrain intraspecies genome sequence variation is lower across orphan ORFs than in intergenic regions, indicating that orphan ORFs endure functional constraints and resist deleterious mutations. We conclude that ORFs should be evaluated based on multiple levels of evidence and not be removed from ORFeome annotation solely based on low sequence conservation in other species. Rather, such ORFs might be important for micro-evolutionary divergence between species.
Collapse
Affiliation(s)
- Qian-Ru Li
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA 02115, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
37
|
Wapinski I, Pfeffer A, Friedman N, Regev A. Natural history and evolutionary principles of gene duplication in fungi. Nature 2007; 449:54-61. [PMID: 17805289 DOI: 10.1038/nature06107] [Citation(s) in RCA: 474] [Impact Index Per Article: 27.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2007] [Accepted: 07/20/2007] [Indexed: 11/08/2022]
Abstract
Gene duplication and loss is a powerful source of functional innovation. However, the general principles that govern this process are still largely unknown. With the growing number of sequenced genomes, it is now possible to examine these events in a comprehensive and unbiased manner. Here, we develop a procedure that resolves the evolutionary history of all genes in a large group of species. We apply our procedure to seventeen fungal genomes to create a genome-wide catalogue of gene trees that determine precise orthology and paralogy relations across these species. We show that gene duplication and loss is highly constrained by the functional properties and interacting partners of genes. In particular, stress-related genes exhibit many duplications and losses, whereas growth-related genes show selection against such changes. Whole-genome duplication circumvents this constraint and relaxes the dichotomy, resulting in an expanded functional scope of gene duplication. By characterizing the functional fate of duplicate genes we show that duplicated genes rarely diverge with respect to biochemical function, but typically diverge with respect to regulatory control. Surprisingly, paralogous modules of genes rarely arise, even after whole-genome duplication. Rather, gene duplication may drive the modularization of functional networks through specialization, thereby disentangling cellular systems.
Collapse
Affiliation(s)
- Ilan Wapinski
- Broad Institute of MIT and Harvard, 7 Cambridge Center, Cambridge, Massachusetts 02142, USA
| | | | | | | |
Collapse
|
38
|
Abstract
UNLABELLED Gene duplication and divergence is a major evolutionary force. Despite the growing number of fully sequenced genomes, methods for investigating these events on a genome-wide scale are still in their infancy. Here, we present SYNERGY, a novel and scalable algorithm that uses sequence similarity and a given species phylogeny to reconstruct the underlying evolutionary history of all genes in a large group of species. In doing so, SYNERGY resolves homology relations and accurately distinguishes orthologs from paralogs. We applied our approach to a set of nine fully sequenced fungal genomes spanning 150 million years, generating a genome-wide catalog of orthologous groups and corresponding gene trees. Our results are highly accurate when compared to a manually curated gold standard, and are robust to the quality of input according to a novel jackknife confidence scoring. The reconstructed gene trees provide a comprehensive view of gene evolution on a genomic scale. Our approach can be applied to any set of sequenced eukaryotic species with a known phylogeny, and opens the way to systematic studies of the evolution of individual genes, molecular systems and whole genomes. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Ilan Wapinski
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | | | | | | |
Collapse
|
39
|
Horowitz M, Eli-Berchoer L, Wapinski I, Friedman N, Kodesh E. Stress-related genomic responses during the course of heat acclimation and its association with ischemic-reperfusion cross-tolerance. J Appl Physiol (1985) 2004; 97:1496-507. [PMID: 15155711 DOI: 10.1152/japplphysiol.00306.2004] [Citation(s) in RCA: 62] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
Acclimation to heat is a biphasic process involving a transient perturbed phase followed by a long lasting period during which acclimatory homeostasis is developed. In this investigation, we used cDNA stress microarray (Clontech Laboratory) to characterize the stress-related genomic response during the course of heat acclimation and to test the hypotheses that 1) heat acclimation influences the threshold of activation of protective molecular signaling, and 2) heat-acclimation-mediated ischemic-reperfusion (I/R) protection is coupled with reprogrammed gene expression leading to altered capacity or responsiveness of protective-signaling pathways shared by heat and I/R cytoprotective systems. Rats were acclimated at 34°C for 0, 2, and 30 days.32P-labeled RNA samples prepared from the left ventricles of rats before and after subjection to heat stress (HS; 2 h, 41°C) or after I/R insult (ischemia: 75%, 45 min; reperfusion: 30 min) were hybridized onto the array membranes. Confirmatory RT-PCR of selected genes conducted on samples taken at 0, 30, and 60 min after HS or total ischemia was used to assess the promptness of the transcriptional response. Cluster analysis of the expressed genes indicated that acclimation involves a “two-tier” defense strategy: an immediate transient response peaking at the initial acclimating phase to maintain DNA and cellular integrity, and a sustained response, correlated with slowly developed adaptive, long-lasting cytoprotective signaling networks involving genes encoding proteins that are essential for the heat-shock response, antiapoptosis, and antioxidation. Gene activation was stress specific. Faster activation and suppression of signaling pathways shared by HS and I/R stressors probably contribute to heat-acclimation I/R cross-tolerance.
Collapse
Affiliation(s)
- Michal Horowitz
- Laboratory of Environmental Physiology, Hadassah Medical School, The Hebrew University, POB 12272, Jerusalem 91120, Israel.
| | | | | | | | | |
Collapse
|
40
|
Abstract
The support vector machine (SVM) learning algorithm has been widely applied in bioinformatics. We have developed a simple web interface to our implementation of the SVM algorithm, called Gist. This interface allows novice or occasional users to apply a sophisticated machine learning algorithm easily to their data. More advanced users can download the software and source code for local installation. The availability of these tools will permit more widespread application of this powerful learning algorithm in bioinformatics.
Collapse
Affiliation(s)
- Paul Pavlidis
- Columbia Genome Center and Department of Biomedical Informatics, Columbia University, 1150 St Nicholas Avenue, New York, NY 10032, USA.
| | | | | |
Collapse
|