1
|
Wang Z, Ying Y, Wang M, Chen Q, Wang Y, Yu X, He W, Li J, Zeng S, Xu C. Comprehensive identification of onco-exaptation events in bladder cancer cell lines revealed L1PA2-SYT1 as a prognosis-relevant event. iScience 2023; 26:108482. [PMID: 38058305 PMCID: PMC10696462 DOI: 10.1016/j.isci.2023.108482] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2023] [Revised: 10/17/2023] [Accepted: 11/15/2023] [Indexed: 12/08/2023] Open
Abstract
Transposable elements (TEs) can provide ectopic promoters to drive the expression of oncogenes in cancer, a mechanism known as onco-exaptation. Onco-exaptation events have been extensively identified in various cancers, with bladder cancer showing a high frequency of onco-exaptation events (77%). However, the effect of most of these events in bladder cancer remains unclear. This study identified 44 onco-exaptation events in 44 bladder cancer cell lines in 137 RNA-seq datasets from six publicly available cohorts, with L1PA2 contributing the most events. L1PA2-SYT1, L1PA2-MET, and L1PA2-XCL1 had the highest frequency not only in cell lines but also in TCGA-BLCA samples. L1PA2-SYT1 showed significant tumor specificity and was found to be activated by CpG island demethylation in its promoter. The upregulation of L1PA2-SYT1 enhances the in vitro invasion of bladder cancer and is an independent risk factor for patient's overall survival, suggesting L1PA2-SYT1 being an important event that promotes the development of bladder cancer.
Collapse
Affiliation(s)
- Ziwei Wang
- Department of Urology, Changhai Hospital, Naval Medical University, Shanghai 200433, China
| | - Yidie Ying
- Department of Urology, Changhai Hospital, Naval Medical University, Shanghai 200433, China
| | - Maoyu Wang
- Department of Urology, Changhai Hospital, Naval Medical University, Shanghai 200433, China
| | - Qing Chen
- Department of Urology, Changhai Hospital, Naval Medical University, Shanghai 200433, China
| | - Yi Wang
- Department of Urology, Changhai Hospital, Naval Medical University, Shanghai 200433, China
| | - Xufeng Yu
- Department of Urology, Changhai Hospital, Naval Medical University, Shanghai 200433, China
| | - Wei He
- Department of Urology, Changhai Hospital, Naval Medical University, Shanghai 200433, China
| | - Jing Li
- Department of Bioinformatics, Center for Translational Medicine, Naval Medical University, Shanghai 200433, China
- Shanghai Key Laboratory of Cell Engineering, Shanghai, China
| | - Shuxiong Zeng
- Department of Urology, Changhai Hospital, Naval Medical University, Shanghai 200433, China
| | - Chuanliang Xu
- Department of Urology, Changhai Hospital, Naval Medical University, Shanghai 200433, China
| |
Collapse
|
2
|
Bu C, Zheng X, Mai J, Nie Z, Zeng J, Qian Q, Xu T, Sun Y, Bao Y, Xiao J. CCLHunter: An efficient toolkit for cancer cell line authentication. Comput Struct Biotechnol J 2023; 21:4675-4682. [PMID: 37841327 PMCID: PMC10568302 DOI: 10.1016/j.csbj.2023.09.040] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2023] [Revised: 09/28/2023] [Accepted: 09/28/2023] [Indexed: 10/17/2023] Open
Abstract
Cancer cell lines are essential in cancer research, yet accurate authentication of these cell lines can be challenging, particularly for consanguineous cell lines with close genetic similarities. We introduce a new Cancer Cell Line Hunter (CCLHunter) method to tackle this challenge. This approach utilizes the information of single nucleotide polymorphisms, expression profiles, and kindred topology to authenticate 1389 human cancer cell lines accurately. CCLHunter can precisely and efficiently authenticate cell lines from consanguineous lineages and those derived from other tissues of the same individual. Our evaluation results indicate that CCLHunter has a complete accuracy rate of 93.27%, with an accuracy of 89.28% even for consanguineous cell lines, outperforming existing methods. Additionally, we provide convenient access to CCLHunter through standalone software and a web server at https://ngdc.cncb.ac.cn/cclhunter.
Collapse
Affiliation(s)
- Congfan Bu
- National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China
| | - Xinchang Zheng
- National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China
| | - Jialin Mai
- National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Zhi Nie
- National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Jingyao Zeng
- National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China
| | - Qiheng Qian
- National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Tianyi Xu
- National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Yanling Sun
- National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Yiming Bao
- National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Jingfa Xiao
- National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| |
Collapse
|
3
|
Lang O, Srivastava D, Pugh BF, Lai WK. GenoPipe: identifying the genotype of origin within (epi)genomic datasets. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.03.14.532660. [PMID: 36993164 PMCID: PMC10055126 DOI: 10.1101/2023.03.14.532660] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Confidence in experimental results is critical for discovery. As the scale of data generation in genomics has grown exponentially, experimental error has likely kept pace despite the best efforts of many laboratories. Technical mistakes can and do occur at nearly every stage of a genomics assay (i.e., cell line contamination, reagent swapping, tube mislabelling, etc.) and are often difficult to identify post-execution. However, the DNA sequenced in genomic experiments contains certain markers (e.g., indels) encoded within and can often be ascertained forensically from experimental datasets. We developed the Genotype validation Pipeline (GenoPipe), a suite of heuristic tools that operate together directly on raw and aligned sequencing data from individual high-throughput sequencing experiments to characterize the underlying genome of the source material. We demonstrate how GenoPipe validates and rescues erroneously annotated experiments by identifying unique markers inherent to an organism’s genome (i.e., epitope insertions, gene deletions, and SNPs).
Collapse
|
4
|
Menke J, Eckmann P, Ozyurt IB, Roelandse M, Anderson N, Grethe J, Gamst A, Bandrowski A. Establishing Institutional Scores With the Rigor and Transparency Index: Large-scale Analysis of Scientific Reporting Quality. J Med Internet Res 2022; 24:e37324. [PMID: 35759334 PMCID: PMC9274430 DOI: 10.2196/37324] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2022] [Revised: 05/10/2022] [Accepted: 05/23/2022] [Indexed: 12/11/2022] Open
Abstract
BACKGROUND Improving rigor and transparency measures should lead to improvements in reproducibility across the scientific literature; however, the assessment of measures of transparency tends to be very difficult if performed manually. OBJECTIVE This study addresses the enhancement of the Rigor and Transparency Index (RTI, version 2.0), which attempts to automatically assess the rigor and transparency of journals, institutions, and countries using manuscripts scored on criteria found in reproducibility guidelines (eg, Materials Design, Analysis, and Reporting checklist criteria). METHODS The RTI tracks 27 entity types using natural language processing techniques such as Bidirectional Long Short-term Memory Conditional Random Field-based models and regular expressions; this allowed us to assess over 2 million papers accessed through PubMed Central. RESULTS Between 1997 and 2020 (where data were readily available in our data set), rigor and transparency measures showed general improvement (RTI 2.29 to 4.13), suggesting that authors are taking the need for improved reporting seriously. The top-scoring journals in 2020 were the Journal of Neurochemistry (6.23), British Journal of Pharmacology (6.07), and Nature Neuroscience (5.93). We extracted the institution and country of origin from the author affiliations to expand our analysis beyond journals. Among institutions publishing >1000 papers in 2020 (in the PubMed Central open access set), Capital Medical University (4.75), Yonsei University (4.58), and University of Copenhagen (4.53) were the top performers in terms of RTI. In country-level performance, we found that Ethiopia and Norway consistently topped the RTI charts of countries with 100 or more papers per year. In addition, we tested our assumption that the RTI may serve as a reliable proxy for scientific replicability (ie, a high RTI represents papers containing sufficient information for replication efforts). Using work by the Reproducibility Project: Cancer Biology, we determined that replication papers (RTI 7.61, SD 0.78) scored significantly higher (P<.001) than the original papers (RTI 3.39, SD 1.12), which according to the project required additional information from authors to begin replication efforts. CONCLUSIONS These results align with our view that RTI may serve as a reliable proxy for scientific replicability. Unfortunately, RTI measures for journals, institutions, and countries fall short of the replicated paper average. If we consider the RTI of these replication studies as a target for future manuscripts, more work will be needed to ensure that the average manuscript contains sufficient information for replication attempts.
Collapse
Affiliation(s)
- Joe Menke
- Center for Research in Biological Systems, University of California, San Diego, La Jolla, CA, United States
- SciCrunch Inc., San Diego, CA, United States
| | - Peter Eckmann
- SciCrunch Inc., San Diego, CA, United States
- Department of Neuroscience, University of California, San Diego, La Jolla, CA, United States
| | - Ibrahim Burak Ozyurt
- SciCrunch Inc., San Diego, CA, United States
- Department of Neuroscience, University of California, San Diego, La Jolla, CA, United States
| | | | | | - Jeffrey Grethe
- SciCrunch Inc., San Diego, CA, United States
- Department of Neuroscience, University of California, San Diego, La Jolla, CA, United States
| | - Anthony Gamst
- Department of Mathematics, University of California, San Diego, CA, United States
| | - Anita Bandrowski
- SciCrunch Inc., San Diego, CA, United States
- Department of Neuroscience, University of California, San Diego, La Jolla, CA, United States
| |
Collapse
|
5
|
Chaves-Urbano B, Hernando B, Garcia MJ, Macintyre G. CNpare: matching DNA copy number profiles. Bioinformatics 2022; 38:3638-3641. [PMID: 35640971 PMCID: PMC9272807 DOI: 10.1093/bioinformatics/btac371] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2021] [Revised: 04/19/2022] [Accepted: 05/27/2022] [Indexed: 11/14/2022] Open
Abstract
SUMMARY Selecting the optimal cancer cell line for an experiment can be challenging given the diversity of lines available. Here, we present CNpare, which identifies similar cell line models based on genome-wide DNA copy number. AVAILABILITY CNpare is available as an R package at https://github.com/macintyrelab/CNpare. All analysis performed in the manuscript can be reproduced via the code found at https://github.com/macintyrelab/CNpare_analyses. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Blas Chaves-Urbano
- Computational Oncology Group, Spanish National Cancer Research Centre (CNIO), 28029 Madrid, Spain
| | - Barbara Hernando
- Computational Oncology Group, Spanish National Cancer Research Centre (CNIO), 28029 Madrid, Spain
| | - Maria J Garcia
- Computational Oncology Group, Spanish National Cancer Research Centre (CNIO), 28029 Madrid, Spain
| | - Geoff Macintyre
- Computational Oncology Group, Spanish National Cancer Research Centre (CNIO), 28029 Madrid, Spain
| |
Collapse
|
6
|
Kim S, Park JW, Seo H, Kim M, Park J, Kim G, Lee JO, Shin Y, Bae JM, Koo B, Jeong S, Ku J. Multifocal Organoid Capturing of Colon Cancer Reveals Pervasive Intratumoral Heterogenous Drug Responses. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2022; 9:e2103360. [PMID: 34918496 PMCID: PMC8844556 DOI: 10.1002/advs.202103360] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/03/2021] [Revised: 11/15/2021] [Indexed: 06/14/2023]
Abstract
Intratumor heterogeneity (ITH) stands as one of the main difficulties in the treatment of colorectal cancer (CRC) as it causes the development of resistant clones and leads to heterogeneous drug responses. Here, 12 sets of patient-derived organoids (PDOs) and cell lines (PDCs) isolated from multiple regions of single tumors from 12 patients, capturing ITH by multiregion sampling of individual tumors, are presented. Whole-exome sequencing and RNA sequencing of the 12 sets are performed. The PDOs and PDCs of the 12 sets are also analyzed with a clinically relevant 24-compound library to assess their drug responses. The results reveal unexpectedly widespread subregional heterogeneity among PDOs and PDCs isolated from a single tumor, which is manifested by genetic and transcriptional heterogeneity and strong variance in drug responses, while each PDO still recapitulates the major histologic, genomic, and transcriptomic characteristics of the primary tumor. The data suggest an imminent drawback of single biopsy-originated PDO-based clinical diagnosis in evaluating CRC patient responses. Instead, the results indicate the importance of targeting common somatic driver mutations positioned in the trunk of all tumor subregional clones in parallel with a comprehensive understanding of the molecular ITH of each tumor.
Collapse
Affiliation(s)
- Soon‐Chan Kim
- Korean Cell Line BankLaboratory of Cell BiologyCancer Research InstituteSeoul National University College of MedicineSeoul03080South Korea
- Department of Biomedical SciencesSeoul National University College of MedicineSeoul03080South Korea
- Cancer Research InstituteSeoul National UniversitySeoul03080South Korea
- Ischemic/Hypoxic Disease InstituteSeoul National University College of MedicineSeoul03080South Korea
| | - Ji Won Park
- Cancer Research InstituteSeoul National UniversitySeoul03080South Korea
- Department of SurgerySeoul National University College of MedicineSeoul03080South Korea
- Division of Colorectal SurgeryDepartment of SurgerySeoul National University HospitalSeoul03080South Korea
| | - Ha‐Young Seo
- Korean Cell Line BankLaboratory of Cell BiologyCancer Research InstituteSeoul National University College of MedicineSeoul03080South Korea
- Cancer Research InstituteSeoul National UniversitySeoul03080South Korea
| | - Minjung Kim
- Cancer Research InstituteSeoul National UniversitySeoul03080South Korea
- Department of SurgerySeoul National University College of MedicineSeoul03080South Korea
- Division of Colorectal SurgeryDepartment of SurgerySeoul National University HospitalSeoul03080South Korea
| | - Jae‐Hyeon Park
- Korean Cell Line BankLaboratory of Cell BiologyCancer Research InstituteSeoul National University College of MedicineSeoul03080South Korea
- Cancer Research InstituteSeoul National UniversitySeoul03080South Korea
| | - Ga‐Hye Kim
- Korean Cell Line BankLaboratory of Cell BiologyCancer Research InstituteSeoul National University College of MedicineSeoul03080South Korea
- Department of Biomedical SciencesSeoul National University College of MedicineSeoul03080South Korea
- Cancer Research InstituteSeoul National UniversitySeoul03080South Korea
| | - Ja Oh Lee
- Korean Cell Line BankLaboratory of Cell BiologyCancer Research InstituteSeoul National University College of MedicineSeoul03080South Korea
- Cancer Research InstituteSeoul National UniversitySeoul03080South Korea
| | - Young‐Kyoung Shin
- Korean Cell Line BankLaboratory of Cell BiologyCancer Research InstituteSeoul National University College of MedicineSeoul03080South Korea
- Cancer Research InstituteSeoul National UniversitySeoul03080South Korea
- Ischemic/Hypoxic Disease InstituteSeoul National University College of MedicineSeoul03080South Korea
| | - Jeong Mo Bae
- Department of PathologySeoul National University College of MedicineSeoul03080South Korea
| | - Bon‐Kyoung Koo
- Institute of Molecular Biotechnology of the Austrian Academy of Sciences (IMBA)Vienna Biocenter (VBC)Dr. Bohr‐Gasse 3Vienna1030Austria
| | - Seung‐Yong Jeong
- Cancer Research InstituteSeoul National UniversitySeoul03080South Korea
- Department of SurgerySeoul National University College of MedicineSeoul03080South Korea
- Division of Colorectal SurgeryDepartment of SurgerySeoul National University HospitalSeoul03080South Korea
| | - Ja‐Lok Ku
- Korean Cell Line BankLaboratory of Cell BiologyCancer Research InstituteSeoul National University College of MedicineSeoul03080South Korea
- Department of Biomedical SciencesSeoul National University College of MedicineSeoul03080South Korea
- Cancer Research InstituteSeoul National UniversitySeoul03080South Korea
- Ischemic/Hypoxic Disease InstituteSeoul National University College of MedicineSeoul03080South Korea
| |
Collapse
|
7
|
Han S, Basting PJ, Dias GB, Luhur A, Zelhof AC, Bergman CM. Transposable element profiles reveal cell line identity and loss of heterozygosity in Drosophila cell culture. Genetics 2021; 219:6321957. [PMID: 34849875 PMCID: PMC8633141 DOI: 10.1093/genetics/iyab113] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2021] [Accepted: 07/01/2021] [Indexed: 11/28/2022] Open
Abstract
Cell culture systems allow key insights into biological mechanisms yet suffer from irreproducible outcomes in part because of cross-contamination or mislabeling of cell lines. Cell line misidentification can be mitigated by the use of genotyping protocols, which have been developed for human cell lines but are lacking for many important model species. Here, we leverage the classical observation that transposable elements (TEs) proliferate in cultured Drosophila cells to demonstrate that genome-wide TE insertion profiles can reveal the identity and provenance of Drosophila cell lines. We identify multiple cases where TE profiles clarify the origin of Drosophila cell lines (Sg4, mbn2, and OSS_E) relative to published reports, and also provide evidence that insertions from only a subset of long-terminal repeat retrotransposon families are necessary to mark Drosophila cell line identity. We also develop a new bioinformatics approach to detect TE insertions and estimate intra-sample allele frequencies in legacy whole-genome sequencing data (called ngs_te_mapper2), which revealed loss of heterozygosity as a mechanism shaping the unique TE profiles that identify Drosophila cell lines. Our work contributes to the general understanding of the forces impacting metazoan genomes as they evolve in cell culture and paves the way for high-throughput protocols that use TE insertions to authenticate cell lines in Drosophila and other organisms.
Collapse
Affiliation(s)
- Shunhua Han
- Department of Genetics and Institute of Bioinformatics, University of Georgia, Athens, GA 30602, USA
| | - Preston J Basting
- Department of Genetics and Institute of Bioinformatics, University of Georgia, Athens, GA 30602, USA
| | - Guilherme B Dias
- Department of Genetics and Institute of Bioinformatics, University of Georgia, Athens, GA 30602, USA.,Department of Genetics, University of Georgia, Athens, GA 30602, USA
| | - Arthur Luhur
- Drosophila Genomics Resource Center, Indiana University, Bloomington, IN 47405, USA.,Department of Biology, Indiana University, Bloomington, IN 47405, USA
| | - Andrew C Zelhof
- Drosophila Genomics Resource Center, Indiana University, Bloomington, IN 47405, USA.,Department of Biology, Indiana University, Bloomington, IN 47405, USA
| | - Casey M Bergman
- Department of Genetics and Institute of Bioinformatics, University of Georgia, Athens, GA 30602, USA.,Department of Genetics, University of Georgia, Athens, GA 30602, USA
| |
Collapse
|
8
|
Tabatabai M, Bailey S, Bursac Z, Tabatabai H, Wilus D, Singh KP. An introduction to new robust linear and monotonic correlation coefficients. BMC Bioinformatics 2021; 22:170. [PMID: 33789571 PMCID: PMC8011137 DOI: 10.1186/s12859-021-04098-4] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2020] [Accepted: 03/22/2021] [Indexed: 01/10/2023] Open
Abstract
BACKGROUND The most common measure of association between two continuous variables is the Pearson correlation (Maronna et al. in Safari an OMC. Robust statistics, 2019. https://login.proxy.bib.uottawa.ca/login?url=https://learning.oreilly.com/library/view/-/9781119214687/?ar&orpq&email=^u). When outliers are present, Pearson does not accurately measure association and robust measures are needed. This article introduces three new robust measures of correlation: Taba (T), TabWil (TW), and TabWil rank (TWR). The correlation estimators T and TW measure a linear association between two continuous or ordinal variables; whereas TWR measures a monotonic association. The robustness of these proposed measures in comparison with Pearson (P), Spearman (S), Quadrant (Q), Median (M), and Minimum Covariance Determinant (MCD) are examined through simulation. Taba distance is used to analyze genes, and statistical tests were used to identify those genes most significantly associated with Williams Syndrome (WS). RESULTS Based on the root mean square error (RMSE) and bias, the three proposed correlation measures are highly competitive when compared to classical measures such as P and S as well as robust measures such as Q, M, and MCD. Our findings indicate TBL2 was the most significant gene among patients diagnosed with WS and had the most significant reduction in gene expression level when compared with control (P value = 6.37E-05). CONCLUSIONS Overall, when the distribution is bivariate Log-Normal or bivariate Weibull, TWR performs best in terms of bias and T performs best with respect to RMSE. Under the Normal distribution, MCD performs well with respect to bias and RMSE; but TW, TWR, T, S, and P correlations were in close proximity. The identification of TBL2 may serve as a diagnostic tool for WS patients. A Taba R package has been developed and is available for use to perform all necessary computations for the proposed methods.
Collapse
Affiliation(s)
| | | | - Zoran Bursac
- Department of Biostatistics, Florida International University, Miami, FL 33199 USA
| | - Habib Tabatabai
- Department of Civil and Environmental Engineering, University of Wisconsin Milwaukee, Milwaukee, WI 53211 USA
| | - Derek Wilus
- Meharry Medical College, Nashville, TN 37208 USA
| | - Karan P. Singh
- Department of Epidemiology and Biostatistics, University of Texas Health Sciences Center at Tyler, Tyler, TX 75708 USA
| |
Collapse
|
9
|
Khundmiri SJ, Chen L, Lederer ED, Yang CR, Knepper MA. Transcriptomes of Major Proximal Tubule Cell Culture Models. J Am Soc Nephrol 2021; 32:86-97. [PMID: 33122286 PMCID: PMC7894662 DOI: 10.1681/asn.2020010009] [Citation(s) in RCA: 37] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2020] [Accepted: 09/16/2020] [Indexed: 02/04/2023] Open
Abstract
BACKGROUND Cultured cell lines are widely used for research in the physiology, pathophysiology, toxicology, and pharmacology of the renal proximal tubule. The lines that are most appropriate for a given use depend upon the genes expressed. New tools for transcriptomic profiling using RNA sequencing (RNA-Seq) make it possible to catalog expressed genes in each cell line. METHODS Fourteen different proximal tubule cell lines, representing six species, were grown on permeable supports under conditions specific for the respective lines. RNA-Seq followed standard procedures. RESULTS Transcripts expressed in cell lines variably matched transcripts selectively expressed in native proximal tubule. Opossum kidney (OK) cells displayed the highest percentage match (45% of proximal marker genes [TPM threshold =15]), with pig kidney cells (LLC-PK1) close behind (39%). Lower-percentage matches were seen for various human lines, including HK-2 (26%), and lines from rodent kidneys, such as NRK-52E (23%). Nominally, identical OK cells from different sources differed substantially in expression of proximal tubule markers. Mapping cell line transcriptomes to gene sets for various proximal tubule functions (sodium and water transport, protein transport, metabolic functions, endocrine functions) showed that different lines may be optimal for experimentally modeling each function. An online resource (https://esbl.nhlbi.nih.gov/JBrowse/KCT/) has been created to interrogate cell line transcriptome data. Proteomic analysis of NRK-52E cells confirmed low expression of many proximal tubule marker proteins. CONCLUSIONS No cell line fully matched the transcriptome of native proximal tubule cells. However, some of the lines tested are suitable for the study of particular metabolic and transport processes seen in the proximal tubule.
Collapse
Affiliation(s)
- Syed J. Khundmiri
- Department of Physiology and Biophysics, Howard University College of Medicine, Washington, DC
- Epithelial Systems Biology Laboratory, Systems Biology Center, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, Maryland
| | - Lihe Chen
- Epithelial Systems Biology Laboratory, Systems Biology Center, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, Maryland
| | - Eleanor D. Lederer
- Division of Nephrology and Hypertension, School of Medicine, University of Louisville and Robley Rex Veterans Affairs Medical Center, Louisville, Kentucky
| | - Chin-Rang Yang
- Epithelial Systems Biology Laboratory, Systems Biology Center, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, Maryland
| | - Mark A. Knepper
- Epithelial Systems Biology Laboratory, Systems Biology Center, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, Maryland
| |
Collapse
|
10
|
Zhang Q, Luo M, Liu CJ, Guo AY. CCLA: an accurate method and web server for cancer cell line authentication using gene expression profiles. Brief Bioinform 2020; 22:5854406. [PMID: 32510568 DOI: 10.1093/bib/bbaa093] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2020] [Revised: 04/26/2020] [Accepted: 04/28/2020] [Indexed: 01/28/2023] Open
Abstract
Cancer cell lines (CCLs) as important model systems play critical roles in cancer research. The misidentification and contamination of CCLs are serious problems, leading to unreliable results and waste of resources. Current methods for CCL authentication are mainly based on the CCL-specific genetic polymorphism, whereas no method is available for CCL authentication using gene expression profiles. Here, we developed a novel method and homonymic web server (CCLA, Cancer Cell Line Authentication, http://bioinfo.life.hust.edu.cn/web/CCLA/) to authenticate 1291 human CCLs of 28 tissues using gene expression profiles. CCLA showed an excellent speed advantage and high accuracy for CCL authentication, a top 1 accuracy of 96.58 or 92.15% (top 3 accuracy of 100 or 95.11%) for microarray or RNA-Seq validation data (719 samples, 461 CCLs), respectively. To the best of our knowledge, CCLA is the first approach to authenticate CCLs using gene expression data. Users can freely and conveniently authenticate CCLs using gene expression profiles or NCBI GEO accession on CCLA website.
Collapse
|
11
|
Zhang SC, Wang MY, Feng JR, Chang Y, Ji SR, Wu Y. Reversible promoter methylation determines fluctuating expression of acute phase proteins. eLife 2020; 9:51317. [PMID: 32223889 PMCID: PMC7136028 DOI: 10.7554/elife.51317] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2019] [Accepted: 03/27/2020] [Indexed: 12/15/2022] Open
Abstract
Acute phase reactants (APRs) are secretory proteins exhibiting large expression changes in response to proinflammatory cytokines. Here we show that the expression pattern of a major human APR, that is C-reactive protein (CRP), is casually determined by DNMT3A and TET2-tuned promoter methylation status. CRP features a CpG-poor promoter with its CpG motifs located in binding sites of STAT3, C/EBP-β and NF-κB. These motifs are highly methylated at the resting state, but undergo STAT3- and NF-κB-dependent demethylation upon cytokine stimulation, leading to markedly enhanced recruitment of C/EBP-β that boosts CRP expression. Withdrawal of cytokines, by contrast, results in a rapid recovery of promoter methylation and termination of CRP induction. Further analysis suggests that reversible methylation also regulates the expression of highly inducible genes carrying CpG-poor promoters with APRs as representatives. Therefore, these CpG-poor promoters may evolve CpG-containing TF binding sites to harness dynamic methylation for prompt and reversible responses.
Collapse
Affiliation(s)
- Shi-Chao Zhang
- MOE Key Laboratory of Cell Activities and Stress Adaptations, School of Life Sciences, Lanzhou University, Lanzhou, China
| | - Ming-Yu Wang
- MOE Key Laboratory of Cell Activities and Stress Adaptations, School of Life Sciences, Lanzhou University, Lanzhou, China
| | - Jun-Rui Feng
- MOE Key Laboratory of Environment and Genes Related to Diseases, School of Basic Medical Sciences, Xi'an Jiaotong University, Xi'an, China
| | - Yue Chang
- MOE Key Laboratory of Cell Activities and Stress Adaptations, School of Life Sciences, Lanzhou University, Lanzhou, China
| | - Shang-Rong Ji
- MOE Key Laboratory of Cell Activities and Stress Adaptations, School of Life Sciences, Lanzhou University, Lanzhou, China
| | - Yi Wu
- MOE Key Laboratory of Environment and Genes Related to Diseases, School of Basic Medical Sciences, Xi'an Jiaotong University, Xi'an, China
| |
Collapse
|
12
|
Abstract
Cancer cell lines serve as invaluable model systems for cancer biology research and help in evaluating the efficacy of new therapeutic agents. However, cell line contamination and misidentification have become one of the most pressing problems affecting biomedical research. Available methods of cell line authentication suffer from limited access, time-consuming and often costly for many researchers, hence a new and cost-effective approach for cell line authentication is needed. In this regard, we developed a new method called CeL-ID for cell line authentication using genomic variants as a byproduct derived from RNA-seq data. CeL-ID was trained and tested on publicly available more than 900 RNA-seq dataset derived from the Cancer Cell Line Encyclopedia (CCLE) project; including most frequently used adult and pediatric cancer cell lines. We generated cell line specific variant profiles from RNA-seq data using our in-house pipeline followed by pair-wise variant profile comparison between cell lines using allele frequencies and depth of coverage values of the entire variant set. Comparative analysis of variant profiles revealed that they differ significantly from cell line to cell line whereas identical, synonymous and derivative cell lines share high variant identity and their allelic fractions are highly correlated, which is the basis of this cell line authentication protocol. Additionally, CeL-ID also includes a method to estimate the possible cross-contamination using a linear mixture model with any possible CCLE cells in case no perfect match was detected.
Collapse
Affiliation(s)
- Tabrez A Mohammad
- Greehey Children's Cancer Research Institute, UT Health San Antonio, San Antonio, Texas, USA
| | - Yidong Chen
- Greehey Children's Cancer Research Institute, UT Health San Antonio, San Antonio, Texas, USA.,Department of Population Health Sciences, UT Health San Antonio, San Antonio, Texas, USA
| |
Collapse
|
13
|
Wu Z, Yan J, Wang K, Liu X, Guo Y, Zhi D, Ruan J, Zhao Z. The International Conference on Intelligent Biology and Medicine (ICIBM) 2018: genomics with bigger data and wider applications. BMC Genomics 2019; 20:80. [PMID: 30712512 PMCID: PMC6360715 DOI: 10.1186/s12864-018-5369-3] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
Abstract
The sixth International Conference on Intelligent Biology and Medicine (ICIBM) took place in Los Angeles, California, USA on June 10-12, 2018. This conference featured eleven regular scientific sessions, four tutorials, one poster session, four keynote talks, and four eminent scholar talks. The scientific program covered a wide range of topics from bench to bedside, including 3D Genome Organization, reconstruction of large scale evolution of genomes and gene functions, artificial intelligence in biological and biomedical fields, and precision medicine. Both method development and application in genomic research continued to be a main component in the conference, including studies on genetic variants, regulation of transcription, genetic-epigenetic interaction at both single cell and tissue level and artificial intelligence. Here, we write a summary of the conference and also briefly introduce the four high quality papers selected to be published in BMC Genomics that cover novel methodology development or innovative data analysis.
Collapse
Affiliation(s)
- Zhijin Wu
- Department of Biostatistics, Brown University, Providence, RI 02912 USA
| | - Jingwen Yan
- Department of Biohealth Informatics, Indiana University Purdue University Indianapolis, Indianapolis, IN 46202 USA
| | - Kai Wang
- Raymond G. Perelman Center for Cellular and Molecular Therapeutics, Children’s Hospital of Philadelphia, Philadelphia, PA 19104 USA
- Department of Pathology and Laboratory Medicine, University of Pennsylvania, Philadelphia, PA 19104 USA
| | - Xiaoming Liu
- College of Public Health, University of South Florida, Tampa, FL 33612 USA
| | - Yan Guo
- Comprehensive Cancer Center, University of New Mexico, Albuquerque, NM 87131 USA
| | - Degui Zhi
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030 USA
| | - Jianhua Ruan
- Department of Computer Science, The University of Texas at San Antonio, San Antonio, TX 78249 USA
| | - Zhongming Zhao
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030 USA
| |
Collapse
|