1
|
Ko G, Lee JH, Sim YM, Song W, Yoon BH, Byeon I, Lee BH, Kim SO, Choi J, Jang I, Kim H, Yang JO, Jang K, Kim S, Kim JH, Jeon J, Jung J, Hwang S, Park JH, Kim PG, Kim SY, Lee B. KoNA: Korean Nucleotide Archive as A New Data Repository for Nucleotide Sequence Data. GENOMICS, PROTEOMICS & BIOINFORMATICS 2024; 22:qzae017. [PMID: 38862433 DOI: 10.1093/gpbjnl/qzae017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/05/2022] [Revised: 11/20/2023] [Accepted: 01/08/2024] [Indexed: 06/13/2024]
Abstract
During the last decade, the generation and accumulation of petabase-scale high-throughput sequencing data have resulted in great challenges, including access to human data, as well as transfer, storage, and sharing of enormous amounts of data. To promote data-driven biological research, the Korean government announced that all biological data generated from government-funded research projects should be deposited at the Korea BioData Station (K-BDS), which consists of multiple databases for individual data types. Here, we introduce the Korean Nucleotide Archive (KoNA), a repository of nucleotide sequence data. As of July 2022, the Korean Read Archive in KoNA has collected over 477 TB of raw next-generation sequencing data from national genome projects. To ensure data quality and prepare for international alignment, a standard operating procedure was adopted, which is similar to that of the International Nucleotide Sequence Database Collaboration. The standard operating procedure includes quality control processes for submitted data and metadata using an automated pipeline, followed by manual examination. To ensure fast and stable data transfer, a high-speed transmission system called GBox is used in KoNA. Furthermore, the data uploaded to or downloaded from KoNA through GBox can be readily processed using a cloud computing service called Bio-Express. This seamless coupling of KoNA, GBox, and Bio-Express enhances the data experience, including submission, access, and analysis of raw nucleotide sequences. KoNA not only satisfies the unmet needs for a national sequence repository in Korea but also provides datasets to researchers globally and contributes to advances in genomics. The KoNA is available at https://www.kobic.re.kr/kona/.
Collapse
Affiliation(s)
- Gunhwan Ko
- Korea Bioinformation Center, Korea Research Institute of Bioscience & Biotechnology, Daejeon 34141, Republic of Korea
| | - Jae Ho Lee
- Korea Bioinformation Center, Korea Research Institute of Bioscience & Biotechnology, Daejeon 34141, Republic of Korea
| | - Young Mi Sim
- Korea Bioinformation Center, Korea Research Institute of Bioscience & Biotechnology, Daejeon 34141, Republic of Korea
| | - Wangho Song
- Korea Bioinformation Center, Korea Research Institute of Bioscience & Biotechnology, Daejeon 34141, Republic of Korea
| | - Byung-Ha Yoon
- Korea Bioinformation Center, Korea Research Institute of Bioscience & Biotechnology, Daejeon 34141, Republic of Korea
| | - Iksu Byeon
- Korea Bioinformation Center, Korea Research Institute of Bioscience & Biotechnology, Daejeon 34141, Republic of Korea
| | - Bang Hyuck Lee
- Korea Bioinformation Center, Korea Research Institute of Bioscience & Biotechnology, Daejeon 34141, Republic of Korea
| | - Sang-Ok Kim
- Korea Bioinformation Center, Korea Research Institute of Bioscience & Biotechnology, Daejeon 34141, Republic of Korea
| | - Jinhyuk Choi
- Korea Bioinformation Center, Korea Research Institute of Bioscience & Biotechnology, Daejeon 34141, Republic of Korea
| | - Insoo Jang
- Korea Bioinformation Center, Korea Research Institute of Bioscience & Biotechnology, Daejeon 34141, Republic of Korea
| | - Hyerin Kim
- Korea Bioinformation Center, Korea Research Institute of Bioscience & Biotechnology, Daejeon 34141, Republic of Korea
| | - Jin Ok Yang
- Korea Bioinformation Center, Korea Research Institute of Bioscience & Biotechnology, Daejeon 34141, Republic of Korea
| | - Kiwon Jang
- Korea Bioinformation Center, Korea Research Institute of Bioscience & Biotechnology, Daejeon 34141, Republic of Korea
| | - Sora Kim
- Korea Bioinformation Center, Korea Research Institute of Bioscience & Biotechnology, Daejeon 34141, Republic of Korea
| | - Jong-Hwan Kim
- Korea Bioinformation Center, Korea Research Institute of Bioscience & Biotechnology, Daejeon 34141, Republic of Korea
| | - Jongbum Jeon
- Korea Bioinformation Center, Korea Research Institute of Bioscience & Biotechnology, Daejeon 34141, Republic of Korea
| | - Jaeeun Jung
- Korea Bioinformation Center, Korea Research Institute of Bioscience & Biotechnology, Daejeon 34141, Republic of Korea
| | - Seungwoo Hwang
- Korea Bioinformation Center, Korea Research Institute of Bioscience & Biotechnology, Daejeon 34141, Republic of Korea
| | - Ji-Hwan Park
- Korea Bioinformation Center, Korea Research Institute of Bioscience & Biotechnology, Daejeon 34141, Republic of Korea
| | - Pan-Gyu Kim
- Korea Bioinformation Center, Korea Research Institute of Bioscience & Biotechnology, Daejeon 34141, Republic of Korea
| | - Seon-Young Kim
- Korea Bioinformation Center, Korea Research Institute of Bioscience & Biotechnology, Daejeon 34141, Republic of Korea
| | - Byungwook Lee
- Korea Bioinformation Center, Korea Research Institute of Bioscience & Biotechnology, Daejeon 34141, Republic of Korea
| |
Collapse
|
2
|
Koblitz J, Dirks WG, Eberth S, Nagel S, Steenpass L, Pommerenke C. DSMZCellDive: Diving into high-throughput cell line data. F1000Res 2022; 11:420. [PMID: 35949917 PMCID: PMC9334839 DOI: 10.12688/f1000research.111175.2] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 07/18/2022] [Indexed: 01/08/2023] Open
Abstract
Human and animal cell lines serve as model systems in a wide range of life sciences such as cancer and infection research or drug screening. Reproducible data are highly dependent on authenticated, contaminant-free cell lines, no better delivered than by the official and certified biorepositories. Offering a web portal to high-throughput information on these model systems will facilitate working with and comparing to these references by data otherwise dispersed at different sources. We here provide DSMZCellDive to access a comprehensive data source on human and animal cell lines, freely available at celldive.dsmz.de. A wide variety of data sources are generated such as RNA-seq transcriptome data and STR (short tandem repeats) profiles. Several starting points ease entering the database via browsing, searching or visualising. This web tool is designed for further expansion on meta and high-throughput data to be generated in future. Explicated examples for the power of this novel tool include analysis of B-cell differentiation markers, homeo-oncogene expression, and measurement of genomic loss of heterozygosities by an enlarged STR panel of 17 loci. Sharing the data on cell lines by the biorepository itself will be of benefit to the scientific community since it (1) supports the selection of appropriate model cell lines, (2) ensures reliability, (3) avoids misleading data, (4) saves on additional experimentals, and (5) serves as reference for genomic and gene expression data.
Collapse
Affiliation(s)
- Julia Koblitz
- Leibniz Institute DSMZ - German Collection of Microorganisms and Cell Cultures, Braunschweig, 38124, Germany
| | - Wilhelm G. Dirks
- Leibniz Institute DSMZ - German Collection of Microorganisms and Cell Cultures, Braunschweig, 38124, Germany
| | - Sonja Eberth
- Leibniz Institute DSMZ - German Collection of Microorganisms and Cell Cultures, Braunschweig, 38124, Germany
| | - Stefan Nagel
- Leibniz Institute DSMZ - German Collection of Microorganisms and Cell Cultures, Braunschweig, 38124, Germany
| | - Laura Steenpass
- Leibniz Institute DSMZ - German Collection of Microorganisms and Cell Cultures, Braunschweig, 38124, Germany
| | - Claudia Pommerenke
- Leibniz Institute DSMZ - German Collection of Microorganisms and Cell Cultures, Braunschweig, 38124, Germany
| |
Collapse
|
3
|
Tibor Fekete J, Győrffy B. A unified platform enabling biomarker ranking and validation for 1,562 drugs using transcriptomic data of 1,250 cancer cell lines. Comput Struct Biotechnol J 2022; 20:2885-2894. [PMID: 35765648 PMCID: PMC9198269 DOI: 10.1016/j.csbj.2022.06.007] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2022] [Revised: 06/01/2022] [Accepted: 06/01/2022] [Indexed: 12/21/2022] Open
Abstract
Intro In vitro cell line models provide a valuable resource to investigate compounds useful in the systemic chemotherapy of cancer. However, the due to the dispersal of the data into several different databases, the utilization of these resources is limited. Here, our aim was to establish a platform enabling the validation of chemoresistance-associated genes and the ranking of available cell line models. Methods We processed four independent databases, DepMap, GDSC1, GDSC2, and CTRP. The gene expression data was quantile normalized and HUGO gene names were assigned to have unambiguous identification of the genes. Resistance values were exported for all agents. The correlation between gene expression and therapy resistance is computed using ROC test. Results We combined four datasets with chemosensitivity data of 1562 agents and transcriptome-level gene expression of 1250 cancer cell lines. We have set up an online tool utilizing this database to correlate available cell line sensitivity data and treatment response in a uniform analysis pipeline (www.rocplot.com/cells). We employed the established pipeline to by rank genes related to resistance against afatinib and lapatinib, two inhibitors of the tyrosine-kinase domain of ERBB2. Discussion The computational tool is useful 1) to correlate gene expression with resistance, 2) to identify and rank resistant and sensitive cell lines, and 3) to rank resistance associated genes, cancer hallmarks, and gene ontology pathways. The platform will be an invaluable support to speed up cancer research by validating gene-resistance correlations and by selecting the best cell line models for new experiments.
Collapse
Affiliation(s)
- János Tibor Fekete
- Semmelweis University, Department of Bioinformatics and 2 Department of Pediatrics, Budapest H-1094, Hungary
- Research Center for Natural Sciences, Institute of Enzymology, Momentum Cancer Biomarker Research Group, Magyar tudósok körútja 2., Budapest H-1117, Hungary
| | - Balázs Győrffy
- Semmelweis University, Department of Bioinformatics and 2 Department of Pediatrics, Budapest H-1094, Hungary
- Research Center for Natural Sciences, Institute of Enzymology, Momentum Cancer Biomarker Research Group, Magyar tudósok körútja 2., Budapest H-1117, Hungary
| |
Collapse
|
4
|
FAK inhibition radiosensitizes pancreatic ductal adenocarcinoma cells in vitro. Strahlenther Onkol 2020; 197:27-38. [PMID: 32705304 PMCID: PMC7801360 DOI: 10.1007/s00066-020-01666-0] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2020] [Accepted: 06/29/2020] [Indexed: 12/16/2022]
Abstract
Introduction Focal adhesion kinase (FAK) is a nonreceptor tyrosine kinase protein frequently overexpressed in cancer and has been linked to an increase in the stem cell population of tumors, resistance to therapy, and metastatic spread. Pharmacological FAK inhibition in pancreatic cancer has received increased attention over the last few years, either alone or in combination with other therapeutics including chemotherapy and immunotherapy. However, its prognostic value and its role in radioresistance of pancreatic ducal adenocarcinoma (PDAC) is unknown. Methods and materials Using the TCGA and GTEx databases, we investigated the genetic alterations and mRNA expression levels of PTK2 (the encoding-gene for FAK) in normal pancreatic tissue and pancreatic cancer and its impact on patient survival. Furthermore, we evaluated the expression of FAK and its tyrosine domain Ty-397 in three pancreatic cancer cell lines. We went further and evaluated the role of a commercial FAK tyrosine kinase inhibitor VS-4718 on the viability and radiosensitization of the pancreatic cell lines as well as its effect on the extracellular matrix (ECM) production from the pancreatic stellate cells. Furthermore, we tested the effect of combining radiation with VS-4718 in a three-dimensional (3D) multicellular pancreatic tumor spheroid model. Results A database analysis revealed a relevant increase in genetic alterations and mRNA expression of the PTK2 in PDAC, which were associated with lower progression-free survival. In vitro, there was only variation in the basal phosphorylation level of FAK in cell lines. VS-4718 radiosensitized pancreatic cell lines only in the presence of ECM-producing pancreatic stellate cells and markedly reduced the ECM production in the stromal cells. Finally, using a 3D multicellular tumor model, the combination of VS-4718 and radiotherapy significantly reduced the growth of tumor aggregates. Conclusion Pharmacological inhibition of FAK in pancreatic cancer could be a novel therapeutic strategy as our results show a radiosensitization effect of VS-4718 in vitro in a multicellular 2D- and in a 3D-model of pancreatic cancer. Electronic supplementary material The online version of this article (10.1007/s00066-020-01666-0) contains supplementary material, which is available to authorized users.
Collapse
|
5
|
Ko G, Kim PG, Cho Y, Jeong S, Kim JY, Kim KH, Lee HY, Han J, Yu N, Ham S, Jang I, Kang B, Shin S, Kim L, Lee SW, Nam D, Kim JF, Kim N, Kim SY, Lee S, Roh TY, Lee B. Bioinformatics services for analyzing massive genomic datasets. Genomics Inform 2020; 18:e8. [PMID: 32224841 PMCID: PMC7120352 DOI: 10.5808/gi.2020.18.1.e8] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2020] [Accepted: 03/11/2020] [Indexed: 11/25/2022] Open
Abstract
The explosive growth of next-generation sequencing data has resulted in ultra-large-scale datasets and ensuing computational problems. In Korea, the amount of genomic data has been increasing rapidly in the recent years. Leveraging these big data requires researchers to use large-scale computational resources and analysis pipelines. A promising solution for addressing this computational challenge is cloud computing, where CPUs, memory, storage, and programs are accessible in the form of virtual machines. Here, we present a cloud computing-based system, Bio-Express, that provides user-friendly, cost-effective analysis of massive genomic datasets. Bio-Express is loaded with predefined multi-omics data analysis pipelines, which are divided into genome, transcriptome, epigenome, and metagenome pipelines. Users can employ predefined pipelines or create a new pipeline for analyzing their own omics data. We also developed several web-based services for facilitating downstream analysis of genome data. Bio-Express web service is freely available at https://www.bioexpress.re.kr/.
Collapse
Affiliation(s)
- Gunhwan Ko
- Korea Bioinformation Center (KOBIC), KRIBB, Daejeon 34141, Korea
| | - Pan-Gyu Kim
- Korea Bioinformation Center (KOBIC), KRIBB, Daejeon 34141, Korea
| | - Youngbum Cho
- Genome Editing Research Center, KRIBB, Daejeon 34141, Korea
| | - Seongmun Jeong
- Genome Editing Research Center, KRIBB, Daejeon 34141, Korea
| | - Jae-Yoon Kim
- Genome Editing Research Center, KRIBB, Daejeon 34141, Korea
| | | | - Ho-Yeon Lee
- Genome Editing Research Center, KRIBB, Daejeon 34141, Korea
| | - Jiyeon Han
- Department of BioInformation Science, Ewha Womans University, Seoul 03760, Korea
| | - Namhee Yu
- Department of BioInformation Science, Ewha Womans University, Seoul 03760, Korea
| | - Seokjin Ham
- Department of Life Sciences and Division of Integrative Biosciences & Biotechnology, Pohang University of Science & Technology (POSTECH), Pohang 37673, Korea
| | - Insoon Jang
- Department of Life Sciences and Division of Integrative Biosciences & Biotechnology, Pohang University of Science & Technology (POSTECH), Pohang 37673, Korea
| | - Byunghee Kang
- Department of Life Sciences and Division of Integrative Biosciences & Biotechnology, Pohang University of Science & Technology (POSTECH), Pohang 37673, Korea
| | - Sunguk Shin
- Department of Systems, Biology Division of Life Sciences, and Institute for Life Science and Biotechnology, Yonsei University, Seoul 03722, Korea
| | - Lian Kim
- Bioposh Inc., Daejeon 34016, Korea
| | | | - Dougu Nam
- School of Life Sciences, Ulsan National Institute of Science and Technology, Ulsan 44919, Korea
| | - Jihyun F Kim
- Department of Systems, Biology Division of Life Sciences, and Institute for Life Science and Biotechnology, Yonsei University, Seoul 03722, Korea.,Strategic Initiative for Microbiomes in Agriculture and Food, Yonsei University, Seoul 03722, Korea
| | - Namshin Kim
- Genome Editing Research Center, KRIBB, Daejeon 34141, Korea
| | - Seon-Young Kim
- Genome Structure Research Center, KRIBB, Daejeon 34141, Korea
| | - Sanghyuk Lee
- Department of BioInformation Science, Ewha Womans University, Seoul 03760, Korea
| | - Tae-Young Roh
- Department of Life Sciences and Division of Integrative Biosciences & Biotechnology, Pohang University of Science & Technology (POSTECH), Pohang 37673, Korea.,SysGenLab Inc., Pohang 37613, Korea
| | - Byungwook Lee
- Korea Bioinformation Center (KOBIC), KRIBB, Daejeon 34141, Korea
| |
Collapse
|