1
|
Kodama Y, Mashima J, Kosuge T, Ogasawara O. DDBJ update: the Genomic Expression Archive (GEA) for functional genomics data. Nucleic Acids Res 2020; 47:D69-D73. [PMID: 30357349 PMCID: PMC6323915 DOI: 10.1093/nar/gky1002] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2018] [Accepted: 10/09/2018] [Indexed: 12/13/2022] Open
Abstract
The Genomic Expression Archive (GEA) for functional genomics data from microarray and high-throughput sequencing experiments has been established at the DNA Data Bank of Japan (DDBJ) Center (https://www.ddbj.nig.ac.jp), which is a member of the International Nucleotide Sequence Database Collaboration (INSDC) with the US National Center for Biotechnology Information and the European Bioinformatics Institute. The DDBJ Center collects nucleotide sequence data and associated biological information from researchers and also services the Japanese Genotype–phenotype Archive (JGA) with the National Bioscience Database Center for collecting human data. To automate the submission process, we have implemented the DDBJ BioSample validator which checks submitted records, auto-corrects their format, and issues error messages and warnings if necessary. The DDBJ Center also operates the NIG supercomputer, prepared for analyzing large-scale genome sequences. We now offer a secure platform specifically to handle personal human genomes. This report describes database activities for INSDC and JGA over the past year, the newly launched GEA, submission, retrieval, and analysis services available in our supercomputer system and their recent developments.
Collapse
Affiliation(s)
- Yuichi Kodama
- DDBJ Center, National Institute of Genetics, Shizuoka 411-8540, Japan
| | - Jun Mashima
- DDBJ Center, National Institute of Genetics, Shizuoka 411-8540, Japan
| | - Takehide Kosuge
- DDBJ Center, National Institute of Genetics, Shizuoka 411-8540, Japan
| | - Osamu Ogasawara
- DDBJ Center, National Institute of Genetics, Shizuoka 411-8540, Japan
| |
Collapse
|
2
|
Kodama Y, Mashima J, Kosuge T, Kaminuma E, Ogasawara O, Okubo K, Nakamura Y, Takagi T. DNA Data Bank of Japan: 30th anniversary. Nucleic Acids Res 2019; 46:D30-D35. [PMID: 29040613 PMCID: PMC5753283 DOI: 10.1093/nar/gkx926] [Citation(s) in RCA: 44] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2017] [Accepted: 10/02/2017] [Indexed: 11/17/2022] Open
Abstract
The DNA Data Bank of Japan (DDBJ) Center (http://www.ddbj.nig.ac.jp) has been providing public data services for 30 years since 1987. We are collecting nucleotide sequence data and associated biological information from researchers as a member of the International Nucleotide Sequence Database Collaboration (INSDC), in collaboration with the US National Center for Biotechnology Information and the European Bioinformatics Institute. The DDBJ Center also services the Japanese Genotype-phenotype Archive (JGA) with the National Bioscience Database Center to collect genotype and phenotype data of human individuals. Here, we outline our database activities for INSDC and JGA over the past year, and introduce submission, retrieval and analysis services running on our supercomputer system and their recent developments. Furthermore, we highlight our responses to the amended Japanese rules for the protection of personal information and the launch of the DDBJ Group Cloud service for sharing pre-publication data among research groups.
Collapse
Affiliation(s)
- Yuichi Kodama
- DDBJ Center, National Institute of Genetics, Shizuoka 411-8540, Japan
| | - Jun Mashima
- DDBJ Center, National Institute of Genetics, Shizuoka 411-8540, Japan
| | - Takehide Kosuge
- DDBJ Center, National Institute of Genetics, Shizuoka 411-8540, Japan
| | - Eli Kaminuma
- DDBJ Center, National Institute of Genetics, Shizuoka 411-8540, Japan
| | - Osamu Ogasawara
- DDBJ Center, National Institute of Genetics, Shizuoka 411-8540, Japan
| | - Kousaku Okubo
- DDBJ Center, National Institute of Genetics, Shizuoka 411-8540, Japan
| | - Yasukazu Nakamura
- DDBJ Center, National Institute of Genetics, Shizuoka 411-8540, Japan
| | - Toshihisa Takagi
- DDBJ Center, National Institute of Genetics, Shizuoka 411-8540, Japan.,National Bioscience Database Center, Japan Science and Technology Agency, Tokyo 102-8666, Japan
| |
Collapse
|
3
|
Mashima J, Kodama Y, Fujisawa T, Katayama T, Okuda Y, Kaminuma E, Ogasawara O, Okubo K, Nakamura Y, Takagi T. DNA Data Bank of Japan. Nucleic Acids Res 2016; 45:D25-D31. [PMID: 27924010 PMCID: PMC5210514 DOI: 10.1093/nar/gkw1001] [Citation(s) in RCA: 44] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2016] [Revised: 10/13/2016] [Accepted: 10/15/2016] [Indexed: 12/27/2022] Open
Abstract
The DNA Data Bank of Japan (DDBJ) (http://www.ddbj.nig.ac.jp) has been providing public data services for thirty years (since 1987). We are collecting nucleotide sequence data from researchers as a member of the International Nucleotide Sequence Database Collaboration (INSDC, http://www.insdc.org), in collaboration with the US National Center for Biotechnology Information (NCBI) and European Bioinformatics Institute (EBI). The DDBJ Center also services Japanese Genotype-phenotype Archive (JGA), with the National Bioscience Database Center to collect human-subjected data from Japanese researchers. Here, we report our database activities for INSDC and JGA over the past year, and introduce retrieval and analytical services running on our supercomputer system and their recent modifications. Furthermore, with the Database Center for Life Science, the DDBJ Center improves semantic web technologies to integrate and to share biological data, for providing the RDF version of the sequence data.
Collapse
Affiliation(s)
- Jun Mashima
- DDBJ Center, National Institute of Genetics, Shizuoka 411-8540, Japan
| | - Yuichi Kodama
- DDBJ Center, National Institute of Genetics, Shizuoka 411-8540, Japan
| | - Takatomo Fujisawa
- DDBJ Center, National Institute of Genetics, Shizuoka 411-8540, Japan
| | | | - Yoshihiro Okuda
- DDBJ Center, National Institute of Genetics, Shizuoka 411-8540, Japan
| | - Eli Kaminuma
- DDBJ Center, National Institute of Genetics, Shizuoka 411-8540, Japan
| | - Osamu Ogasawara
- DDBJ Center, National Institute of Genetics, Shizuoka 411-8540, Japan
| | - Kousaku Okubo
- DDBJ Center, National Institute of Genetics, Shizuoka 411-8540, Japan
| | - Yasukazu Nakamura
- DDBJ Center, National Institute of Genetics, Shizuoka 411-8540, Japan
| | - Toshihisa Takagi
- DDBJ Center, National Institute of Genetics, Shizuoka 411-8540, Japan .,National Bioscience Database Center, Japan Science and Technology Agency, Tokyo 102-8666, Japan
| |
Collapse
|
4
|
Kim KH, Kim J, Choi JS, Bae S, Kwon D, Park I, Kim DH, Seo TS. Rapid, High-Throughput, and Direct Molecular Beacon Delivery to Human Cancer Cells Using a Nanowire-Incorporated and Pneumatic Pressure-Driven Microdevice. SMALL (WEINHEIM AN DER BERGSTRASSE, GERMANY) 2015; 11:6215-6224. [PMID: 26484480 DOI: 10.1002/smll.201502151] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/20/2015] [Revised: 09/01/2015] [Indexed: 06/05/2023]
Abstract
Tracking and monitoring the intracellular behavior of mRNA is of paramount importance for understanding real-time gene expression in cell biology. To detect specific mRNA sequences, molecular beacons (MBs) have been widely employed as sensing probes. Although numerous strategies for MB delivery into the target cells have been reported, many issues such as the cytotoxicity of the carriers, dependence on the random probability of MB transfer, and critical cellular damage still need to be overcome. Herein, we have developed a nanowire-incorporated and pneumatic pressure-driven microdevice for rapid, high-throughput, and direct MB delivery to human breast cancer MCF-7 cells to monitor survivin mRNA expression. The proposed microdevice is composed of three layers: a pump-associated glass manifold layer, a monolithic polydimethylsiloxane (PDMS) membrane, and a ZnO nanowire-patterned microchannel layer. The MB is immobilized on the ZnO nanowires by disulfide bonding, and the glass manifold and PDMS membrane serve as a microvalve, so that the cellular attachment and detachment on the MB-coated nanowire array can be manipulated. The combination of the nanowire-mediated MB delivery and the microvalve function enable the transfer of MB into the cells in a controllable way with high cell viability and to detect survivin mRNA expression quantitatively after docetaxel treatment.
Collapse
Affiliation(s)
- Kyung Hoon Kim
- Department of Chemical and Biomolecular Engineering, Korea Advanced Institute of Science and Technology, Daejeon, 305-701, Korea
| | - Jung Kim
- Department of Mechanical Engineering, Korea Advanced Institute of Science and Technology, Daejeon, 305-701, Korea
| | - Jong Seob Choi
- Department of Chemical and Biomolecular Engineering, Korea Advanced Institute of Science and Technology, Daejeon, 305-701, Korea
| | - Sunwoong Bae
- Department of Chemical and Biomolecular Engineering, Korea Advanced Institute of Science and Technology, Daejeon, 305-701, Korea
| | - Donguk Kwon
- Department of Mechanical Engineering, Korea Advanced Institute of Science and Technology, Daejeon, 305-701, Korea
| | - Inkyu Park
- Department of Mechanical Engineering, Korea Advanced Institute of Science and Technology, Daejeon, 305-701, Korea
| | - Do Hyun Kim
- Department of Chemical and Biomolecular Engineering, Korea Advanced Institute of Science and Technology, Daejeon, 305-701, Korea
| | - Tae Seok Seo
- Department of Chemical and Biomolecular Engineering, Korea Advanced Institute of Science and Technology, Daejeon, 305-701, Korea
| |
Collapse
|
5
|
Mashima J, Kodama Y, Kosuge T, Fujisawa T, Katayama T, Nagasaki H, Okuda Y, Kaminuma E, Ogasawara O, Okubo K, Nakamura Y, Takagi T. DNA data bank of Japan (DDBJ) progress report. Nucleic Acids Res 2015; 44:D51-7. [PMID: 26578571 PMCID: PMC4702806 DOI: 10.1093/nar/gkv1105] [Citation(s) in RCA: 55] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2015] [Accepted: 10/09/2015] [Indexed: 01/07/2023] Open
Abstract
The DNA Data Bank of Japan Center (DDBJ Center; http://www.ddbj.nig.ac.jp) maintains and provides public archival, retrieval and analytical services for biological information. The contents of the DDBJ databases are shared with the US National Center for Biotechnology Information (NCBI) and the European Bioinformatics Institute (EBI) within the framework of the International Nucleotide Sequence Database Collaboration (INSDC). Since 2013, the DDBJ Center has been operating the Japanese Genotype-phenotype Archive (JGA) in collaboration with the National Bioscience Database Center (NBDC) in Japan. In addition, the DDBJ Center develops semantic web technologies for data integration and sharing in collaboration with the Database Center for Life Science (DBCLS) in Japan. This paper briefly reports on the activities of the DDBJ Center over the past year including submissions to databases and improvements in our services for data retrieval, analysis, and integration.
Collapse
Affiliation(s)
- Jun Mashima
- DDBJ Center, National Institute of Genetics, Shizuoka 411-8540, Japan
| | - Yuichi Kodama
- DDBJ Center, National Institute of Genetics, Shizuoka 411-8540, Japan
| | - Takehide Kosuge
- DDBJ Center, National Institute of Genetics, Shizuoka 411-8540, Japan
| | - Takatomo Fujisawa
- DDBJ Center, National Institute of Genetics, Shizuoka 411-8540, Japan
| | | | - Hideki Nagasaki
- DDBJ Center, National Institute of Genetics, Shizuoka 411-8540, Japan
| | - Yoshihiro Okuda
- DDBJ Center, National Institute of Genetics, Shizuoka 411-8540, Japan
| | - Eli Kaminuma
- DDBJ Center, National Institute of Genetics, Shizuoka 411-8540, Japan
| | - Osamu Ogasawara
- DDBJ Center, National Institute of Genetics, Shizuoka 411-8540, Japan
| | - Kousaku Okubo
- DDBJ Center, National Institute of Genetics, Shizuoka 411-8540, Japan
| | - Yasukazu Nakamura
- DDBJ Center, National Institute of Genetics, Shizuoka 411-8540, Japan
| | - Toshihisa Takagi
- DDBJ Center, National Institute of Genetics, Shizuoka 411-8540, Japan National Bioscience Database Center, Japan Science and Technology Agency, Tokyo 102-8666, Japan
| |
Collapse
|
6
|
Kodama Y, Mashima J, Kosuge T, Katayama T, Fujisawa T, Kaminuma E, Ogasawara O, Okubo K, Takagi T, Nakamura Y. The DDBJ Japanese Genotype-phenotype Archive for genetic and phenotypic human data. Nucleic Acids Res 2014; 43:D18-22. [PMID: 25477381 PMCID: PMC4383935 DOI: 10.1093/nar/gku1120] [Citation(s) in RCA: 48] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open
Abstract
The DNA Data Bank of Japan Center (DDBJ Center; http://www.ddbj.nig.ac.jp) maintains and provides public archival, retrieval and analytical services for biological information. Since October 2013, DDBJ Center has operated the Japanese Genotype-phenotype Archive (JGA) in collaboration with our partner institute, the National Bioscience Database Center (NBDC) of the Japan Science and Technology Agency. DDBJ Center provides the JGA database system which securely stores genotype and phenotype data collected from individuals whose consent agreements authorize data release only for specific research use. NBDC has established guidelines and policies for sharing human-derived data and reviews data submission and usage requests from researchers. In addition to the JGA project, DDBJ Center develops Semantic Web technologies for data integration and sharing in collaboration with the Database Center for Life Science. This paper describes the overview of the JGA project, updates to the DDBJ databases, and services for data retrieval, analysis and integration.
Collapse
Affiliation(s)
- Yuichi Kodama
- DDBJ Center, National Institute of Genetics, Shizuoka 411-8540, Japan
| | - Jun Mashima
- DDBJ Center, National Institute of Genetics, Shizuoka 411-8540, Japan
| | - Takehide Kosuge
- DDBJ Center, National Institute of Genetics, Shizuoka 411-8540, Japan
| | - Toshiaki Katayama
- National Bioscience Database Center, Japan Science and Technology Agency, Tokyo 102-8666, Japan
| | - Takatomo Fujisawa
- DDBJ Center, National Institute of Genetics, Shizuoka 411-8540, Japan
| | - Eli Kaminuma
- DDBJ Center, National Institute of Genetics, Shizuoka 411-8540, Japan
| | - Osamu Ogasawara
- DDBJ Center, National Institute of Genetics, Shizuoka 411-8540, Japan
| | - Kousaku Okubo
- DDBJ Center, National Institute of Genetics, Shizuoka 411-8540, Japan
| | - Toshihisa Takagi
- DDBJ Center, National Institute of Genetics, Shizuoka 411-8540, Japan Database Center for Life Science, Chiba 277-0871, Japan
| | - Yasukazu Nakamura
- DDBJ Center, National Institute of Genetics, Shizuoka 411-8540, Japan
| |
Collapse
|
7
|
Torri F, Dinov ID, Zamanyan A, Hobel S, Genco A, Petrosyan P, Clark AP, Liu Z, Eggert P, Pierce J, Knowles JA, Ames J, Kesselman C, Toga AW, Potkin SG, Vawter MP, Macciardi F. Next generation sequence analysis and computational genomics using graphical pipeline workflows. Genes (Basel) 2014; 3:545-75. [PMID: 23139896 PMCID: PMC3490498 DOI: 10.3390/genes3030545] [Citation(s) in RCA: 43] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
Whole-genome and exome sequencing have already proven to be essential and powerful methods to identify genes responsible for simple Mendelian inherited disorders. These methods can be applied to complex disorders as well, and have been adopted as one of the current mainstream approaches in population genetics. These achievements have been made possible by next generation sequencing (NGS) technologies, which require substantial bioinformatics resources to analyze the dense and complex sequence data. The huge analytical burden of data from genome sequencing might be seen as a bottleneck slowing the publication of NGS papers at this time, especially in psychiatric genetics. We review the existing methods for processing NGS data, to place into context the rationale for the design of a computational resource. We describe our method, the Graphical Pipeline for Computational Genomics (GPCG), to perform the computational steps required to analyze NGS data. The GPCG implements flexible workflows for basic sequence alignment, sequence data quality control, single nucleotide polymorphism analysis, copy number variant identification, annotation, and visualization of results. These workflows cover all the analytical steps required for NGS data, from processing the raw reads to variant calling and annotation. The current version of the pipeline is freely available at http://pipeline.loni.ucla.edu. These applications of NGS analysis may gain clinical utility in the near future (e.g., identifying miRNA signatures in diseases) when the bioinformatics approach is made feasible. Taken together, the annotation tools and strategies that have been developed to retrieve information and test hypotheses about the functional role of variants present in the human genome will help to pinpoint the genetic risk factors for psychiatric disorders.
Collapse
Affiliation(s)
- Federica Torri
- Department of Psychiatry and Human Behavior, University of California, Irvine, CA 92617, USA; E-Mails: (F.T.); (S.G.P.)
- Biomedical Informatics Research Network (BIRN), Information Sciences Institute, University of Southern California, Los Angeles, CA 90292, USA; E-Mails: (I.D.D.); (J.A.); (C.K.); (A.W.T.)
| | - Ivo D. Dinov
- Biomedical Informatics Research Network (BIRN), Information Sciences Institute, University of Southern California, Los Angeles, CA 90292, USA; E-Mails: (I.D.D.); (J.A.); (C.K.); (A.W.T.)
- Laboratory of Neuro Imaging (LONI), University of California, Los Angeles, CA 90095, USA; E-Mails: (A.Z.); (S.H.); (A.G.); (P.P.); (Z.L.); (P.E.); (J.P.)
| | - Alen Zamanyan
- Laboratory of Neuro Imaging (LONI), University of California, Los Angeles, CA 90095, USA; E-Mails: (A.Z.); (S.H.); (A.G.); (P.P.); (Z.L.); (P.E.); (J.P.)
| | - Sam Hobel
- Laboratory of Neuro Imaging (LONI), University of California, Los Angeles, CA 90095, USA; E-Mails: (A.Z.); (S.H.); (A.G.); (P.P.); (Z.L.); (P.E.); (J.P.)
| | - Alex Genco
- Laboratory of Neuro Imaging (LONI), University of California, Los Angeles, CA 90095, USA; E-Mails: (A.Z.); (S.H.); (A.G.); (P.P.); (Z.L.); (P.E.); (J.P.)
| | - Petros Petrosyan
- Laboratory of Neuro Imaging (LONI), University of California, Los Angeles, CA 90095, USA; E-Mails: (A.Z.); (S.H.); (A.G.); (P.P.); (Z.L.); (P.E.); (J.P.)
| | - Andrew P. Clark
- Zilkha Neurogenetic Institute, USC Keck School of Medicine, Los Angeles, CA 90033, USA; E-Mails: (A.P.C.); (J.A.K.)
| | - Zhizhong Liu
- Laboratory of Neuro Imaging (LONI), University of California, Los Angeles, CA 90095, USA; E-Mails: (A.Z.); (S.H.); (A.G.); (P.P.); (Z.L.); (P.E.); (J.P.)
| | - Paul Eggert
- Laboratory of Neuro Imaging (LONI), University of California, Los Angeles, CA 90095, USA; E-Mails: (A.Z.); (S.H.); (A.G.); (P.P.); (Z.L.); (P.E.); (J.P.)
- Department of Computer Science, University of California, Los Angeles, CA 90095, USA
| | - Jonathan Pierce
- Laboratory of Neuro Imaging (LONI), University of California, Los Angeles, CA 90095, USA; E-Mails: (A.Z.); (S.H.); (A.G.); (P.P.); (Z.L.); (P.E.); (J.P.)
| | - James A. Knowles
- Zilkha Neurogenetic Institute, USC Keck School of Medicine, Los Angeles, CA 90033, USA; E-Mails: (A.P.C.); (J.A.K.)
| | - Joseph Ames
- Biomedical Informatics Research Network (BIRN), Information Sciences Institute, University of Southern California, Los Angeles, CA 90292, USA; E-Mails: (I.D.D.); (J.A.); (C.K.); (A.W.T.)
| | - Carl Kesselman
- Biomedical Informatics Research Network (BIRN), Information Sciences Institute, University of Southern California, Los Angeles, CA 90292, USA; E-Mails: (I.D.D.); (J.A.); (C.K.); (A.W.T.)
| | - Arthur W. Toga
- Biomedical Informatics Research Network (BIRN), Information Sciences Institute, University of Southern California, Los Angeles, CA 90292, USA; E-Mails: (I.D.D.); (J.A.); (C.K.); (A.W.T.)
- Laboratory of Neuro Imaging (LONI), University of California, Los Angeles, CA 90095, USA; E-Mails: (A.Z.); (S.H.); (A.G.); (P.P.); (Z.L.); (P.E.); (J.P.)
| | - Steven G. Potkin
- Department of Psychiatry and Human Behavior, University of California, Irvine, CA 92617, USA; E-Mails: (F.T.); (S.G.P.)
- Biomedical Informatics Research Network (BIRN), Information Sciences Institute, University of Southern California, Los Angeles, CA 90292, USA; E-Mails: (I.D.D.); (J.A.); (C.K.); (A.W.T.)
| | - Marquis P. Vawter
- Functional Genomics Laboratory, Department of Psychiatry And Human Behavior, School of Medicine, University of California, Irvine, CA 92697, USA; E-Mail:
| | - Fabio Macciardi
- Department of Psychiatry and Human Behavior, University of California, Irvine, CA 92617, USA; E-Mails: (F.T.); (S.G.P.)
- Biomedical Informatics Research Network (BIRN), Information Sciences Institute, University of Southern California, Los Angeles, CA 90292, USA; E-Mails: (I.D.D.); (J.A.); (C.K.); (A.W.T.)
- Author to whom correspondence should be addressed; E-Mail: ; Tel.: +1-949-824-4559; Fax: +1-949-824-2072
| |
Collapse
|
8
|
Kosuge T, Mashima J, Kodama Y, Fujisawa T, Kaminuma E, Ogasawara O, Okubo K, Takagi T, Nakamura Y. DDBJ progress report: a new submission system for leading to a correct annotation. Nucleic Acids Res 2013; 42:D44-9. [PMID: 24194602 PMCID: PMC3964987 DOI: 10.1093/nar/gkt1066] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023] Open
Abstract
The DNA Data Bank of Japan (DDBJ; http://www.ddbj.nig.ac.jp) maintains and provides archival, retrieval and analytical resources for biological information. This database content is shared with the US National Center for Biotechnology Information (NCBI) and the European Bioinformatics Institute (EBI) within the framework of the International Nucleotide Sequence Database Collaboration (INSDC). DDBJ launched a new nucleotide sequence submission system for receiving traditional nucleotide sequence. We expect that the new submission system will be useful for many submitters to input accurate annotation and reduce the time needed for data input. In addition, DDBJ has started a new service, the Japanese Genotype–phenotype Archive (JGA), with our partner institute, the National Bioscience Database Center (NBDC). JGA permanently archives and shares all types of individual human genetic and phenotypic data. We also introduce improvements in the DDBJ services and databases made during the past year.
Collapse
Affiliation(s)
- Takehide Kosuge
- DDBJ Center, National Institute of Genetics, Yata 1111, Mishima, Shizuoka 411-8540, Japan and National Bioscience Database Center, Japan Science and Technology Agency, Tokyo 102-8666, Japan
| | | | | | | | | | | | | | | | | |
Collapse
|
9
|
Nagasaki H, Mochizuki T, Kodama Y, Saruhashi S, Morizaki S, Sugawara H, Ohyanagi H, Kurata N, Okubo K, Takagi T, Kaminuma E, Nakamura Y. DDBJ read annotation pipeline: a cloud computing-based pipeline for high-throughput analysis of next-generation sequencing data. DNA Res 2013; 20:383-90. [PMID: 23657089 PMCID: PMC3738164 DOI: 10.1093/dnares/dst017] [Citation(s) in RCA: 62] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
High-performance next-generation sequencing (NGS) technologies are advancing genomics and molecular biological research. However, the immense amount of sequence data requires computational skills and suitable hardware resources that are a challenge to molecular biologists. The DNA Data Bank of Japan (DDBJ) of the National Institute of Genetics (NIG) has initiated a cloud computing-based analytical pipeline, the DDBJ Read Annotation Pipeline (DDBJ Pipeline), for a high-throughput annotation of NGS reads. The DDBJ Pipeline offers a user-friendly graphical web interface and processes massive NGS datasets using decentralized processing by NIG supercomputers currently free of charge. The proposed pipeline consists of two analysis components: basic analysis for reference genome mapping and de novo assembly and subsequent high-level analysis of structural and functional annotations. Users may smoothly switch between the two components in the pipeline, facilitating web-based operations on a supercomputer for high-throughput data analysis. Moreover, public NGS reads of the DDBJ Sequence Read Archive located on the same supercomputer can be imported into the pipeline through the input of only an accession number. This proposed pipeline will facilitate research by utilizing unified analytical workflows applied to the NGS data. The DDBJ Pipeline is accessible at http://p.ddbj.nig.ac.jp/.
Collapse
Affiliation(s)
- Hideki Nagasaki
- Center for Information Biology and DNA Data Bank of Japan, National Institute of Genetics, 1111 Yata, Mishima, Shizuoka 411-8510, Japan
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
10
|
Karlsson J, Trelles O. MAPI: a software framework for distributed biomedical applications. J Biomed Semantics 2013; 4:4. [PMID: 23311574 PMCID: PMC3558448 DOI: 10.1186/2041-1480-4-4] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2012] [Accepted: 12/16/2012] [Indexed: 11/11/2022] Open
Abstract
Background The amount of web-based resources (databases, tools etc.) in biomedicine has increased, but the integrated usage of those resources is complex due to differences in access protocols and data formats. However, distributed data processing is becoming inevitable in several domains, in particular in biomedicine, where researchers face rapidly increasing data sizes. This big data is difficult to process locally because of the large processing, memory and storage capacity required. Results This manuscript describes a framework, called MAPI, which provides a uniform representation of resources available over the Internet, in particular for Web Services. The framework enhances their interoperability and collaborative use by enabling a uniform and remote access. The framework functionality is organized in modules that can be combined and configured in different ways to fulfil concrete development requirements. Conclusions The framework has been tested in the biomedical application domain where it has been a base for developing several clients that are able to integrate different web resources. The MAPI binaries and documentation are freely available at http://www.bitlab-es.com/mapi under the Creative Commons Attribution-No Derivative Works 2.5 Spain License. The MAPI source code is available by request (GPL v3 license).
Collapse
Affiliation(s)
- Johan Karlsson
- Computer Architecture Department, University of Málaga, Complejo Tecnológico, Campus de Teatinos, Málaga, 29080, Spain.
| | | |
Collapse
|
11
|
Kodama Y, Mashima J, Kaminuma E, Gojobori T, Ogasawara O, Takagi T, Okubo K, Nakamura Y. The DNA Data Bank of Japan launches a new resource, the DDBJ Omics Archive of functional genomics experiments. Nucleic Acids Res 2011; 40:D38-42. [PMID: 22110025 PMCID: PMC3244990 DOI: 10.1093/nar/gkr994] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023] Open
Abstract
The DNA Data Bank of Japan (DDBJ; http://www.ddbj.nig.ac.jp) maintains and provides archival, retrieval and analytical resources for biological information. The central DDBJ resource consists of public, open-access nucleotide sequence databases including raw sequence reads, assembly information and functional annotation. Database content is exchanged with EBI and NCBI within the framework of the International Nucleotide Sequence Database Collaboration (INSDC). In 2011, DDBJ launched two new resources: the ‘DDBJ Omics Archive’ (DOR; http://trace.ddbj.nig.ac.jp/dor) and BioProject (http://trace.ddbj.nig.ac.jp/bioproject). DOR is an archival database of functional genomics data generated by microarray and highly parallel new generation sequencers. Data are exchanged between the ArrayExpress at EBI and DOR in the common MAGE-TAB format. BioProject provides an organizational framework to access metadata about research projects and the data from the projects that are deposited into different databases. In this article, we describe major changes and improvements introduced to the DDBJ services, and the launch of two new resources: DOR and BioProject.
Collapse
Affiliation(s)
- Yuichi Kodama
- Center for Information Biology and DNA Data Bank of Japan, National Institute of Genetics, Research Organization for Information and Systems, Yata, Mishima 411-8510, Japan
| | | | | | | | | | | | | | | |
Collapse
|
12
|
Katayama T, Wilkinson MD, Vos R, Kawashima T, Kawashima S, Nakao M, Yamamoto Y, Chun HW, Yamaguchi A, Kawano S, Aerts J, Aoki-Kinoshita KF, Arakawa K, Aranda B, Bonnal RJ, Fernández JM, Fujisawa T, Gordon PM, Goto N, Haider S, Harris T, Hatakeyama T, Ho I, Itoh M, Kasprzyk A, Kido N, Kim YJ, Kinjo AR, Konishi F, Kovarskaya Y, von Kuster G, Labarga A, Limviphuvadh V, McCarthy L, Nakamura Y, Nam Y, Nishida K, Nishimura K, Nishizawa T, Ogishima S, Oinn T, Okamoto S, Okuda S, Ono K, Oshita K, Park KJ, Putnam N, Senger M, Severin J, Shigemoto Y, Sugawara H, Taylor J, Trelles O, Yamasaki C, Yamashita R, Satoh N, Takagi T. The 2nd DBCLS BioHackathon: interoperable bioinformatics Web services for integrated applications. J Biomed Semantics 2011; 2:4. [PMID: 21806842 PMCID: PMC3170566 DOI: 10.1186/2041-1480-2-4] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2011] [Accepted: 08/02/2011] [Indexed: 01/19/2023] Open
Abstract
Background The interaction between biological researchers and the bioinformatics tools they use is still hampered by incomplete interoperability between such tools. To ensure interoperability initiatives are effectively deployed, end-user applications need to be aware of, and support, best practices and standards. Here, we report on an initiative in which software developers and genome biologists came together to explore and raise awareness of these issues: BioHackathon 2009. Results Developers in attendance came from diverse backgrounds, with experts in Web services, workflow tools, text mining and visualization. Genome biologists provided expertise and exemplar data from the domains of sequence and pathway analysis and glyco-informatics. One goal of the meeting was to evaluate the ability to address real world use cases in these domains using the tools that the developers represented. This resulted in i) a workflow to annotate 100,000 sequences from an invertebrate species; ii) an integrated system for analysis of the transcription factor binding sites (TFBSs) enriched based on differential gene expression data obtained from a microarray experiment; iii) a workflow to enumerate putative physical protein interactions among enzymes in a metabolic pathway using protein structure data; iv) a workflow to analyze glyco-gene-related diseases by searching for human homologs of glyco-genes in other species, such as fruit flies, and retrieving their phenotype-annotated SNPs. Conclusions Beyond deriving prototype solutions for each use-case, a second major purpose of the BioHackathon was to highlight areas of insufficiency. We discuss the issues raised by our exploration of the problem/solution space, concluding that there are still problems with the way Web services are modeled and annotated, including: i) the absence of several useful data or analysis functions in the Web service "space"; ii) the lack of documentation of methods; iii) lack of compliance with the SOAP/WSDL specification among and between various programming-language libraries; and iv) incompatibility between various bioinformatics data formats. Although it was still difficult to solve real world problems posed to the developers by the biological researchers in attendance because of these problems, we note the promise of addressing these issues within a semantic framework.
Collapse
Affiliation(s)
- Toshiaki Katayama
- Database Center for Life Science, Research Organization of Information and Systems, 2-11-16 Yayoi, Bunkyo-ku, Tokyo, 113-0032, Japan.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
13
|
Lamprecht AL, Naujokat S, Margaria T, Steffen B. Semantics-based composition of EMBOSS services. J Biomed Semantics 2011; 2 Suppl 1:S5. [PMID: 21388574 PMCID: PMC3105497 DOI: 10.1186/2041-1480-2-s1-s5] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022] Open
Abstract
Background More than in other domains the heterogeneous services world in bioinformatics demands for a methodology to classify and relate resources in a both human and machine accessible manner. The Semantic Web, which is meant to address exactly this challenge, is currently one of the most ambitious projects in computer science. Collective efforts within the community have already led to a basis of standards for semantic service descriptions and meta-information. In combination with process synthesis and planning methods, such knowledge about types and services can facilitate the automatic composition of workflows for particular research questions. Results In this study we apply the synthesis methodology that is available in the Bio-jETI workflow management framework for the semantics-based composition of EMBOSS services. EMBOSS (European Molecular Biology Open Software Suite) is a collection of 350 tools (March 2010) for various sequence analysis tasks, and thus a rich source of services and types that imply comprehensive domain models for planning and synthesis approaches. We use and compare two different setups of our EMBOSS synthesis domain: 1) a manually defined domain setup where an intuitive, high-level, semantically meaningful nomenclature is applied to describe the input/output behavior of the single EMBOSS tools and their classifications, and 2) a domain setup where this information has been automatically derived from the EMBOSS Ajax Command Definition (ACD) files and the EMBRACE Data and Methods ontology (EDAM). Our experiments demonstrate that these domain models in combination with our synthesis methodology greatly simplify working with the large, heterogeneous, and hence manually intractable EMBOSS collection. However, they also show that with the information that can be derived from the (current) ACD files and EDAM ontology alone, some essential connections between services can not be recognized. Conclusions Our results show that adequate domain modeling requires to incorporate as much domain knowledge as possible, far beyond the mere technical aspects of the different types and services. Finding or defining semantically appropriate service and type descriptions is a difficult task, but the bioinformatics community appears to be on the right track towards a Life Science Semantic Web, which will eventually allow automatic service composition methods to unfold their full potential.
Collapse
Affiliation(s)
- Anna-Lena Lamprecht
- Chair for Programming Systems, Technical University Dortmund, Dortmund, D-44227, Germany.
| | | | | | | |
Collapse
|
14
|
Kaminuma E, Kosuge T, Kodama Y, Aono H, Mashima J, Gojobori T, Sugawara H, Ogasawara O, Takagi T, Okubo K, Nakamura Y. DDBJ progress report. Nucleic Acids Res 2010; 39:D22-7. [PMID: 21062814 PMCID: PMC3013661 DOI: 10.1093/nar/gkq1041] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
The DNA Data Bank of Japan (DDBJ, http://www.ddbj.nig.ac.jp) provides a nucleotide sequence archive database and accompanying database tools for sequence submission, entry retrieval and annotation analysis. The DDBJ collected and released 3 637 446 entries/2 272 231 889 bases between July 2009 and June 2010. A highlight of the released data was archive datasets from next-generation sequencing reads of Japanese rice cultivar, Koshihikari submitted by the National Institute of Agrobiological Sciences. In this period, we started a new archive for quantitative genomics data, the DDBJ Omics aRchive (DOR). The DOR stores quantitative data both from the microarray and high-throughput new sequencing platforms. Moreover, we improved the content of the DDBJ patent sequence, released a new submission tool of the DDBJ Sequence Read Archive (DRA) which archives massive raw sequencing reads, and enhanced a cloud computing-based analytical system from sequencing reads, the DDBJ Read Annotation Pipeline. In this article, we describe these new functions of the DDBJ databases and support tools.
Collapse
Affiliation(s)
- Eli Kaminuma
- Center for Information Biology and DNA Data Bank of Japan, National Institute of Genetics, Research Organization for Information and Systems, Yata, Mishima 411-8510, Japan
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
15
|
Soós V, Sebestyén E, Juhász A, Light ME, Kohout L, Szalai G, Tandori J, Van Staden J, Balázs E. Transcriptome analysis of germinating maize kernels exposed to smoke-water and the active compound KAR1. BMC PLANT BIOLOGY 2010; 10:236. [PMID: 21044315 PMCID: PMC3095319 DOI: 10.1186/1471-2229-10-236] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/07/2010] [Accepted: 11/02/2010] [Indexed: 05/24/2023]
Abstract
BACKGROUND Smoke released from burning vegetation functions as an important environmental signal promoting the germination of many plant species following a fire. It not only promotes the germination of species from fire-prone habitats, but several species from non-fire-prone areas also respond, including some crops. The germination stimulatory activity can largely be attributed to the presence of a highly active butenolide compound, 3-methyl-2H-furo[2,3-c]pyran-2-one (referred to as karrikin 1 or KAR1), that has previously been isolated from plant-derived smoke. Several hypotheses have arisen regarding the molecular background of smoke and KAR1 action. RESULTS In this paper we demonstrate that although smoke-water and KAR1 treatment of maize kernels result in a similar physiological response, the gene expression and the protein ubiquitination patterns are quite different. Treatment with smoke-water enhanced the ubiquitination of proteins and activated protein-degradation-related genes. This effect was completely absent from KAR1-treated kernels, in which a specific aquaporin gene was distinctly upregulated. CONCLUSIONS Our findings indicate that the array of bioactive compounds present in smoke-water form an environmental signal that may act together in germination stimulation. It is highly possible that the smoke/KAR1 'signal' is perceived by a receptor that is shared with the signal transduction system implied in perceiving environmental cues (especially stresses and light), or some kind of specialized receptor exists in fire-prone plant species which diverged from a more general one present in a common ancestor, and also found in non fire-prone plants allowing for a somewhat weaker but still significant response. Besides their obvious use in agricultural practices, smoke and KAR1 can be used in studies to gain further insight into the transcriptional changes during germination.
Collapse
Affiliation(s)
- Vilmos Soós
- Department of Applied Genomics, Agricultural Research Institute of the Hungarian Academy of Sciences, H-2462 Martonvásár, Brunszvik u. 2, Hungary
| | - Endre Sebestyén
- Department of Applied Genomics, Agricultural Research Institute of the Hungarian Academy of Sciences, H-2462 Martonvásár, Brunszvik u. 2, Hungary
| | - Angéla Juhász
- Department of Applied Genomics, Agricultural Research Institute of the Hungarian Academy of Sciences, H-2462 Martonvásár, Brunszvik u. 2, Hungary
| | - Marnie E Light
- Research Centre for Plant Growth and Development, School of Biological and Conservation Sciences, University of KwaZulu-Natal Pietermaritzburg, Private Bag X01, Scottsville 3209, South Africa
| | - Ladislav Kohout
- Institute of Organic Chemistry and Biochemistry, Academy of Sciences of the Czech Republic, v.v.i., Flemingovo nám. 2, 166 10 Prague 6, Czech Republic
| | - Gabriella Szalai
- Department of Plant Physiology, Agricultural Research Institute of the Hungarian Academy of Sciences, H-2462 Martonvásár, Brunszvik u. 2, Hungary
| | - Júlia Tandori
- Department of Plant Physiology, Agricultural Research Institute of the Hungarian Academy of Sciences, H-2462 Martonvásár, Brunszvik u. 2, Hungary
| | - Johannes Van Staden
- Research Centre for Plant Growth and Development, School of Biological and Conservation Sciences, University of KwaZulu-Natal Pietermaritzburg, Private Bag X01, Scottsville 3209, South Africa
| | - Ervin Balázs
- Department of Applied Genomics, Agricultural Research Institute of the Hungarian Academy of Sciences, H-2462 Martonvásár, Brunszvik u. 2, Hungary
| |
Collapse
|
16
|
Goble CA, Bhagat J, Aleksejevs S, Cruickshank D, Michaelides D, Newman D, Borkum M, Bechhofer S, Roos M, Li P, De Roure D. myExperiment: a repository and social network for the sharing of bioinformatics workflows. Nucleic Acids Res 2010; 38:W677-82. [PMID: 20501605 PMCID: PMC2896080 DOI: 10.1093/nar/gkq429] [Citation(s) in RCA: 214] [Impact Index Per Article: 15.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
myExperiment (http://www.myexperiment.org) is an online research environment that supports the social sharing of bioinformatics workflows. These workflows are procedures consisting of a series of computational tasks using web services, which may be performed on data from its retrieval, integration and analysis, to the visualization of the results. As a public repository of workflows, myExperiment allows anybody to discover those that are relevant to their research, which can then be reused and repurposed to their specific requirements. Conversely, developers can submit their workflows to myExperiment and enable them to be shared in a secure manner. Since its release in 2007, myExperiment currently has over 3500 registered users and contains more than 1000 workflows. The social aspect to the sharing of these workflows is facilitated by registered users forming virtual communities bound together by a common interest or research project. Contributors of workflows can build their reputation within these communities by receiving feedback and credit from individuals who reuse their work. Further documentation about myExperiment including its REST web service is available from http://wiki.myexperiment.org. Feedback and requests for support can be sent to bugs@myexperiment.org.
Collapse
Affiliation(s)
- Carole A Goble
- School of Computer Science, The University of Manchester, Manchester M13 9PL, UK
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
17
|
Bak G, Hwang SW, Ko Y, Lee J, Kim Y, Kim K, Hong SK, Lee Y. On-off controllable RNA hybrid expression vector for yeast three-hybrid system. BMB Rep 2010; 43:110-4. [PMID: 20193129 DOI: 10.5483/bmbrep.2010.43.2.110] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023] Open
Abstract
The yeast three-hybrid system (Y3H), a powerful method for identifying RNA-binding proteins, still suffers from many false positives, due mostly to RNA-independent interactions. In this study, we attempted to efficiently identify false positives by introducing a tetracycline operator (tetO) motif into the RPR1 promoter of an RNA hybrid expression vector. We successfully developed a tight tetracycline-regulatable RPR1 promoter variant containing a single tetO motif between the transcription start site and the A-box sequence of the RPR1 promoter. Expression from this tetracycline-regulatable RPR1 promoter in the presence of tetracycline-response transcription activator (tTA) was positively controlled by doxycycline (Dox), a derivative of tetracycline. This on-off control runs opposite to the general knowledge that Dox negatively regulates tTA. This positively controlled RPR1 promoter system can therefore efficiently eliminate RNA-independent false positives commonly observed in the Y3H system by directly monitoring RNA hybrid expression.
Collapse
Affiliation(s)
- Geunu Bak
- Department of Chemistry, KAIST, Daejeon 305-701, Korea
| | | | | | | | | | | | | | | |
Collapse
|
18
|
Katayama T, Nakao M, Takagi T. TogoWS: integrated SOAP and REST APIs for interoperable bioinformatics Web services. Nucleic Acids Res 2010; 38:W706-11. [PMID: 20472643 PMCID: PMC2896079 DOI: 10.1093/nar/gkq386] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Web services have become widely used in bioinformatics analysis, but there exist incompatibilities in interfaces and data types, which prevent users from making full use of a combination of these services. Therefore, we have developed the TogoWS service to provide an integrated interface with advanced features. In the TogoWS REST (REpresentative State Transfer) API (application programming interface), we introduce a unified access method for major database resources through intuitive URIs that can be used to search, retrieve, parse and convert the database entries. The TogoWS SOAP API resolves compatibility issues found on the server and client-side SOAP implementations. The TogoWS service is freely available at: http://togows.dbcls.jp/.
Collapse
Affiliation(s)
- Toshiaki Katayama
- Human Genome Center, Institute of Medical Science, University of Tokyo, 4-6-1 Shirokane-dai, Minato-ku, Tokyo 108-8639, Japan.
| | | | | |
Collapse
|