1
|
Bolduc B, Hodgkins SB, Varner RK, Crill PM, McCalley CK, Chanton JP, Tyson GW, Riley WJ, Palace M, Duhaime MB, Hough MA, Saleska SR, Sullivan MB, Rich VI. The IsoGenie database: an interdisciplinary data management solution for ecosystems biology and environmental research. PeerJ 2020. [DOI: 10.7717/peerj.9467] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Modern microbial and ecosystem sciences require diverse interdisciplinary teams that are often challenged in “speaking” to one another due to different languages and data product types. Here we introduce the IsoGenie Database (IsoGenieDB; https://isogenie-db.asc.ohio-state.edu/), a de novo developed data management and exploration platform, as a solution to this challenge of accurately representing and integrating heterogenous environmental and microbial data across ecosystem scales. The IsoGenieDB is a public and private data infrastructure designed to store and query data generated by the IsoGenie Project, a ~10 year DOE-funded project focused on discovering ecosystem climate feedbacks in a thawing permafrost landscape. The IsoGenieDB provides (i) a platform for IsoGenie Project members to explore the project’s interdisciplinary datasets across scales through the inherent relationships among data entities, (ii) a framework to consolidate and harmonize the datasets needed by the team’s modelers, and (iii) a public venue that leverages the same spatially explicit, disciplinarily integrated data structure to share published datasets. The IsoGenieDB is also being expanded to cover the NASA-funded Archaea to Atmosphere (A2A) project, which scales the findings of IsoGenie to a broader suite of Arctic peatlands, via the umbrella A2A Database (A2A-DB). The IsoGenieDB’s expandability and flexible architecture allow it to serve as an example ecosystems database.
Collapse
Affiliation(s)
- Benjamin Bolduc
- Department of Microbiology, The Ohio State University, Columbus, OH, USA
| | | | - Ruth K. Varner
- Earth Systems Research Center, Institute for the Study of Earth, Oceans and Space, University of New Hampshire, Durham, NH, USA
- Department of Earth Sciences, College of Engineering and Physical Sciences, University of New Hampshire, Durham, NH, USA
| | - Patrick M. Crill
- Department of Geological Sciences and Bolin Centre for Climate Research, Stockholm University, Stockholm, Sweden
| | - Carmody K. McCalley
- Thomas H. Gosnell School of Life Sciences, Rochester Institute of Technology, Rochester, NY, USA
| | - Jeffrey P. Chanton
- Department of Earth, Ocean, and Atmospheric Science, Florida State University, Tallahassee, FL, USA
| | - Gene W. Tyson
- Australian Centre for Ecogenomics, School of Chemistry and Molecular Biosciences, University of Queensland, Brisbane, QLD, Australia
| | - William J. Riley
- Climate and Ecosystem Sciences Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Michael Palace
- Earth Systems Research Center, Institute for the Study of Earth, Oceans and Space, University of New Hampshire, Durham, NH, USA
- Department of Earth Sciences, College of Engineering and Physical Sciences, University of New Hampshire, Durham, NH, USA
| | - Melissa B. Duhaime
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI, USA
| | - Moira A. Hough
- Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, AZ, USA
| | - Scott R. Saleska
- Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, AZ, USA
| | - Matthew B. Sullivan
- Department of Microbiology, The Ohio State University, Columbus, OH, USA
- Department of Civil, Environmental and Geodetic Engineering, The Ohio State University, Columbus, OH, USA
| | - Virginia I. Rich
- Department of Microbiology, The Ohio State University, Columbus, OH, USA
| | | |
Collapse
|
2
|
Riebeling C, Jungnickel H, Luch A, Haase A. Systems Biology to Support Nanomaterial Grouping. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2017; 947:143-171. [PMID: 28168668 DOI: 10.1007/978-3-319-47754-1_6] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
The assessment of potential health risks of engineered nanomaterials (ENMs) is a challenging task due to the high number and great variety of already existing and newly emerging ENMs. Reliable grouping or categorization of ENMs with respect to hazards could help to facilitate prioritization and decision making for regulatory purposes. The development of grouping criteria, however, requires a broad and comprehensive data basis. A promising platform addressing this challenge is the systems biology approach. The different areas of systems biology, most prominently transcriptomics, proteomics and metabolomics, each of which provide a wealth of data that can be used to reveal novel biomarkers and biological pathways involved in the mode-of-action of ENMs. Combining such data with classical toxicological data would enable a more comprehensive understanding and hence might lead to more powerful and reliable prediction models. Physico-chemical data provide crucial information on the ENMs and need to be integrated, too. Overall statistical analysis should reveal robust grouping and categorization criteria and may ultimately help to identify meaningful biomarkers and biological pathways that sufficiently characterize the corresponding ENM subgroups. This chapter aims to give an overview on the different systems biology technologies and their current applications in the field of nanotoxicology, as well as to identify the existing challenges.
Collapse
Affiliation(s)
- Christian Riebeling
- German Federal Institute for Risk Assessment, Department of Chemical and Product Safety, Berlin, Germany
| | - Harald Jungnickel
- German Federal Institute for Risk Assessment, Department of Chemical and Product Safety, Berlin, Germany
| | - Andreas Luch
- German Federal Institute for Risk Assessment, Department of Chemical and Product Safety, Berlin, Germany
| | - Andrea Haase
- German Federal Institute for Risk Assessment, Department of Chemical and Product Safety, Berlin, Germany.
| |
Collapse
|
3
|
Izzo M, Mortola F, Arnulfo G, Fato MM, Varesio L. A digital repository with an extensible data model for biobanking and genomic analysis management. BMC Genomics 2014; 15 Suppl 3:S3. [PMID: 25077808 PMCID: PMC4083403 DOI: 10.1186/1471-2164-15-s3-s3] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Motivation Molecular biology laboratories require extensive metadata to improve data collection and analysis. The heterogeneity of the collected metadata grows as research is evolving in to international multi-disciplinary collaborations and increasing data sharing among institutions. Single standardization is not feasible and it becomes crucial to develop digital repositories with flexible and extensible data models, as in the case of modern integrated biobanks management. Results We developed a novel data model in JSON format to describe heterogeneous data in a generic biomedical science scenario. The model is built on two hierarchical entities: processes and events, roughly corresponding to research studies and analysis steps within a single study. A number of sequential events can be grouped in a process building up a hierarchical structure to track patient and sample history. Each event can produce new data. Data is described by a set of user-defined metadata, and may have one or more associated files. We integrated the model in a web based digital repository with a data grid storage to manage large data sets located in geographically distinct areas. We built a graphical interface that allows authorized users to define new data types dynamically, according to their requirements. Operators compose queries on metadata fields using a flexible search interface and run them on the database and on the grid. We applied the digital repository to the integrated management of samples, patients and medical history in the BIT-Gaslini biobank. The platform currently manages 1800 samples of over 900 patients. Microarray data from 150 analyses are stored on the grid storage and replicated on two physical resources for preservation. The system is equipped with data integration capabilities with other biobanks for worldwide information sharing. Conclusions Our data model enables users to continuously define flexible, ad hoc, and loosely structured metadata, for information sharing in specific research projects and purposes. This approach can improve sensitively interdisciplinary research collaboration and allows to track patients' clinical records, sample management information, and genomic data. The web interface allows the operators to easily manage, query, and annotate the files, without dealing with the technicalities of the data grid.
Collapse
|
4
|
Sturla SJ, Boobis AR, FitzGerald RE, Hoeng J, Kavlock RJ, Schirmer K, Whelan M, Wilks MF, Peitsch MC. Systems toxicology: from basic research to risk assessment. Chem Res Toxicol 2014; 27:314-29. [PMID: 24446777 PMCID: PMC3964730 DOI: 10.1021/tx400410s] [Citation(s) in RCA: 211] [Impact Index Per Article: 21.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
Systems Toxicology is the integration of classical toxicology with quantitative analysis of large networks of molecular and functional changes occurring across multiple levels of biological organization. Society demands increasingly close scrutiny of the potential health risks associated with exposure to chemicals present in our everyday life, leading to an increasing need for more predictive and accurate risk-assessment approaches. Developing such approaches requires a detailed mechanistic understanding of the ways in which xenobiotic substances perturb biological systems and lead to adverse outcomes. Thus, Systems Toxicology approaches offer modern strategies for gaining such mechanistic knowledge by combining advanced analytical and computational tools. Furthermore, Systems Toxicology is a means for the identification and application of biomarkers for improved safety assessments. In Systems Toxicology, quantitative systems-wide molecular changes in the context of an exposure are measured, and a causal chain of molecular events linking exposures with adverse outcomes (i.e., functional and apical end points) is deciphered. Mathematical models are then built to describe these processes in a quantitative manner. The integrated data analysis leads to the identification of how biological networks are perturbed by the exposure and enables the development of predictive mathematical models of toxicological processes. This perspective integrates current knowledge regarding bioanalytical approaches, computational analysis, and the potential for improved risk assessment.
Collapse
Affiliation(s)
- Shana J Sturla
- Department of Health Sciences and Technology, Institute of Food, Nutrition and Health, ETH Zürich , Schmelzbergstrasse 9, 8092 Zürich, Switzerland
| | | | | | | | | | | | | | | | | |
Collapse
|
5
|
Hermida L, Poussin C, Stadler MB, Gubian S, Sewer A, Gaidatzis D, Hotz HR, Martin F, Belcastro V, Cano S, Peitsch MC, Hoeng J. Confero: an integrated contrast data and gene set platform for computational analysis and biological interpretation of omics data. BMC Genomics 2013; 14:514. [PMID: 23895370 PMCID: PMC3750322 DOI: 10.1186/1471-2164-14-514] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2012] [Accepted: 07/17/2013] [Indexed: 11/18/2022] Open
Abstract
Background High-throughput omics technologies such as microarrays and next-generation sequencing (NGS) have become indispensable tools in biological research. Computational analysis and biological interpretation of omics data can pose significant challenges due to a number of factors, in particular the systems integration required to fully exploit and compare data from different studies and/or technology platforms. In transcriptomics, the identification of differentially expressed genes when studying effect(s) or contrast(s) of interest constitutes the starting point for further downstream computational analysis (e.g. gene over-representation/enrichment analysis, reverse engineering) leading to mechanistic insights. Therefore, it is important to systematically store the full list of genes with their associated statistical analysis results (differential expression, t-statistics, p-value) corresponding to one or more effect(s) or contrast(s) of interest (shortly termed as ” contrast data”) in a comparable manner and extract gene sets in order to efficiently support downstream analyses and further leverage data on a long-term basis. Filling this gap would open new research perspectives for biologists to discover disease-related biomarkers and to support the understanding of molecular mechanisms underlying specific biological perturbation effects (e.g. disease, genetic, environmental, etc.). Results To address these challenges, we developed Confero, a contrast data and gene set platform for downstream analysis and biological interpretation of omics data. The Confero software platform provides storage of contrast data in a simple and standard format, data transformation to enable cross-study and platform data comparison, and automatic extraction and storage of gene sets to build new a priori knowledge which is leveraged by integrated and extensible downstream computational analysis tools. Gene Set Enrichment Analysis (GSEA) and Over-Representation Analysis (ORA) are currently integrated as an analysis module as well as additional tools to support biological interpretation. Confero is a standalone system that also integrates with Galaxy, an open-source workflow management and data integration system. To illustrate Confero platform functionality we walk through major aspects of the Confero workflow and results using the Bioconductor estrogen package dataset. Conclusion Confero provides a unique and flexible platform to support downstream computational analysis facilitating biological interpretation. The system has been designed in order to provide the researcher with a simple, innovative, and extensible solution to store and exploit analyzed data in a sustainable and reproducible manner thereby accelerating knowledge-driven research. Confero source code is freely available from http://sourceforge.net/projects/confero/.
Collapse
Affiliation(s)
- Leandro Hermida
- Philip Morris International Research & Development, Quai Jeanrenaud 5, CH-2000 Neuchatel, Switzerland.
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
6
|
Wruck W, Peuker M, Regenbrecht CRA. Data management strategies for multinational large-scale systems biology projects. Brief Bioinform 2012; 15:65-78. [PMID: 23047157 PMCID: PMC3896927 DOI: 10.1093/bib/bbs064] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022] Open
Abstract
Good accessibility of publicly funded research data is essential to secure an open scientific system and eventually becomes mandatory [Wellcome Trust will Penalise Scientists Who Don’t Embrace Open Access. The Guardian 2012]. By the use of high-throughput methods in many research areas from physics to systems biology, large data collections are increasingly important as raw material for research. Here, we present strategies worked out by international and national institutions targeting open access to publicly funded research data via incentives or obligations to share data. Funding organizations such as the British Wellcome Trust therefore have developed data sharing policies and request commitment to data management and sharing in grant applications. Increased citation rates are a profound argument for sharing publication data. Pre-publication sharing might be rewarded by a data citation credit system via digital object identifiers (DOIs) which have initially been in use for data objects. Besides policies and incentives, good practice in data management is indispensable. However, appropriate systems for data management of large-scale projects for example in systems biology are hard to find. Here, we give an overview of a selection of open-source data management systems proved to be employed successfully in large-scale projects.
Collapse
Affiliation(s)
- Wasco Wruck
- Institute of Pathology, Charite - Universitaetsmedizin Berlin, Chariteplatz 1, 10117 Berlin. Tel.: +49 30 2093 8951; Fax: +49 30 450 536 909;
| | | | | |
Collapse
|
7
|
Britto R, Sallou O, Collin O, Michaux G, Primig M, Chalmel F. GPSy: a cross-species gene prioritization system for conserved biological processes--application in male gamete development. Nucleic Acids Res 2012; 40:W458-65. [PMID: 22570409 PMCID: PMC3394256 DOI: 10.1093/nar/gks380] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
Abstract
We present gene prioritization system (GPSy), a cross-species gene prioritization system that facilitates the arduous but critical task of prioritizing genes for follow-up functional analyses. GPSy’s modular design with regard to species, data sets and scoring strategies enables users to formulate queries in a highly flexible manner. Currently, the system encompasses 20 topics related to conserved biological processes including male gamete development discussed in this article. The web server-based tool is freely available at http://gpsy.genouest.org.
Collapse
|
8
|
Bauch A, Adamczyk I, Buczek P, Elmer FJ, Enimanev K, Glyzewski P, Kohler M, Pylak T, Quandt A, Ramakrishnan C, Beisel C, Malmström L, Aebersold R, Rinn B. openBIS: a flexible framework for managing and analyzing complex data in biology research. BMC Bioinformatics 2011; 12:468. [PMID: 22151573 PMCID: PMC3275639 DOI: 10.1186/1471-2105-12-468] [Citation(s) in RCA: 75] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2011] [Accepted: 12/08/2011] [Indexed: 11/10/2022] Open
Abstract
Background Modern data generation techniques used in distributed systems biology research projects often create datasets of enormous size and diversity. We argue that in order to overcome the challenge of managing those large quantitative datasets and maximise the biological information extracted from them, a sound information system is required. Ease of integration with data analysis pipelines and other computational tools is a key requirement for it. Results We have developed openBIS, an open source software framework for constructing user-friendly, scalable and powerful information systems for data and metadata acquired in biological experiments. openBIS enables users to collect, integrate, share, publish data and to connect to data processing pipelines. This framework can be extended and has been customized for different data types acquired by a range of technologies. Conclusions openBIS is currently being used by several SystemsX.ch and EU projects applying mass spectrometric measurements of metabolites and proteins, High Content Screening, or Next Generation Sequencing technologies. The attributes that make it interesting to a large research community involved in systems biology projects include versatility, simplicity in deployment, scalability to very large data, flexibility to handle any biological data type and extensibility to the needs of any research domain.
Collapse
Affiliation(s)
- Angela Bauch
- Department of Biosystems Science and Engineering, Center for Information Sciences and Databases, Swiss Federal Institute of Technology (ETH) Zurich, Switzerland
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
9
|
Abstract
Background In principle, gene expression data can be viewed as providing just the three-valued expression profiles of target biological elements relative to an experiment at hand. Although complicated, gathering expression profiles does not pose much of a challenge from a query language standpoint. What is interesting is how these expression profiles are used to tease out information from the vast array of information repositories that ascribe meaning to the expression profiles. Since such annotations are inherently experiment specific functions, much the same way as queries in databases, developing a querying system for gene expression data appears to be pointless. Instead, developing tools and techniques to support individual assignment has been considered prudent in contemporary research. Results We propose a gene expression data management and querying system that is able to support pre-expression, expression and post-expression level analysis and reduce impedance mismatch between analysis systems. To this end, we propose a new, platform-independent and general purpose query language called Curray, for Custom Microarray query language, to support online expression data analysis using distributed resources. It includes features to design expression analysis pipelines using language constructs at the conceptual level. The ability to include user defined functions as a first-class language feature facilitates unlimited analysis support and removes language limitations. We show that Curray’s declarative and extensible features nimbly allow flexible modeling and room for customization. Conclusions The developments proposed in this article allow users to view their expression data from a conceptual standpoint - experiments, probes, expressions, mapping, etc. at multiple levels of representation and independent of the underlying chip technologies. It also allows transparent roll-up and drill-down along representation hierarchies from raw data to standards such as MIAME and MAGE-ML using linguistic constructs. Curray also allows seamless integration with distributed web resources through its LifeDB system of which it is a part.
Collapse
Affiliation(s)
- Hasan Jamil
- Department of Computer Science, Wayne State University, Michigan, USA.
| | | |
Collapse
|
10
|
Lardenois A, Gattiker A, Collin O, Chalmel F, Primig M. GermOnline 4.0 is a genomics gateway for germline development, meiosis and the mitotic cell cycle. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2010; 2010:baq030. [PMID: 21149299 PMCID: PMC3004465 DOI: 10.1093/database/baq030] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
GermOnline 4.0 is a cross-species database portal focusing on high-throughput expression data relevant for germline development, the meiotic cell cycle and mitosis in healthy versus malignant cells. It is thus a source of information for life scientists as well as clinicians who are interested in gene expression and regulatory networks. The GermOnline gateway provides unlimited access to information produced with high-density oligonucleotide microarrays (3'-UTR GeneChips), genome-wide protein-DNA binding assays and protein-protein interaction studies in the context of Ensembl genome annotation. Samples used to produce high-throughput expression data and to carry out genome-wide in vivo DNA binding assays are annotated via the MIAME-compliant Multiomics Information Management and Annotation System (MIMAS 3.0). Furthermore, the Saccharomyces Genomics Viewer (SGV) was developed and integrated into the gateway. SGV is a visualization tool that outputs genome annotation and DNA-strand specific expression data produced with high-density oligonucleotide tiling microarrays (Sc_tlg GeneChips) which cover the complete budding yeast genome on both DNA strands. It facilitates the interpretation of expression levels and transcript structures determined for various cell types cultured under different growth and differentiation conditions. Database URL: www.germonline.org/
Collapse
Affiliation(s)
- Aurélie Lardenois
- Inserm, U625, GERHM, IFR-140, Université de Rennes 1, F-35042 Rennes, France
| | | | | | | | | |
Collapse
|
11
|
Vyas J, Nowling RJ, Meusburger T, Sargeant D, Kadaveru K, Gryk MR, Kundeti V, Rajasekaran S, Schiller MR. MimoSA: a system for minimotif annotation. BMC Bioinformatics 2010; 11:328. [PMID: 20565705 PMCID: PMC2905367 DOI: 10.1186/1471-2105-11-328] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2010] [Accepted: 06/16/2010] [Indexed: 11/11/2022] Open
Abstract
Background Minimotifs are short peptide sequences within one protein, which are recognized by other proteins or molecules. While there are now several minimotif databases, they are incomplete. There are reports of many minimotifs in the primary literature, which have yet to be annotated, while entirely novel minimotifs continue to be published on a weekly basis. Our recently proposed function and sequence syntax for minimotifs enables us to build a general tool that will facilitate structured annotation and management of minimotif data from the biomedical literature. Results We have built the MimoSA application for minimotif annotation. The application supports management of the Minimotif Miner database, literature tracking, and annotation of new minimotifs. MimoSA enables the visualization, organization, selection and editing functions of minimotifs and their attributes in the MnM database. For the literature components, Mimosa provides paper status tracking and scoring of papers for annotation through a freely available machine learning approach, which is based on word correlation. The paper scoring algorithm is also available as a separate program, TextMine. Form-driven annotation of minimotif attributes enables entry of new minimotifs into the MnM database. Several supporting features increase the efficiency of annotation. The layered architecture of MimoSA allows for extensibility by separating the functions of paper scoring, minimotif visualization, and database management. MimoSA is readily adaptable to other annotation efforts that manually curate literature into a MySQL database. Conclusions MimoSA is an extensible application that facilitates minimotif annotation and integrates with the Minimotif Miner database. We have built MimoSA as an application that integrates dynamic abstract scoring with a high performance relational model of minimotif syntax. MimoSA's TextMine, an efficient paper-scoring algorithm, can be used to dynamically rank papers with respect to context.
Collapse
Affiliation(s)
- Jay Vyas
- Department of Molecular, Microbial, and Structural Biology, University of Connecticut Health Center, 263 Farmington Ave. Farmington, CT 06030-3305, USA
| | | | | | | | | | | | | | | | | |
Collapse
|
12
|
Langille MGI, Eisen JA. BioTorrents: a file sharing service for scientific data. PLoS One 2010; 5:e10071. [PMID: 20418944 PMCID: PMC2854681 DOI: 10.1371/journal.pone.0010071] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2010] [Accepted: 03/17/2010] [Indexed: 11/19/2022] Open
Abstract
The transfer of scientific data has emerged as a significant challenge, as datasets continue to grow in size and demand for open access sharing increases. Current methods for file transfer do not scale well for large files and can cause long transfer times. In this study we present BioTorrents, a website that allows open access sharing of scientific data and uses the popular BitTorrent peer-to-peer file sharing technology. BioTorrents allows files to be transferred rapidly due to the sharing of bandwidth across multiple institutions and provides more reliable file transfers due to the built-in error checking of the file sharing technology. BioTorrents contains multiple features, including keyword searching, category browsing, RSS feeds, torrent comments, and a discussion forum. BioTorrents is available at http://www.biotorrents.net.
Collapse
Affiliation(s)
- Morgan G I Langille
- Genome Center, University of California Davis, Davis, California, United States of America.
| | | |
Collapse
|
13
|
Vallon-Christersson J, Nordborg N, Svensson M, Häkkinen J. BASE--2nd generation software for microarray data management and analysis. BMC Bioinformatics 2009; 10:330. [PMID: 19822003 PMCID: PMC2768720 DOI: 10.1186/1471-2105-10-330] [Citation(s) in RCA: 43] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2009] [Accepted: 10/12/2009] [Indexed: 01/08/2023] Open
Abstract
BACKGROUND Microarray experiments are increasing in size and samples are collected asynchronously over long time. Available data are re-analysed as more samples are hybridized. Systematic use of collected data requires tracking of biomaterials, array information, raw data, and assembly of annotations. To meet the information tracking and data analysis challenges in microarray experiments we reimplemented and improved BASE version 1.2. RESULTS The new BASE presented in this report is a comprehensive annotable local microarray data repository and analysis application providing researchers with an efficient information management and analysis tool. The information management system tracks all material from biosource, via sample and through extraction and labelling to raw data and analysis. All items in BASE can be annotated and the annotations can be used as experimental factors in downstream analysis. BASE stores all microarray experiment related data regardless if analysis tools for specific techniques or data formats are readily available. The BASE team is committed to continue improving and extending BASE to make it usable for even more experimental setups and techniques, and we encourage other groups to target their specific needs leveraging on the infrastructure provided by BASE. CONCLUSION BASE is a comprehensive management application for information, data, and analysis of microarray experiments, available as free open source software at http://base.thep.lu.se under the terms of the GPLv3 license.
Collapse
|