1
|
Rasche H, Hyde C, Davis J, Gladman S, Coraor N, Bretaudeau A, Cuccuru G, Bacon W, Serrano-Solano B, Hillman-Jackson J, Hiltemann S, Zhou M, Grüning B, Stubbs A. Training Infrastructure as a Service. Gigascience 2022; 12:giad048. [PMID: 37395629 PMCID: PMC10316688 DOI: 10.1093/gigascience/giad048] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2023] [Revised: 05/31/2023] [Accepted: 06/08/2023] [Indexed: 07/04/2023] Open
Abstract
BACKGROUND Hands-on training, whether in bioinformatics or other domains, often requires significant technical resources and knowledge to set up and run. Instructors must have access to powerful compute infrastructure that can support resource-intensive jobs running efficiently. Often this is achieved using a private server where there is no contention for the queue. However, this places a significant prerequisite knowledge or labor barrier for instructors, who must spend time coordinating deployment and management of compute resources. Furthermore, with the increase of virtual and hybrid teaching, where learners are located in separate physical locations, it is difficult to track student progress as efficiently as during in-person courses. FINDINGS Originally developed by Galaxy Europe and the Gallantries project, together with the Galaxy community, we have created Training Infrastructure-as-a-Service (TIaaS), aimed at providing user-friendly training infrastructure to the global training community. TIaaS provides dedicated training resources for Galaxy-based courses and events. Event organizers register their course, after which trainees are transparently placed in a private queue on the compute infrastructure, which ensures jobs complete quickly, even when the main queue is experiencing high wait times. A built-in dashboard allows instructors to monitor student progress. CONCLUSIONS TIaaS provides a significant improvement for instructors and learners, as well as infrastructure administrators. The instructor dashboard makes remote events not only possible but also easy. Students experience continuity of learning, as all training happens on Galaxy, which they can continue to use after the event. In the past 60 months, 504 training events with over 24,000 learners have used this infrastructure for Galaxy training.
Collapse
Affiliation(s)
- Helena Rasche
- Department of Pathology and Clinical Bioinformatics, Erasmus Medical Center, Dr. Molewaterplein 40, 3015 GD, Rotterdam, the Netherlands
- School of Life Sciences and Technology, Avans University of Applied Sciences, Lovensdijkstraat 63, 4818 AJ Breda, the Netherlands
| | - Cameron Hyde
- Queensland Cyber Infrastructure Foundation Ltd., The University of Queensland, St. Lucia, QLD 4072, Australia
- University of the Sunshine Coast, Maroochydore, QLD 4558, Australia
| | - John Davis
- Department of Biology, Johns Hopkins University, Baltimore, MD 21218, USA
| | - Simon Gladman
- Melbourne Bioinformatics, The University of Melbourne, Melbourne, VIC 3051, Australia
| | - Nate Coraor
- School of Life, Health & Chemical Sciences, The Open University, Milton Keynes MK7 6AA, UK
| | - Anthony Bretaudeau
- IGEPP, INRAE, Institut Agro, University of Rennes, 35000 Rennes, France
- GenOuest Core Facility, University of Rennes, Inria, CNRS, IRISA, 35000 Rennes, France
| | - Gianmauro Cuccuru
- Bioinformatics Grou, Department of Computer Science, University of Freiburg, 79110 Freiburg im Breisgau, Germany
| | - Wendi Bacon
- School of Life, Health & Chemical Sciences, The Open University, Milton Keynes MK7 6AA, UK
| | - Beatriz Serrano-Solano
- Euro-Bioimaging ERIC Bio-Hub, EMBL, 69117 Heidelberg, Germany
- Department of Biochemistry and Molecular Biology, Eberly College of Science, The Pennsylvania State University, State College, PA 16802, USA
| | | | - Saskia Hiltemann
- Department of Pathology and Clinical Bioinformatics, Erasmus Medical Center, Dr. Molewaterplein 40, 3015 GD, Rotterdam, the Netherlands
| | - Miaomiao Zhou
- School of Life Sciences and Technology, Avans University of Applied Sciences, Lovensdijkstraat 63, 4818 AJ Breda, the Netherlands
| | - Björn Grüning
- Bioinformatics Grou, Department of Computer Science, University of Freiburg, 79110 Freiburg im Breisgau, Germany
| | - Andrew Stubbs
- Department of Pathology and Clinical Bioinformatics, Erasmus Medical Center, Dr. Molewaterplein 40, 3015 GD, Rotterdam, the Netherlands
| |
Collapse
|
2
|
Abstract
rCASC is a modular workflow providing an integrated environment for single-cell RNA-seq (scRNA-Seq) data analysis exploiting Docker containers to achieve functional and computational reproducibility. It was initially developed as an R package usable also through a Java GUI. However, the Java frontend cannot be employed when running rCASC on a remote server, a typical setup due to the significant computational resources commonly needed to analyze scRNA-Seq data.To allow the use of rCASC through a graphical user interface on the client side and to harness the many advantages provided by the Galaxy platform, we have made rCASC available as a Galaxy set of tools, also providing a dedicated public instance of Galaxy named "Galaxy-rCASC." To integrate rCASC into Galaxy, all its functions, originally implemented as a set of Docker containers to maximize reproducibility, have been extensively reworked to become independent from the R package functions that launch them in the original implementation. Furthermore, suitable Galaxy wrappers have been developed for most functions of rCASC. We provide a detailed reference document to the use of Galaxy-rCASC with insights and explanations on the platform functionalities, parameters, and output while guiding the reader through the typical rCASC analysis workflow of a scRNA-Seq dataset.
Collapse
|
3
|
Tangaro MA, Mandreoli P, Chiara M, Donvito G, Antonacci M, Parisi A, Bianco A, Romano A, Bianchi DM, Cangelosi D, Uva P, Molineris I, Nosi V, Calogero RA, Alessandri L, Pedrini E, Mordenti M, Bonetti E, Sangiorgi L, Pesole G, Zambelli F. Laniakea@ReCaS: exploring the potential of customisable Galaxy on-demand instances as a cloud-based service. BMC Bioinformatics 2021; 22:544. [PMID: 34749633 PMCID: PMC8574934 DOI: 10.1186/s12859-021-04401-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2021] [Accepted: 09/24/2021] [Indexed: 11/16/2022] Open
Abstract
BACKGROUND Improving the availability and usability of data and analytical tools is a critical precondition for further advancing modern biological and biomedical research. For instance, one of the many ramifications of the COVID-19 global pandemic has been to make even more evident the importance of having bioinformatics tools and data readily actionable by researchers through convenient access points and supported by adequate IT infrastructures. One of the most successful efforts in improving the availability and usability of bioinformatics tools and data is represented by the Galaxy workflow manager and its thriving community. In 2020 we introduced Laniakea, a software platform conceived to streamline the configuration and deployment of "on-demand" Galaxy instances over the cloud. By facilitating the set-up and configuration of Galaxy web servers, Laniakea provides researchers with a powerful and highly customisable platform for executing complex bioinformatics analyses. The system can be accessed through a dedicated and user-friendly web interface that allows the Galaxy web server's initial configuration and deployment. RESULTS "Laniakea@ReCaS", the first instance of a Laniakea-based service, is managed by ELIXIR-IT and was officially launched in February 2020, after about one year of development and testing that involved several users. Researchers can request access to Laniakea@ReCaS through an open-ended call for use-cases. Ten project proposals have been accepted since then, totalling 18 Galaxy on-demand virtual servers that employ ~ 100 CPUs, ~ 250 GB of RAM and ~ 5 TB of storage and serve several different communities and purposes. Herein, we present eight use cases demonstrating the versatility of the platform. CONCLUSIONS During this first year of activity, the Laniakea-based service emerged as a flexible platform that facilitated the rapid development of bioinformatics tools, the efficient delivery of training activities, and the provision of public bioinformatics services in different settings, including food safety and clinical research. Laniakea@ReCaS provides a proof of concept of how enabling access to appropriate, reliable IT resources and ready-to-use bioinformatics tools can considerably streamline researchers' work.
Collapse
Affiliation(s)
- Marco Antonio Tangaro
- Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, National Research Council (CNR), Via Giovanni Amendola 122/O, 70126, Bari, Italy
- National Institute for Nuclear Physics (INFN), Section of Bari, Via Orabona 4, 70126, Bari, Italy
| | - Pietro Mandreoli
- Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, National Research Council (CNR), Via Giovanni Amendola 122/O, 70126, Bari, Italy
- Department of Biosciences, University of Milan, Via Celoria 26, 20133, Milano, Italy
| | - Matteo Chiara
- Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, National Research Council (CNR), Via Giovanni Amendola 122/O, 70126, Bari, Italy
- Department of Biosciences, University of Milan, Via Celoria 26, 20133, Milano, Italy
| | - Giacinto Donvito
- National Institute for Nuclear Physics (INFN), Section of Bari, Via Orabona 4, 70126, Bari, Italy
| | - Marica Antonacci
- National Institute for Nuclear Physics (INFN), Section of Bari, Via Orabona 4, 70126, Bari, Italy
| | - Antonio Parisi
- Istituto Zooprofilattico Sperimentale Della Puglia e Della Basilicata, Via Manfredonia 20, 71121, Foggia, Italy
| | - Angelica Bianco
- Istituto Zooprofilattico Sperimentale Della Puglia e Della Basilicata, Via Manfredonia 20, 71121, Foggia, Italy
| | - Angelo Romano
- National Reference Laboratory for Coagulase-Positive Staphylococci Including Staphylococcus Aureus, Istituto Zooprofilattico Sperimentale del Piemonte, Liguria e Valle d'Aosta, Via Bologna 148, 10154, Turin, Italy
| | - Daniela Manila Bianchi
- National Reference Laboratory for Coagulase-Positive Staphylococci Including Staphylococcus Aureus, Istituto Zooprofilattico Sperimentale del Piemonte, Liguria e Valle d'Aosta, Via Bologna 148, 10154, Turin, Italy
| | - Davide Cangelosi
- Clinical Bioinformatics Unit, Scientific Direction, IRCCS Istituto Giannina Gaslini, Via Gerolamo Gaslini 5, 16147, Genova, Italy
| | - Paolo Uva
- Clinical Bioinformatics Unit, Scientific Direction, IRCCS Istituto Giannina Gaslini, Via Gerolamo Gaslini 5, 16147, Genova, Italy
- Italian Institute of Technology, Via Morego 30, 16163, Genova, Italy
| | - Ivan Molineris
- Department of Life Science and System Biology, University of Turin, Via Accademia Albertina, 13-1023, Turin, Italy
| | - Vladimir Nosi
- Department of Computer Science, University of Turin, Via Pessinetto 12, 10049, Turin, Italy
| | - Raffaele A Calogero
- Department of Molecular Biotechnology and Health Sciences, Via Nizza 52, 10126, Turin, Italy
| | - Luca Alessandri
- Department of Molecular Biotechnology and Health Sciences, Via Nizza 52, 10126, Turin, Italy
| | - Elena Pedrini
- Department of Rare Skeletal Disorders, IRCCS Istituto Ortopedico Rizzoli, Via di Barbiano 1/10, 40136, Bologna, Italy
| | - Marina Mordenti
- Department of Rare Skeletal Disorders, IRCCS Istituto Ortopedico Rizzoli, Via di Barbiano 1/10, 40136, Bologna, Italy
| | - Emanuele Bonetti
- Department of Rare Skeletal Disorders, IRCCS Istituto Ortopedico Rizzoli, Via di Barbiano 1/10, 40136, Bologna, Italy
- Department of Experimental Oncology, European Institute of Oncology, Via Adamello 16, 20139, Milan, Italy
| | - Luca Sangiorgi
- Department of Rare Skeletal Disorders, IRCCS Istituto Ortopedico Rizzoli, Via di Barbiano 1/10, 40136, Bologna, Italy
| | - Graziano Pesole
- Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, National Research Council (CNR), Via Giovanni Amendola 122/O, 70126, Bari, Italy.
- Department of Biosciences, Biotechnologies and Biopharmaceutics, University of Bari, Via Orabona 4, 70126, Bari, Italy.
| | - Federico Zambelli
- Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, National Research Council (CNR), Via Giovanni Amendola 122/O, 70126, Bari, Italy.
- Department of Biosciences, University of Milan, Via Celoria 26, 20133, Milano, Italy.
| |
Collapse
|
4
|
Goonasekera N, Mahmoud A, Chilton J, Afgan E. GalaxyCloudRunner: enhancing scalable computing for Galaxy. Bioinformatics 2021; 37:1763-1765. [PMID: 33104194 DOI: 10.1093/bioinformatics/btaa860] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2020] [Revised: 08/18/2020] [Accepted: 10/11/2020] [Indexed: 11/13/2022] Open
Abstract
SUMMARY The existence of more than 100 public Galaxy servers with service quotas is indicative of the need for an increased availability of compute resources for Galaxy to use. The GalaxyCloudRunner enables a Galaxy server to easily expand its available compute capacity by sending user jobs to cloud resources. User jobs are routed to the acquired resources based on a set of configurable rules and the resources can be dynamically acquired from any of four popular cloud providers (AWS, Azure, GCP or OpenStack) in an automated fashion. AVAILABILITY AND IMPLEMENTATION GalaxyCloudRunner is implemented in Python and leverages Docker containers. The source code is MIT licensed and available at https://github.com/cloudve/galaxycloudrunner. The documentation is available at http://gcr.cloudve.org/.
Collapse
Affiliation(s)
- Nuwan Goonasekera
- Melbourne Bioinformatics, Faculty of Medicine, Dentistry & Health Sciences, University of Melbourne, Melbourne, VIC 3010, Australia
| | - Alexandru Mahmoud
- Department of Biology, Johns Hopkins University, Baltimore, MD 21218, USA
| | - John Chilton
- Department of Biochemistry and Molecular Biology, Penn State University, State College, PA 16801, USA
| | - Enis Afgan
- Department of Biology, Johns Hopkins University, Baltimore, MD 21218, USA
| |
Collapse
|
5
|
Koppad S, B A, Gkoutos GV, Acharjee A. Cloud Computing Enabled Big Multi-Omics Data Analytics. Bioinform Biol Insights 2021; 15:11779322211035921. [PMID: 34376975 PMCID: PMC8323418 DOI: 10.1177/11779322211035921] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2021] [Accepted: 07/12/2021] [Indexed: 12/27/2022] Open
Abstract
High-throughput experiments enable researchers to explore complex multifactorial
diseases through large-scale analysis of omics data. Challenges for such
high-dimensional data sets include storage, analyses, and sharing. Recent
innovations in computational technologies and approaches, especially in cloud
computing, offer a promising, low-cost, and highly flexible solution in the
bioinformatics domain. Cloud computing is rapidly proving increasingly useful in
molecular modeling, omics data analytics (eg, RNA sequencing, metabolomics, or
proteomics data sets), and for the integration, analysis, and interpretation of
phenotypic data. We review the adoption of advanced cloud-based and big data
technologies for processing and analyzing omics data and provide insights into
state-of-the-art cloud bioinformatics applications.
Collapse
Affiliation(s)
- Saraswati Koppad
- Department of Computer Science and Engineering, National Institute of Technology Karnataka, Surathkal, India
| | - Annappa B
- Department of Computer Science and Engineering, National Institute of Technology Karnataka, Surathkal, India
| | - Georgios V Gkoutos
- Institute of Cancer and Genomic Sciences and Centre for Computational Biology, College of Medical and Dental Sciences, University of Birmingham, Birmingham, UK.,Institute of Translational Medicine, University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK.,NIHR Surgical Reconstruction and Microbiology Research Centre, University Hospitals Birmingham, Birmingham, UK.,MRC Health Data Research UK (HDR UK), London, UK.,NIHR Experimental Cancer Medicine Centre, Birmingham, UK.,NIHR Biomedical Research Centre, University Hospitals Birmingham, Birmingham, UK
| | - Animesh Acharjee
- Institute of Cancer and Genomic Sciences and Centre for Computational Biology, College of Medical and Dental Sciences, University of Birmingham, Birmingham, UK.,Institute of Translational Medicine, University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK.,NIHR Surgical Reconstruction and Microbiology Research Centre, University Hospitals Birmingham, Birmingham, UK
| |
Collapse
|
6
|
Tangaro M, Defazio G, Fosso B, Licciulli VF, Grillo G, Donvito G, Lavezzo E, Baruzzo G, Pesole G, Santamaria M. ITSoneWB: profiling global taxonomic diversity of eukaryotic communities on Galaxy. Bioinformatics 2021; 37:4253-4254. [PMID: 34117876 PMCID: PMC9502156 DOI: 10.1093/bioinformatics/btab431] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2021] [Revised: 06/03/2021] [Accepted: 06/11/2021] [Indexed: 12/05/2022] Open
Abstract
Summary ITSoneWB (ITSone WorkBench) is a Galaxy-based bioinformatic environment where comprehensive and high-quality reference data are connected with established pipelines and new tools in an automated and easy-to-use service targeted at global taxonomic analysis of eukaryotic communities based on Internal Transcribed Spacer 1 variants high-throughput sequencing. Availability and implementation ITSoneWB has been deployed on the INFN-Bari ReCaS cloud facility and is freely available on the web at http://itsonewb.cloud.ba.infn.it/galaxy. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Marco Tangaro
- Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, National Research Council, Bari 70126, Italy
| | - Giuseppe Defazio
- Department of Biosciences, Biotechnology and Biopharmaceutics, University of Bari 'A. Moro', Bari 70126, Italy
| | - Bruno Fosso
- Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, National Research Council, Bari 70126, Italy
| | - Vito Flavio Licciulli
- Institute of Biomedical Technologies, National Research Council, Bari Unit, 70126 Bari, Italy
| | - Giorgio Grillo
- Institute of Biomedical Technologies, National Research Council, Bari Unit, 70126 Bari, Italy
| | - Giacinto Donvito
- National Institute for Nuclear Physics (INFN), Section of Bari, Bari 70126, Italy
| | - Enrico Lavezzo
- Department of Molecular Medicine, University of Padova, Padova 35131, Italy
| | - Giacomo Baruzzo
- Department of Information Engineering, University of Padova, Padova, 35131, Italy
| | - Graziano Pesole
- Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, National Research Council, Bari 70126, Italy.,Department of Biosciences, Biotechnology and Biopharmaceutics, University of Bari 'A. Moro', Bari 70126, Italy
| | - Monica Santamaria
- Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, National Research Council, Bari 70126, Italy
| |
Collapse
|
7
|
Chiara M, Mandreoli P, Tangaro MA, D'Erchia AM, Sorrentino S, Forleo C, Horner DS, Zambelli F, Pesole G. VINYL: Variant prIoritizatioN by survivaL analysis. Bioinformatics 2020; 36:5590-5599. [PMID: 33367501 DOI: 10.1093/bioinformatics/btaa1067] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2020] [Revised: 10/31/2020] [Accepted: 12/14/2020] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION Clinical applications of genome re-sequencing technologies typically generate large amounts of data that need to be carefully annotated and interpreted to identify genetic variants potentially associated with pathological conditions. In this context, accurate and reproducible methods for the functional annotation and prioritization of genetic variants are of fundamental importance. RESULTS In this paper, we present VINYL, a flexible and fully automated system for the functional annotation and prioritization of genetic variants. Extensive analyses of both real and simulated datasets suggest that VINYL can identify clinically relevant genetic variants in a more accurate manner compared to equivalent state of the art methods, allowing a more rapid and effective prioritization of genetic variants in different experimental settings. As such we believe that VINYL can establish itself as a valuable tool to assist healthcare operators and researchers in clinical genomics investigations. AVAILABILITY VINYL is available at http://beaconlab.it/VINYL and https://github.com/matteo14c/VINYL. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Matteo Chiara
- Department of Biosciences, University of Milan, Milan, Italy.,Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, National Research Council, Bari, Italy
| | | | - Marco Antonio Tangaro
- Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, National Research Council, Bari, Italy
| | - Anna Maria D'Erchia
- Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, National Research Council, Bari, Italy.,Department of Biosciences, Biotechnology and Biopharmaceutics, University of Bari "Aldo Moro", Bari, Italy
| | - Sandro Sorrentino
- Cardiology Unit, Department of Emergency and Organ Transplantation, University of Bari "Aldo Moro", Bari, Italy
| | - Cinzia Forleo
- Cardiology Unit, Department of Emergency and Organ Transplantation, University of Bari "Aldo Moro", Bari, Italy
| | - David S Horner
- Department of Biosciences, University of Milan, Milan, Italy.,Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, National Research Council, Bari, Italy
| | - Federico Zambelli
- Department of Biosciences, University of Milan, Milan, Italy.,Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, National Research Council, Bari, Italy
| | - Graziano Pesole
- Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, National Research Council, Bari, Italy.,Department of Biosciences, Biotechnology and Biopharmaceutics, University of Bari "Aldo Moro", Bari, Italy
| |
Collapse
|
8
|
Chiara M, Zambelli F, Tangaro MA, Mandreoli P, Horner DS, Pesole G. CorGAT: a tool for the functional annotation of SARS-CoV-2 genomes. Bioinformatics 2020; 36:5522-5523. [PMID: 33346830 PMCID: PMC7799324 DOI: 10.1093/bioinformatics/btaa1047] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2020] [Revised: 10/30/2020] [Accepted: 12/09/2020] [Indexed: 12/24/2022] Open
Abstract
Summary While over 200 000 genomic sequences are currently available through dedicated repositories, ad hoc methods for the functional annotation of SARS-CoV-2 genomes do not harness all currently available resources for the annotation of functionally relevant genomic sites. Here, we present CorGAT, a novel tool for the functional annotation of SARS-CoV-2 genomic variants. By comparisons with other state of the art methods we demonstrate that, by providing a more comprehensive and rich annotation, our method can facilitate the identification of evolutionary patterns in the genome of SARS-CoV-2. Availabilityand implementation Galaxy http://corgat.cloud.ba.infn.it/galaxy; software: https://github.com/matteo14c/CorGAT/tree/Revision_V1; docker: https://hub.docker.com/r/laniakeacloud/galaxy_corgat. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Matteo Chiara
- Department of Biosciences, University of Milan, Via Celoria 26, Milan, Italy.,Institute of Biomembranes, Bioenergetics and Molecular Biotechnology, National Research Council, Via Giovanni Amendola, 122/O 70126, Bari, Italy
| | - Federico Zambelli
- Department of Biosciences, University of Milan, Via Celoria 26, Milan, Italy.,Institute of Biomembranes, Bioenergetics and Molecular Biotechnology, National Research Council, Via Giovanni Amendola, 122/O 70126, Bari, Italy
| | - Marco Antonio Tangaro
- Institute of Biomembranes, Bioenergetics and Molecular Biotechnology, National Research Council, Via Giovanni Amendola, 122/O 70126, Bari, Italy
| | - Pietro Mandreoli
- Department of Biosciences, University of Milan, Via Celoria 26, Milan, Italy.,Institute of Biomembranes, Bioenergetics and Molecular Biotechnology, National Research Council, Via Giovanni Amendola, 122/O 70126, Bari, Italy
| | - David S Horner
- Institute of Biomembranes, Bioenergetics and Molecular Biotechnology, National Research Council, Via Giovanni Amendola, 122/O 70126, Bari, Italy.,Department of Biosciences, Biotechnology and Biopharmaceutics, University of Bari "A. Moro", Via Edoardo Orabona 4, Bari, Italy
| | - Graziano Pesole
- Institute of Biomembranes, Bioenergetics and Molecular Biotechnology, National Research Council, Via Giovanni Amendola, 122/O 70126, Bari, Italy.,Department of Biosciences, Biotechnology and Biopharmaceutics, University of Bari "A. Moro", Via Edoardo Orabona 4, Bari, Italy
| |
Collapse
|
9
|
Tangaro MA, Donvito G, Antonacci M, Chiara M, Mandreoli P, Pesole G, Zambelli F. Laniakea: an open solution to provide Galaxy "on-demand" instances over heterogeneous cloud infrastructures. Gigascience 2020; 9:giaa033. [PMID: 32252069 PMCID: PMC7136032 DOI: 10.1093/gigascience/giaa033] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2020] [Revised: 03/13/2020] [Accepted: 03/17/2020] [Indexed: 12/26/2022] Open
Abstract
BACKGROUND While the popular workflow manager Galaxy is currently made available through several publicly accessible servers, there are scenarios where users can be better served by full administrative control over a private Galaxy instance, including, but not limited to, concerns about data privacy, customisation needs, prioritisation of particular job types, tools development, and training activities. In such cases, a cloud-based Galaxy virtual instance represents an alternative that equips the user with complete control over the Galaxy instance itself without the burden of the hardware and software infrastructure involved in running and maintaining a Galaxy server. RESULTS We present Laniakea, a complete software solution to set up a "Galaxy on-demand" platform as a service. Building on the INDIGO-DataCloud software stack, Laniakea can be deployed over common cloud architectures usually supported both by public and private e-infrastructures. The user interacts with a Laniakea-based service through a simple front-end that allows a general setup of a Galaxy instance, and then Laniakea takes care of the automatic deployment of the virtual hardware and the software components. At the end of the process, the user gains access with full administrative privileges to a private, production-grade, fully customisable, Galaxy virtual instance and to the underlying virtual machine (VM). Laniakea features deployment of single-server or cluster-backed Galaxy instances, sharing of reference data across multiple instances, data volume encryption, and support for VM image-based, Docker-based, and Ansible recipe-based Galaxy deployments. A Laniakea-based Galaxy on-demand service, named Laniakea@ReCaS, is currently hosted at the ELIXIR-IT ReCaS cloud facility. CONCLUSIONS Laniakea offers to scientific e-infrastructures a complete and easy-to-use software solution to provide a Galaxy on-demand service to their users. Laniakea-based cloud services will help in making Galaxy more accessible to a broader user base by removing most of the burdens involved in deploying and running a Galaxy service. In turn, this will facilitate the adoption of Galaxy in scenarios where classic public instances do not represent an optimal solution. Finally, the implementation of Laniakea can be easily adapted and expanded to support different services and platforms beyond Galaxy.
Collapse
Affiliation(s)
- Marco Antonio Tangaro
- Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, National Research Council (CNR), Via Giovanni Amendola 122/O, 70126 Bari, Italy
| | - Giacinto Donvito
- National Institute for Nuclear Physics (INFN), Section of Bari, Via Orabona 4, 70126 Bari, Italy
| | - Marica Antonacci
- National Institute for Nuclear Physics (INFN), Section of Bari, Via Orabona 4, 70126 Bari, Italy
| | - Matteo Chiara
- Department of Biosciences, University of Milan, via Celoria 26, 20133 Milano, Italy
| | - Pietro Mandreoli
- Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, National Research Council (CNR), Via Giovanni Amendola 122/O, 70126 Bari, Italy
- Department of Biosciences, University of Milan, via Celoria 26, 20133 Milano, Italy
| | - Graziano Pesole
- Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, National Research Council (CNR), Via Giovanni Amendola 122/O, 70126 Bari, Italy
- Department of Biosciences, Biotechnologies and Biopharmaceutics, University of Bari, Via Orabona 4, 70126 Bari, Italy
| | - Federico Zambelli
- Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, National Research Council (CNR), Via Giovanni Amendola 122/O, 70126 Bari, Italy
- Department of Biosciences, University of Milan, via Celoria 26, 20133 Milano, Italy
| |
Collapse
|