1
|
Vazquez P, Hirayama-Shoji K, Novik S, Krauss S, Rayner S. Globally Accessible Distributed Data Sharing (GADDS): a decentralized FAIR platform to facilitate data sharing in the life sciences. Bioinformatics 2022; 38:3812-3817. [PMID: 35639939 PMCID: PMC9344842 DOI: 10.1093/bioinformatics/btac362] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2021] [Revised: 04/12/2022] [Accepted: 05/24/2022] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION Technical advances have revolutionized the life sciences and researchers commonly face challenges associated with handling large amounts of heterogeneous digital data. The Findable, Accessible, Interoperable and Reusable (FAIR) principles provide a framework to support effective data management. However, implementing this framework is beyond the means of most researchers in terms of resources and expertise, requiring awareness of metadata, policies, community agreements, and other factors such as vocabularies and ontologies. RESULTS We have developed the Globally Accessible Distributed Data Sharing (GADDS) platform to facilitate FAIR-like data-sharing in cross-disciplinary research collaborations. The platform consists of (i) a blockchain based metadata quality control system, (ii) a private cloud-like storage system and (iii) a version control system. GADDS is built with containerized technologies, providing minimal hardware standards and easing scalability, and offers decentralized trust via transparency of metadata, facilitating data exchange and collaboration. As a use case, we provide an example implementation in engineered living material technology within the Hybrid Technology Hub at the University of Oslo. AVAILABILITY Demo version available at https://github.com/pavelvazquez/GADDS. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Pavel Vazquez
- Hybrid Technology Hub - Centre of Excellence, Institute of Basic Medical Sciences, University of Oslo, P.O. Box 1110 Blindern 0317, Oslo, Norway
| | - Kayoko Hirayama-Shoji
- Hybrid Technology Hub - Centre of Excellence, Institute of Basic Medical Sciences, University of Oslo, P.O. Box 1110 Blindern 0317, Oslo, Norway
| | - Steffen Novik
- Department of Informatics, Faculty of Mathematics and Natural Sciences, University of Oslo, P.O. Box 1032 Blindern N-0315, Oslo, Norway
| | - Stefan Krauss
- Hybrid Technology Hub - Centre of Excellence, Institute of Basic Medical Sciences, University of Oslo, P.O. Box 1110 Blindern 0317, Oslo, Norway.,Department of Immunology and Transfusion Medicine, Oslo University Hospital, P.O. Box 4950 Nydalen, 0424, Oslo, Norway
| | - Simon Rayner
- Hybrid Technology Hub - Centre of Excellence, Institute of Basic Medical Sciences, University of Oslo, P.O. Box 1110 Blindern 0317, Oslo, Norway.,Department of Medical Genetics, Oslo University Hospital and University of Oslo, Oslo, Norway
| |
Collapse
|
2
|
Hamdi Y, Zass L, Othman H, Radouani F, Allali I, Hanachi M, Okeke CJ, Chaouch M, Tendwa MB, Samtal C, Mohamed Sallam R, Alsayed N, Turkson M, Ahmed S, Benkahla A, Romdhane L, Souiai O, Tastan Bishop Ö, Ghedira K, Mohamed Fadlelmola F, Mulder N, Kamal Kassim S. Human OMICs and Computational Biology Research in Africa: Current Challenges and Prospects. OMICS : A JOURNAL OF INTEGRATIVE BIOLOGY 2021; 25:213-233. [PMID: 33794662 PMCID: PMC8060717 DOI: 10.1089/omi.2021.0004] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
Following the publication of the first human genome, OMICs research, including genomics, transcriptomics, proteomics, and metagenomics, has been on the rise. OMICs studies revealed the complex genetic diversity among human populations and challenged our understandings of genotype-phenotype correlations. Africa, being the cradle of the first modern humans, is distinguished by a large genetic diversity within its populations and rich ethnolinguistic history. However, the available human OMICs tools and databases are not representative of this diversity, therefore creating significant gaps in biomedical research. African scientists, students, and publics are among the key contributors to OMICs systems science. This expert review examines the pressing issues in human OMICs research, education, and development in Africa, as seen through a lens of computational biology, public health relevant technology innovation, critically-informed science governance, and how best to harness OMICs data to benefit health and societies in Africa and beyond. We underscore the disparities between North and Sub-Saharan Africa at different levels. A harmonized African ethnolinguistic classification would help address annotation challenges associated with population diversity. Finally, building on the existing strategic research initiatives, such as the H3Africa and H3ABioNet Consortia, we highly recommend addressing large-scale multidisciplinary research challenges, strengthening research collaborations and knowledge transfer, and enhancing the ability of African researchers to influence and shape national and international research, policy, and funding agendas. This article and analysis contribute to a deeper understanding of past and current challenges in the African OMICs innovation ecosystem, while also offering foresight on future innovation trajectories.
Collapse
Affiliation(s)
- Yosr Hamdi
- Laboratory of Biomedical Genomics and Oncogenetics, Institut Pasteur de Tunis, Université Tunis El Manar, Tunis, Tunisia
- Laboratory of Human and Experimental Pathology, Institut Pasteur de Tunis, Tunis, Tunisia
| | - Lyndon Zass
- Computational Biology Division, Department of Integrative Biomedical Sciences, Institute of Infectious Disease and Molecular Medicine, CIDRI Africa Wellcome Trust Centre, Faculty of Health Sciences, University of Cape Town, Cape Town, South Africa
| | - Houcemeddine Othman
- Sydney Brenner Institute for Molecular Bioscience, University of the Witwatersrand, Johannesburg, South Africa
| | - Fouzia Radouani
- Chlamydiae and Mycoplasmas Laboratory, Institut Pasteur du Maroc, Casablanca, Morocco
| | - Imane Allali
- Computational Biology Division, Department of Integrative Biomedical Sciences, Institute of Infectious Disease and Molecular Medicine, CIDRI Africa Wellcome Trust Centre, Faculty of Health Sciences, University of Cape Town, Cape Town, South Africa
- Laboratory of Human Pathologies Biology, Department of Biology, Faculty of Sciences, and Genomic Center of Human Pathologies, Faculty of Medicine and Pharmacy, Mohammed V University in Rabat, Rabat, Morocco
| | - Mariem Hanachi
- Laboratory of Bioinformatics, Biomathematics and Biostatistics, Institut Pasteur de Tunis, Université Tunis El Manar, Tunis, Tunisia
- Faculty of Science of Bizerte, Zarzouna, University of Carthage, Tunis, Tunisia
| | - Chiamaka Jessica Okeke
- Research Unit in Bioinformatics (RUBi), Department of Biochemistry and Microbiology, Rhodes University, Makhanda, South Africa
| | - Melek Chaouch
- Laboratory of Bioinformatics, Biomathematics and Biostatistics, Institut Pasteur de Tunis, Université Tunis El Manar, Tunis, Tunisia
| | - Maureen Bilinga Tendwa
- Research Unit in Bioinformatics (RUBi), Department of Biochemistry and Microbiology, Rhodes University, Makhanda, South Africa
| | - Chaimae Samtal
- Laboratory of Biotechnology, Environment, Agri-food and Health, Faculty of Sciences Dhar El Mahraz–Sidi Mohammed Ben Abdellah University, Fez, Morocco
- University of Mohamed Premier, Oujda, Morocco
| | - Reem Mohamed Sallam
- Department of Medical Biochemistry and Molecular Biology, Faculty of Medicine, Ain Shams University, Cairo, Egypt
- Department of Basic Medical Sciences, Faculty of Medicine, Galala University, Suez, Egypt
| | - Nihad Alsayed
- Centre for Bioinformatics and Systems Biology, Faculty of Science, University of Khartoum, Khartoum, Sudan
| | - Michael Turkson
- The National Institute for Mathematical Sciences, Kwame Nkrumah University of Science and Technology, Kumasi, Ghana
| | - Samah Ahmed
- Centre for Bioinformatics and Systems Biology, Faculty of Science, University of Khartoum, Khartoum, Sudan
| | - Alia Benkahla
- Laboratory of Bioinformatics, Biomathematics and Biostatistics, Institut Pasteur de Tunis, Université Tunis El Manar, Tunis, Tunisia
| | - Lilia Romdhane
- Laboratory of Biomedical Genomics and Oncogenetics, Institut Pasteur de Tunis, Université Tunis El Manar, Tunis, Tunisia
- Faculty of Science of Bizerte, Zarzouna, University of Carthage, Tunis, Tunisia
| | - Oussema Souiai
- Laboratory of Bioinformatics, Biomathematics and Biostatistics, Institut Pasteur de Tunis, Université Tunis El Manar, Tunis, Tunisia
| | - Özlem Tastan Bishop
- Research Unit in Bioinformatics (RUBi), Department of Biochemistry and Microbiology, Rhodes University, Makhanda, South Africa
| | - Kais Ghedira
- Laboratory of Bioinformatics, Biomathematics and Biostatistics, Institut Pasteur de Tunis, Université Tunis El Manar, Tunis, Tunisia
| | - Faisal Mohamed Fadlelmola
- Centre for Bioinformatics and Systems Biology, Faculty of Science, University of Khartoum, Khartoum, Sudan
| | - Nicola Mulder
- Computational Biology Division, Department of Integrative Biomedical Sciences, Institute of Infectious Disease and Molecular Medicine, CIDRI Africa Wellcome Trust Centre, Faculty of Health Sciences, University of Cape Town, Cape Town, South Africa
| | - Samar Kamal Kassim
- Department of Medical Biochemistry and Molecular Biology, Faculty of Medicine, Ain Shams University, Cairo, Egypt
| |
Collapse
|
3
|
Colella JP, Stephens RB, Campbell ML, Kohli BA, Parsons DJ, Mclean BS. The Open-Specimen Movement. Bioscience 2020. [DOI: 10.1093/biosci/biaa146] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
Abstract
The open-science movement seeks to increase transparency, reproducibility, and access to scientific data. As primary data, preserved biological specimens represent records of global biodiversity critical to research, conservation, national security, and public health. However, a recent decrease in specimen preservation in public biorepositories is a major barrier to open biological science. As such, there is an urgent need for a cultural shift in the life sciences that normalizes specimen deposition in museum collections. Museums embody an open-science ethos and provide long-term research infrastructure through curation, data management and security, and community-wide access to samples and data, thereby ensuring scientific reproducibility and extension. We propose that a paradigm shift from specimen ownership to specimen stewardship can be achieved through increased open-data requirements among scientific journals and institutional requirements for specimen deposition by funding and permitting agencies, and through explicit integration of specimens into existing data management plan guidelines and annual reporting.
Collapse
Affiliation(s)
| | - Ryan B Stephens
- Department of Natural Resources and the Environment, University of New Hampshire, Durham
| | - Mariel L Campbell
- Museum of Southwestern Biology, Division of Genomic Resources, University of New Mexico, Albuquerque
| | - Brooks A Kohli
- Department of Evolution, Ecology, and Organismal Biology, Ohio State University, Columbus
| | - Danielle J Parsons
- Department of Evolution, Ecology, and Organismal Biology, Ohio State University, Columbus
| | - Bryan S Mclean
- Department of Biology, University of North Carolina, Greensboro
| |
Collapse
|
4
|
Heacock ML, Amolegbe SM, Skalla LA, Trottier BA, Carlin DJ, Henry HF, Lopez AR, Duncan CG, Lawler CP, Balshaw DM, Suk WA. Sharing SRP data to reduce environmentally associated disease and promote transdisciplinary research. REVIEWS ON ENVIRONMENTAL HEALTH 2020; 35:111-122. [PMID: 32126018 DOI: 10.1515/reveh-2019-0089] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/06/2019] [Accepted: 01/06/2020] [Indexed: 05/25/2023]
Abstract
The National Institute of Environmental Health Sciences (NIEHS) Superfund Basic Research and Training Program (SRP) funds a wide range of projects that span biomedical, environmental sciences, and engineering research and generate a wealth of data resulting from hypothesis-driven research projects. Combining or integrating these diverse data offers an opportunity to uncover new scientific connections that can be used to gain a more comprehensive understanding of the interplay between exposures and health. Integrating and reusing data generated from individual research projects within the program requires harmonization of data workflows, ensuring consistent and robust practices in data stewardship, and embracing data sharing from the onset of data collection and analysis. We describe opportunities to leverage data within the SRP and current SRP efforts to advance data sharing and reuse, including by developing an SRP dataset library and fostering data integration through Data Management and Analysis Cores. We also discuss opportunities to improve public health by identifying parallels in the data captured from health and engineering research, layering data streams for a more comprehensive picture of exposures and disease, and using existing SRP research infrastructure to facilitate and foster data sharing. Importantly, we point out that while the SRP is in a unique position to exploit these opportunities, they can be employed across environmental health research. SRP research teams, which comprise cross-disciplinary scientists focused on similar research questions, are well positioned to use data to leverage previous findings and accelerate the pace of research. Incorporating data streams from different disciplines addressing similar questions can provide a broader understanding and uncover the answers to complex and discrete research questions.
Collapse
Affiliation(s)
- Michelle L Heacock
- Superfund Research Program, National Institute of Environmental Health Sciences (NIEHS), National Institutes of Health (NIH), Department of Health and Human Services (DHHS), Research Triangle Park, NC, USA
| | | | | | - Brittany A Trottier
- Superfund Research Program, National Institute of Environmental Health Sciences (NIEHS), National Institutes of Health (NIH), Department of Health and Human Services (DHHS), Research Triangle Park, NC, USA
| | - Danielle J Carlin
- Superfund Research Program, National Institute of Environmental Health Sciences (NIEHS), National Institutes of Health (NIH), Department of Health and Human Services (DHHS), Research Triangle Park, NC, USA
| | - Heather F Henry
- Superfund Research Program, National Institute of Environmental Health Sciences (NIEHS), National Institutes of Health (NIH), Department of Health and Human Services (DHHS), Research Triangle Park, NC, USA
| | | | - Christopher G Duncan
- National Institute of Environmental Health Sciences (NIEHS), National Institutes of Health (NIH), Department of Health and Human Services (DHHS), Research Triangle Park, NC, USA
| | - Cindy P Lawler
- National Institute of Environmental Health Sciences (NIEHS), National Institutes of Health (NIH), Department of Health and Human Services (DHHS), Research Triangle Park, NC, USA
| | - David M Balshaw
- National Institute of Environmental Health Sciences (NIEHS), National Institutes of Health (NIH), Department of Health and Human Services (DHHS), Research Triangle Park, NC, USA
| | - William A Suk
- Superfund Research Program, National Institute of Environmental Health Sciences (NIEHS), National Institutes of Health (NIH), Department of Health and Human Services (DHHS), Research Triangle Park, NC, USA
| |
Collapse
|
5
|
Creydt M, Fischer M. Food authentication in real life: How to link nontargeted approaches with routine analytics? Electrophoresis 2020; 41:1665-1679. [PMID: 32249434 DOI: 10.1002/elps.202000030] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2020] [Revised: 03/19/2020] [Accepted: 03/23/2020] [Indexed: 12/20/2022]
Abstract
In times of increasing globalization and the resulting complexity of trade flows, securing food quality is an increasing challenge. The development of analytical methods for checking the integrity and, thus, the safety of food is one of the central questions for actors from science, politics, and industry. Targeted methods, for the detection of a few selected analytes, still play the most important role in routine analysis. In the past 5 years, nontargeted methods that do not aim at individual analytes but on analyte profiles that are as comprehensive as possible have increasingly come into focus. Instead of investigating individual chemical structures, data patterns are collected, evaluated and, depending on the problem, fed into databases that can be used for further nontargeted approaches. Alternatively, individual markers can be extracted and transferred to targeted methods. Such an approach requires (i) the availability of authentic reference material, (ii) the corresponding high-resolution laboratory infrastructure, and (iii) extensive expertise in processing and storing very large amounts of data. Probably due to the requirements mentioned above, only a few methods have really established themselves in routine analysis. This review article focuses on the establishment of nontargeted methods in routine laboratories. Challenges are summarized and possible solutions are presented.
Collapse
Affiliation(s)
- Marina Creydt
- Hamburg School of Food Science, Institute of Food Chemistry, University of Hamburg, Hamburg, Germany
| | - Markus Fischer
- Hamburg School of Food Science, Institute of Food Chemistry, University of Hamburg, Hamburg, Germany
| |
Collapse
|
6
|
Hulsen T. The ten commandments of translational research informatics. DATA SCIENCE 2019. [DOI: 10.3233/ds-190020] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Affiliation(s)
- Tim Hulsen
- Department of Professional Health Solutions & Services, Philips Research, Eindhoven, The Netherlands. E-mail:
| |
Collapse
|
7
|
NetR and AttR, Two New Bioinformatic Tools to Integrate Diverse Datasets into Cytoscape Network and Attribute Files. Genes (Basel) 2019; 10:genes10060423. [PMID: 31159440 PMCID: PMC6628208 DOI: 10.3390/genes10060423] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2019] [Revised: 05/25/2019] [Accepted: 05/27/2019] [Indexed: 11/17/2022] Open
Abstract
High-throughput technologies have allowed researchers to obtain genome-wide data from a wide array of experimental model systems. Unfortunately, however, new data generation tends to significantly outpace data re-utilization, and most high throughput datasets are only rarely used in subsequent studies or to generate new hypotheses to be tested experimentally. The reasons behind such data underutilization include a widespread lack of programming expertise among experimentalist biologists to carry out the necessary file reformatting that is often necessary to integrate published data from disparate sources. We have developed two programs (NetR and AttR), which allow experimental biologists with little to no programming background to integrate publicly available datasets into files that can be later visualized with Cytoscape to display hypothetical networks that result from combining individual datasets, as well as a series of published attributes related to the genes or proteins in the network. NetR also allows users to import protein and genetic interaction data from InterMine, which can further enrich a network model based on curated information. We expect that NetR/AttR will allow experimental biologists to mine a largely unexploited wealth of data in their fields and facilitate their integration into hypothetical models to be tested experimentally.
Collapse
|
8
|
The Gene Ontology Resource: 20 years and still GOing strong. Nucleic Acids Res 2019; 47:D330-D338. [PMID: 30395331 PMCID: PMC6323945 DOI: 10.1093/nar/gky1055] [Citation(s) in RCA: 2677] [Impact Index Per Article: 535.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2018] [Accepted: 10/17/2018] [Indexed: 02/06/2023] Open
Abstract
The Gene Ontology resource (GO; http://geneontology.org) provides structured, computable knowledge regarding the functions of genes and gene products. Founded in 1998, GO has become widely adopted in the life sciences, and its contents are under continual improvement, both in quantity and in quality. Here, we report the major developments of the GO resource during the past two years. Each monthly release of the GO resource is now packaged and given a unique identifier (DOI), enabling GO-based analyses on a specific release to be reproduced in the future. The molecular function ontology has been refactored to better represent the overall activities of gene products, with a focus on transcription regulator activities. Quality assurance efforts have been ramped up to address potentially out-of-date or inaccurate annotations. New evidence codes for high-throughput experiments now enable users to filter out annotations obtained from these sources. GO-CAM, a new framework for representing gene function that is more expressive than standard GO annotations, has been released, and users can now explore the growing repository of these models. We also provide the 'GO ribbon' widget for visualizing GO annotations to a gene; the widget can be easily embedded in any web page.
Collapse
|
9
|
Watson-Haigh NS, Suchecki R, Kalashyan E, Garcia M, Baumann U. DAWN: a resource for yielding insights into the diversity among wheat genomes. BMC Genomics 2018; 19:941. [PMID: 30558550 PMCID: PMC6296097 DOI: 10.1186/s12864-018-5228-2] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2018] [Accepted: 11/06/2018] [Indexed: 12/21/2022] Open
Abstract
BACKGROUND Democratising the growing body of whole genome sequencing data available for Triticum aestivum (bread wheat) has been impeded by the lack of a genome reference and the large computational requirements for analysing these data sets. RESULTS DAWN (Diversity Among Wheat geNomes) integrates data from the T. aestivum Chinese Spring (CS) IWGSC RefSeq v1.0 genome with public WGS and exome data from 17 and 62 accessions respectively, enabling researchers and breeders alike to investigate genotypic differences between wheat accessions at the level of whole chromosomes down to individual genes. CONCLUSIONS Using DAWN we show that it is possible to visualise small and large chromosomal deletions, identify haplotypes at a glance and spot the consequences of selective breeding. DAWN allows us to detect the break points of alien introgression segments brought into an accession when transferring desired genes. Furthermore, we can find possible explanations for reduced recombination in parts of a chromosome, we can predict regions with linkage drag, and also look at diversity in centromeric regions.
Collapse
Affiliation(s)
- Nathan S. Watson-Haigh
- School of Agriculture, Food and Wine, University of Adelaide, PMB 1, Glen Osmond, 5064 SA Australia
- Bioinformatics Hub, School of Biological Sciences, University of Adelaide, Adelaide, SA 5005 Australia
| | - Radosław Suchecki
- School of Agriculture, Food and Wine, University of Adelaide, PMB 1, Glen Osmond, 5064 SA Australia
- CSIRO Agriculture and Food, Glen Osmond, Locked Bag 2, Adelaide, SA 5064 Australia
| | - Elena Kalashyan
- School of Agriculture, Food and Wine, University of Adelaide, PMB 1, Glen Osmond, 5064 SA Australia
| | - Melissa Garcia
- School of Agriculture, Food and Wine, University of Adelaide, PMB 1, Glen Osmond, 5064 SA Australia
| | - Ute Baumann
- School of Agriculture, Food and Wine, University of Adelaide, PMB 1, Glen Osmond, 5064 SA Australia
| |
Collapse
|
10
|
Pascar J, Chandler CH. A bioinformatics approach to identifying Wolbachia infections in arthropods. PeerJ 2018; 6:e5486. [PMID: 30202647 PMCID: PMC6126470 DOI: 10.7717/peerj.5486] [Citation(s) in RCA: 32] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2017] [Accepted: 07/30/2018] [Indexed: 11/20/2022] Open
Abstract
Wolbachia is the most widespread endosymbiont, infecting >20% of arthropod species, and capable of drastically manipulating the host's reproductive mechanisms. Conventionally, diagnosis has relied on PCR amplification; however, PCR is not always a reliable diagnostic technique due to primer specificity, strain diversity, degree of infection and/or tissue sampled. Here, we look for evidence of Wolbachia infection across a wide array of arthropod species using a bioinformatic approach to detect the Wolbachia genes ftsZ, wsp, and the groE operon in next-generation sequencing samples available through the NCBI Sequence Read Archive. For samples showing signs of infection, we attempted to assemble entire Wolbachia genomes, and in order to better understand the relationships between hosts and symbionts, phylogenies were constructed using the assembled gene sequences. Out of the 34 species with positively identified infections, eight species of arthropod had not previously been recorded to harbor Wolbachia infection. All putative infections cluster with known representative strains belonging to supergroup A or B, which are known to only infect arthropods. This study presents an efficient bioinformatic approach for post-sequencing diagnosis and analysis of Wolbachia infection in arthropods.
Collapse
Affiliation(s)
- Jane Pascar
- Department of Biological Sciences, State University of New York at Oswego, Oswego, NY, United States of America
- Department of Biology, Syracuse University, Syracuse, NY, United States of America
| | - Christopher H. Chandler
- Department of Biological Sciences, State University of New York at Oswego, Oswego, NY, United States of America
| |
Collapse
|
11
|
Wagholikar KB, Dessai P, Sanz J, Mendis ME, Bell DS, Murphy SN. Implementation of informatics for integrating biology and the bedside (i2b2) platform as Docker containers. BMC Med Inform Decis Mak 2018; 18:66. [PMID: 30012140 PMCID: PMC6048900 DOI: 10.1186/s12911-018-0646-2] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2017] [Accepted: 06/27/2018] [Indexed: 12/03/2022] Open
Abstract
BACKGROUND Informatics for Integrating Biology and the Bedside (i2b2) is an open source clinical data analytics platform used at over 200 healthcare institutions for querying patient data. The i2b2 platform has several components with numerous dependencies and configuration parameters, which renders the task of installing or upgrading i2b2 a challenging one. Even with the availability of extensive documentation and tutorials, new users often require several weeks to correctly install a functional i2b2 platform. The goal of this work is to simplify the installation and upgrade process for i2b2. Specifically, we have containerized the core components of the platform, and evaluated the containers for ease of installation. RESULTS We developed three Docker container images: WildFly, database, and web, to encapsulate the three major deployment components of i2b2. These containers isolate the core functionalities of the i2b2 platform, and work in unison to provide its functionalities. Our evaluations indicate that i2b2 containers function successfully on the Linux platform. Our results demonstrate that the containerized components work out-of-the-box, with minimal configuration. CONCLUSIONS Containerization offers the potential to package the i2b2 platform components into standalone executable packages that are agnostic to the underlying host operating system. By releasing i2b2 as a Docker container, we anticipate that users will be able to create a working i2b2 hive installation without the need to download, compile, and configure individual components that constitute the i2b2 cells, thus making this platform accessible to a greater number of institutions.
Collapse
Affiliation(s)
| | - Pralav Dessai
- University of California Los Angeles, Los Angeles, CA USA
| | - Javier Sanz
- University of California Los Angeles, Los Angeles, CA USA
| | | | | | - Shawn N. Murphy
- Massachusetts General Hospital, Boston, MA USA
- Harvard Medical School, Boston, MA USA
| |
Collapse
|