1
|
Vallat B, Webb BM, Westbrook JD, Goddard TD, Hanke CA, Graziadei A, Peisach E, Zalevsky A, Sagendorf J, Tangmunarunkit H, Voinea S, Sekharan M, Yu J, Bonvin AAMJJ, DiMaio F, Hummer G, Meiler J, Tajkhorshid E, Ferrin TE, Lawson CL, Leitner A, Rappsilber J, Seidel CAM, Jeffries CM, Burley SK, Hoch JC, Kurisu G, Morris K, Patwardhan A, Velankar S, Schwede T, Trewhella J, Kesselman C, Berman HM, Sali A. IHMCIF: An Extension of the PDBx/mmCIF Data Standard for Integrative Structure Determination Methods. J Mol Biol 2024:168546. [PMID: 38508301 DOI: 10.1016/j.jmb.2024.168546] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2024] [Revised: 03/11/2024] [Accepted: 03/14/2024] [Indexed: 03/22/2024]
Abstract
IHMCIF (github.com/ihmwg/IHMCIF) is a data information framework that supports archiving and disseminating macromolecular structures determined by integrative or hybrid modeling (IHM), and making them Findable, Accessible, Interoperable, and Reusable (FAIR). IHMCIF is an extension of the Protein Data Bank Exchange/macromolecular Crystallographic Information Framework (PDBx/mmCIF) that serves as the framework for the Protein Data Bank (PDB) to archive experimentally determined atomic structures of biological macromolecules and their complexes with one another and small molecule ligands (e.g., enzyme cofactors and drugs). IHMCIF serves as the foundational data standard for the PDB-Dev prototype system, developed for archiving and disseminating integrative structures. It utilizes a flexible data representation to describe integrative structures that span multiple spatiotemporal scales and structural states with definitions for restraints from a variety of experimental methods contributing to integrative structural biology. The IHMCIF extension was created with the benefit of considerable community input and recommendations gathered by the Worldwide Protein Data Bank (wwPDB) Task Force for Integrative or Hybrid Methods (wwpdb.org/task/hybrid). Herein, we describe the development of IHMCIF to support evolving methodologies and ongoing advancements in integrative structural biology. Ultimately, IHMCIF will facilitate the unification of PDB-Dev data and tools with the PDB archive so that integrative structures can be archived and disseminated through PDB.
Collapse
Affiliation(s)
- Brinda Vallat
- Research Collaboratory for Structural Bioinformatics Protein Data Bank and the Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, NJ 08901, USA.
| | - Benjamin M Webb
- Department of Bioengineering and Therapeutic Sciences, Department of Pharmaceutical Chemistry, the Quantitative Biosciences Institute (QBI), and the Research Collaboratory for Structural Bioinformatics Protein Data Bank, University of California, San Francisco, San Francisco, CA 94157, USA
| | - John D Westbrook
- Research Collaboratory for Structural Bioinformatics Protein Data Bank and the Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, NJ 08901, USA
| | - Thomas D Goddard
- Department of Pharmaceutical Chemistry, University of California, San Francisco, CA 94158, USA
| | - Christian A Hanke
- Molecular Physical Chemistry, Heinrich Heine University Düsseldorf, 40225 Düsseldorf, Germany
| | - Andrea Graziadei
- Bioanalytics, Institute of Biotechnology, Technische Universität Berlin, 10623 Berlin, Germany; Human Technopole, 20157 Milan, Italy
| | - Ezra Peisach
- Research Collaboratory for Structural Bioinformatics Protein Data Bank and the Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Arthur Zalevsky
- Department of Bioengineering and Therapeutic Sciences, Department of Pharmaceutical Chemistry, the Quantitative Biosciences Institute (QBI), and the Research Collaboratory for Structural Bioinformatics Protein Data Bank, University of California, San Francisco, San Francisco, CA 94157, USA
| | - Jared Sagendorf
- Department of Bioengineering and Therapeutic Sciences, Department of Pharmaceutical Chemistry, the Quantitative Biosciences Institute (QBI), and the Research Collaboratory for Structural Bioinformatics Protein Data Bank, University of California, San Francisco, San Francisco, CA 94157, USA
| | - Hongsuda Tangmunarunkit
- Information Sciences Institute, Viterbi School of Engineering, University of Southern California, Los Angeles, CA, USA
| | - Serban Voinea
- Information Sciences Institute, Viterbi School of Engineering, University of Southern California, Los Angeles, CA, USA
| | - Monica Sekharan
- Research Collaboratory for Structural Bioinformatics Protein Data Bank and the Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Jian Yu
- Protein Data Bank Japan, Institute for Protein Research, Osaka University, Suita, Osaka 565-0871, Japan
| | - Alexander A M J J Bonvin
- Bijvoet Centre for Biomolecular Research, Faculty of Science - Chemistry, Utrecht University, Padualaan 8, 3584 CH Utrecht, the Netherlands
| | - Frank DiMaio
- Department of Biochemistry and Institute for Protein Design, University of Washington, Seattle, WA 98195, USA
| | - Gerhard Hummer
- Department of Theoretical Biophysics, Max Planck Institute of Biophysics, 60438 Frankfurt am Main, Germany; Institute for Biophysics, Goethe University Frankfurt, 60438 Frankfurt am Main, Germany
| | - Jens Meiler
- Center for Structural Biology, Vanderbilt University, 465 21st Avenue South, Nashville, TN 37221, USA; Institute for Drug Discovery, Leipzig University Medical School, 04103 Leipzig, Germany
| | - Emad Tajkhorshid
- NIH Resource for Macromolecular Modeling and Visualization, Beckman Institute for Advanced Science and Technology, Department of Biochemistry, and Center for Biophysics and Quantitative Biology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| | - Thomas E Ferrin
- Department of Pharmaceutical Chemistry, University of California, San Francisco, CA 94158, USA
| | - Catherine L Lawson
- Research Collaboratory for Structural Bioinformatics Protein Data Bank and the Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Alexander Leitner
- Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, 8093 Zurich, Switzerland
| | - Juri Rappsilber
- Bioanalytics, Institute of Biotechnology, Technische Universität Berlin, 10623 Berlin, Germany; Wellcome Centre for Cell Biology, University of Edinburgh, Max Born Crescent, Edinburgh EH9 3BF, UK
| | - Claus A M Seidel
- Molecular Physical Chemistry, Heinrich Heine University Düsseldorf, 40225 Düsseldorf, Germany
| | - Cy M Jeffries
- European Molecular Biology Laboratory (EMBL), Hamburg Unit, c/o Deutsches Elektronen-Synchrotron (DESY), Notkestrasse 85, 22607 Hamburg, Germany
| | - Stephen K Burley
- Research Collaboratory for Structural Bioinformatics Protein Data Bank and the Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, NJ 08901, USA; Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, La Jolla, CA 92093, USA; Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Jeffrey C Hoch
- Biological Magnetic Resonance Data Bank, Department of Molecular Biology and Biophysics, University of Connecticut, Farmington, CT 06030-3305, USA
| | - Genji Kurisu
- Protein Data Bank Japan, Institute for Protein Research, Osaka University, Suita, Osaka 565-0871, Japan
| | - Kyle Morris
- Electron Microscopy Data Bank, European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, UK
| | - Ardan Patwardhan
- Electron Microscopy Data Bank, European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, UK
| | - Sameer Velankar
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, Cambridge CB10 1SD, UK
| | - Torsten Schwede
- Biozentrum, University of Basel, Basel, Switzerland; Computational Structural Biology & SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Jill Trewhella
- School of Life and Environmental Sciences, The University of Sydney, Sydney, NSW 2006, Australia; Department of Chemistry, University of Utah, Salt Lake City, UT 84112, USA
| | - Carl Kesselman
- Information Sciences Institute, Viterbi School of Engineering, University of Southern California, Los Angeles, CA, USA
| | - Helen M Berman
- Research Collaboratory for Structural Bioinformatics Protein Data Bank and the Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Department of Quantitative and Computational Biology, University of Southern California, Los Angeles CA 90089, USA
| | - Andrej Sali
- Department of Bioengineering and Therapeutic Sciences, Department of Pharmaceutical Chemistry, the Quantitative Biosciences Institute (QBI), and the Research Collaboratory for Structural Bioinformatics Protein Data Bank, University of California, San Francisco, San Francisco, CA 94157, USA
| |
Collapse
|
2
|
Charbonneau AL, Brady A, Czajkowski K, Aluvathingal J, Canchi S, Carter R, Chard K, Clarke DJB, Crabtree J, Creasy HH, D'Arcy M, Felix V, Giglio M, Gingrich A, Harris RM, Hodges TK, Ifeonu O, Jeon M, Kropiwnicki E, Lim MCW, Liming RL, Lumian J, Mahurkar AA, Mandal M, Munro JB, Nadendla S, Richter R, Romano C, Rocca-Serra P, Schor M, Schuler RE, Tangmunarunkit H, Waldrop A, Williams C, Word K, Sansone SA, Ma'ayan A, Wagner R, Foster I, Kesselman C, Brown CT, White O. Making Common Fund data more findable: catalyzing a data ecosystem. Gigascience 2022; 11:6835135. [PMID: 36409836 PMCID: PMC9677336 DOI: 10.1093/gigascience/giac105] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2022] [Revised: 09/16/2022] [Accepted: 10/10/2022] [Indexed: 11/22/2022] Open
Abstract
The Common Fund Data Ecosystem (CFDE) has created a flexible system of data federation that enables researchers to discover datasets from across the US National Institutes of Health Common Fund without requiring that data owners move, reformat, or rehost those data. This system is centered on a catalog that integrates detailed descriptions of biomedical datasets from individual Common Fund Programs' Data Coordination Centers (DCCs) into a uniform metadata model that can then be indexed and searched from a centralized portal. This Crosscut Metadata Model (C2M2) supports the wide variety of data types and metadata terms used by individual DCCs and can readily describe nearly all forms of biomedical research data. We detail its use to ingest and index data from 11 DCCs.
Collapse
Affiliation(s)
| | - Arthur Brady
- University of Maryland Institute for Genome Sciences, University of Maryland School of Medicine, MD 21201, USA
| | - Karl Czajkowski
- University of Southern California Information Sciences Institute, CA 90292, USA
| | - Jain Aluvathingal
- University of Maryland Institute for Genome Sciences, University of Maryland School of Medicine, MD 21201, USA
| | - Saranya Canchi
- Population Health and Reproduction, UC Davis, Davis, CA 95616, USA
| | - Robert Carter
- University of Maryland Institute for Genome Sciences, University of Maryland School of Medicine, MD 21201, USA
| | - Kyle Chard
- Division of Decision and Information Sciences, University of Chicago and Argonne National Laboratory, Chicago, IL 60637, USA
| | - Daniel J B Clarke
- Department of Pharmacological Sciences, Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Jonathan Crabtree
- University of Maryland Institute for Genome Sciences, University of Maryland School of Medicine, MD 21201, USA
| | - Heather H Creasy
- University of Maryland Institute for Genome Sciences, University of Maryland School of Medicine, MD 21201, USA
| | - Mike D'Arcy
- University of Southern California Information Sciences Institute, CA 90292, USA
| | - Victor Felix
- University of Maryland Institute for Genome Sciences, University of Maryland School of Medicine, MD 21201, USA
| | - Michelle Giglio
- University of Maryland Institute for Genome Sciences, University of Maryland School of Medicine, MD 21201, USA
| | | | - Rayna M Harris
- Population Health and Reproduction, UC Davis, Davis, CA 95616, USA
| | - Theresa K Hodges
- University of Maryland Institute for Genome Sciences, University of Maryland School of Medicine, MD 21201, USA
| | - Olukemi Ifeonu
- University of Maryland Institute for Genome Sciences, University of Maryland School of Medicine, MD 21201, USA
| | - Minji Jeon
- Department of Pharmacological Sciences, Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Eryk Kropiwnicki
- Department of Pharmacological Sciences, Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Marisa C W Lim
- Population Health and Reproduction, UC Davis, Davis, CA 95616, USA
| | - R Lee Liming
- Division of Decision and Information Sciences, University of Chicago and Argonne National Laboratory, Chicago, IL 60637, USA
| | - Jessica Lumian
- Population Health and Reproduction, UC Davis, Davis, CA 95616, USA
| | - Anup A Mahurkar
- University of Maryland Institute for Genome Sciences, University of Maryland School of Medicine, MD 21201, USA
| | - Meisha Mandal
- RTI International, Research Triangle Park 27709-2194, USA
| | - James B Munro
- University of Maryland Institute for Genome Sciences, University of Maryland School of Medicine, MD 21201, USA
| | - Suvarna Nadendla
- University of Maryland Institute for Genome Sciences, University of Maryland School of Medicine, MD 21201, USA
| | - Rudyard Richter
- Division of Decision and Information Sciences, University of Chicago and Argonne National Laboratory, Chicago, IL 60637, USA
| | - Cia Romano
- University of Southern California Information Sciences Institute, CA 90292, USA.,Interface Guru, Tuscon 85701, USA
| | - Philippe Rocca-Serra
- Oxford e-Research Centre, Department of Engineering Science, University of Oxford, Oxford OX1 3QG, UK
| | - Michael Schor
- University of Maryland Institute for Genome Sciences, University of Maryland School of Medicine, MD 21201, USA
| | - Robert E Schuler
- University of Southern California Information Sciences Institute, CA 90292, USA
| | | | - Alex Waldrop
- RTI International, Research Triangle Park 27709-2194, USA
| | - Cris Williams
- University of Southern California Information Sciences Institute, CA 90292, USA
| | | | - Susanna-Assunta Sansone
- Oxford e-Research Centre, Department of Engineering Science, University of Oxford, Oxford OX1 3QG, UK
| | - Avi Ma'ayan
- Department of Pharmacological Sciences, Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Rick Wagner
- University of California San Diego, San Diego, CA 92093, USA
| | - Ian Foster
- Division of Decision and Information Sciences, University of Chicago and Argonne National Laboratory, Chicago, IL 60637, USA
| | - Carl Kesselman
- University of Southern California Information Sciences Institute, CA 90292, USA
| | - C Titus Brown
- Population Health and Reproduction, UC Davis, Davis, CA 95616, USA
| | - Owen White
- University of Maryland Institute for Genome Sciences, University of Maryland School of Medicine, MD 21201, USA
| |
Collapse
|
3
|
Vallat B, Webb B, Fayazi M, Voinea S, Tangmunarunkit H, Ganesan SJ, Lawson CL, Westbrook JD, Kesselman C, Sali A, Berman HM. New system for archiving integrative structures. Acta Crystallogr D Struct Biol 2021; 77:1486-1496. [PMID: 34866606 PMCID: PMC8647179 DOI: 10.1107/s2059798321010871] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2021] [Accepted: 10/19/2021] [Indexed: 11/30/2022] Open
Abstract
Structures of many complex biological assemblies are increasingly determined using integrative approaches, in which data from multiple experimental methods are combined. A standalone system, called PDB-Dev, has been developed for archiving integrative structures and making them publicly available. Here, the data standards and software tools that support PDB-Dev are described along with the new and updated components of the PDB-Dev data-collection, processing and archiving infrastructure. Following the FAIR (Findable, Accessible, Interoperable and Reusable) principles, PDB-Dev ensures that the results of integrative structure determinations are freely accessible to everyone.
Collapse
Affiliation(s)
- Brinda Vallat
- RCSB PDB, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, New Jersey, USA
| | - Benjamin Webb
- Department of Bioengineering and Therapeutic Sciences, Department of Pharmaceutical Chemistry, and California Institute for Quantitative Biosciences, University of California at San Francisco, San Francisco, California, USA
| | - Maryam Fayazi
- RCSB PDB, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, New Jersey, USA
| | - Serban Voinea
- Information Sciences Institute, Viterbi School of Engineering, University of Southern California, Los Angeles, California, USA
| | - Hongsuda Tangmunarunkit
- Information Sciences Institute, Viterbi School of Engineering, University of Southern California, Los Angeles, California, USA
| | - Sai J. Ganesan
- Department of Bioengineering and Therapeutic Sciences, Department of Pharmaceutical Chemistry, and California Institute for Quantitative Biosciences, University of California at San Francisco, San Francisco, California, USA
| | - Catherine L. Lawson
- RCSB PDB, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, New Jersey, USA
| | - John D. Westbrook
- RCSB PDB, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, New Jersey, USA
| | - Carl Kesselman
- RCSB PDB, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, New Jersey, USA
| | - Andrej Sali
- Department of Bioengineering and Therapeutic Sciences, Department of Pharmaceutical Chemistry, and California Institute for Quantitative Biosciences, University of California at San Francisco, San Francisco, California, USA
| | - Helen M. Berman
- Department of Chemistry and Chemical Biology and Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, New Jersey, USA
| |
Collapse
|
4
|
Vallat B, Webb B, Westbrook J, Tangmunarunkit H, Voinea S, Kesselman C, Sali A, Berman HM. Archiving Integrative Structural Models. Biophys J 2021. [DOI: 10.1016/j.bpj.2020.11.1702] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022] Open
|
5
|
Abstract
Database evolution is a notoriously difficult task, and it is exacerbated by the necessity to evolve database-dependent applications. As science becomes increasingly dependent on sophisticated data management, the need to evolve an array of database-driven systems will only intensify. In this paper, we present an architecture for data-centric ecosystems that allows the components to seamlessly co-evolve by centralizing the models and mappings at the data service and pushing model-adaptive interactions to the database clients. Boundary objects fill the gap where applications are unable to adapt and need a stable interface to interact with the components of the ecosystem. Finally, evolution of the ecosystem is enabled via integrated schema modification and model management operations. We present use cases from actual experiences that demonstrate the utility of our approach.
Collapse
Affiliation(s)
- Robert Schuler
- USC Information Sciences Institute, Marina del Rey, California
| | - Karl Czajkowski
- USC Information Sciences Institute, Marina del Rey, California
| | - Mike D'Arcy
- USC Information Sciences Institute, Marina del Rey, California
| | | | - Carl Kesselman
- USC Information Sciences Institute, Marina del Rey, California
| |
Collapse
|
6
|
Bugacov A, Czajkowski K, Kesselman C, Kumar A, Schuler RE, Tangmunarunkit H. Experiences with Deriva: An Asset Management Platform for Accelerating eScience. Proc IEEE Int Conf Escience 2017; 2017:79-88. [PMID: 29756001 DOI: 10.1109/escience.2017.20] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
The pace of discovery in eScience is increasingly dependent on a scientist's ability to acquire, curate, integrate, analyze, and share large and diverse collections of data. It is all too common for investigators to spend inordinate amounts of time developing ad hoc procedures to manage their data. In previous work, we presented Deriva, a Scientific Asset Management System, designed to accelerate data driven discovery. In this paper, we report on the use of Deriva in a number of substantial and diverse eScience applications. We describe the lessons we have learned, both from the perspective of the Deriva technology, as well as the ability and willingness of scientists to incorporate Scientific Asset Management into their daily workflows.
Collapse
Affiliation(s)
- Alejandro Bugacov
- Information Sciences Institute, Viterbi School of Engineering, University of Southern California, Marina del Rey, CA 90292
| | - Karl Czajkowski
- Information Sciences Institute, Viterbi School of Engineering, University of Southern California, Marina del Rey, CA 90292
| | - Carl Kesselman
- Information Sciences Institute, Viterbi School of Engineering, University of Southern California, Marina del Rey, CA 90292
| | - Anoop Kumar
- Information Sciences Institute, Viterbi School of Engineering, University of Southern California, Marina del Rey, CA 90292
| | - Robert E Schuler
- Information Sciences Institute, Viterbi School of Engineering, University of Southern California, Marina del Rey, CA 90292
| | - Hongsuda Tangmunarunkit
- Information Sciences Institute, Viterbi School of Engineering, University of Southern California, Marina del Rey, CA 90292
| |
Collapse
|
7
|
Tangmunarunkit H, Hsieh CK, Longstaff B, Nolen S, Jenkins J, Ketcham C, Selsky J, Alquaddoomi F, George D, Kang J, Khalapyan Z, Ooms J, Ramanathan N, Estrin D. Ohmage. ACM T INTEL SYST TEC 2015. [DOI: 10.1145/2717318] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Abstract
Participatory sensing (PS) is a distributed data collection and analysis approach where individuals, acting alone or in groups, use their personal mobile devices to systematically explore interesting aspects of their lives and communities [Burke et al. 2006]. These mobile devices can be used to capture diverse spatiotemporal data through both intermittent self-report and continuous recording from on-board sensors and applications.
Ohmage (http://ohmage.org) is a modular and extensible open-source, mobile to Web PS platform that records, stores, analyzes, and visualizes data from both prompted self-report and continuous data streams. These data streams are authorable and can dynamically be deployed in diverse settings. Feedback from hundreds of behavioral and technology researchers, focus group participants, and end users has been integrated into ohmage through an iterative participatory design process. Ohmage has been used as an enabling platform in more than 20 independent projects in many disciplines. We summarize the PS requirements, challenges and key design objectives learned through our design process, and ohmage system architecture to achieve those objectives. The flexibility, modularity, and extensibility of ohmage in supporting diverse deployment settings are presented through three distinct case studies in education, health, and clinical research.
Collapse
|
8
|
Abstract
Following the long-held belief that the Internet is hierarchical, the network topology generators most widely used by the Internet research community, Transit-Stub and Tiers, create networks with a deliberately hierarchical structure. However, in 1999 a seminal paper by Faloutsos et al. revealed that the Internet's degree distribution is a power-law. Because the degree distributions produced by the Transit-Stub and Tiers generators are not power-laws, the research community has largely dismissed them as inadequate and proposed new network generators that attempt to generate graphs with power-law degree distributions.Contrary to much of the current literature on network topology generators, this paper starts with the assumption that it is more important for network generators to accurately model the large-scale structure of the Internet (such as its hierarchical structure) than to faithfully imitate its local properties (such as the degree distribution). The purpose of this paper is to determine, using various topology metrics, which network generators better represent this large-scale structure. We find, much to our surprise, that network generators based on the degree distribution more accurately capture the large-scale structure of measured topologies. We then seek an explanation for this result by examining the nature of hierarchy in the Internet more closely; we find that degree-based generators produce a form of hierarchy that closely resembles the loosely hierarchical nature of the Internet.
Collapse
|
9
|
Tangmunarunkit H, Govindan R, Jamin S, Shenker S, Willinger W. Network topologies, power laws, and hierarchy. SIGCOMM Comput Commun Rev 2002. [DOI: 10.1145/510726.510750] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
|
10
|
Tangmunarunkit H, Doyle J, Govindan R, Willinger W, Jamin S, Shenker S. Does AS size determine degree in as topology? SIGCOMM Comput Commun Rev 2001. [DOI: 10.1145/1037107.1037108] [Citation(s) in RCA: 21] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
Abstract
In a recent and much celebrated paper, Faloutsos <i>et al.</i> [6] found that the inter Autonomous System (AS) topology exhibits a power-law degree distribution. This result was quite unexpected in the networking community, and stirred significant interest in exploring the possible causes of this phenomenon. The work of Barabasi <i>et al.</i> [2], and its application to network topology generation in the work of Medina <i>et al.</i> [9], have explored a promising class of models that yield strict power-law degree distributions. These models, which we will refer to collectively as the <i>B-A model</i>, describe the detailed dynamics of the network growth process, modeling the way in which connections are made between ASs. There are two simple connectivity rules that define the evolution of AS connectivity over time: <i>incremental growth</i> where a new AS connects to existing ASs, and <i>preferential connectivity</i> where the likelihood of connecting to an AS is proportional to the vertex outdegree of the target AS. These simple rules, which are similar to the classical "rich get richer" model originally proposed by Simon [12], lead to power-law degree distributions.
While the B-A model provably yields power-law vertex degree distributions, recent empirical evidence indicates that the model may not be consistent with the dynamics underlying the evolution of the actual AS topology. First, there is strong evidence [3, 4] that the degree distribution of the actual AS topology does not conform to a strict power law. However, the distribution is certainly <i>heavy-tailed</i> or <i>highly-variable</i> in the sense that the observed vertex degrees typically range over three or four orders of magnitude; in some cases, the <i>tail</i> of the degree distribution may fit a power law. These observations were gleaned from more complete pictures of AS - level connectivity (obtained by augmenting BGP route tables with peering relationships from other sources) than those used by earlier work [2, 6, 9]. Second, the B-A model's AS connectivity evolution rules can be shown to be inconsistent with empirical AS growth measurements [16]. As such, while the B-A model appears to produce topologies whose degree distribution characteristics exhibit power-law behavior, it cannot be a valid <i>explanation</i> for the connectivity evolution in the AS topology.
Clearly, some of these empirical observations don't corroborate the claim that the B-A model explains the phenomenon of highly variable vertex degrees in the Internet's AS topology [2]. However, the B-A model was originally proposed as a simple illustration of how some elementary mechanisms or rules can give rise to power law vertex degree distributions. As such, it is likely that the B-A model can be modified to accommodate these more recent findings [1], but we will neither discuss here such modifications nor comment on their possibility for success. Instead, we merely note that any such resulting model would seek, as does the original B-A model, to explain the highly variable degree distribution of the AS topology through the detailed dynamics of how connections between ASs are established.
The purpose of this note is to raise the question --- motivated by the B-A approach---of whether the underlying cause of the high variability phenomenon of the vertex degree distribution lies in the detailed dynamics of network growth, or if there are alternative explanations. To that end, we briefly outline an alternative explanation for the AS topology degree distribution. We do not claim to have proven that this explanation holds; our purpose here is merely to expand the dialog to a larger class of explanations for the variability of the AS topology degree distribution.
Collapse
|
11
|
Abstract
One of the many benefits of multicast, when compared to traditional unicast, is that multicast reduces the overall network load. While the importance of multicast is beyond dispute, there have been surprisingly few attempts to quantify multicast's reduction in overall network load. The only substantial and quantitative effort we are aware of is that of Chuang and Sirbu [3]. They calculate the number of links
L
in a multicast delivery tree connecting a random source to
m
random and distinct network sites; extensive simulations over a range of networks suggest that
L(m) ∝ m
0.8
. In this paper we examine the function
L(m)
in more detail and derive the asymptotic form for
L(m)
in k-ary trees. These results suggest one possible explanation for the universality of the Chuang-Sirbu scaling behavior.
Collapse
Affiliation(s)
- Graham Phillips
- USC/Information Sciences Institute, 4676 Admiralty Way, Suite 1001, Marina del Rey, CA
| | - Scott Shenker
- International Computer Science Institute, 1947 Center Street, Berkeley, CA
| | | |
Collapse
|