1
|
Rokem A, Benson NC. Hands-On Neuroinformatics Education at the Crossroads of Online and In-Person: Lessons Learned from NeuroHackademy. Neuroinformatics 2024:10.1007/s12021-024-09666-6. [PMID: 38763989 DOI: 10.1007/s12021-024-09666-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/27/2024] [Indexed: 05/21/2024]
Abstract
NeuroHackademy ( https://neurohackademy.org ) is a two-week event designed to train early-career neuroscience researchers in data science methods and their application to neuroimaging. The event seeks to bridge the big data skills gap by introducing participants to data science methods and skills that are often ignored in traditional curricula. Such skills are needed for the analysis and interpretation of the kinds of large and complex datasets that have become increasingly important to neuroimaging research due to concerted data collection efforts. In 2020, the event rapidly pivoted from an in-person event to an online event that included hundreds of participants from all over the world. This experience and those of the participants substantially changed our valuation of large online-accessible events. In subsequent events held in 2022 and 2023, we have developed a "hybrid" format that includes both online and in-person participants. We discuss the technical and sociotechnical elements of hybrid events and discuss some of the lessons we have learned while organizing them. We emphasize in particular the role that these events can play in creating a global and inclusive community of practice in the intersection of neuroimaging and data science.
Collapse
Affiliation(s)
- Ariel Rokem
- Department of Psychology, University of Washington, 119 Guthrie Hall, Seattle, 98195, Washington, USA.
- eScience Institute, University of Washington, 3910 15th Ave NE, Seattle, 98195, Washington, USA.
| | - Noah C Benson
- eScience Institute, University of Washington, 3910 15th Ave NE, Seattle, 98195, Washington, USA
| |
Collapse
|
2
|
Heller B, Amir A, Waxman R, Maaravi Y. Hack your organizational innovation: literature review and integrative model for running hackathons. JOURNAL OF INNOVATION AND ENTREPRENEURSHIP 2023; 12:6. [PMID: 36883168 PMCID: PMC9983543 DOI: 10.1186/s13731-023-00269-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/10/2021] [Accepted: 02/17/2023] [Indexed: 06/18/2023]
Abstract
This article aims to offer a comprehensive overview of the existing literature on the hackathon phenomenon to offer scholars a common ground for future research and managers and practitioners research-based guidelines on best planning and running a hackathon. A review of the most relevant literature on hackathons was conducted to serve as the research basis for our integrative model and guidelines. This article synthesizes the research on hackathons to offer comprehensible guidelines for practitioners while also providing questions for future hackathon researchers. We differentiate between the different design characteristics of hackathons while noting their advantages and disadvantages, discuss tools and methodologies for successful hackathon setup and execution step-by-step, and provide recommendations to encourage project continuity.
Collapse
Affiliation(s)
- Ben Heller
- Baruch Ivcher School of Psychology, Reichman University (IDC), Herzliya, Israel
| | - Atar Amir
- The Adelson School of Entrepreneurship, Reichman University (IDC), Herzliya, Israel
| | - Roy Waxman
- The Adelson School of Entrepreneurship, Reichman University (IDC), Herzliya, Israel
| | - Yossi Maaravi
- The Adelson School of Entrepreneurship, Reichman University (IDC), Herzliya, Israel
| |
Collapse
|
3
|
Mahmoud ASI, Dey T, Nolte A, Mockus A, Herbsleb JD. One-off events? An empirical study of hackathon code creation and reuse. EMPIRICAL SOFTWARE ENGINEERING 2022; 27:167. [PMID: 36159898 PMCID: PMC9489595 DOI: 10.1007/s10664-022-10201-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Accepted: 07/02/2022] [Indexed: 06/16/2023]
Abstract
CONTEXT Hackathons have become popular events for teams to collaborate on projects and develop software prototypes. Most existing research focuses on activities during an event with limited attention to the evolution of the hackathon code. OBJECTIVE We aim to understand the evolution of code used in and created during hackathon events, with a particular focus on the code blobs, specifically, how frequently hackathon teams reuse pre-existing code, how much new code they develop, if that code gets reused afterwards, and what factors affect reuse. METHOD We collected information about 22,183 hackathon projects from Devpost and obtained related code blobs, authors, project characteristics, original author, code creation time, language, and size information from World of Code. We tracked the reuse of code blobs by identifying all commits containing blobs created during hackathons and identifying all projects that contain those commits. We also conducted a series of surveys in order to gain a deeper understanding of hackathon code evolution that we sent out to hackathon participants whose code was reused, whose code was not reused, and developers who reused some hackathon code. RESULT 9.14% of the code blobs in hackathon repositories and 8% of the lines of code (LOC) are created during hackathons and around a third of the hackathon code gets reused in other projects by both blob count and LOC. The number of associated technologies and the number of participants in hackathons increase reuse probability. CONCLUSION The results of our study demonstrates hackathons are not always "one-off" events as the common knowledge dictates and it can serve as a starting point for further studies in this area.
Collapse
Affiliation(s)
| | - Tapajit Dey
- Lero—the Irish Software Research Centre, University of Limerick, Limerick, Ireland
| | - Alexander Nolte
- University of Tartu, Tartu, Estonia
- Carnegie Mellon University, Pittsburgh, PA USA
| | | | | |
Collapse
|
4
|
Schulten C, Nolte A, Spikol D, Chounta IA. How do participants collaborate during an online hackathon? An empirical, quantitative study of communication traces. FRONTIERS IN COMPUTER SCIENCE 2022. [DOI: 10.3389/fcomp.2022.983164] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Starting as niche programming events, hackathons have since become a popular form of collaboration. Events are organized in various domains across the globe, aiming to foster innovation and learning, create and expand communities and tackle civic and environmental issues. While research around such events has grown in recent years, most studies are based on observations of a few individuals during an event and on post-hoc interviews during which participants report their experiences. Such studies are helpful but somewhat limited in that they do not allow us to study how individuals communicate at scale using technology. To address this gap, we conducted an archival analysis of communication traces of teams during a 48-h event. Our findings indicate that teams scaffold their communication around the design of an event, influenced by milestones set by the organizers. The officially selected communication platform's main use was to organize the event and the teams and to facilitate contact between participants and hackathon officials. We further investigated the balance of intra-team communication on the given platform and the potential use of additional communication tools.
Collapse
|
5
|
Roche DG, Raby GD, Norin T, Ern R, Scheuffele H, Skeeles M, Morgan R, Andreassen AH, Clements JC, Louissaint S, Jutfelt F, Clark TD, Binning SA. Paths towards greater consensus building in experimental biology. J Exp Biol 2022; 225:274263. [PMID: 35258604 DOI: 10.1242/jeb.243559] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023]
Abstract
In a recent editorial, the Editors-in-Chief of Journal of Experimental Biology argued that consensus building, data sharing, and better integration across disciplines are needed to address the urgent scientific challenges posed by climate change. We agree and expand on the importance of cross-disciplinary integration and transparency to improve consensus building and advance climate change research in experimental biology. We investigated reproducible research practices in experimental biology through a review of open data and analysis code associated with empirical studies on three debated paradigms and for unrelated studies published in leading journals in comparative physiology and behavioural ecology over the last 10 years. Nineteen per cent of studies on the three paradigms had open data, and 3.2% had open code. Similarly, 12.1% of studies in the journals we examined had open data, and 3.1% had open code. Previous research indicates that only 50% of shared datasets are complete and re-usable, suggesting that fewer than 10% of studies in experimental biology have usable open data. Encouragingly, our results indicate that reproducible research practices are increasing over time, with data sharing rates in some journals reaching 75% in recent years. Rigorous empirical research in experimental biology is key to understanding the mechanisms by which climate change affects organisms, and ultimately promotes evidence-based conservation policy and practice. We argue that a greater adoption of open science practices, with a particular focus on FAIR (Findable, Accessible, Interoperable, Re-usable) data and code, represents a much-needed paradigm shift towards improved transparency, cross-disciplinary integration, and consensus building to maximize the contributions of experimental biologists in addressing the impacts of environmental change on living organisms.
Collapse
Affiliation(s)
- Dominique G Roche
- Canadian Centre for Evidence-Based Conservation, Department of Biology and Institute of Environmental and Interdisciplinary Science, Carleton University, Ottawa, ON, Canada, K1S 5B6.,Institut de Biologie, Université de Neuchâtel, 2000 Neuchâtel, Switzerland
| | - Graham D Raby
- Department of Biology, Trent University, Peterborough, ON, Canada, K9L 0G2
| | - Tommy Norin
- DTU Aqua: National Institute of Aquatic Resources, Technical University of Denmark, 2800 Kgs. Lyngby, Denmark
| | - Rasmus Ern
- Department of Biology, Norwegian University of Science and Technology, 7491 Trondheim, Norway
| | - Hanna Scheuffele
- School of Life and Environmental Sciences, Deakin University, Geelong, VIC 3216, Australia
| | - Michael Skeeles
- School of Life and Environmental Sciences, Deakin University, Geelong, VIC 3216, Australia
| | - Rachael Morgan
- Institute of Biodiversity, Animal Health & Comparative Medicine, University of Glasgow, Glasgow G12 8QQ, UK.,Department of Biological Sciences, University of Bergen, 5020 Bergen, Norway
| | - Anna H Andreassen
- Department of Biology, Norwegian University of Science and Technology, 7491 Trondheim, Norway
| | - Jeff C Clements
- Aquaculture and Coastal Ecosystems, Fisheries and Oceans Canada Gulf Region, Moncton, NB, Canada, E1C 9B6
| | - Sarahdghyn Louissaint
- Département de Sciences Biologiques, Université de Montréal, Montréal, QC, Canada, H2V 0B3
| | - Fredrik Jutfelt
- Department of Biology, Norwegian University of Science and Technology, 7491 Trondheim, Norway
| | - Timothy D Clark
- School of Life and Environmental Sciences, Deakin University, Geelong, VIC 3216, Australia
| | - Sandra A Binning
- Département de Sciences Biologiques, Université de Montréal, Montréal, QC, Canada, H2V 0B3
| |
Collapse
|
6
|
Vizcarra JC, Burlingame EA, Hug CB, Goltsev Y, White BS, Tyson DR, Sokolov A. A community-based approach to image analysis of cells, tissues and tumors. Comput Med Imaging Graph 2022; 95:102013. [PMID: 34864359 PMCID: PMC8761177 DOI: 10.1016/j.compmedimag.2021.102013] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2021] [Revised: 11/09/2021] [Accepted: 11/09/2021] [Indexed: 01/03/2023]
Abstract
Emerging multiplexed imaging platforms provide an unprecedented view of an increasing number of molecular markers at subcellular resolution and the dynamic evolution of tumor cellular composition. As such, they are capable of elucidating cell-to-cell interactions within the tumor microenvironment that impact clinical outcome and therapeutic response. However, the rapid development of these platforms has far outpaced the computational methods for processing and analyzing the data they generate. While being technologically disparate, all imaging assays share many computational requirements for post-collection data processing. As such, our Image Analysis Working Group (IAWG), composed of researchers in the Cancer Systems Biology Consortium (CSBC) and the Physical Sciences - Oncology Network (PS-ON), convened a workshop on "Computational Challenges Shared by Diverse Imaging Platforms" to characterize these common issues and a follow-up hackathon to implement solutions for a selected subset of them. Here, we delineate these areas that reflect major axes of research within the field, including image registration, segmentation of cells and subcellular structures, and identification of cell types from their morphology. We further describe the logistical organization of these events, believing our lessons learned can aid others in uniting the imaging community around self-identified topics of mutual interest, in designing and implementing operational procedures to address those topics and in mitigating issues inherent in image analysis (e.g., sharing exemplar images of large datasets and disseminating baseline solutions to hackathon challenges through open-source code repositories).
Collapse
Affiliation(s)
- Juan Carlos Vizcarra
- Department of Biomedical Engineering, Georgia Institute of Technology & Emory University, Atlanta, GA, USA
| | - Erik A Burlingame
- Computational Biology Program, Department of Biomedical Engineering, Oregon Health & Science University, Portland, OR, USA
| | - Clemens B Hug
- Laboratory of Systems Pharmacology, Harvard Program in Therapeutic Science, Boston, MA, USA
| | - Yury Goltsev
- Department of Microbiology & Immunology, Stanford University School of Medicine, Stanford, CA, USA
| | - Brian S White
- Computational Oncology, Sage Bionetworks, Seattle, WA, USA
| | - Darren R Tyson
- Department of Biochemistry, Vanderbilt University School of Medicine, Nashville, TN, USA
| | - Artem Sokolov
- Laboratory of Systems Pharmacology, Harvard Program in Therapeutic Science, Boston, MA, USA; Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA.
| |
Collapse
|
7
|
Sahneh F, Balk MA, Kisley M, Chan CK, Fox M, Nord B, Lyons E, Swetnam T, Huppenkothen D, Sutherland W, Walls RL, Quinn DP, Tarin T, LeBauer D, Ribes D, Birnie DP, Lushbough C, Carr E, Nearing G, Fischer J, Tyle K, Carrasco L, Lang M, Rose PW, Rushforth RR, Roy S, Matheson T, Lee T, Brown CT, Teal TK, Papeș M, Kobourov S, Merchant N. Ten simple rules to cultivate transdisciplinary collaboration in data science. PLoS Comput Biol 2021; 17:e1008879. [PMID: 33983959 PMCID: PMC8118297 DOI: 10.1371/journal.pcbi.1008879] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/02/2022] Open
Affiliation(s)
- Faryad Sahneh
- Data Science Institute, University of Arizona, Tucson, Arizona, United States of America
- Computer Science Department, University of Arizona, Tucson, Arizona, United States of America
- * E-mail:
| | - Meghan A. Balk
- BIO5 Institute, University of Arizona, Tucson, Arizona, United States of America
- National Museum of Natural History, Department of Paleontology, Washington, District of Columbia, United States of America
| | - Marina Kisley
- Computer Science Department, University of Arizona, Tucson, Arizona, United States of America
| | - Chi-kwan Chan
- Data Science Institute, University of Arizona, Tucson, Arizona, United States of America
- Steward Observatory and Department of Astronomy, University of Arizona, Tucson, Arizona, United States of America
| | - Mercury Fox
- Data Science Institute, University of Arizona, Tucson, Arizona, United States of America
- CODATA Center of Excellence in Data for Society, Washington, District of Columbia, United States of America
- School of Information, University of Arizona, Tucson, Arizona, United States of America
- Native Nations Institute, University of Arizona, Tucson, Arizona, United States of America
- Center for Digital Society and Data Studies, University of Arizona, Tucson, Arizona, United States of America
| | - Brian Nord
- Fermi National Accelerator Laboratory, Batavia, Illinois, United States of America
- Kavli Institute for Cosmological Physics, University of Chicago, Chicago, Illinois, United States of America
- Department of Astronomy and Astrophysics, University of Chicago, Illinois, United States of America
| | - Eric Lyons
- BIO5 Institute, University of Arizona, Tucson, Arizona, United States of America
- School of Plant Sciences, University of Arizona, Tucson, Arizona, United States of America
- CyVerse, University of Arizona, Tucson, Arizona, United States of America
| | - Tyson Swetnam
- BIO5 Institute, University of Arizona, Tucson, Arizona, United States of America
| | - Daniela Huppenkothen
- DIRAC Institute, Department of Astronomy, University of Washington, Seattle, Washington, United States of America
- eScience Institute, University of Washington, Seattle, Washington, United States of America
| | - Will Sutherland
- Department of Human Centered Design and Engineering, University of Washington, Seattle, Washington, United States of America
| | - Ramona L. Walls
- BIO5 Institute, University of Arizona, Tucson, Arizona, United States of America
| | - Daven P. Quinn
- Department of Geoscience, University of Wisconsin–Madison, Madison, Wisconsin, United States of America
| | - Tonantzin Tarin
- Instituto de Ecología, Universidad Nacional Autónoma de México, Mexico City, Mexico
| | - David LeBauer
- College of Agriculture and Life Sciences, University of Arizona, Tucson, Arizona, United States of America
| | - David Ribes
- Department of Human Centered Design and Engineering, University of Washington, Seattle, Washington, United States of America
| | - Dunbar P. Birnie
- Department of Materials Science and Engineering, Rutgers University, Piscataway, New Jersey, United States of America
| | - Carol Lushbough
- Biomedical Engineering Department, University of South Dakota, Sioux Falls, South Dakota, United States of America
- BioSNTR, Brookings, South Dakota, United States of America
| | - Eric Carr
- National Institute for Mathematical and Biological Synthesis, University of Tennessee, Knoxville, Tennessee, United States of America
| | - Grey Nearing
- Google Research, Mountain View, California, United States of America
| | - Jeremy Fischer
- Pervasive Technology Institute, Indiana University Bloomington, Bloomington, Indiana, United States of America
- JetStream Cloud, Indiana University Bloomington, Bloomington, Indiana, United States of America
| | - Kevin Tyle
- Atmospheric & Environmental Sciences, University at Albany, Albany, New York, United States of America
| | - Luis Carrasco
- National Institute for Mathematical and Biological Synthesis, University of Tennessee, Knoxville, Tennessee, United States of America
| | - Meagan Lang
- National Center for Supercomputing Applications, University of Illinois at Urbana-Champaign, Urbana, Illinois, United States of America
| | - Peter W. Rose
- San Diego Supercomputer Center, University of California, San Diego, La Jolla, California, United States of America
| | - Richard R. Rushforth
- School of Informatics, Computing, and Cyber Systems, Northern Arizona University, Flagstaff, Arizona, United States of America
| | - Samapriya Roy
- Planet Labs, San Francisco, California, United States of America
| | - Thomas Matheson
- NSF’s National Optical-Infrared Astronomy Research Laboratory, Tucson, Arizona, United States of America
| | - Tina Lee
- CyVerse, University of Arizona, Tucson, Arizona, United States of America
| | - C. Titus Brown
- Department of Population Health and Reproduction, University of California, Davis, Davis, California, United States of America
| | - Tracy K. Teal
- Dryad, Durham, North Carolina, United States of America
| | - Monica Papeș
- National Institute for Mathematical and Biological Synthesis, University of Tennessee, Knoxville, Tennessee, United States of America
- Ecology & Evolutionary Biology, University of Tennessee, Knoxville, Tennessee, United States of America
| | - Stephen Kobourov
- Computer Science Department, University of Arizona, Tucson, Arizona, United States of America
| | - Nirav Merchant
- Data Science Institute, University of Arizona, Tucson, Arizona, United States of America
- CyVerse, University of Arizona, Tucson, Arizona, United States of America
| |
Collapse
|
8
|
Affiliation(s)
- Brian E. Granger
- Amazon Web Services and California Polytechnic State University, San Luis Obispo, CA, USA
| | - Fernando Perez
- UC Berkeley and Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| |
Collapse
|
9
|
Exploring potential roles of academic libraries in undergraduate data science education curriculum development. JOURNAL OF ACADEMIC LIBRARIANSHIP 2021. [DOI: 10.1016/j.acalib.2021.102320] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
10
|
Parker MS, Burgess AE, Bourne PE. Ten simple rules for starting (and sustaining) an academic data science initiative. PLoS Comput Biol 2021; 17:e1008628. [PMID: 33600414 PMCID: PMC7891724 DOI: 10.1371/journal.pcbi.1008628] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [MESH Headings] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Affiliation(s)
- Micaela S. Parker
- Academic Data Science Alliance, Seattle, Washington, United States of America
- * E-mail:
| | - Arlyn E. Burgess
- School of Data Science, University of Virginia, Charlottesville, Virginia, United States of America
| | - Philip E. Bourne
- School of Data Science, University of Virginia, Charlottesville, Virginia, United States of America
| |
Collapse
|
11
|
Kuter K, Wedrychowicz C. Hosting a data science hackathon with limited resources. Stat (Int Stat Inst) 2021. [DOI: 10.1002/sta4.338] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
|
12
|
Mitsuhashi T. Evaluation of epidemiological lectures using peer instruction: focusing on the importance of ConcepTests. PeerJ 2020. [DOI: 10.7717/peerj.9640] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Background
In clinical research, the ability to properly analyze data is a necessary skill that cannot be learned simply by listening to lectures. Interactive classes, such as Peer Instruction (PI), are required to help medical students understand the concept of epidemiology for future valid research. In PI lectures, ConcepTests are conducted to confirm and deepen students’ understanding of the lecture material. Although it is important to evaluate PI lectures, there have been no studies conducted on PI lectures in epidemiology. This study employed the ConcepTest to evaluate PI lectures in a medical school epidemiology class to measure the efficiency of active learning techniques and the usefulness of ConcepTests in determining effective active learning approaches.
Methods
The PI lecture was conducted as part of an existing epidemiology class for fourth-year medical students at Okayama University on October 17, 2019. The lecture was conducted as follows. The lecturer taught the fundamental concepts of epidemiology and presented the ConcepTest to students. After answering the test, students were provided with the answer distribution, followed by peer discussion. After the discussion, students answered the ConcepTest again, and a new answer distribution was presented. Subsequently, the lecturer announced the correct answers and delivered a commentary. The ConcepTest comprised five questions, each related to fundamental concepts of epidemiology. Students’ responses to five ConcepTests were collected and analyzed by calculating the proportion of correct answers before and after the discussion, as well as PI efficiency to evaluate the PI lecture.
Results
Overall,121 students attended the epidemiology lecture. The proportion of correct answers before the discussion ranged from 0.217 to 0.458, and after the peer discussion they ranged from 0.178 to 0.767. The PI efficiency ranged from −0.051 to 0.657, and was higher than the theoretical value in three ConcepTests. The efficiency was about the same as the theoretical value in one ConcepTest, and lower than the theoretical value in another.
Conclusion
In this study, the efficiency of a PI lecture was determined by calculating the PI efficiency of each ConcepTest. The results showed that the educational efficiency of a ConcepTest in epidemiology lectures can be widely distributed, ranging from efficient to inefficient. Particularly in three ConcepTests, the proportion of correct answers after the discussion and the PI efficiency were higher than the theoretical value. This suggests that PI lectures can be useful in epidemiology education with the efficient use of ConcepTests.
Collapse
Affiliation(s)
- Toshiharu Mitsuhashi
- Center for Innovative Clinical Medicine, Okayama University Hospital, Okayama, Japan
| |
Collapse
|
13
|
Huppenkothen D, McFee B, Norén L. Entrofy your cohort: A transparent method for diverse cohort selection. PLoS One 2020; 15:e0231939. [PMID: 32716929 PMCID: PMC7384611 DOI: 10.1371/journal.pone.0231939] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2019] [Accepted: 04/04/2020] [Indexed: 11/23/2022] Open
Abstract
Selecting a cohort from a set of candidates is a common task within and beyond academia. Admitting students, awarding grants, and choosing speakers for a conference are situations where human biases may affect the selection of any particular candidate, and, thereby the composition of the final cohort. In this paper, we propose a new algorithm, entrofy, designed to be part of a human-in-the-loop decision making strategy aimed at making cohort selection as just, transparent, and accountable as possible. We suggest embedding entrofy in a two-step selection procedure. During a merit review, the committee selects all applicants, submissions, or other entities that meet their merit-based criteria. This often yields a cohort larger than the admissible number. In the second stage, the target cohort can be chosen from this meritorious pool via a new algorithm and software tool called entrofy. entrofy optimizes differences across an assignable set of categories selected by the human committee. Criteria could include academic discipline, home country, experience with certain technologies, or other quantifiable characteristics. The entrofy algorithm then yields the approximation of pre-defined target proportions for each category by solving the tie-breaking problem with provable performance guarantees. We show how entrofy selects cohorts according to pre-determined characteristics in simulated sets of applications and demonstrate its use in a case study of Astro Hack Week. This two stage candidate and cohort selection process allows human judgment and debate to guide the assessment of candidates’ merit in step 1. Then the human committee defines relevant diversity criteria which will be used as computational parameters in entrofy. Once the parameters are defined, the set of candidates who meet the minimum threshold for merit are passed through the entrofy cohort selection procedure in step 2 which yields a cohort of a composition as close as possible to the computational parameters defined by the committee. This process has the benefit of separating the meritorious assessment of candidates from certain elements of their diversity and from some considerations around cohort composition. It also increases the transparency and auditability of the process, which enables, but does not guarantee, fairness. Splitting merit and diversity considerations into their own assessment stages makes it easier to explain why a given candidate was selected or rejected, though it does not eliminate the possibility of objectionable bias.
Collapse
Affiliation(s)
- Daniela Huppenkothen
- Department of Astronomy, DIRAC Institute, University of Washington, Seattle, WA, United States of America
- The Washington Research Foundation Data Science Studio, The University of Washington eScience Institute, University of Washington, Seattle, WA, United States of America
- * E-mail:
| | - Brian McFee
- Center for Data Science, New York University, New York, NY, United States of America
- Music and Audio Research Lab, New York University, New York, NY, United States of America
| | - Laura Norén
- Obsidian Security, Newport Beach, CA, United States of America
| |
Collapse
|
14
|
Abstract
With increasing demand for training in data science, extracurricular or "ad hoc" education efforts have emerged to help individuals acquire relevant skills and expertise. Although extracurricular efforts already exist for many computationally intensive disciplines, their support of data science education has significantly helped in coping with the speed of innovation in data science practice and formal curricula. While the proliferation of ad hoc efforts is an indication of their popularity, less has been documented about the needs that they are designed to meet, the limitations that they face, and practical suggestions for holding successful efforts. To holistically understand the role of different ad hoc formats for data science, we surveyed organizers of ad hoc data science education efforts to understand how organizers perceived the events to have gone-including areas of strength and areas requiring growth. We also gathered recommendations from these past events for future organizers. Our results suggest that the perceived benefits of ad hoc efforts go beyond developing technical skills and may provide continued benefit in conjunction with formal curricula, which warrants further investigation. As increasing numbers of researchers from computational fields with a history of complex data become involved with ad hoc efforts to share their skills, the lessons learned that we extract from the surveys will provide concrete suggestions for the practitioner-leaders interested in creating, improving, and sustaining future efforts.
Collapse
Affiliation(s)
- Orianna DeMasi
- Department of Computer Science, University of California, Davis, California, United States of America
| | - Alexandra Paxton
- Department of Psychological Sciences, University of Connecticut, Storrs, Connecticut, United States of America
- Center for the Ecological Study of Perception and Action, University of Connecticut, Storrs, Connecticut, United States of America
| | - Kevin Koy
- IDEO, San Francisco, California, United States of America
| |
Collapse
|
15
|
Sholler D, Steinmacher I, Ford D, Averick M, Hoye M, Wilson G. Ten simple rules for helping newcomers become contributors to open projects. PLoS Comput Biol 2019; 15:e1007296. [PMID: 31513567 PMCID: PMC6742214 DOI: 10.1371/journal.pcbi.1007296] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
Affiliation(s)
- Dan Sholler
- Berkeley Institute for Data Science, University of California Berkeley, Berkeley, California, United States of America
| | - Igor Steinmacher
- School of Informatics, Computing, and Cyber Systems, Northern Arizona University, Flagstaff, Arizona, United States of America
| | - Denae Ford
- Department of Computer Science, North Carolina State University, Raleigh, North Carolina, United States of America
| | - Mara Averick
- RStudio, Inc., Boston, Massachusetts, United States of America
| | - Mike Hoye
- Mozilla Corporation, Toronto, Ontario, Canada
| | - Greg Wilson
- RStudio, Inc., Toronto, Ontario, Canada
- * E-mail:
| |
Collapse
|
16
|
Oliver JC, Kollen C, Hickson B, Rios F. Data Science Support at the Academic Library. JOURNAL OF LIBRARY ADMINISTRATION 2019. [DOI: 10.1080/01930826.2019.1583015] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Affiliation(s)
- Jeffrey C. Oliver
- Data Science Specialist, Office of Digital Innovation and Stewardship, University Libraries, University of Arizona, Tucson, AZ, USA
| | - Christine Kollen
- Data Curation Librarian, Office of Digital Innovation and Stewardship, University Libraries, University of Arizona, Tucson, AZ, USA
| | - Benjamin Hickson
- Geospatial Specialist, Office of Digital Innovation and Stewardship, University Libraries, University of Arizona, Tucson, AZ, USA
| | - Fernando Rios
- Research Data Management Specialist, Office of Digital Innovation and Stewardship, University Libraries, University of Arizona, Tucson, AZ, USA
| |
Collapse
|
17
|
Keshavan A, Poline JB. From the Wet Lab to the Web Lab: A Paradigm Shift in Brain Imaging Research. Front Neuroinform 2019; 13:3. [PMID: 30881299 PMCID: PMC6405692 DOI: 10.3389/fninf.2019.00003] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2018] [Accepted: 01/22/2019] [Indexed: 01/08/2023] Open
Abstract
Web technology has transformed our lives, and has led to a paradigm shift in the computational sciences. As the neuroimaging informatics research community amasses large datasets to answer complex neuroscience questions, we find that the web is the best medium to facilitate novel insights by way of improved collaboration and communication. Here, we review the landscape of web technologies used in neuroimaging research, and discuss future applications, areas for improvement, and the limitations of using web technology in research. Fully incorporating web technology in our research lifecycle requires not only technical skill, but a widespread culture change; a shift from the small, focused "wet lab" to a multidisciplinary and largely collaborative "web lab."
Collapse
Affiliation(s)
- Anisha Keshavan
- Department of Speech and Hearing, Institute for Neuroengineering, eScience Institute, University of Washington, Seattle, WA, United States
| | - Jean-Baptiste Poline
- Faculty of Medicine, McConnell Brain Imaging Centre, Ludmer Centre for Neuroinformatics and Mental Health, Montreal Neurological Institute and Hospital, McGill University, Montreal, QC, Canada
- Henry H. Wheeler Jr. Brain Imaging Center, Helen Wills Neuroscience Institute, University of California, Berkeley, Berkeley, CA, United States
| |
Collapse
|