9
|
Salganik MJ, Lundberg I, Kindel AT, Ahearn CE, Al-Ghoneim K, Almaatouq A, Altschul DM, Brand JE, Carnegie NB, Compton RJ, Datta D, Davidson T, Filippova A, Gilroy C, Goode BJ, Jahani E, Kashyap R, Kirchner A, McKay S, Morgan AC, Pentland A, Polimis K, Raes L, Rigobon DE, Roberts CV, Stanescu DM, Suhara Y, Usmani A, Wang EH, Adem M, Alhajri A, AlShebli B, Amin R, Amos RB, Argyle LP, Baer-Bositis L, Büchi M, Chung BR, Eggert W, Faletto G, Fan Z, Freese J, Gadgil T, Gagné J, Gao Y, Halpern-Manners A, Hashim SP, Hausen S, He G, Higuera K, Hogan B, Horwitz IM, Hummel LM, Jain N, Jin K, Jurgens D, Kaminski P, Karapetyan A, Kim EH, Leizman B, Liu N, Möser M, Mack AE, Mahajan M, Mandell N, Marahrens H, Mercado-Garcia D, Mocz V, Mueller-Gastell K, Musse A, Niu Q, Nowak W, Omidvar H, Or A, Ouyang K, Pinto KM, Porter E, Porter KE, Qian C, Rauf T, Sargsyan A, Schaffner T, Schnabel L, Schonfeld B, Sender B, Tang JD, Tsurkov E, van Loon A, Varol O, Wang X, Wang Z, Wang J, Wang F, Weissman S, Whitaker K, Wolters MK, Woon WL, Wu J, Wu C, Yang K, Yin J, Zhao B, Zhu C, Brooks-Gunn J, Engelhardt BE, Hardt M, Knox D, Levy K, Narayanan A, Stewart BM, Watts DJ, McLanahan S. Measuring the predictability of life outcomes with a scientific mass collaboration. Proc Natl Acad Sci U S A 2020; 117:8398-8403. [PMID: 32229555 PMCID: PMC7165437 DOI: 10.1073/pnas.1915006117] [Citation(s) in RCA: 58] [Impact Index Per Article: 14.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
How predictable are life trajectories? We investigated this question with a scientific mass collaboration using the common task method; 160 teams built predictive models for six life outcomes using data from the Fragile Families and Child Wellbeing Study, a high-quality birth cohort study. Despite using a rich dataset and applying machine-learning methods optimized for prediction, the best predictions were not very accurate and were only slightly better than those from a simple benchmark model. Within each outcome, prediction error was strongly associated with the family being predicted and weakly associated with the technique used to generate the prediction. Overall, these results suggest practical limits to the predictability of life outcomes in some settings and illustrate the value of mass collaborations in the social sciences.
Collapse
Affiliation(s)
| | - Ian Lundberg
- Department of Sociology, Princeton University, Princeton, NJ 08544
| | | | - Caitlin E Ahearn
- Department of Sociology, University of California, Los Angeles, CA 90095
| | | | - Abdullah Almaatouq
- Sloan School of Management, Massachusetts Institute of Technology, Cambridge, MA 02142
- Media Lab, Massachusetts Institute of Technology, Cambridge, MA 02139
| | - Drew M Altschul
- Mental Health Data Science Scotland, Department of Psychology, The University of Edinburgh, Edinburgh EH8 9JZ, United Kingdom
| | - Jennie E Brand
- Department of Sociology, University of California, Los Angeles, CA 90095
- Department of Statistics, University of California, Los Angeles, CA 90095
| | | | - Ryan James Compton
- Human Computer Interaction Lab, University of California, Santa Cruz, CA 95064
| | - Debanjan Datta
- Discovery Analytics Center, Virginia Polytechnic Institute and State University, Arlington, VA 22203
| | - Thomas Davidson
- Department of Sociology, Cornell University, Ithaca, NY 14853
| | | | - Connor Gilroy
- Department of Sociology, University of Washington, Seattle, WA 98105
| | - Brian J Goode
- Social and Decision Analytics Laboratory, Fralin Life Sciences Institute, Virginia Polytechnic Institute and State University, Arlington, VA 22203
| | - Eaman Jahani
- Institute for Data, Systems and Society, Massachusetts Institute of Technology, Cambridge, MA 02139
| | - Ridhi Kashyap
- Department of Sociology, University of Oxford, Oxford OX1 1JD, United Kingdom
- Nuffield College, University of Oxford, Oxford OX1 1NF, United Kingdom
- School of Anthropology and Museum Ethnography, University of Oxford, Oxford OX2 6PE, United Kingdom
| | - Antje Kirchner
- Program for Research in Survey Methodology, Survey Research Division, RTI International, Research Triangle Park, NC 27709
| | - Stephen McKay
- School of Social and Political Sciences, University of Lincoln, Brayford Pool, Lincoln LN6 7TS, United Kingdom
| | - Allison C Morgan
- Department of Computer Science, University of Colorado, Boulder, CO 80309
| | - Alex Pentland
- Media Lab, Massachusetts Institute of Technology, Cambridge, MA 02139
| | - Kivan Polimis
- Center for the Study of Demography and Ecology, University of Washington, Seattle, WA 98105
| | - Louis Raes
- Department of Economics, Tilburg School of Economics and Management, Tilburg University, 5037 AB Tilburg, The Netherlands
| | - Daniel E Rigobon
- Department of Operations Research and Financial Engineering, Princeton University, Princeton, NJ 08544
| | - Claudia V Roberts
- Department of Computer Science, Princeton University, Princeton, NJ 08544
| | - Diana M Stanescu
- Department of Politics, Princeton University,Princeton, NJ, 08544
| | - Yoshihiko Suhara
- Media Lab, Massachusetts Institute of Technology, Cambridge, MA 02139
| | - Adaner Usmani
- Department of Sociology, Harvard University, Cambridge, MA 02138
| | - Erik H Wang
- Department of Politics, Princeton University,Princeton, NJ, 08544
| | - Muna Adem
- Department of Sociology, Indiana University, Bloomington, IN 47405
| | - Abdulla Alhajri
- Department of Nuclear Science and Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139
| | - Bedoor AlShebli
- Computational Social Science Lab, Social Science Division, New York University Abu Dhabi, 129188 Abu Dhabi, United Arab Emirates
| | - Redwane Amin
- Bendheim Center for Finance, Princeton University, Princeton, NJ 08544
| | - Ryan B Amos
- Department of Computer Science, Princeton University, Princeton, NJ 08544
| | - Lisa P Argyle
- Department of Political Science, Brigham Young University, Provo, UT 84602
| | | | - Moritz Büchi
- Department of Communication and Media Research, University of Zurich, Zurich, Switzerland, ZH-8050
| | - Bo-Ryehn Chung
- Center for Statistics & Machine Learning, Princeton University, Princeton, NJ 08544
| | - William Eggert
- Department of Mechanical and Aerospace Engineering, Princeton University, Princeton, NJ 08544
| | - Gregory Faletto
- Statistics Group, Department of Data Sciences and Operations, Marshall School of Business, University of Southern California, Los Angeles, CA 90089
| | - Zhilin Fan
- Department of Statistics, Columbia University, New York, NY 10027
| | - Jeremy Freese
- Department of Sociology, Stanford University, Stanford, CA 94305
| | - Tejomay Gadgil
- Center for Data Science, New York University, New York, NY 10011
| | - Josh Gagné
- Department of Sociology, Stanford University, Stanford, CA 94305
| | - Yue Gao
- Department of Industrial Engineering and Operations Research, Columbia University, New York, NY 10027
| | | | - Sonia P Hashim
- Department of Computer Science, Princeton University, Princeton, NJ 08544
| | - Sonia Hausen
- Department of Sociology, Stanford University, Stanford, CA 94305
| | - Guanhua He
- Department of Molecular Biology, Princeton University, Princeton, NJ 08544
| | - Kimberly Higuera
- Department of Sociology, Stanford University, Stanford, CA 94305
| | - Bernie Hogan
- Oxford Internet Institute, University of Oxford, Oxford OX1 3JS, United Kingdom
| | - Ilana M Horwitz
- Graduate School of Education, Stanford University, Stanford, CA, 94305
| | - Lisa M Hummel
- Department of Sociology, Stanford University, Stanford, CA 94305
| | - Naman Jain
- Department of Operations Research and Financial Engineering, Princeton University, Princeton, NJ 08544
| | - Kun Jin
- Department of Computer Science, Ohio State University, Columbus, OH 43210
| | - David Jurgens
- School of Information, University of Michigan, Ann Arbor, MI 48104
| | - Patrick Kaminski
- Department of Sociology, Indiana University, Bloomington, IN 47405
- Center for Complex Networks and Systems Research, Indiana University, Bloomington, IN 47405
| | - Areg Karapetyan
- Department of Computer Science, Masdar Institute, Khalifa University, 127788 Abu Dhabi, United Arab Emirates
- Research Institute for Mathematical Sciences, Kyoto University, Kyoto 606-8502, Japan
| | - E H Kim
- Department of Sociology, Stanford University, Stanford, CA 94305
| | - Ben Leizman
- Department of Computer Science, Princeton University, Princeton, NJ 08544
| | - Naijia Liu
- Department of Politics, Princeton University,Princeton, NJ, 08544
| | - Malte Möser
- Department of Computer Science, Princeton University, Princeton, NJ 08544
| | - Andrew E Mack
- Department of Politics, Princeton University,Princeton, NJ, 08544
| | - Mayank Mahajan
- Department of Computer Science, Princeton University, Princeton, NJ 08544
| | - Noah Mandell
- Department of Astrophysical Sciences, Princeton University, Princeton, NJ 08544
| | - Helge Marahrens
- Department of Sociology, Indiana University, Bloomington, IN 47405
| | | | - Viola Mocz
- Department of Neuroscience, Princeton University, Princeton, NJ 08544
| | | | - Ahmed Musse
- Department of Electrical Engineering, Princeton University, Princeton, NJ, 08544
| | - Qiankun Niu
- Bendheim Center for Finance, Princeton University, Princeton, NJ 08544
| | | | - Hamidreza Omidvar
- Department of Civil and Environmental Engineering, Princeton University, Princeton, NJ 08544
| | - Andrew Or
- Department of Computer Science, Princeton University, Princeton, NJ 08544
| | - Karen Ouyang
- Department of Computer Science, Princeton University, Princeton, NJ 08544
| | - Katy M Pinto
- Department of Sociology, California State University, Dominguez Hills, Carson, CA 90747
| | - Ethan Porter
- School of Media and Public Affairs, George Washington University, Washington, DC 20052
| | | | - Crystal Qian
- Department of Computer Science, Princeton University, Princeton, NJ 08544
| | - Tamkinat Rauf
- Department of Sociology, Stanford University, Stanford, CA 94305
| | - Anahit Sargsyan
- Social Science Division, New York University Abu Dhabi, 129188 Abu Dhabi, United Arab Emirates
| | - Thomas Schaffner
- Department of Computer Science, Princeton University, Princeton, NJ 08544
| | - Landon Schnabel
- Department of Sociology, Stanford University, Stanford, CA 94305
| | - Bryan Schonfeld
- Department of Politics, Princeton University,Princeton, NJ, 08544
| | - Ben Sender
- Department of Economics, Princeton University, Princeton, NJ 08544
| | - Jonathan D Tang
- Department of Computer Science, Princeton University, Princeton, NJ 08544
| | - Emma Tsurkov
- Department of Sociology, Stanford University, Stanford, CA 94305
| | - Austin van Loon
- Department of Sociology, Stanford University, Stanford, CA 94305
| | - Onur Varol
- Center for Complex Network Research, Northeastern University Networks Science Institute, Boston, MA 02115
- Luddy School of Informatics, Computing, & Engineering, Indiana University, Bloomington, IN 47408
| | - Xiafei Wang
- School of Social Work, David B. Falk College of Sport and Human Dynamics, Syracuse University, NY 13244
| | - Zhi Wang
- Luddy School of Informatics, Computing, & Engineering, Indiana University, Bloomington, IN 47408
- School of Public Health, Indiana University, Bloomington, IN 47408
| | - Julia Wang
- Department of Computer Science, Princeton University, Princeton, NJ 08544
| | - Flora Wang
- Department of Economics, Princeton University, Princeton, NJ 08544
| | - Samantha Weissman
- Department of Computer Science, Princeton University, Princeton, NJ 08544
| | - Kirstie Whitaker
- The Alan Turing Institute, London NW1 2DB, United Kingdom
- Department of Psychiatry, University of Cambridge, Cambridge CB2 0SZ, United Kingdom
| | - Maria K Wolters
- School of Informatics, University of Edinburgh, Edinburgh EH8 9AB, United Kingdom
| | - Wei Lee Woon
- Department of Marketplaces & Yield Data Science, Expedia Group, Seattle, WA 98119
| | - James Wu
- Department of the Applied Statistics, Social Science, and Humanities, New York University, New York, NY 10003
| | - Catherine Wu
- Department of Computer Science, Princeton University, Princeton, NJ 08544
| | - Kengran Yang
- Department of Civil and Environmental Engineering, Princeton University, Princeton, NJ 08544
| | - Jingwen Yin
- Department of Statistics, Columbia University, New York, NY 10027
| | - Bingyu Zhao
- Department of Engineering, University of Cambridge, Cambridge CB2 1PZ, United Kingdom
| | - Chenyun Zhu
- Department of Statistics, Columbia University, New York, NY 10027
| | - Jeanne Brooks-Gunn
- Department of Human Development, Teachers College, Columbia University, New York, NY 10027
- Department of Pediatrics, Vagelos College of Physicians and Surgeons, Columbia University, New York, NY 10032
| | - Barbara E Engelhardt
- Department of Computer Science, Princeton University, Princeton, NJ 08544
- Center for Statistics & Machine Learning, Princeton University, Princeton, NJ 08544
| | - Moritz Hardt
- Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, CA 94720
| | - Dean Knox
- Department of Politics, Princeton University,Princeton, NJ, 08544
| | - Karen Levy
- Department of Information Science, Cornell University, Ithaca, NY 14853
| | - Arvind Narayanan
- Department of Computer Science, Princeton University, Princeton, NJ 08544
| | | | - Duncan J Watts
- Department of Computer and Information Science, University of Pennsylvania, Philadelphia, PA 19104
- Annenberg School of Communication, University of Pennsylvania, Philadelphia, PA 19104
- Operations, Information and Decisions Department, University of Pennsylvania, Philadelphia, PA 19104
| | - Sara McLanahan
- Department of Sociology, Princeton University, Princeton, NJ 08544;
| |
Collapse
|