Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Fowler DM, Araya CL, Fleishman SJ, Kellogg EH, Stephany JJ, Baker D, Fields S. High-resolution mapping of protein sequence-function relationships. Nat Methods 2010;7:741-6. [PMID: 20711194 PMCID: PMC2938879 DOI: 10.1038/nmeth.1492] [Citation(s) in RCA: 378] [Impact Index Per Article: 27.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2010] [Accepted: 07/13/2010] [Indexed: 12/30/2022]

For:	Fowler DM, Araya CL, Fleishman SJ, Kellogg EH, Stephany JJ, Baker D, Fields S. High-resolution mapping of protein sequence-function relationships. Nat Methods 2010;7:741-6. [PMID: 20711194 PMCID: PMC2938879 DOI: 10.1038/nmeth.1492] [Citation(s) in RCA: 378] [Impact Index Per Article: 27.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2010] [Accepted: 07/13/2010] [Indexed: 12/30/2022]

Number

Cited by Other Article(s)

201

Wheeler LC, Anderson JA, Morrison AJ, Wong CE, Harms MJ. Conservation of Specificity in Two Low-Specificity Proteins. Biochemistry 2017;57:684-695. [PMID: 29240404 DOI: 10.1021/acs.biochem.7b01086] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023]

202

Weile J, Sun S, Cote AG, Knapp J, Verby M, Mellor JC, Wu Y, Pons C, Wong C, van Lieshout N, Yang F, Tasan M, Tan G, Yang S, Fowler DM, Nussbaum R, Bloom JD, Vidal M, Hill DE, Aloy P, Roth FP. A framework for exhaustively mapping functional missense variants. Mol Syst Biol 2017;13:957. [PMID: 29269382 PMCID: PMC5740498 DOI: 10.15252/msb.20177908] [Citation(s) in RCA: 102] [Impact Index Per Article: 14.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023] Open

Affiliation(s)

Jochen Weile Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, ON, Canada.,The Donnelly Centre, University of Toronto, Toronto, ON, Canada.,Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada.,Department of Computer Science, University of Toronto, Toronto, ON, Canada
Song Sun Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, ON, Canada.,The Donnelly Centre, University of Toronto, Toronto, ON, Canada.,Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada.,Department of Computer Science, University of Toronto, Toronto, ON, Canada.,Department of Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden
Atina G Cote Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, ON, Canada.,The Donnelly Centre, University of Toronto, Toronto, ON, Canada.,Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
Jennifer Knapp Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, ON, Canada.,The Donnelly Centre, University of Toronto, Toronto, ON, Canada.,Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
Marta Verby Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, ON, Canada.,The Donnelly Centre, University of Toronto, Toronto, ON, Canada.,Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
Joseph C Mellor The Donnelly Centre, University of Toronto, Toronto, ON, Canada.,SeqWell Inc, Boston, MA, USA
Yingzhou Wu Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, ON, Canada.,The Donnelly Centre, University of Toronto, Toronto, ON, Canada.,Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada.,Department of Computer Science, University of Toronto, Toronto, ON, Canada
Carles Pons Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute for Science and Technology, Barcelona, Catalonia, Spain
Cassandra Wong Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, ON, Canada.,The Donnelly Centre, University of Toronto, Toronto, ON, Canada
Natascha van Lieshout Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, ON, Canada
Fan Yang Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, ON, Canada.,The Donnelly Centre, University of Toronto, Toronto, ON, Canada.,Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada.,Department of Computer Science, University of Toronto, Toronto, ON, Canada
Murat Tasan Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, ON, Canada.,The Donnelly Centre, University of Toronto, Toronto, ON, Canada.,Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada.,Department of Computer Science, University of Toronto, Toronto, ON, Canada
Guihong Tan The Donnelly Centre, University of Toronto, Toronto, ON, Canada.,Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
Shan Yang Invitae Corp., San Francisco, CA, USA
Douglas M Fowler Department of Genome Sciences, University of Washington, Seattle, WA, USA
Robert Nussbaum Invitae Corp., San Francisco, CA, USA
Jesse D Bloom Fred Hutchinson Research Center, Seattle, WA, USA
Marc Vidal Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA.,Department of Genetics, Harvard Medical School, Boston, MA, USA
David E Hill Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA
Patrick Aloy Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute for Science and Technology, Barcelona, Catalonia, Spain.,Institució Catalana de Recerca I Estudis Avançats (ICREA), Barcelona, Catalonia, Spain
Frederick P Roth Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, ON, Canada .,The Donnelly Centre, University of Toronto, Toronto, ON, Canada.,Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada.,Department of Computer Science, University of Toronto, Toronto, ON, Canada.,Canadian Institute for Advanced Research, Toronto, ON, Canada

Collapse

203

Arai R. Hierarchical design of artificial proteins and complexes toward synthetic structural biology. Biophys Rev 2017;10:391-410. [PMID: 29243094 DOI: 10.1007/s12551-017-0376-1] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2017] [Accepted: 11/23/2017] [Indexed: 12/14/2022] Open

204

Sharma P, Kranz DM. Subtle changes at the variable domain interface of the T-cell receptor can strongly increase affinity. J Biol Chem 2017;293:1820-1834. [PMID: 29229779 DOI: 10.1074/jbc.m117.814152] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2017] [Revised: 12/03/2017] [Indexed: 11/06/2022] Open

205

Higgins SA, Savage DF. Protein Science by DNA Sequencing: How Advances in Molecular Biology Are Accelerating Biochemistry. Biochemistry 2017;57:38-46. [DOI: 10.1021/acs.biochem.7b00886] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]

206

Molecular ensembles make evolution unpredictable. Proc Natl Acad Sci U S A 2017;114:11938-11943. [PMID: 29078365 PMCID: PMC5691298 DOI: 10.1073/pnas.1711927114] [Citation(s) in RCA: 42] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open

Abstract

A long-standing goal in evolutionary biology is predicting evolution. Here, we show that the architecture of macromolecules fundamentally limits evolutionary predictability. Under physiological conditions, macromolecules, like proteins, flip between multiple structures, forming an ensemble of structures. A mutation affects all of these structures in slightly different ways, redistributing the relative probabilities of structures in the ensemble. As a result, mutations that follow the first mutation have a different effect than they would if introduced before. This implies that knowing the effects of every mutation in an ancestor would be insufficient to predict evolutionary trajectories past the first few steps, leading to profound unpredictability in evolution. We, therefore, conclude that detailed evolutionary predictions are not possible given the chemistry of macromolecules.

Evolutionary prediction is of deep practical and philosophical importance. Here we show, using a simple computational protein model, that protein evolution remains unpredictable, even if one knows the effects of all mutations in an ancestral protein background. We performed a virtual deep mutational scan—revealing the individual and pairwise epistatic effects of every mutation to our model protein—and then used this information to predict evolutionary trajectories. Our predictions were poor. This is a consequence of statistical thermodynamics. Proteins exist as ensembles of similar conformations. The effect of a mutation depends on the relative probabilities of conformations in the ensemble, which in turn, depend on the exact amino acid sequence of the protein. Accumulating substitutions alter the relative probabilities of conformations, thereby changing the effects of future mutations. This manifests itself as subtle but pervasive high-order epistasis. Uncertainty in the effect of each mutation accumulates and undermines prediction. Because conformational ensembles are an inevitable feature of proteins, this is likely universal.

Collapse

207

Starr TN, Picton LK, Thornton JW. Alternative evolutionary histories in the sequence space of an ancient protein. Nature 2017;549:409-413. [PMID: 28902834 PMCID: PMC6214350 DOI: 10.1038/nature23902] [Citation(s) in RCA: 112] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2017] [Accepted: 08/08/2017] [Indexed: 12/28/2022]

208

Rubin AF, Gelman H, Lucas N, Bajjalieh SM, Papenfuss AT, Speed TP, Fowler DM. A statistical framework for analyzing deep mutational scanning data. Genome Biol 2017;18:150. [PMID: 28784151 PMCID: PMC5547491 DOI: 10.1186/s13059-017-1272-5] [Citation(s) in RCA: 119] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2017] [Accepted: 07/06/2017] [Indexed: 11/10/2022] Open

209

Wrenbeck EE, Faber MS, Whitehead TA. Deep sequencing methods for protein engineering and design. Curr Opin Struct Biol 2017;45:36-44. [PMID: 27886568 PMCID: PMC5440218 DOI: 10.1016/j.sbi.2016.11.001] [Citation(s) in RCA: 69] [Impact Index Per Article: 9.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2016] [Accepted: 11/01/2016] [Indexed: 11/27/2022]

210

Analysis of Large-Scale Mutagenesis Data To Assess the Impact of Single Amino Acid Substitutions. Genetics 2017;207:53-61. [PMID: 28751422 PMCID: PMC5586385 DOI: 10.1534/genetics.117.300064] [Citation(s) in RCA: 66] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2017] [Accepted: 07/24/2017] [Indexed: 11/18/2022] Open

211

Koenig P, Sanowar S, Lee CV, Fuh G. Tuning the specificity of a Two-in-One Fab against three angiogenic antigens by fully utilizing the information of deep mutational scanning. MAbs 2017;9:959-967. [PMID: 28585908 PMCID: PMC5540083 DOI: 10.1080/19420862.2017.1337618] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2017] [Revised: 05/24/2017] [Accepted: 05/27/2017] [Indexed: 10/19/2022] Open

212

Massively Parallel Genetics. Genetics 2017;203:617-9. [PMID: 27270695 DOI: 10.1534/genetics.115.180562] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open

213

Sosa-Pagán JO, Iversen ES, Grandl J. TRPV1 temperature activation is specifically sensitive to strong decreases in amino acid hydrophobicity. Sci Rep 2017;7:549. [PMID: 28373693 PMCID: PMC5428820 DOI: 10.1038/s41598-017-00636-4] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2016] [Accepted: 03/07/2017] [Indexed: 12/15/2022] Open

214

Rinaldi S, Gori A, Annovazzi C, Ferrandi EE, Monti D, Colombo G. Unraveling Energy and Dynamics Determinants to Interpret Protein Functional Plasticity: The Limonene-1,2-epoxide-hydrolase Case Study. J Chem Inf Model 2017;57:717-725. [DOI: 10.1021/acs.jcim.6b00504] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]

215

Chan YH, Venev SV, Zeldovich KB, Matthews CR. Correlation of fitness landscapes from three orthologous TIM barrels originates from sequence and structure constraints. Nat Commun 2017;8:14614. [PMID: 28262665 PMCID: PMC5343507 DOI: 10.1038/ncomms14614] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2016] [Accepted: 01/11/2017] [Indexed: 02/07/2023] Open

216

Inference of the Distribution of Selection Coefficients for New Nonsynonymous Mutations Using Large Samples. Genetics 2017;206:345-361. [PMID: 28249985 PMCID: PMC5419480 DOI: 10.1534/genetics.116.197145] [Citation(s) in RCA: 113] [Impact Index Per Article: 16.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2016] [Accepted: 02/14/2017] [Indexed: 12/23/2022] Open

217

Najar TA, Khare S, Pandey R, Gupta SK, Varadarajan R. Mapping Protein Binding Sites and Conformational Epitopes Using Cysteine Labeling and Yeast Surface Display. Structure 2017;25:395-406. [PMID: 28132782 DOI: 10.1016/j.str.2016.12.016] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2016] [Revised: 12/10/2016] [Accepted: 12/28/2016] [Indexed: 11/16/2022]

218

Mutational landscape of antibody variable domains reveals a switch modulating the interdomain conformational dynamics and antigen binding. Proc Natl Acad Sci U S A 2017;114:E486-E495. [PMID: 28057863 DOI: 10.1073/pnas.1613231114] [Citation(s) in RCA: 56] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open

219

Adams RM, Mora T, Walczak AM, Kinney JB. Measuring the sequence-affinity landscape of antibodies with massively parallel titration curves. eLife 2016;5. [PMID: 28035901 PMCID: PMC5268739 DOI: 10.7554/elife.23156] [Citation(s) in RCA: 72] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2016] [Accepted: 12/27/2016] [Indexed: 11/30/2022] Open

Abstract

Despite the central role that antibodies play in the adaptive immune system and in biotechnology, much remains unknown about the quantitative relationship between an antibody’s amino acid sequence and its antigen binding affinity. Here we describe a new experimental approach, called Tite-Seq, that is capable of measuring binding titration curves and corresponding affinities for thousands of variant antibodies in parallel. The measurement of titration curves eliminates the confounding effects of antibody expression and stability that arise in standard deep mutational scanning assays. We demonstrate Tite-Seq on the CDR1H and CDR3H regions of a well-studied scFv antibody. Our data shed light on the structural basis for antigen binding affinity and suggests a role for secondary CDR loops in establishing antibody stability. Tite-Seq fills a large gap in the ability to measure critical aspects of the adaptive immune system, and can be readily used for studying sequence-affinity landscapes in other protein systems.

DOI:http://dx.doi.org/10.7554/eLife.23156.001

Antibodies are proteins produced by cells of the immune system to tag or neutralize potential threats to the body, such as foreign substances and disease-causing microbes. Antibodies do this by binding to target molecules called antigens. An antibody’s ability to bind to an antigen depends on the sequence of amino acids – the building blocks of proteins – that make up the antibody. Through a process that randomizes this sequence of amino acids, the immune system generates a vast pool of antibodies that are able to target almost any foreign antigen that exists in nature.

Currently, little is understood about how the sequence of amino acids in an antibody determines how strongly that antibody binds to its antigen target – a property referred to as the antibody’s binding affinity. Answering this fundamental question requires techniques that can measure the affinities of many different antibodies at the same time. However, previous high-throughput methods have been unable to provide quantitative measurements of binding affinities. These kinds of measurements are difficult because an antibody’s amino acid sequence governs more than just binding affinity: it also affects how easy it is to produce that antibody, and what fraction of antibody molecules work properly.

Adams et al. now describe a new method, named “Tite-Seq”, that overcomes these issues. First, thousands of different antibodies are displayed on the surface of yeast cells, with each cell carrying a single kind of antibody. These cells are then incubated with fluorescently labeled antigen at a wide range of different concentrations. Next, the yeast cells are sorted based on how brightly they glow; brighter cells have more antigen bound to them, and so it is possible to calculate how much of the antigen is bound to each kind of antibody at each concentration. Plotting these data provides a “binding curve” for each antibody, which is then used to read off the antibody’s binding affinity in a way that is not affected by the factors that have plagued other high-throughput methods.

Tite-Seq is thus able to measure the binding affinities for thousands of different antibodies at the same time. This will potentially allow researchers to address many fundamental and yet unanswered questions about how the immune system works. Tite-Seq can also be used to measure how amino acid sequence affects the binding affinity of proteins other than antibodies.

DOI:http://dx.doi.org/10.7554/eLife.23156.002

Collapse

220

Stolz A, Putyrski M, Kutle I, Huber J, Wang C, Major V, Sidhu SS, Youle RJ, Rogov VV, Dötsch V, Ernst A, Dikic I. Fluorescence-based ATG8 sensors monitor localization and function of LC3/GABARAP proteins. EMBO J 2016;36:549-564. [PMID: 28028054 DOI: 10.15252/embj.201695063] [Citation(s) in RCA: 36] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2016] [Revised: 11/23/2016] [Accepted: 11/27/2016] [Indexed: 12/25/2022] Open

221

Press O, Zvagelsky T, Vyazmensky M, Kleinau G, Engel S. Construction of Structural Mimetics of the Thyrotropin Receptor Intracellular Domain. Biophys J 2016;111:2620-2628. [PMID: 28002738 PMCID: PMC5192603 DOI: 10.1016/j.bpj.2016.11.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2016] [Revised: 10/26/2016] [Accepted: 11/02/2016] [Indexed: 10/20/2022] Open

222

Kowalsky CA, Whitehead TA. Determination of binding affinity upon mutation for type I dockerin-cohesin complexes from Clostridium thermocellum and Clostridium cellulolyticum using deep sequencing. Proteins 2016;84:1914-1928. [PMID: 27699856 DOI: 10.1002/prot.25175] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2016] [Revised: 09/05/2016] [Accepted: 09/27/2016] [Indexed: 12/27/2022]

223

Majithia AR, Tsuda B, Agostini M, Gnanapradeepan K, Rice R, Peloso G, Patel KA, Zhang X, Broekema MF, Patterson N, Duby M, Sharpe T, Kalkhoven E, Rosen ED, Barroso I, Ellard S, Kathiresan S, O’Rahilly S, Chatterjee K, Florez JC, Mikkelsen T, Savage DB, Altshuler D. Prospective functional classification of all possible missense variants in PPARG. Nat Genet 2016;48:1570-1575. [PMID: 27749844 PMCID: PMC5131844 DOI: 10.1038/ng.3700] [Citation(s) in RCA: 158] [Impact Index Per Article: 19.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2016] [Accepted: 09/23/2016] [Indexed: 12/13/2022]

Affiliation(s)

Amit R. Majithia Program in Medical & Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA Diabetes Research Center, Diabetes Unit, Department of Medicine, Massachusetts General Hospital, Boston, MA, USA Center for Human Genetic Research, Massachusetts General Hospital, Boston, MA, USA Department of Medicine, Harvard Medical School, Boston, MA, USA
Ben Tsuda Program in Medical & Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
Maura Agostini University of Cambridge Metabolic Research Laboratories, Wellcome Trust-Medical Research Council Institute of Metabolic Science, Cambridge CB2 0QQ, United Kingdom
Keerthana Gnanapradeepan Program in Medical & Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
Robert Rice Program in Medical & Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
Gina Peloso Program in Medical & Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA Department of Biostatistics, Boston University School of Public Health, Boston, MA, USA
Kashyap A. Patel Institute of Biomedical and Clinical Science, University of Exeter Medical School, Exeter, UK
Xiaolan Zhang Program in Medical & Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
Marjoleine F. Broekema Molecular Cancer Research and Center for Molecular Medicine, University Medical Centre Utrecht, Universiteitsweg 100, 3584 CG, Utrecht, The Netherlands
Nick Patterson Program in Medical & Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
Marc Duby Program in Medical & Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
Ted Sharpe Program in Medical & Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
Eric Kalkhoven Molecular Cancer Research and Center for Molecular Medicine, University Medical Centre Utrecht, Universiteitsweg 100, 3584 CG, Utrecht, The Netherlands
Evan D. Rosen Department of Medicine, Harvard Medical School, Boston, MA, USA Division of Endocrinology and Metabolism, Beth Israel Deaconess Medical Center, 330 Brookline Avenue, Boston, MA 02115, USA
Inês Barroso University of Cambridge Metabolic Research Laboratories, Wellcome Trust-Medical Research Council Institute of Metabolic Science, Cambridge CB2 0QQ, United Kingdom
Sian Ellard Institute of Biomedical and Clinical Science, University of Exeter Medical School, Exeter, UK Department of Molecular Genetics, Royal Devon and Exeter National Health Service Foundation Trust, Exeter, UK
UK Monogenic Diabetes Consortium
Sekar Kathiresan Program in Medical & Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA Center for Human Genetic Research, Massachusetts General Hospital, Boston, MA, USA Department of Medicine, Harvard Medical School, Boston, MA, USA Cardiovascular Research Center, Department of Medicine, Massachusetts General Hospital, Boston, MA, USA
Myocardial Infarction Genetics Consortium Cardiovascular Research Center, Department of Medicine, Massachusetts General Hospital, Boston, MA, USA
Stephen O’Rahilly University of Cambridge Metabolic Research Laboratories, Wellcome Trust-Medical Research Council Institute of Metabolic Science, Cambridge CB2 0QQ, United Kingdom
UK Congenital Lipodystrophy Consortium Cardiovascular Research Center, Department of Medicine, Massachusetts General Hospital, Boston, MA, USA
Krishna Chatterjee University of Cambridge Metabolic Research Laboratories, Wellcome Trust-Medical Research Council Institute of Metabolic Science, Cambridge CB2 0QQ, United Kingdom
Jose C. Florez Program in Medical & Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA Diabetes Research Center, Diabetes Unit, Department of Medicine, Massachusetts General Hospital, Boston, MA, USA Center for Human Genetic Research, Massachusetts General Hospital, Boston, MA, USA Department of Medicine, Harvard Medical School, Boston, MA, USA
Tarjei Mikkelsen Program in Medical & Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
David B. Savage University of Cambridge Metabolic Research Laboratories, Wellcome Trust-Medical Research Council Institute of Metabolic Science, Cambridge CB2 0QQ, United Kingdom
David Altshuler Program in Medical & Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA Diabetes Research Center, Diabetes Unit, Department of Medicine, Massachusetts General Hospital, Boston, MA, USA Center for Human Genetic Research, Massachusetts General Hospital, Boston, MA, USA Department of Medicine, Harvard Medical School, Boston, MA, USA

Collapse

224

Haddox HK, Dingens AS, Bloom JD. Experimental Estimation of the Effects of All Amino-Acid Mutations to HIV's Envelope Protein on Viral Replication in Cell Culture. PLoS Pathog 2016;12:e1006114. [PMID: 27959955 PMCID: PMC5189966 DOI: 10.1371/journal.ppat.1006114] [Citation(s) in RCA: 69] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2016] [Revised: 12/27/2016] [Accepted: 12/07/2016] [Indexed: 11/18/2022] Open

225

Payen C, Sunshine AB, Ong GT, Pogachar JL, Zhao W, Dunham MJ. High-Throughput Identification of Adaptive Mutations in Experimentally Evolved Yeast Populations. PLoS Genet 2016;12:e1006339. [PMID: 27727276 PMCID: PMC5065121 DOI: 10.1371/journal.pgen.1006339] [Citation(s) in RCA: 47] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2016] [Accepted: 09/05/2016] [Indexed: 11/19/2022] Open

Abstract

High-throughput sequencing has enabled genetic screens that can rapidly identify mutations that occur during experimental evolution. The presence of a mutation in an evolved lineage does not, however, constitute proof that the mutation is adaptive, given the well-known and widespread phenomenon of genetic hitchhiking, in which a non-adaptive or even detrimental mutation can co-occur in a genome with a beneficial mutation and the combined genotype is carried to high frequency by selection. We approximated the spectrum of possible beneficial mutations in Saccharomyces cerevisiae using sets of single-gene deletions and amplifications of almost all the genes in the S. cerevisiae genome. We determined the fitness effects of each mutation in three different nutrient-limited conditions using pooled competitions followed by barcode sequencing. Although most of the mutations were neutral or deleterious, ~500 of them increased fitness. We then compared those results to the mutations that actually occurred during experimental evolution in the same three nutrient-limited conditions. On average, ~35% of the mutations that occurred during experimental evolution were predicted by the systematic screen to be beneficial. We found that the distribution of fitness effects depended on the selective conditions. In the phosphate-limited and glucose-limited conditions, a large number of beneficial mutations of nearly equivalent, small effects drove the fitness increases. In the sulfate-limited condition, one type of mutation, the amplification of the high-affinity sulfate transporter, dominated. In the absence of that mutation, evolution in the sulfate-limited condition involved mutations in other genes that were not observed previously—but were predicted by the systematic screen. Thus, gross functional screens have the potential to predict and identify adaptive mutations that occur during experimental evolution.

Experimental evolution allows us to observe evolution in real time. New advances in genome sequencing make it trivial to discover the mutations that have arisen in evolved cultures; however, linking those mutations to particular adaptive traits remains difficult. We evaluated the fitness impacts of thousands of single-gene losses and amplifications in yeast. We discovered that only a fraction of the hundreds of possible beneficial mutations were actually detected in evolution experiments performed previously. Our results provide evidence that 35% of the mutations identified in experimentally evolved populations are advantageous and that the distribution of beneficial fitness effects depends on the genetic background and the selective conditions. Furthermore, we show that it is possible to select for alternative mutations that improve fitness by blocking particularly high-fitness routes to adaptation.

Collapse

226

Plasmid-based one-pot saturation mutagenesis. Nat Methods 2016;13:928-930. [PMID: 27723752 DOI: 10.1038/nmeth.4029] [Citation(s) in RCA: 107] [Impact Index Per Article: 13.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2016] [Accepted: 09/09/2016] [Indexed: 01/12/2023]

227

Au L, Green DF. Direct Calculation of Protein Fitness Landscapes through Computational Protein Design. Biophys J 2016;110:75-84. [PMID: 26745411 DOI: 10.1016/j.bpj.2015.11.029] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2015] [Revised: 11/03/2015] [Accepted: 11/16/2015] [Indexed: 11/24/2022] Open

228

Harris DT, Wang N, Riley TP, Anderson SD, Singh NK, Procko E, Baker BM, Kranz DM. Deep Mutational Scans as a Guide to Engineering High Affinity T Cell Receptor Interactions with Peptide-bound Major Histocompatibility Complex. J Biol Chem 2016;291:24566-24578. [PMID: 27681597 DOI: 10.1074/jbc.m116.748681] [Citation(s) in RCA: 39] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2016] [Revised: 09/15/2016] [Indexed: 11/06/2022] Open

229

Sun Z, Mehta SC, Adamski CJ, Gibbs RA, Palzkill T. Deep Sequencing of Random Mutant Libraries Reveals the Active Site of the Narrow Specificity CphA Metallo-β-Lactamase is Fragile to Mutations. Sci Rep 2016;6:33195. [PMID: 27616327 PMCID: PMC5018959 DOI: 10.1038/srep33195] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2016] [Accepted: 08/23/2016] [Indexed: 11/17/2022] Open

230

Protein stability: computation, sequence statistics, and new experimental methods. Curr Opin Struct Biol 2016;33:161-8. [PMID: 26497286 DOI: 10.1016/j.sbi.2015.09.002] [Citation(s) in RCA: 112] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2015] [Revised: 09/22/2015] [Accepted: 09/24/2015] [Indexed: 11/22/2022]

231

Wong LH, Sinha S, Bergeron JR, Mellor JC, Giaever G, Flaherty P, Nislow C. Reverse Chemical Genetics: Comprehensive Fitness Profiling Reveals the Spectrum of Drug Target Interactions. PLoS Genet 2016;12:e1006275. [PMID: 27588687 PMCID: PMC5010250 DOI: 10.1371/journal.pgen.1006275] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2016] [Accepted: 08/03/2016] [Indexed: 01/22/2023] Open

Abstract

The emergence and prevalence of drug resistance demands streamlined strategies to identify drug resistant variants in a fast, systematic and cost-effective way. Methods commonly used to understand and predict drug resistance rely on limited clinical studies from patients who are refractory to drugs or on laborious evolution experiments with poor coverage of the gene variants. Here, we report an integrative functional variomics methodology combining deep sequencing and a Bayesian statistical model to provide a comprehensive list of drug resistance alleles from complex variant populations. Dihydrofolate reductase, the target of methotrexate chemotherapy drug, was used as a model to identify functional mutant alleles correlated with methotrexate resistance. This systematic approach identified previously reported resistance mutations, as well as novel point mutations that were validated in vivo. Use of this systematic strategy as a routine diagnostics tool widens the scope of successful drug research and development.

One of the most profound outcomes of fast, reliable genome sequencing is the ability to tailor drug therapy to an individual’s genotype. This ‘personalized’ or ‘precision medicine’ is the realization of a decades-long effort to maximize drug effect and limit unwanted side effects. An undesirable consequence of such targeted therapies, however, is the emergence of drug resistance. This outcome is the result of an evolutionary process where mutations in the drug target render the drug perturbation allow such mutant cells to proliferate. Because of the unbiased, and stochastic nature of the emergence of drug resistance, it is impossible to predict. We developed a test where hundreds of thousands of mutant cells are exposed to a drug simultaneously and those cells that modulate resistance survive. This method is innovative because it partners a high-throughput experimental protocol with a tailored statistical model to identify all mutations that modulate resistance. Finally, we used synthetic biology to re-create these mutations and demonstrate that they were, in fact, bona fide drug-resistant variants. These mutations were further extended and confirmed to also be resistant in the human orthologue. This combined biological-computational approach allows one to identify drug’s degree of resistance to both guide treatments and future drug discovery.

Collapse

232

The power of multiplexed functional analysis of genetic variants. Nat Protoc 2016;11:1782-7. [PMID: 27583640 DOI: 10.1038/nprot.2016.135] [Citation(s) in RCA: 92] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2016] [Accepted: 07/13/2016] [Indexed: 12/30/2022]

233

Tripathi A, Gupta K, Khare S, Jain PC, Patel S, Kumar P, Pulianmackal AJ, Aghera N, Varadarajan R. Molecular Determinants of Mutant Phenotypes, Inferred from Saturation Mutagenesis Data. Mol Biol Evol 2016;33:2960-2975. [PMID: 27563054 PMCID: PMC5062330 DOI: 10.1093/molbev/msw182] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open

234

Rapid construction of metabolite biosensors using domain-insertion profiling. Nat Commun 2016;7:12266. [PMID: 27470466 PMCID: PMC4974565 DOI: 10.1038/ncomms12266] [Citation(s) in RCA: 82] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2015] [Accepted: 06/15/2016] [Indexed: 12/15/2022] Open

235

A Statistical Guide to the Design of Deep Mutational Scanning Experiments. Genetics 2016;204:77-87. [PMID: 27412710 DOI: 10.1534/genetics.116.190462] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2016] [Accepted: 06/29/2016] [Indexed: 12/21/2022] Open

236

Cooper GM. Parlez-vous VUS? Genome Res 2016;25:1423-6. [PMID: 26430151 PMCID: PMC4579326 DOI: 10.1101/gr.190116.115] [Citation(s) in RCA: 32] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]

237

Wu NC, Dai L, Olson CA, Lloyd-Smith JO, Sun R. Adaptation in protein fitness landscapes is facilitated by indirect paths. eLife 2016;5. [PMID: 27391790 PMCID: PMC4985287 DOI: 10.7554/elife.16965] [Citation(s) in RCA: 123] [Impact Index Per Article: 15.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2016] [Accepted: 07/07/2016] [Indexed: 12/11/2022] Open

Abstract

The structure of fitness landscapes is critical for understanding adaptive protein evolution. Previous empirical studies on fitness landscapes were confined to either the neighborhood around the wild type sequence, involving mostly single and double mutants, or a combinatorially complete subgraph involving only two amino acids at each site. In reality, the dimensionality of protein sequence space is higher (20^L) and there may be higher-order interactions among more than two sites. Here we experimentally characterized the fitness landscape of four sites in protein GB1, containing 20⁴ = 160,000 variants. We found that while reciprocal sign epistasis blocked many direct paths of adaptation, such evolutionary traps could be circumvented by indirect paths through genotype space involving gain and subsequent loss of mutations. These indirect paths alleviate the constraint on adaptive protein evolution, suggesting that the heretofore neglected dimensions of sequence space may change our views on how proteins evolve.

DOI:http://dx.doi.org/10.7554/eLife.16965.001

Proteins can evolve over time by changing their component parts, which are called amino acids. These changes usually happen one at a time and natural selection tends to preserve those changes that make the protein more efficient at its specific tasks, while discarding those that impair the protein’s activity. However the effect of each change depends on the protein as a whole, and so two changes that separately make the protein worse can make it much better if they occur together. This phenomenon is called epistasis and in some cases it can trap proteins in a sub-optimal form and prevent them from improving further.

Proteins are made from twenty different kinds of amino acid, and there are millions of different combinations of amino acids that could, in theory, make a protein of a given length. Studying protein evolution involves making variants of the same protein, each with just a few changes, and comparing how efficient, or “fit”, they are. Previous studies only measured the fitness of a few variants and showed that epistasis could block protein evolution by requiring the protein to lose some fitness before it could improve further. However, new techniques have now made it easier to study protein evolution by testing many more protein variants.

Wu, Dai et al. focused on four amino acids in part of a protein called GB1 and tested the efficiency of every possible combination of these four amino acids, a total of 160,000 (20⁴) variants. Contrary to expectations, the results suggested that the protein could evolve quickly to maximise fitness despite there being epistasis between the four amino acids. Overcoming epistasis typically involved making a change to one amino acid that paved the way for further changes while avoiding the need to lose fitness. The original change could then be reversed once the epistasis was overcome. The complexity of this solution means it can only be seen by studying a large number of protein variants that represent many alternative sequences of protein changes.

Wu, Dai et al. conclude that proteins are able to achieve a higher level of fitness through evolution by exploring a large number of changes. There are many possible changes for each protein and it is this variety that, despite epistasis, allows proteins to become naturally optimised for the tasks that they perform. While the full complexity of protein evolution cannot be explored at the moment, as technology advances it will become possible to study more protein variants. Such advances would therefore hopefully allow researchers to discover even more about the natural mechanisms of protein evolution.

DOI:http://dx.doi.org/10.7554/eLife.16965.002

Collapse

238

Mannakee BK, Gutenkunst RN. Selection on Network Dynamics Drives Differential Rates of Protein Domain Evolution. PLoS Genet 2016;12:e1006132. [PMID: 27380265 PMCID: PMC4933380 DOI: 10.1371/journal.pgen.1006132] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2016] [Accepted: 05/27/2016] [Indexed: 11/19/2022] Open

239

Stiffler MA, Subramanian SK, Salinas VH, Ranganathan R. A Protocol for Functional Assessment of Whole-Protein Saturation Mutagenesis Libraries Utilizing High-Throughput Sequencing. J Vis Exp 2016. [PMID: 27403811 DOI: 10.3791/54119] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/31/2022] Open

240

Starr TN, Thornton JW. Epistasis in protein evolution. Protein Sci 2016;25:1204-18. [PMID: 26833806 PMCID: PMC4918427 DOI: 10.1002/pro.2897] [Citation(s) in RCA: 296] [Impact Index Per Article: 37.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2015] [Revised: 01/25/2016] [Accepted: 01/27/2016] [Indexed: 01/18/2023]

241

Abriata LA, Bovigny C, Dal Peraro M. Detection and sequence/structure mapping of biophysical constraints to protein variation in saturated mutational libraries and protein sequence alignments with a dedicated server. BMC Bioinformatics 2016;17:242. [PMID: 27315797 PMCID: PMC4912743 DOI: 10.1186/s12859-016-1124-4] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2016] [Accepted: 06/07/2016] [Indexed: 11/21/2022] Open

Abstract

Background

Protein variability can now be studied by measuring high-resolution tolerance-to-substitution maps and fitness landscapes in saturated mutational libraries. But these rich and expensive datasets are typically interpreted coarsely, restricting detailed analyses to positions of extremely high or low variability or dubbed important beforehand based on existing knowledge about active sites, interaction surfaces, (de)stabilizing mutations, etc.

Results

Our new webserver PsychoProt (freely available without registration at http://psychoprot.epfl.ch or at http://lucianoabriata.altervista.org/psychoprot/index.html) helps to detect, quantify, and sequence/structure map the biophysical and biochemical traits that shape amino acid preferences throughout a protein as determined by deep-sequencing of saturated mutational libraries or from large alignments of naturally occurring variants.

Discussion

We exemplify how PsychoProt helps to (i) unveil protein structure-function relationships from experiments and from alignments that are consistent with structures according to coevolution analysis, (ii) recall global information about structural and functional features and identify hitherto unknown constraints to variation in alignments, and (iii) point at different sources of variation among related experimental datasets or between experimental and alignment-based data. Remarkably, metabolic costs of the amino acids pose strong constraints to variability at protein surfaces in nature but not in the laboratory. This and other differences call for caution when extrapolating results from in vitro experiments to natural scenarios in, for example, studies of protein evolution.

Conclusion

We show through examples how PsychoProt can be a useful tool for the broad communities of structural biology and molecular evolution, particularly for studies about protein modeling, evolution and design.

Electronic supplementary material

The online version of this article (doi:10.1186/s12859-016-1124-4) contains supplementary material, which is available to authorized users.

Collapse

242

Local fitness landscape of the green fluorescent protein. Nature 2016;533:397-401. [PMID: 27193686 PMCID: PMC4968632 DOI: 10.1038/nature17995] [Citation(s) in RCA: 282] [Impact Index Per Article: 35.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2015] [Accepted: 04/07/2016] [Indexed: 01/16/2023]

243

Boyer S, Biswas D, Kumar Soshee A, Scaramozzino N, Nizak C, Rivoire O. Hierarchy and extremes in selections from pools of randomized proteins. Proc Natl Acad Sci U S A 2016;113:3482-7. [PMID: 26969726 PMCID: PMC4822605 DOI: 10.1073/pnas.1517813113] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open

244

Peterman N, Levine E. Sort-seq under the hood: implications of design choices on large-scale characterization of sequence-function relations. BMC Genomics 2016;17:206. [PMID: 26956374 PMCID: PMC4784318 DOI: 10.1186/s12864-016-2533-5] [Citation(s) in RCA: 38] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2015] [Accepted: 02/25/2016] [Indexed: 12/22/2022] Open

Abstract

Background

Sort-seq is an effective approach for simultaneous activity measurements in a large-scale library, combining flow cytometry, deep sequencing, and statistical inference. Such assays enable the characterization of functional landscapes at unprecedented scale for a wide-reaching array of biological molecules and functionalities in vivo. Applications of sort-seq range from footprinting to establishing quantitative models of biological systems and rational design of synthetic genetic elements. Nearly as diverse are implementations of this technique, reflecting key design choices with extensive impact on the scope and accuracy the results. Yet how to make these choices remains unclear. Here we investigate the effects of alternative sort-seq designs and inference methods on the information output using mathematical formulation and simulations.

Results

We identify key intrinsic properties of any system of interest with practical implications for sort-seq assays, depending on the experimental goals. The fluorescence range and cell-to-cell variability specify the number of sorted populations needed for quantitative measurements that are precise and unbiased. These factors also indicate cases where an enrichment-based approach that uses a single sorted population can offer satisfactory results. These predications of our model are corroborated using re-analysis of published data. We explore implications of these results for quantitative modeling and library design.

Conclusions

Sort-seq assays can be streamlined by reducing the number of sorted populations, saving considerable resources. Simple preliminary experiments can guide optimal experiment design, minimizing cost while maintaining the maximal information output and avoiding latent biases. These insights can facilitate future applications of this highly adaptable technique.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-016-2533-5) contains supplementary material, which is available to authorized users.

Collapse

245

Phillips AM, Shoulders MD. The Path of Least Resistance: Mechanisms to Reduce Influenza's Sensitivity to Oseltamivir. J Mol Biol 2016;428:533-537. [PMID: 26748011 DOI: 10.1016/j.jmb.2015.12.019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]

246

A benchmark study on error-correction by read-pairing and tag-clustering in amplicon-based deep sequencing. BMC Genomics 2016;17:108. [PMID: 26868371 PMCID: PMC4751728 DOI: 10.1186/s12864-016-2388-9] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2015] [Accepted: 01/08/2016] [Indexed: 11/10/2022] Open

247

Echave J, Spielman SJ, Wilke CO. Causes of evolutionary rate variation among protein sites. Nat Rev Genet 2016;17:109-21. [PMID: 26781812 DOI: 10.1038/nrg.2015.18] [Citation(s) in RCA: 176] [Impact Index Per Article: 22.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]

248

Tinberg CE, Khare SD. Improving Binding Affinity and Selectivity of Computationally Designed Ligand-Binding Proteins Using Experiments. Methods Mol Biol 2016;1414:155-171. [PMID: 27094290 DOI: 10.1007/978-1-4939-3569-7_9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]

249

Generating High-Accuracy Peptide-Binding Data in High Throughput with Yeast Surface Display and SORTCERY. Methods Mol Biol 2016;1414:233-47. [PMID: 27094295 DOI: 10.1007/978-1-4939-3569-7_14] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]

250

Sahoo A, Khare S, Devanarayanan S, Jain PC, Varadarajan R. Residue proximity information and protein model discrimination using saturation-suppressor mutagenesis. eLife 2015;4. [PMID: 26716404 PMCID: PMC4758949 DOI: 10.7554/elife.09532] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2015] [Accepted: 12/29/2015] [Indexed: 12/16/2022] Open

Abstract

Identification of residue-residue contacts from primary sequence can be used to guide protein structure prediction. Using Escherichia coli CcdB as the test case, we describe an experimental method termed saturation-suppressor mutagenesis to acquire residue contact information. In this methodology, for each of five inactive CcdB mutants, exhaustive screens for suppressors were performed. Proximal suppressors were accurately discriminated from distal suppressors based on their phenotypes when present as single mutants. Experimentally identified putative proximal pairs formed spatial constraints to recover >98% of native-like models of CcdB from a decoy dataset. Suppressor methodology was also applied to the integral membrane protein, diacylglycerol kinase A where the structures determined by X-ray crystallography and NMR were significantly different. Suppressor as well as sequence co-variation data clearly point to the X-ray structure being the functional one adopted in vivo. The methodology is applicable to any macromolecular system for which a convenient phenotypic assay exists.

DOI:http://dx.doi.org/10.7554/eLife.09532.001

Common techniques to determine the three-dimensional structures of proteins can help researchers to understand these molecules’ activities, but are often time-consuming and do not work for all proteins. Proteins are made of chains of amino acids. When a protein chain folds, some of these amino acids interact with other amino acids and these contacts dictate the overall shape of the protein. This means that identifying the pairs of contacting amino acids could make it possible to predict the protein’s structure.

Interactions between pairs of contacting amino acids tend to remain conserved throughout evolution, and if a mutation alters one of the amino acids in a pair then a 'compensatory' change often occurs to alter the second amino acid as well. Compensatory mutations can suggest that two amino acids are close to each other in the three-dimensional shape of a protein, but the computational methods used to identify such amino acid pairs can sometimes be inaccurate.

In 2012, researchers generated mutants of a bacterial protein called CcdB with changes to single amino acids that caused the protein to fail to fold correctly. Now, Sahoo et al. – who include two of the researchers involved in the 2012 work – have developed an experimental method to identify contacting amino acids and use the CcdB protein as a test case. The approach involved searching for additional mutations that could restore the activity of five of the original mutant proteins when the proteins were produced in yeast cells. The rationale was that any secondary mutations that restored the activity must have corrected the folding defect caused by the original mutation. Sahoo et al. then predicted how close the amino acids affected by the secondary mutations were to the amino acids altered by the original mutations. This information was used to select reliable three-dimensional models of CcdB from a large set of possible structures that had been generated previously using computer models.

Next, the technique was applied to a protein called diacylglycerol kinase A. The structure of this protein had previously been inferred using techniques such as X-ray crystallography and nuclear magnetic resonance, but there was a mismatch between the two methods. Sahoo et al. found that the amino acid contacts derived from their experimental method matched those found in the crystal structure, suggesting that the functional protein structure in living cells is similar to the crystal structure. In the future, the experimental approach developed in this work could be combined with existing methods to reliably guide protein structure prediction.

DOI:http://dx.doi.org/10.7554/eLife.09532.002

Collapse