Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Sailer ZR, Harms MJ. Detecting High-Order Epistasis in Nonlinear Genotype-Phenotype Maps. Genetics 2017;205:1079-88. [PMID: 28100592 DOI: 10.1534/genetics.116.195214] [Citation(s) in RCA: 84] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2016] [Accepted: 01/09/2017] [Indexed: 11/18/2022] Open

For:	Sailer ZR, Harms MJ. Detecting High-Order Epistasis in Nonlinear Genotype-Phenotype Maps. Genetics 2017;205:1079-88. [PMID: 28100592 DOI: 10.1534/genetics.116.195214] [Citation(s) in RCA: 84] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2016] [Accepted: 01/09/2017] [Indexed: 11/18/2022] Open

Number

Cited by Other Article(s)

Faure AJ, Martí-Aranda A, Hidalgo-Carcedo C, Beltran A, Schmiedel JM, Lehner B. The genetic architecture of protein stability. Nature 2024:10.1038/s41586-024-07966-0. [PMID: 39322666 DOI: 10.1038/s41586-024-07966-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2023] [Accepted: 08/20/2024] [Indexed: 09/27/2024]

Park Y, Metzger BPH, Thornton JW. The simplicity of protein sequence-function relationships. Nat Commun 2024;15:7953. [PMID: 39261454 PMCID: PMC11390738 DOI: 10.1038/s41467-024-51895-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2024] [Accepted: 08/20/2024] [Indexed: 09/13/2024] Open

Welsh FC, Eguia RT, Lee JM, Haddox HK, Galloway J, Van Vinh Chau N, Loes AN, Huddleston J, Yu TC, Quynh Le M, Nhat NTD, Thi Le Thanh N, Greninger AL, Chu HY, Englund JA, Bedford T, Matsen FA, Boni MF, Bloom JD. Age-dependent heterogeneity in the antigenic effects of mutations to influenza hemagglutinin. Cell Host Microbe 2024;32:1397-1411.e11. [PMID: 39032493 PMCID: PMC11329357 DOI: 10.1016/j.chom.2024.06.015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2023] [Revised: 04/19/2024] [Accepted: 06/25/2024] [Indexed: 07/23/2024]

Affiliation(s)

Frances C Welsh Molecular and Cellular Biology Graduate Program, University of Washington, and Basic Sciences Division, Fred Hutch Cancer Center, Seattle, WA 98109, USA; Basic Sciences Division and Computational Biology Program, Fred Hutchinson Cancer Center, Seattle, WA 98109, USA
Rachel T Eguia Basic Sciences Division and Computational Biology Program, Fred Hutchinson Cancer Center, Seattle, WA 98109, USA; Howard Hughes Medical Institute, Seattle, WA 98109, USA
Juhye M Lee Basic Sciences Division and Computational Biology Program, Fred Hutchinson Cancer Center, Seattle, WA 98109, USA; Howard Hughes Medical Institute, Seattle, WA 98109, USA
Hugh K Haddox Basic Sciences Division and Computational Biology Program, Fred Hutchinson Cancer Center, Seattle, WA 98109, USA
Jared Galloway Basic Sciences Division and Computational Biology Program, Fred Hutchinson Cancer Center, Seattle, WA 98109, USA
Nguyen Van Vinh Chau Wellcome Trust Major Overseas Programme, Oxford University Clinical Research Unit, Ho Chi Minh City, Vietnam
Andrea N Loes Basic Sciences Division and Computational Biology Program, Fred Hutchinson Cancer Center, Seattle, WA 98109, USA; Howard Hughes Medical Institute, Seattle, WA 98109, USA
John Huddleston Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Center, Seattle, WA 98109, USA
Timothy C Yu Molecular and Cellular Biology Graduate Program, University of Washington, and Basic Sciences Division, Fred Hutch Cancer Center, Seattle, WA 98109, USA; Basic Sciences Division and Computational Biology Program, Fred Hutchinson Cancer Center, Seattle, WA 98109, USA
Mai Quynh Le National Institutes for Hygiene and Epidemiology, Hanoi, Vietnam
Nguyen T D Nhat Wellcome Trust Major Overseas Programme, Oxford University Clinical Research Unit, Ho Chi Minh City, Vietnam; Centre for Tropical Medicine, Nuffield Department of Clinical Medicine, University of Oxford, Oxford, UK
Nguyen Thi Le Thanh Wellcome Trust Major Overseas Programme, Oxford University Clinical Research Unit, Ho Chi Minh City, Vietnam
Alexander L Greninger Department of Laboratory Medicine and Pathology, University of Washington School of Medicine, Seattle, WA 98195, USA; Division of Allergy and Infectious Diseases, University of Washington School of Medicine, Seattle, WA 98195, USA
Helen Y Chu Division of Allergy and Infectious Diseases, University of Washington School of Medicine, Seattle, WA 98195, USA
Janet A Englund Seattle Children's Research Institute, Seattle, WA 98109, USA
Trevor Bedford Howard Hughes Medical Institute, Seattle, WA 98109, USA; Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Center, Seattle, WA 98109, USA
Frederick A Matsen Basic Sciences Division and Computational Biology Program, Fred Hutchinson Cancer Center, Seattle, WA 98109, USA; Howard Hughes Medical Institute, Seattle, WA 98109, USA
Maciej F Boni Wellcome Trust Major Overseas Programme, Oxford University Clinical Research Unit, Ho Chi Minh City, Vietnam; Centre for Tropical Medicine, Nuffield Department of Clinical Medicine, University of Oxford, Oxford, UK; Center for Infectious Disease Dynamics, Department of Biology, Pennsylvania State University, University Park, PA 16802, USA
Jesse D Bloom Basic Sciences Division and Computational Biology Program, Fred Hutchinson Cancer Center, Seattle, WA 98109, USA; Howard Hughes Medical Institute, Seattle, WA 98109, USA.

Collapse

Chamness LM, Kuntz CP, McKee AG, Penn WD, Hemmerich CM, Rusch DB, Woods H, Dyotima, Meiler J, Schlebach JP. Divergent folding-mediated epistasis among unstable membrane protein variants. eLife 2024;12:RP92406. [PMID: 39078397 PMCID: PMC11288631 DOI: 10.7554/elife.92406] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/31/2024] Open

Posfai A, Zhou J, McCandlish DM, Kinney JB. Gauge fixing for sequence-function relationships. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.12.593772. [PMID: 38798671 PMCID: PMC11118547 DOI: 10.1101/2024.05.12.593772] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2024]

Diaz-Colunga J, Skwara A, Vila JCC, Bajic D, Sanchez A. Global epistasis and the emergence of function in microbial consortia. Cell 2024;187:3108-3119.e30. [PMID: 38776921 DOI: 10.1016/j.cell.2024.04.016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2023] [Revised: 12/06/2023] [Accepted: 04/16/2024] [Indexed: 05/25/2024]

Hulse SV, Bruns EL. The Emergence of Non-Linear Evolutionary Trade-offs and the Maintenance of Genetic Polymorphisms. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.29.595890. [PMID: 38853830 PMCID: PMC11160725 DOI: 10.1101/2024.05.29.595890] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2024]

Metzger BPH, Park Y, Starr TN, Thornton JW. Epistasis facilitates functional evolution in an ancient transcription factor. eLife 2024;12:RP88737. [PMID: 38767330 PMCID: PMC11105156 DOI: 10.7554/elife.88737] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/22/2024] Open

Abstract

A protein's genetic architecture - the set of causal rules by which its sequence produces its functions - also determines its possible evolutionary trajectories. Prior research has proposed that the genetic architecture of proteins is very complex, with pervasive epistatic interactions that constrain evolution and make function difficult to predict from sequence. Most of this work has analyzed only the direct paths between two proteins of interest - excluding the vast majority of possible genotypes and evolutionary trajectories - and has considered only a single protein function, leaving unaddressed the genetic architecture of functional specificity and its impact on the evolution of new functions. Here, we develop a new method based on ordinal logistic regression to directly characterize the global genetic determinants of multiple protein functions from 20-state combinatorial deep mutational scanning (DMS) experiments. We use it to dissect the genetic architecture and evolution of a transcription factor's specificity for DNA, using data from a combinatorial DMS of an ancient steroid hormone receptor's capacity to activate transcription from two biologically relevant DNA elements. We show that the genetic architecture of DNA recognition consists of a dense set of main and pairwise effects that involve virtually every possible amino acid state in the protein-DNA interface, but higher-order epistasis plays only a tiny role. Pairwise interactions enlarge the set of functional sequences and are the primary determinants of specificity for different DNA elements. They also massively expand the number of opportunities for single-residue mutations to switch specificity from one DNA target to another. By bringing variants with different functions close together in sequence space, pairwise epistasis therefore facilitates rather than constrains the evolution of new functions.

Collapse

Behr M, Kumbier K, Cordova-Palomera A, Aguirre M, Ronen O, Ye C, Ashley E, Butte AJ, Arnaout R, Brown B, Priest J, Yu B. Learning epistatic polygenic phenotypes with Boolean interactions. PLoS One 2024;19:e0298906. [PMID: 38625909 PMCID: PMC11020961 DOI: 10.1371/journal.pone.0298906] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2023] [Accepted: 01/31/2024] [Indexed: 04/18/2024] Open

Abstract

Detecting epistatic drivers of human phenotypes is a considerable challenge. Traditional approaches use regression to sequentially test multiplicative interaction terms involving pairs of genetic variants. For higher-order interactions and genome-wide large-scale data, this strategy is computationally intractable. Moreover, multiplicative terms used in regression modeling may not capture the form of biological interactions. Building on the Predictability, Computability, Stability (PCS) framework, we introduce the epiTree pipeline to extract higher-order interactions from genomic data using tree-based models. The epiTree pipeline first selects a set of variants derived from tissue-specific estimates of gene expression. Next, it uses iterative random forests (iRF) to search training data for candidate Boolean interactions (pairwise and higher-order). We derive significance tests for interactions, based on a stabilized likelihood ratio test, by simulating Boolean tree-structured null (no epistasis) and alternative (epistasis) distributions on hold-out test data. Finally, our pipeline computes PCS epistasis p-values that probabilisticly quantify improvement in prediction accuracy via bootstrap sampling on the test set. We validate the epiTree pipeline in two case studies using data from the UK Biobank: predicting red hair and multiple sclerosis (MS). In the case of predicting red hair, epiTree recovers known epistatic interactions surrounding MC1R and novel interactions, representing non-linearities not captured by logistic regression models. In the case of predicting MS, a more complex phenotype than red hair, epiTree rankings prioritize novel interactions surrounding HLA-DRB1, a variant previously associated with MS in several populations. Taken together, these results highlight the potential for epiTree rankings to help reduce the design space for follow up experiments.

Collapse

Affiliation(s)

Merle Behr Faculty of Informatics and Data Science, University of Regensburg, Regensburg, Germany
Karl Kumbier Department of Pharmaceutical Chemistry, University of California, San Francisco, San Francisco, CA, United States of America
Aldo Cordova-Palomera Department of Pediatrics, Stanford Medicine, Stanford, CA, United States of America
Matthew Aguirre Department of Pediatrics, Stanford Medicine, Stanford, CA, United States of America Department of Biomedical Data Science, Stanford Medicine, Stanford, CA, United States of America
Omer Ronen Department of Statistics, University of California at Berkeley, Berkeley, CA, United States of America
Chengzhong Ye Department of Statistics, University of California at Berkeley, Berkeley, CA, United States of America
Euan Ashley Division of Cardiovascular Medicine, Stanford Medicine, Stanford, CA, United States of America
Atul J. Butte Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, CA, United States of America
Rima Arnaout Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, CA, United States of America Division of Cardiology, Department of Medicine, University of California, San Francisco, San Francisco, CA, United States of America
Ben Brown Department of Statistics, University of California at Berkeley, Berkeley, CA, United States of America Biosciences Area, Lawrence Berkeley National Laboratory, Berkeley, CA, United States of America
James Priest Department of Pediatrics, Stanford Medicine, Stanford, CA, United States of America
Bin Yu Department of Statistics, University of California at Berkeley, Berkeley, CA, United States of America Department of Electrical Engineering and Computer Sciences and Center for Computational Biology, University of California at Berkeley, Berkeley, CA, United States of America

Collapse

Park Y, Metzger BP, Thornton JW. The simplicity of protein sequence-function relationships. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.09.02.556057. [PMID: 37732229 PMCID: PMC10508729 DOI: 10.1101/2023.09.02.556057] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/22/2023]

Fannjiang C, Listgarten J. Is Novelty Predictable? Cold Spring Harb Perspect Biol 2024;16:a041469. [PMID: 38052497 PMCID: PMC10835614 DOI: 10.1101/cshperspect.a041469] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/07/2023]

Dupic T, Phillips AM, Desai MM. Protein sequence landscapes are not so simple: on reference-free versus reference-based inference. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.01.29.577800. [PMID: 38352387 PMCID: PMC10862727 DOI: 10.1101/2024.01.29.577800] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/23/2024]

Buda K, Miton CM, Tokuriki N. Pervasive epistasis exposes intramolecular networks in adaptive enzyme evolution. Nat Commun 2023;14:8508. [PMID: 38129396 PMCID: PMC10739712 DOI: 10.1038/s41467-023-44333-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2023] [Accepted: 12/08/2023] [Indexed: 12/23/2023] Open

Eble H, Joswig M, Lamberti L, Ludington WB. Master regulators of biological systems in higher dimensions. Proc Natl Acad Sci U S A 2023;120:e2300634120. [PMID: 38096409 PMCID: PMC10743376 DOI: 10.1073/pnas.2300634120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2023] [Accepted: 10/23/2023] [Indexed: 12/18/2023] Open

Welsh FC, Eguia RT, Lee JM, Haddox HK, Galloway J, Chau NVV, Loes AN, Huddleston J, Yu TC, Le MQ, Nhat NTD, Thanh NTL, Greninger AL, Chu HY, Englund JA, Bedford T, Matsen FA, Boni MF, Bloom JD. Age-dependent heterogeneity in the antigenic effects of mutations to influenza hemagglutinin. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.12.12.571235. [PMID: 38168237 PMCID: PMC10760046 DOI: 10.1101/2023.12.12.571235] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/05/2024]

Affiliation(s)

Frances C Welsh Molecular and Cellular Biology Graduate Program, University of Washington, and Basic Sciences Division, Fred Hutch Cancer Center, Seattle, WA, 98109, USA Basic Sciences Division and Computational Biology Program, Fred Hutchinson Cancer Center, Seattle, WA, 98109, USA
Rachel T Eguia Basic Sciences Division and Computational Biology Program, Fred Hutchinson Cancer Center, Seattle, WA, 98109, USA Howard Hughes Medical Institute, Seattle, WA, 98109, USA
Juhye M Lee Basic Sciences Division and Computational Biology Program, Fred Hutchinson Cancer Center, Seattle, WA, 98109, USA Howard Hughes Medical Institute, Seattle, WA, 98109, USA
Hugh K Haddox Basic Sciences Division and Computational Biology Program, Fred Hutchinson Cancer Center, Seattle, WA, 98109, USA
Jared Galloway Basic Sciences Division and Computational Biology Program, Fred Hutchinson Cancer Center, Seattle, WA, 98109, USA
Nguyen Van Vinh Chau Wellcome Trust Major Overseas Programme, Oxford University Clinical Research Unit, Ho Chi Minh City, Vietnam
Andrea N Loes Basic Sciences Division and Computational Biology Program, Fred Hutchinson Cancer Center, Seattle, WA, 98109, USA Howard Hughes Medical Institute, Seattle, WA, 98109, USA
John Huddleston Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Center, Seattle, WA, 98109, USA
Timothy C Yu Molecular and Cellular Biology Graduate Program, University of Washington, and Basic Sciences Division, Fred Hutch Cancer Center, Seattle, WA, 98109, USA Basic Sciences Division and Computational Biology Program, Fred Hutchinson Cancer Center, Seattle, WA, 98109, USA
Mai Quynh Le National Institutes for Hygiene and Epidemiology, Hanoi, Vietnam
Nguyen T D Nhat Wellcome Trust Major Overseas Programme, Oxford University Clinical Research Unit, Ho Chi Minh City, Vietnam Centre for Tropical Medicine, Nuffield Department of Clinical Medicine, University of Oxford, Oxford, United Kingdom
Nguyen Thi Le Thanh Wellcome Trust Major Overseas Programme, Oxford University Clinical Research Unit, Ho Chi Minh City, Vietnam
Alexander L Greninger Department of Laboratory Medicine and Pathology, University of Washington School of Medicine, Seattle, WA, 98195, USA Division of Allergy and Infectious Diseases, University of Washington School of Medicine, Seattle, WA, 98195, USA
Helen Y Chu Division of Allergy and Infectious Diseases, University of Washington School of Medicine, Seattle, WA, 98195, USA
Janet A Englund Seattle Children's Research Institute, Seattle, WA, 98109, USA
Trevor Bedford Howard Hughes Medical Institute, Seattle, WA, 98109, USA Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Center, Seattle, WA, 98109, USA
Frederick A Matsen Basic Sciences Division and Computational Biology Program, Fred Hutchinson Cancer Center, Seattle, WA, 98109, USA Howard Hughes Medical Institute, Seattle, WA, 98109, USA
Maciej F Boni Wellcome Trust Major Overseas Programme, Oxford University Clinical Research Unit, Ho Chi Minh City, Vietnam Centre for Tropical Medicine, Nuffield Department of Clinical Medicine, University of Oxford, Oxford, United Kingdom Center for Infectious Disease Dynamics, Department of Biology, Pennsylvania State University, University Park, PA, 16802, USA
Jesse D Bloom Basic Sciences Division and Computational Biology Program, Fred Hutchinson Cancer Center, Seattle, WA, 98109, USA Howard Hughes Medical Institute, Seattle, WA, 98109, USA

Collapse

Arya S, George AB, O’Dwyer JP. Sparsity of higher-order landscape interactions enables learning and prediction for microbiomes. Proc Natl Acad Sci U S A 2023;120:e2307313120. [PMID: 37991947 PMCID: PMC10691334 DOI: 10.1073/pnas.2307313120] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2023] [Accepted: 10/16/2023] [Indexed: 11/24/2023] Open

Santorsola M, Lescai F. The promise of explainable deep learning for omics data analysis: Adding new discovery tools to AI. N Biotechnol 2023;77:1-11. [PMID: 37329982 DOI: 10.1016/j.nbt.2023.06.002] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2023] [Revised: 06/01/2023] [Accepted: 06/14/2023] [Indexed: 06/19/2023]

Charest N, Shen Y, Lai YC, Chen IA, Shea JE. Discovering pathways through ribozyme fitness landscapes using information theoretic quantification of epistasis. RNA (NEW YORK, N.Y.) 2023;29:1644-1657. [PMID: 37580126 PMCID: PMC10578471 DOI: 10.1261/rna.079541.122] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/02/2022] [Accepted: 07/29/2023] [Indexed: 08/16/2023]

Yitbarek S, Guittar J, Knutie SA, Ogbunugafor CB. Deconstructing taxa x taxa xenvironment interactions in the microbiota: A theoretical examination. iScience 2023;26:107875. [PMID: 37860776 PMCID: PMC10583047 DOI: 10.1016/j.isci.2023.107875] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2022] [Revised: 03/21/2023] [Accepted: 09/07/2023] [Indexed: 10/21/2023] Open

Haddox HK, Galloway JG, Dadonaite B, Bloom JD, Matsen IV FA, DeWitt WS. Jointly modeling deep mutational scans identifies shifted mutational effects among SARS-CoV-2 spike homologs. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.07.31.551037. [PMID: 37577604 PMCID: PMC10418112 DOI: 10.1101/2023.07.31.551037] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/15/2023]

Radford CE, Schommers P, Gieselmann L, Crawford KHD, Dadonaite B, Yu TC, Dingens AS, Overbaugh J, Klein F, Bloom JD. Mapping the neutralizing specificity of human anti-HIV serum by deep mutational scanning. Cell Host Microbe 2023;31:1200-1215.e9. [PMID: 37327779 PMCID: PMC10351223 DOI: 10.1016/j.chom.2023.05.025] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2023] [Revised: 05/15/2023] [Accepted: 05/23/2023] [Indexed: 06/18/2023]

Affiliation(s)

Caelan E Radford Molecular and Cellular Biology Graduate Program, University of Washington and Basic Sciences Division, Fred Hutch Cancer Center, Seattle, WA 98109, USA; Basic Sciences Division and Computational Biology Program, Fred Hutchinson Cancer Center, Seattle, WA 98109, USA
Philipp Schommers Laboratory of Experimental Immunology, Institute of Virology, Faculty of Medicine and University Hospital of Cologne, University of Cologne, 50931 Cologne, Germany; German Center for Infection Research, partner site Bonn-Cologne, 50931 Cologne, Germany; Department I of Internal Medicine, Faculty of Medicine and University Hospital of Cologne, University of Cologne, 50931 Cologne, Germany
Lutz Gieselmann Laboratory of Experimental Immunology, Institute of Virology, Faculty of Medicine and University Hospital of Cologne, University of Cologne, 50931 Cologne, Germany; German Center for Infection Research, partner site Bonn-Cologne, 50931 Cologne, Germany; Department I of Internal Medicine, Faculty of Medicine and University Hospital of Cologne, University of Cologne, 50931 Cologne, Germany
Katharine H D Crawford Basic Sciences Division and Computational Biology Program, Fred Hutchinson Cancer Center, Seattle, WA 98109, USA; Department of Genome Sciences & Medical Scientist Training Program, University of Washington, Seattle, WA 98109, USA
Bernadeta Dadonaite Basic Sciences Division and Computational Biology Program, Fred Hutchinson Cancer Center, Seattle, WA 98109, USA
Timothy C Yu Molecular and Cellular Biology Graduate Program, University of Washington and Basic Sciences Division, Fred Hutch Cancer Center, Seattle, WA 98109, USA; Basic Sciences Division and Computational Biology Program, Fred Hutchinson Cancer Center, Seattle, WA 98109, USA
Adam S Dingens Basic Sciences Division and Computational Biology Program, Fred Hutchinson Cancer Center, Seattle, WA 98109, USA
Julie Overbaugh Division of Human Biology, Fred Hutchinson Cancer Center, Seattle, WA 98109, USA
Florian Klein Laboratory of Experimental Immunology, Institute of Virology, Faculty of Medicine and University Hospital of Cologne, University of Cologne, 50931 Cologne, Germany; German Center for Infection Research, partner site Bonn-Cologne, 50931 Cologne, Germany; Department I of Internal Medicine, Faculty of Medicine and University Hospital of Cologne, University of Cologne, 50931 Cologne, Germany
Jesse D Bloom Basic Sciences Division and Computational Biology Program, Fred Hutchinson Cancer Center, Seattle, WA 98109, USA; Howard Hughes Medical Institute, Seattle, WA 98109, USA.

Collapse

Diaz-Colunga J, Skwara A, Gowda K, Diaz-Uriarte R, Tikhonov M, Bajic D, Sanchez A. Global epistasis on fitness landscapes. Philos Trans R Soc Lond B Biol Sci 2023;378:20220053. [PMID: 37004717 PMCID: PMC10067270 DOI: 10.1098/rstb.2022.0053] [Citation(s) in RCA: 16] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2022] [Accepted: 11/23/2022] [Indexed: 04/04/2023] Open

Chen Y, Hu R, Li K, Zhang Y, Fu L, Zhang J, Si T. Deep Mutational Scanning of an Oxygen-Independent Fluorescent Protein CreiLOV for Comprehensive Profiling of Mutational and Epistatic Effects. ACS Synth Biol 2023;12:1461-1473. [PMID: 37066862 PMCID: PMC10204710 DOI: 10.1021/acssynbio.2c00662] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2022] [Indexed: 04/18/2023]

Radford CE, Schommers P, Gieselmann L, Crawford KHD, Dadonaite B, Yu TC, Dingens AS, Overbaugh J, Klein F, Bloom JD. Mapping the neutralizing specificity of human anti-HIV serum by deep mutational scanning. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.03.23.533993. [PMID: 36993197 PMCID: PMC10055425 DOI: 10.1101/2023.03.23.533993] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]

Affiliation(s)

Caelan E. Radford Molecular and Cellular Biology Graduate Program, University of Washington, and Basic Sciences Division, Fred Hutch Cancer Center, Seattle, Washington, 98109, USA Basic Sciences Division and Computational Biology Program, Fred Hutchinson Cancer Center, Seattle, Washington, 98109, USA
Philipp Schommers Laboratory of Experimental Immunology, Institute of Virology, Faculty of Medicine and University Hospital of Cologne, University of Cologne, 50931 Cologne, Germany German Center for Infection Research, partner site Bonn–Cologne, 50931 Cologne, Germany Department I of Internal Medicine, Faculty of Medicine and University Hospital of Cologne, University of Cologne, 50931 Cologne, Germany
Lutz Gieselmann Laboratory of Experimental Immunology, Institute of Virology, Faculty of Medicine and University Hospital of Cologne, University of Cologne, 50931 Cologne, Germany German Center for Infection Research, partner site Bonn–Cologne, 50931 Cologne, Germany Department I of Internal Medicine, Faculty of Medicine and University Hospital of Cologne, University of Cologne, 50931 Cologne, Germany
Katharine H. D. Crawford Basic Sciences Division and Computational Biology Program, Fred Hutchinson Cancer Center, Seattle, Washington, 98109, USA Department of Genome Sciences & Medical Scientist Training Program, University of Washington, Seattle, Washington, 98109, USA
Bernadeta Dadonaite Basic Sciences Division and Computational Biology Program, Fred Hutchinson Cancer Center, Seattle, Washington, 98109, USA
Timothy C. Yu Molecular and Cellular Biology Graduate Program, University of Washington, and Basic Sciences Division, Fred Hutch Cancer Center, Seattle, Washington, 98109, USA Basic Sciences Division and Computational Biology Program, Fred Hutchinson Cancer Center, Seattle, Washington, 98109, USA
Adam S. Dingens Basic Sciences Division and Computational Biology Program, Fred Hutchinson Cancer Center, Seattle, Washington, 98109, USA
Julie Overbaugh Division of Human Biology, Fred Hutchinson Cancer Center, Seattle, Washington, 98109, USA
Florian Klein Laboratory of Experimental Immunology, Institute of Virology, Faculty of Medicine and University Hospital of Cologne, University of Cologne, 50931 Cologne, Germany German Center for Infection Research, partner site Bonn–Cologne, 50931 Cologne, Germany Department I of Internal Medicine, Faculty of Medicine and University Hospital of Cologne, University of Cologne, 50931 Cologne, Germany
Jesse D. Bloom Basic Sciences Division and Computational Biology Program, Fred Hutchinson Cancer Center, Seattle, Washington, 98109, USA Howard Hughes Medical Institute, Seattle, WA, 98109, USA

Collapse

Dadonaite B, Crawford KHD, Radford CE, Farrell AG, Yu TC, Hannon WW, Zhou P, Andrabi R, Burton DR, Liu L, Ho DD, Chu HY, Neher RA, Bloom JD. A pseudovirus system enables deep mutational scanning of the full SARS-CoV-2 spike. Cell 2023;186:1263-1278.e20. [PMID: 36868218 PMCID: PMC9922669 DOI: 10.1016/j.cell.2023.02.001] [Citation(s) in RCA: 61] [Impact Index Per Article: 61.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2022] [Revised: 01/11/2023] [Accepted: 01/31/2023] [Indexed: 02/15/2023]

Affiliation(s)

Bernadeta Dadonaite Basic Sciences Division and Computational Biology Program, Fred Hutchinson Cancer Center, Seattle, WA 98109, USA
Katharine H D Crawford Basic Sciences Division and Computational Biology Program, Fred Hutchinson Cancer Center, Seattle, WA 98109, USA; Department of Genome Sciences & Medical Scientist Training Program, University of Washington, Seattle, WA 98109, USA
Caelan E Radford Basic Sciences Division and Computational Biology Program, Fred Hutchinson Cancer Center, Seattle, WA 98109, USA; Molecular and Cellular Biology Graduate Program, University of Washington, Seattle, WA 98109, USA
Ariana G Farrell Basic Sciences Division and Computational Biology Program, Fred Hutchinson Cancer Center, Seattle, WA 98109, USA
Timothy C Yu Basic Sciences Division and Computational Biology Program, Fred Hutchinson Cancer Center, Seattle, WA 98109, USA; Molecular and Cellular Biology Graduate Program, University of Washington, Seattle, WA 98109, USA
William W Hannon Basic Sciences Division and Computational Biology Program, Fred Hutchinson Cancer Center, Seattle, WA 98109, USA; Molecular and Cellular Biology Graduate Program, University of Washington, Seattle, WA 98109, USA
Panpan Zhou Department of Immunology and Microbiology, The Scripps Research Institute, La Jolla, CA 92037, USA; IAVI Neutralizing Antibody Center, The Scripps Research Institute, La Jolla, CA 92037, USA; Consortium for HIV/AIDS Vaccine Development (CHAVD), The Scripps Research Institute, La Jolla, CA 92037, USA
Raiees Andrabi Department of Immunology and Microbiology, The Scripps Research Institute, La Jolla, CA 92037, USA; IAVI Neutralizing Antibody Center, The Scripps Research Institute, La Jolla, CA 92037, USA; Consortium for HIV/AIDS Vaccine Development (CHAVD), The Scripps Research Institute, La Jolla, CA 92037, USA
Dennis R Burton Department of Immunology and Microbiology, The Scripps Research Institute, La Jolla, CA 92037, USA; IAVI Neutralizing Antibody Center, The Scripps Research Institute, La Jolla, CA 92037, USA; Consortium for HIV/AIDS Vaccine Development (CHAVD), The Scripps Research Institute, La Jolla, CA 92037, USA; Ragon Institute of Massachusetts General Hospital, MIT & Harvard, Cambridge, MA 02139, USA
Lihong Liu Aaron Diamond AIDS Research Center, Columbia University Vagelos College of Physicians and Surgeons, New York, NY, USA
David D Ho Aaron Diamond AIDS Research Center, Columbia University Vagelos College of Physicians and Surgeons, New York, NY, USA; Department of Microbiology and Immunology, Columbia University Vagelos College of Physicians and Surgeons, New York, NY 10032, USA; Division of Infectious Diseases, Department of Medicine, Columbia University Vagelos College of Physicians and Surgeons, New York, NY 10032, USA
Helen Y Chu University of Washington, Department of Medicine, Division of Allergy and Infectious Diseases, Seattle, WA, USA
Richard A Neher Biozentrum, University of Basel, Basel, Switzerland; Swiss Institute of Bioinformatics, Lausanne, Switzerland
Jesse D Bloom Basic Sciences Division and Computational Biology Program, Fred Hutchinson Cancer Center, Seattle, WA 98109, USA; Howard Hughes Medical Institute, Seattle, WA 98195, USA.

Collapse

Morin MA, Morrison AJ, Harms MJ, Dutton RJ. Higher-order interactions shape microbial interactions as microbial community complexity increases. Sci Rep 2022;12:22640. [PMID: 36587027 PMCID: PMC9805437 DOI: 10.1038/s41598-022-25303-1] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2022] [Accepted: 11/28/2022] [Indexed: 01/01/2023] Open

Moulana A, Dupic T, Phillips AM, Chang J, Nieves S, Roffler AA, Greaney AJ, Starr TN, Bloom JD, Desai MM. Compensatory epistasis maintains ACE2 affinity in SARS-CoV-2 Omicron BA.1. Nat Commun 2022;13:7011. [PMID: 36384919 PMCID: PMC9668218 DOI: 10.1038/s41467-022-34506-z] [Citation(s) in RCA: 61] [Impact Index Per Article: 30.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2022] [Accepted: 10/26/2022] [Indexed: 11/17/2022] Open

Dadonaite B, Crawford KHD, Radford CE, Farrell AG, Yu TC, Hannon WW, Zhou P, Andrabi R, Burton DR, Liu L, Ho DD, Neher RA, Bloom JD. A pseudovirus system enables deep mutational scanning of the full SARS-CoV-2 spike. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2022:2022.10.13.512056. [PMID: 36263061 PMCID: PMC9580381 DOI: 10.1101/2022.10.13.512056] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]

Affiliation(s)

Bernadeta Dadonaite Basic Sciences Division and Computational Biology Program, Fred Hutchinson Cancer Center, Seattle, Washington, 98109, USA
Katharine H D Crawford Basic Sciences Division and Computational Biology Program, Fred Hutchinson Cancer Center, Seattle, Washington, 98109, USA Department of Genome Sciences & Medical Scientist Training Program, University of Washington, Seattle, Washington, 98109, USA
Caelan E Radford Basic Sciences Division and Computational Biology Program, Fred Hutchinson Cancer Center, Seattle, Washington, 98109, USA Molecular and Cellular Biology Graduate Program, University of Washington, and Basic Sciences Division, Fred Hutch Cancer Center, Seattle, Washington, 98109, USA
Ariana G Farrell Basic Sciences Division and Computational Biology Program, Fred Hutchinson Cancer Center, Seattle, Washington, 98109, USA
Timothy C Yu Basic Sciences Division and Computational Biology Program, Fred Hutchinson Cancer Center, Seattle, Washington, 98109, USA Molecular and Cellular Biology Graduate Program, University of Washington, and Basic Sciences Division, Fred Hutch Cancer Center, Seattle, Washington, 98109, USA
William W Hannon Basic Sciences Division and Computational Biology Program, Fred Hutchinson Cancer Center, Seattle, Washington, 98109, USA Molecular and Cellular Biology Graduate Program, University of Washington, and Basic Sciences Division, Fred Hutch Cancer Center, Seattle, Washington, 98109, USA
Panpan Zhou Department of Immunology and Microbiology, The Scripps Research Institute, La Jolla, CA 92037, USA IAVI Neutralizing Antibody Center, The Scripps Research Institute, La Jolla, CA 92037, USA Consortium for HIV/AIDS Vaccine Development (CHAVD), The Scripps Research Institute, La Jolla, CA 92037, USA
Raiees Andrabi Department of Immunology and Microbiology, The Scripps Research Institute, La Jolla, CA 92037, USA IAVI Neutralizing Antibody Center, The Scripps Research Institute, La Jolla, CA 92037, USA Consortium for HIV/AIDS Vaccine Development (CHAVD), The Scripps Research Institute, La Jolla, CA 92037, USA
Dennis R Burton Department of Immunology and Microbiology, The Scripps Research Institute, La Jolla, CA 92037, USA IAVI Neutralizing Antibody Center, The Scripps Research Institute, La Jolla, CA 92037, USA Consortium for HIV/AIDS Vaccine Development (CHAVD), The Scripps Research Institute, La Jolla, CA 92037, USA Ragon Institute of MGH, MIT & Harvard, Cambridge, MA 02139, USA
Lihong Liu Aaron Diamond AIDS Research Center, Columbia University Vagelos College of Physicians and Surgeons, New York, NY, USA
David D. Ho Aaron Diamond AIDS Research Center, Columbia University Vagelos College of Physicians and Surgeons, New York, NY, USA Department of Microbiology and Immunology, Columbia University Vagelos College of Physicians and Surgeons, New York, NY 10032, USA Division of Infectious Diseases, Department of Medicine, Columbia University Vagelos College of Physicians and Surgeons, New York, NY 10032, USA
Richard A. Neher Biozentrum, University of Basel, Basel, Switzerland, Swiss Institute of Bioinformatics, Lausanne, Switzerland
Jesse D Bloom Basic Sciences Division and Computational Biology Program, Fred Hutchinson Cancer Center, Seattle, Washington, 98109, USA Howard Hughes Medical Institute, Seattle, WA, 98195, USA

Collapse

Higher-order epistasis and phenotypic prediction. Proc Natl Acad Sci U S A 2022;119:e2204233119. [PMID: 36129941 PMCID: PMC9522415 DOI: 10.1073/pnas.2204233119] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023] Open

Abstract

One core goal of genetics is to systematically understand the mapping between the DNA sequence of an organism (genotype) and its measurable characteristics (phenotype). Understanding this mapping is often challenging because of interactions between mutations, where the result of combining several different mutations can be very different than the sum of their individual effects. Here we provide a statistical framework for modeling complex genetic interactions of this type. The key idea is to ask how fast the effects of mutations change when introducing the same mutation in increasingly distant genetic backgrounds. We then propose a model for phenotypic prediction that takes into account this tendency for the effects of mutations to be more similar in nearby genetic backgrounds.

Contemporary high-throughput mutagenesis experiments are providing an increasingly detailed view of the complex patterns of genetic interaction that occur between multiple mutations within a single protein or regulatory element. By simultaneously measuring the effects of thousands of combinations of mutations, these experiments have revealed that the genotype–phenotype relationship typically reflects not only genetic interactions between pairs of sites but also higher-order interactions among larger numbers of sites. However, modeling and understanding these higher-order interactions remains challenging. Here we present a method for reconstructing sequence-to-function mappings from partially observed data that can accommodate all orders of genetic interaction. The main idea is to make predictions for unobserved genotypes that match the type and extent of epistasis found in the observed data. This information on the type and extent of epistasis can be extracted by considering how phenotypic correlations change as a function of mutational distance, which is equivalent to estimating the fraction of phenotypic variance due to each order of genetic interaction (additive, pairwise, three-way, etc.). Using these estimated variance components, we then define an empirical Bayes prior that in expectation matches the observed pattern of epistasis and reconstruct the genotype–phenotype mapping by conducting Gaussian process regression under this prior. To demonstrate the power of this approach, we present an application to the antibody-binding domain GB1 and also provide a detailed exploration of a dataset consisting of high-throughput measurements for the splicing efficiency of human pre-mRNA 5′ splice sites, for which we also validate our model predictions via additional low-throughput experiments.

Collapse

Tareen A, Kooshkbaghi M, Posfai A, Ireland WT, McCandlish DM, Kinney JB. MAVE-NN: learning genotype-phenotype maps from multiplex assays of variant effect. Genome Biol 2022;23:98. [PMID: 35428271 PMCID: PMC9011994 DOI: 10.1186/s13059-022-02661-7] [Citation(s) in RCA: 21] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2021] [Revised: 03/21/2022] [Accepted: 03/24/2022] [Indexed: 12/17/2022] Open

Ogbunugafor CB. The mutation effect reaction norm (mu-rn) highlights environmentally dependent mutation effects and epistatic interactions. Evolution 2022;76:37-48. [PMID: 34989399 DOI: 10.1111/evo.14428] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2021] [Accepted: 12/23/2021] [Indexed: 11/27/2022]

On the sparsity of fitness functions and implications for learning. Proc Natl Acad Sci U S A 2022;119:2109649118. [PMID: 34937698 PMCID: PMC8740588 DOI: 10.1073/pnas.2109649118] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/11/2021] [Indexed: 01/05/2023] Open

Abstract

The properties of proteins and other biological molecules are encoded in large part in the sequence of amino acids or nucleotides that defines them. Increasingly, researchers estimate functions that map sequences to a particular property using machine learning and related statistical approaches. However, an important question remains unanswered: How many experimental measurements are needed in order to accurately learn these “fitness” functions? We leverage perspectives from the fields of biophysics, evolutionary biology, and signal processing to develop a theoretical framework that enables us to make progress on answering this question. We demonstrate that this framework can be used to make useful calculations on real-world data and suggest how these calculations may be used to guide experiments.

Fitness functions map biological sequences to a scalar property of interest. Accurate estimation of these functions yields biological insight and sets the foundation for model-based sequence design. However, the fitness datasets available to learn these functions are typically small relative to the large combinatorial space of sequences; characterizing how much data are needed for accurate estimation remains an open problem. There is a growing body of evidence demonstrating that empirical fitness functions display substantial sparsity when represented in terms of epistatic interactions. Moreover, the theory of Compressed Sensing provides scaling laws for the number of samples required to exactly recover a sparse function. Motivated by these results, we develop a framework to study the sparsity of fitness functions sampled from a generalization of the NK model, a widely used random field model of fitness functions. In particular, we present results that allow us to test the effect of the Generalized NK (GNK) model’s interpretable parameters—sequence length, alphabet size, and assumed interactions between sequence positions—on the sparsity of fitness functions sampled from the model and, consequently, the number of measurements required to exactly recover these functions. We validate our framework by demonstrating that GNK models with parameters set according to structural considerations can be used to accurately approximate the number of samples required to recover two empirical protein fitness functions and an RNA fitness function. In addition, we show that these GNK models identify important higher-order epistatic interactions in the empirical fitness functions using only structural information.

Collapse

Shaw D, Miravet‐Verde S, Piñero‐Lambea C, Serrano L, Lluch‐Senar M. LoxTnSeq: random transposon insertions combined with cre/lox recombination and counterselection to generate large random genome reductions. Microb Biotechnol 2021;14:2403-2419. [PMID: 33325626 PMCID: PMC8601177 DOI: 10.1111/1751-7915.13714] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2020] [Revised: 11/04/2020] [Accepted: 11/04/2020] [Indexed: 12/13/2022] Open

Phillips AM, Lawrence KR, Moulana A, Dupic T, Chang J, Johnson MS, Cvijovic I, Mora T, Walczak AM, Desai MM. Binding affinity landscapes constrain the evolution of broadly neutralizing anti-influenza antibodies. eLife 2021;10:71393. [PMID: 34491198 PMCID: PMC8476123 DOI: 10.7554/elife.71393] [Citation(s) in RCA: 28] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2021] [Accepted: 09/05/2021] [Indexed: 12/12/2022] Open

Aghazadeh A, Nisonoff H, Ocal O, Brookes DH, Huang Y, Koyluoglu OO, Listgarten J, Ramchandran K. Epistatic Net allows the sparse spectral regularization of deep neural networks for inferring fitness functions. Nat Commun 2021;12:5225. [PMID: 34471113 PMCID: PMC8410946 DOI: 10.1038/s41467-021-25371-3] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2020] [Accepted: 07/27/2021] [Indexed: 11/18/2022] Open

Morrison AJ, Wonderlick DR, Harms MJ. Ensemble epistasis: thermodynamic origins of nonadditivity between mutations. Genetics 2021;219:iyab105. [PMID: 34849909 PMCID: PMC8633102 DOI: 10.1093/genetics/iyab105] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2021] [Accepted: 06/19/2021] [Indexed: 01/02/2023] Open

Correlational selection in the age of genomics. Nat Ecol Evol 2021;5:562-573. [PMID: 33859374 DOI: 10.1038/s41559-021-01413-3] [Citation(s) in RCA: 34] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2019] [Accepted: 02/11/2021] [Indexed: 02/01/2023]

Reddy G, Desai MM. Global epistasis emerges from a generic model of a complex trait. eLife 2021;10:64740. [PMID: 33779543 PMCID: PMC8057814 DOI: 10.7554/elife.64740] [Citation(s) in RCA: 47] [Impact Index Per Article: 15.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2020] [Accepted: 03/26/2021] [Indexed: 11/20/2022] Open

Gualtieri CT. Genomic Variation, Evolvability, and the Paradox of Mental Illness. Front Psychiatry 2021;11:593233. [PMID: 33551865 PMCID: PMC7859268 DOI: 10.3389/fpsyt.2020.593233] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/10/2020] [Accepted: 11/27/2020] [Indexed: 12/30/2022] Open

Abstract

Twentieth-century genetics was hard put to explain the irregular behavior of neuropsychiatric disorders. Autism and schizophrenia defy a principle of natural selection; they are highly heritable but associated with low reproductive success. Nevertheless, they persist. The genetic origins of such conditions are confounded by the problem of variable expression, that is, when a given genetic aberration can lead to any one of several distinct disorders. Also, autism and schizophrenia occur on a spectrum of severity, from mild and subclinical cases to the overt and disabling. Such irregularities reflect the problem of missing heritability; although hundreds of genes may be associated with autism or schizophrenia, together they account for only a small proportion of cases. Techniques for higher resolution, genomewide analysis have begun to illuminate the irregular and unpredictable behavior of the human genome. Thus, the origins of neuropsychiatric disorders in particular and complex disease in general have been illuminated. The human genome is characterized by a high degree of structural and behavioral variability: DNA content variation, epistasis, stochasticity in gene expression, and epigenetic changes. These elements have grown more complex as evolution scaled the phylogenetic tree. They are especially pertinent to brain development and function. Genomic variability is a window on the origins of complex disease, neuropsychiatric disorders, and neurodevelopmental disorders in particular. Genomic variability, as it happens, is also the fuel of evolvability. The genomic events that presided over the evolution of the primate and hominid lineages are over-represented in patients with autism and schizophrenia, as well as intellectual disability and epilepsy. That the special qualities of the human genome that drove evolution might, in some way, contribute to neuropsychiatric disorders is a matter of no little interest.

Collapse

Stouffer DB, Novak M. Hidden layers of density dependence in consumer feeding rates. Ecol Lett 2021;24:520-532. [PMID: 33404158 DOI: 10.1111/ele.13670] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2020] [Revised: 11/26/2020] [Accepted: 12/07/2020] [Indexed: 01/16/2023]

Chen J, Wong KC. Analyzing High-Order Epistasis from Genotype-Phenotype Maps Using 'Epistasis' Package. Methods Mol Biol 2021;2212:265-275. [PMID: 33733361 DOI: 10.1007/978-1-0716-0947-7_16] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]

Rigato E, Fusco G. A heuristic model of the effects of phenotypic robustness in adaptive evolution. Theor Popul Biol 2020;136:22-30. [PMID: 33221334 DOI: 10.1016/j.tpb.2020.11.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2020] [Revised: 10/15/2020] [Accepted: 11/09/2020] [Indexed: 10/23/2022]

Sailer ZR, Shafik SH, Summers RL, Joule A, Patterson-Robert A, Martin RE, Harms MJ. Inferring a complete genotype-phenotype map from a small number of measured phenotypes. PLoS Comput Biol 2020;16:e1008243. [PMID: 32991585 PMCID: PMC7546491 DOI: 10.1371/journal.pcbi.1008243] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2019] [Revised: 10/09/2020] [Accepted: 08/13/2020] [Indexed: 01/02/2023] Open

Abstract

Understanding evolution requires detailed knowledge of genotype-phenotype maps; however, it can be a herculean task to measure every phenotype in a combinatorial map. We have developed a computational strategy to predict the missing phenotypes from an incomplete, combinatorial genotype-phenotype map. As a test case, we used an incomplete genotype-phenotype dataset previously generated for the malaria parasite’s ‘chloroquine resistance transporter’ (PfCRT). Wild-type PfCRT (PfCRT^3D7) lacks significant chloroquine (CQ) transport activity, but the introduction of the eight mutations present in the ‘Dd2’ isoform of PfCRT (PfCRT^Dd2) enables the protein to transport CQ away from its site of antimalarial action. This gain of a transport function imparts CQ resistance to the parasite. A combinatorial map between PfCRT^3D7 and PfCRT^Dd2 consists of 256 genotypes, of which only 52 have had their CQ transport activities measured through expression in the Xenopus laevis oocyte. We trained a statistical model with these 52 measurements to infer the CQ transport activity for the remaining 204 combinatorial genotypes between PfCRT^3D7 and PfCRT^Dd2. Our best-performing model incorporated a binary classifier, a nonlinear scale, and additive effects for each mutation. The addition of specific pairwise- and high-order-epistatic coefficients decreased the predictive power of the model. We evaluated our predictions by experimentally measuring the CQ transport activities of 24 additional PfCRT genotypes. The R² value between our predicted and newly-measured phenotypes was 0.90. We then used the model to probe the accessibility of evolutionary trajectories through the map. Approximately 1% of the possible trajectories between PfCRT^3D7 and PfCRT^Dd2 are accessible; however, none of the trajectories entailed eight successive increases in CQ transport activity. These results demonstrate that phenotypes can be inferred with known uncertainty from a partial genotype-phenotype dataset. We also validated our approach against a collection of previously published genotype-phenotype maps. The model therefore appears general and should be applicable to a large number of genotype-phenotype maps.

Biological macromolecules are built from chains of building blocks. The function of a macromolecule depends on the specific chemical properties of the building blocks that make it up. Macromolecules evolve through mutations that swap one building block for another. Understanding how biomolecules work and evolve therefore requires knowledge of the effects of mutations. The effects of mutations can be measured experimentally; however, because there are a vast number of possible combinations of mutations, it is often difficult to make enough measurements to understand biomolecular function and evolution. In this paper, we describe a simple method to predict the effects of mutations on biomolecules from a small number of measurements. This method works by appropriately averaging the effects of mutations seen in different contexts. We test the method by predicting the effects of mutations on a PfCRT—a macromolecule from the malarial parasite that confers drug resistance. We find that our method is fast and effective. Using a small number of measurements, we were able to gain insight into the evolutionary steps by which this macromolecule conferred drug resistance. To make this method accessible to other researchers, we have released it as an open-source software package: https://gpseer.readthedocs.io.

Collapse

Genotype networks of 80 quantitative Arabidopsis thaliana phenotypes reveal phenotypic evolvability despite pervasive epistasis. PLoS Comput Biol 2020;16:e1008082. [PMID: 32790763 PMCID: PMC7447023 DOI: 10.1371/journal.pcbi.1008082] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2019] [Revised: 08/25/2020] [Accepted: 06/22/2020] [Indexed: 12/23/2022] Open

Ballal A, Laurendon C, Salmon M, Vardakou M, Cheema J, Defernez M, O'Maille PE, Morozov AV. Sparse Epistatic Patterns in the Evolution of Terpene Synthases. Mol Biol Evol 2020;37:1907-1924. [PMID: 32119077 DOI: 10.1093/molbev/msaa052] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open

Levy JJ, O'Malley AJ. Don't dismiss logistic regression: the case for sensible extraction of interactions in the era of machine learning. BMC Med Res Methodol 2020;20:171. [PMID: 32600277 PMCID: PMC7325087 DOI: 10.1186/s12874-020-01046-3] [Citation(s) in RCA: 25] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2019] [Accepted: 06/10/2020] [Indexed: 01/08/2023] Open

Abstract

BACKGROUND

Machine learning approaches have become increasingly popular modeling techniques, relying on data-driven heuristics to arrive at its solutions. Recent comparisons between these algorithms and traditional statistical modeling techniques have largely ignored the superiority gained by the former approaches due to involvement of model-building search algorithms. This has led to alignment of statistical and machine learning approaches with different types of problems and the under-development of procedures that combine their attributes. In this context, we hoped to understand the domains of applicability for each approach and to identify areas where a marriage between the two approaches is warranted. We then sought to develop a hybrid statistical-machine learning procedure with the best attributes of each.

METHODS

We present three simple examples to illustrate when to use each modeling approach and posit a general framework for combining them into an enhanced logistic regression model building procedure that aids interpretation. We study 556 benchmark machine learning datasets to uncover when machine learning techniques outperformed rudimentary logistic regression models and so are potentially well-equipped to enhance them. We illustrate a software package, InteractionTransformer, which embeds logistic regression with advanced model building capacity by using machine learning algorithms to extract candidate interaction features from a random forest model for inclusion in the model. Finally, we apply our enhanced logistic regression analysis to two real-word biomedical examples, one where predictors vary linearly with the outcome and another with extensive second-order interactions.

RESULTS

Preliminary statistical analysis demonstrated that across 556 benchmark datasets, the random forest approach significantly outperformed the logistic regression approach. We found a statistically significant increase in predictive performance when using hybrid procedures and greater clarity in the association with the outcome of terms acquired compared to directly interpreting the random forest output.

CONCLUSIONS

When a random forest model is closer to the true model, hybrid statistical-machine learning procedures can substantially enhance the performance of statistical procedures in an automated manner while preserving easy interpretation of the results. Such hybrid methods may help facilitate widespread adoption of machine learning techniques in the biomedical setting.

Collapse

Zhou J, McCandlish DM. Minimum epistasis interpolation for sequence-function relationships. Nat Commun 2020;11:1782. [PMID: 32286265 PMCID: PMC7156698 DOI: 10.1038/s41467-020-15512-5] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2019] [Accepted: 03/12/2020] [Indexed: 12/17/2022] Open

Crona K, Luo M, Greene D. An uncertainty law for microbial evolution. J Theor Biol 2020;489:110155. [PMID: 31926205 DOI: 10.1016/j.jtbi.2020.110155] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2019] [Revised: 01/05/2020] [Accepted: 01/07/2020] [Indexed: 11/28/2022]

Miton CM, Chen JZ, Ost K, Anderson DW, Tokuriki N. Statistical analysis of mutational epistasis to reveal intramolecular interaction networks in proteins. Methods Enzymol 2020;643:243-280. [DOI: 10.1016/bs.mie.2020.07.012] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]

Sanchez-Gorostiaga A, Bajić D, Osborne ML, Poyatos JF, Sanchez A. High-order interactions distort the functional landscape of microbial consortia. PLoS Biol 2019;17:e3000550. [PMID: 31830028 PMCID: PMC6932822 DOI: 10.1371/journal.pbio.3000550] [Citation(s) in RCA: 81] [Impact Index Per Article: 16.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2019] [Revised: 12/26/2019] [Accepted: 11/15/2019] [Indexed: 12/11/2022] Open