1
|
Kim H, Kumar A, Lövkvist C, Palma AM, Martin P, Kim J, Bhoopathi P, Trevino J, Fisher P, Madan E, Gogna R, Won KJ. CellNeighborEX: deciphering neighbor-dependent gene expression from spatial transcriptomics data. Mol Syst Biol 2023; 19:e11670. [PMID: 37815040 PMCID: PMC10632736 DOI: 10.15252/msb.202311670] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2023] [Revised: 09/14/2023] [Accepted: 09/18/2023] [Indexed: 10/11/2023] Open
Abstract
Cells have evolved their communication methods to sense their microenvironments and send biological signals. In addition to communication using ligands and receptors, cells use diverse channels including gap junctions to communicate with their immediate neighbors. Current approaches, however, cannot effectively capture the influence of various microenvironments. Here, we propose a novel approach to investigate cell neighbor-dependent gene expression (CellNeighborEX) in spatial transcriptomics (ST) data. To categorize cells based on their microenvironment, CellNeighborEX uses direct cell location or the mixture of transcriptome from multiple cells depending on ST technologies. For each cell type, CellNeighborEX identifies diverse gene sets associated with partnering cell types, providing further insight. We found that cells express different genes depending on their neighboring cell types in various tissues including mouse embryos, brain, and liver cancer. Those genes are associated with critical biological processes such as development or metastases. We further validated that gene expression is induced by neighboring partners via spatial visualization. The neighbor-dependent gene expression suggests new potential genes involved in cell-cell interactions beyond what ligand-receptor co-expression can discover.
Collapse
Affiliation(s)
- Hyobin Kim
- Department of Computational BiomedicineCedars‐Sinai Medical CenterHollywoodCAUSA
- Biotech Research and Innovation Centre (BRIC)University of CopenhagenCopenhagenDenmark
| | - Amit Kumar
- Massey Cancer CenterVirginia Commonwealth UniversityRichmondVAUSA
- School of Medicine, Institute of Molecular MedicineVirginia Commonwealth UniversityRichmondVAUSA
- Department of Human and Molecular Genetics, School of MedicineVirginia Commonwealth UniversityRichmondVAUSA
| | - Cecilia Lövkvist
- Novo Nordisk Foundation Center for Stem Cell Medicine, reNEWUniversity of CopenhagenCopenhagenDenmark
| | - António M Palma
- Massey Cancer CenterVirginia Commonwealth UniversityRichmondVAUSA
- School of Medicine, Institute of Molecular MedicineVirginia Commonwealth UniversityRichmondVAUSA
- Instituto Superior TecnicoUniversidade de LisboaLisboaPortugal
| | - Patrick Martin
- Department of Computational BiomedicineCedars‐Sinai Medical CenterHollywoodCAUSA
- Biotech Research and Innovation Centre (BRIC)University of CopenhagenCopenhagenDenmark
| | - Junil Kim
- School of Systems Biomedical ScienceSoongsil UniversitySeoulKorea
| | - Praveen Bhoopathi
- Massey Cancer CenterVirginia Commonwealth UniversityRichmondVAUSA
- School of Medicine, Institute of Molecular MedicineVirginia Commonwealth UniversityRichmondVAUSA
- Department of Human and Molecular Genetics, School of MedicineVirginia Commonwealth UniversityRichmondVAUSA
| | - Jose Trevino
- Massey Cancer CenterVirginia Commonwealth UniversityRichmondVAUSA
- Department of Surgery, School of MedicineVirginia Commonwealth UniversityRichmondVAUSA
| | - Paul Fisher
- Massey Cancer CenterVirginia Commonwealth UniversityRichmondVAUSA
- School of Medicine, Institute of Molecular MedicineVirginia Commonwealth UniversityRichmondVAUSA
- Department of Human and Molecular Genetics, School of MedicineVirginia Commonwealth UniversityRichmondVAUSA
| | - Esha Madan
- Massey Cancer CenterVirginia Commonwealth UniversityRichmondVAUSA
- School of Medicine, Institute of Molecular MedicineVirginia Commonwealth UniversityRichmondVAUSA
- Department of Human and Molecular Genetics, School of MedicineVirginia Commonwealth UniversityRichmondVAUSA
- Department of Surgery, School of MedicineVirginia Commonwealth UniversityRichmondVAUSA
| | - Rajan Gogna
- Massey Cancer CenterVirginia Commonwealth UniversityRichmondVAUSA
- School of Medicine, Institute of Molecular MedicineVirginia Commonwealth UniversityRichmondVAUSA
- Department of Human and Molecular Genetics, School of MedicineVirginia Commonwealth UniversityRichmondVAUSA
- Department of Surgery, School of MedicineVirginia Commonwealth UniversityRichmondVAUSA
| | - Kyoung Jae Won
- Department of Computational BiomedicineCedars‐Sinai Medical CenterHollywoodCAUSA
- Biotech Research and Innovation Centre (BRIC)University of CopenhagenCopenhagenDenmark
| |
Collapse
|
2
|
Martin PCN, Kim H, Lövkvist C, Hong BW, Won KJ. Vesalius: high-resolution in silico anatomization of spatial transcriptomic data using image analysis. Mol Syst Biol 2022; 18:e11080. [PMID: 36065846 PMCID: PMC9446088 DOI: 10.15252/msb.202211080] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2022] [Revised: 08/18/2022] [Accepted: 08/19/2022] [Indexed: 11/25/2022] Open
Abstract
Characterization of tissue architecture promises to deliver insights into development, cell communication, and disease. In silico spatial domain retrieval methods have been developed for spatial transcriptomics (ST) data assuming transcriptional similarity of neighboring barcodes. However, domain retrieval approaches with this assumption cannot work in complex tissues composed of multiple cell types. This task becomes especially challenging in cellular resolution ST methods. We developed Vesalius to decipher tissue anatomy from ST data by applying image processing technology. Vesalius uniquely detected territories composed of multiple cell types and successfully recovered tissue structures in high‐resolution ST data including in mouse brain, embryo, liver, and colon. Utilizing this tissue architecture, Vesalius identified tissue morphology‐specific gene expression and regional specific gene expression changes for astrocytes, interneuron, oligodendrocytes, and entorhinal cells in the mouse brain.
Collapse
Affiliation(s)
- Patrick C N Martin
- Department of Computational Biomedicine, Cedars-Sinai Medical Center, Hollywood, CA, USA.,Biotech Research and Innovation Centre (BRIC), University of Copenhagen, Copenhagen, Denmark
| | - Hyobin Kim
- Department of Computational Biomedicine, Cedars-Sinai Medical Center, Hollywood, CA, USA.,Biotech Research and Innovation Centre (BRIC), University of Copenhagen, Copenhagen, Denmark
| | - Cecilia Lövkvist
- Biotech Research and Innovation Centre (BRIC), University of Copenhagen, Copenhagen, Denmark
| | - Byung-Woo Hong
- Computer Science Department, Chung-Ang University, Seoul, Korea
| | - Kyoung Jae Won
- Department of Computational Biomedicine, Cedars-Sinai Medical Center, Hollywood, CA, USA.,Biotech Research and Innovation Centre (BRIC), University of Copenhagen, Copenhagen, Denmark
| |
Collapse
|
3
|
Holoch D, Wassef M, Lövkvist C, Zielinski D, Aflaki S, Lombard B, Héry T, Loew D, Howard M, Margueron R. A cis-acting mechanism mediates transcriptional memory at Polycomb target genes in mammals. Nat Genet 2021; 53:1686-1697. [PMID: 34782763 DOI: 10.1038/s41588-021-00964-2] [Citation(s) in RCA: 28] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2020] [Accepted: 10/05/2021] [Indexed: 11/09/2022]
Abstract
Epigenetic inheritance of gene expression states enables a single genome to maintain distinct cellular identities. How histone modifications contribute to this process remains unclear. Using global chromatin perturbations and local, time-controlled modulation of transcription, we establish the existence of epigenetic memory of transcriptional activation for genes that can be silenced by the Polycomb group. This property emerges during cell differentiation and allows genes to be stably switched after a transient transcriptional stimulus. This transcriptional memory state at Polycomb targets operates in cis; however, rather than relying solely on read-and-write propagation of histone modifications, the memory is also linked to the strength of activating inputs opposing Polycomb proteins, and therefore varies with the cellular context. Our data and computational simulations suggest a model whereby transcriptional memory arises from double-negative feedback between Polycomb-mediated silencing and active transcription. Transcriptional memory at Polycomb targets thus depends not only on histone modifications but also on the gene-regulatory network and underlying identity of a cell.
Collapse
Affiliation(s)
- Daniel Holoch
- Institut Curie, Paris Sciences et Lettres Research University, Sorbonne University, Paris, France.,INSERM U934/CNRS UMR 3215, Paris, France
| | - Michel Wassef
- Institut Curie, Paris Sciences et Lettres Research University, Sorbonne University, Paris, France.,INSERM U934/CNRS UMR 3215, Paris, France
| | - Cecilia Lövkvist
- John Innes Centre, Norwich Research Park, Norwich, UK. .,Biotech Research and Innovation Centre, University of Copenhagen, Copenhagen, Denmark.
| | - Dina Zielinski
- Institut Curie, Paris Sciences et Lettres Research University, Sorbonne University, Paris, France.,INSERM U934/CNRS UMR 3215, Paris, France.,INSERM U900, Mines ParisTech, Paris, France
| | - Setareh Aflaki
- Institut Curie, Paris Sciences et Lettres Research University, Sorbonne University, Paris, France.,INSERM U934/CNRS UMR 3215, Paris, France
| | - Bérangère Lombard
- Institut Curie, Paris Sciences et Lettres Research University, Sorbonne University, Paris, France.,Proteomics Mass Spectrometry Laboratory, Paris, France
| | - Tiphaine Héry
- Institut Curie, Paris Sciences et Lettres Research University, Sorbonne University, Paris, France.,INSERM U934/CNRS UMR 3215, Paris, France
| | - Damarys Loew
- Institut Curie, Paris Sciences et Lettres Research University, Sorbonne University, Paris, France.,Proteomics Mass Spectrometry Laboratory, Paris, France
| | - Martin Howard
- John Innes Centre, Norwich Research Park, Norwich, UK
| | - Raphaël Margueron
- Institut Curie, Paris Sciences et Lettres Research University, Sorbonne University, Paris, France. .,INSERM U934/CNRS UMR 3215, Paris, France.
| |
Collapse
|
4
|
Lövkvist C, Mikulski P, Reeck S, Hartley M, Dean C, Howard M. Hybrid protein assembly-histone modification mechanism for PRC2-based epigenetic switching and memory. eLife 2021; 10:66454. [PMID: 34473050 PMCID: PMC8412945 DOI: 10.7554/elife.66454] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2021] [Accepted: 08/03/2021] [Indexed: 12/31/2022] Open
Abstract
The histone modification H3K27me3 plays a central role in Polycomb-mediated epigenetic silencing. H3K27me3 recruits and allosterically activates Polycomb Repressive Complex 2 (PRC2), which adds this modification to nearby histones, providing a read/write mechanism for inheritance through DNA replication. However, for some PRC2 targets, a purely histone-based system for epigenetic inheritance may be insufficient. We address this issue at the Polycomb target FLOWERING LOCUS C (FLC) in Arabidopsis thaliana, as a narrow nucleation region of only ~three nucleosomes within FLC mediates epigenetic state switching and subsequent memory over many cell cycles. To explain the memory's unexpected persistence, we introduce a mathematical model incorporating extra protein memory storage elements with positive feedback that persist at the locus through DNA replication, in addition to histone modifications. Our hybrid model explains many features of epigenetic switching/memory at FLC and encapsulates generic mechanisms that may be widely applicable.
Collapse
Affiliation(s)
- Cecilia Lövkvist
- Computational and Systems Biology, John Innes Centre, Norwich Research Park, United Kingdom
| | - Pawel Mikulski
- Cell and Developmental Biology, John Innes Centre, Norwich Research Park, United Kingdom
| | - Svenja Reeck
- Computational and Systems Biology, John Innes Centre, Norwich Research Park, United Kingdom.,Cell and Developmental Biology, John Innes Centre, Norwich Research Park, United Kingdom
| | - Matthew Hartley
- Computational and Systems Biology, John Innes Centre, Norwich Research Park, United Kingdom
| | - Caroline Dean
- Cell and Developmental Biology, John Innes Centre, Norwich Research Park, United Kingdom
| | - Martin Howard
- Computational and Systems Biology, John Innes Centre, Norwich Research Park, United Kingdom
| |
Collapse
|
5
|
Abstract
Near promoters, both nucleosomes and CpG sites form characteristic spatial patterns. Previously, nucleosome depleted regions were observed upstream of transcription start sites and nucleosome occupancy was reported to correlate both with CpG density and the level of CpG methylation. Several studies imply a causal link where CpG methylation might induce nucleosome formation, whereas others argue the opposite, i.e., that nucleosome occupancy might influence CpG methylation. Correlations are indeed evident between nucleosomes, CpG density and CpG methylation—at least near promoter sites. It is however less established whether there is an immediate causal relation between nucleosome occupancy and the presence of CpG sites—or if nucleosome occupancy could be influenced by other factors. In this work, we test for such causality in human genomes by analyzing the three quantities both near and away from promoter sites. For data from the human genome we compare promoter regions with given CpG densities with genomic regions without promoters but of similar CpG densities. We find the observed correlation between nucleosome occupancy and CpG density, respectively CpG methylation, to be specific to promoter regions. In other regions along the genome nucleosome occupancy is statistically independent of the positioning of CpGs or their methylation levels. Anti-correlation between CpG density and methylation level is however similarly strong in both regions. On promoters, nucleosome occupancy is more strongly affected by the level of gene expression than CpG density or CpG methylation—calling into question any direct causal relation between nucleosome occupancy and CpG organization. Rather, our results suggest that for organisms with cytosine methylation nucleosome occupancy might be primarily linked to gene expression, with no strong impact on methylation.
Collapse
Affiliation(s)
- Cecilia Lövkvist
- Center for Models of Life, Niels Bohr Institue, University of Copenhagen, Copenhagen, Denmark
| | - Kim Sneppen
- Center for Models of Life, Niels Bohr Institue, University of Copenhagen, Copenhagen, Denmark
| | - Jan O Haerter
- Center for Models of Life, Niels Bohr Institue, University of Copenhagen, Copenhagen, Denmark
| |
Collapse
|
6
|
Sormani G, Haerter JO, Lövkvist C, Sneppen K. Stabilization of epigenetic states of CpG islands by local cooperation. Mol Biosyst 2017; 12:2142-6. [PMID: 26923344 DOI: 10.1039/c6mb00044d] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
DNA methylation of CpG sites is an important epigenetic mark in mammals. Active promoters are often associated with unmethylated CpG sites, whereas methylated CpG sites correlate with silenced promoters. Methylation of CpG sites must be generally described as a dynamical process that is mediated by methylation enzymes, such as DNMT1 and DNMT3a/b. However, there are several models of how CpG sites can be protected from methylation and thereby remain unmethylated. In this paper we examine the combination of both: the positive feedbacks of DNA methylation and a short range counterpart which in turn protects-and thereby maintains-the unmethylated state. The emergent dynamics is provided by collaborative, re-enforcing feedbacks in favor of methylated CpG islands and cooperative protection of one CpG site by another in favor of unmethylated CpG sites. Our results suggest that this synthesis of mechanisms provides equally robust maintenance of both the unmethylated and methylated states of CpG islands.
Collapse
Affiliation(s)
- Giulia Sormani
- Center for Models of Life, Niels Bohr Institute, University of Copenhagen, 2100 Copenhagen, Denmark.
| | - Jan O Haerter
- Center for Models of Life, Niels Bohr Institute, University of Copenhagen, 2100 Copenhagen, Denmark.
| | - Cecilia Lövkvist
- Center for Models of Life, Niels Bohr Institute, University of Copenhagen, 2100 Copenhagen, Denmark.
| | - Kim Sneppen
- Center for Models of Life, Niels Bohr Institute, University of Copenhagen, 2100 Copenhagen, Denmark.
| |
Collapse
|
7
|
Abstract
A few central transcription factors inside mouse embryonic stem (ES) cells and induced pluripotent stem (iPS) cells are believed to control the cells’ pluripotency. Characterizations of pluripotent state were put forward on both transcription factor and epigenetic levels. Whereas core players have been identified, it is desirable to map out gene regulatory networks which govern the reprogramming of somatic cells as well as the early developmental decisions. Here we propose a multiple level model where the regulatory network of Oct4, Nanog and Tet1 includes positive feedback loops involving DNA-demethylation around the promoters of Oct4 and Tet1. We put forward a mechanistic understanding of the regulatory dynamics which account for i) Oct4 overexpression is sufficient to induce pluripotency in somatic cell types expressing the other Yamanaka reprogramming factors endogenously; ii) Tet1 can replace Oct4 in reprogramming cocktail; iii) Nanog is not necessary for reprogramming however its over-expression leads to enhanced self-renewal; iv) DNA methylation is the key to the regulation of pluripotency genes; v) Lif withdrawal leads to loss of pluripotency. Overall, our paper proposes a novel framework combining transcription regulation with DNA methylation modifications which, takes into account the multi-layer nature of regulatory mechanisms governing pluripotency acquisition through reprogramming.
Collapse
Affiliation(s)
- Victor Olariu
- Centre for Models of Life, Niels Bohr Institute, University of Copenhagen, Copenhagen, Denmark.,Computational Biology and Biological Physics, Department of Astronomy and Theoretical Physics, Lund University, Lund, Sweden
| | - Cecilia Lövkvist
- Centre for Models of Life, Niels Bohr Institute, University of Copenhagen, Copenhagen, Denmark
| | - Kim Sneppen
- Centre for Models of Life, Niels Bohr Institute, University of Copenhagen, Copenhagen, Denmark.,Centre for Models of Life, Niels Bohr Institute, University of Copenhagen, Copenhagen, Denmark
| |
Collapse
|
8
|
Lövkvist C, Dodd IB, Sneppen K, Haerter JO. DNA methylation in human epigenomes depends on local topology of CpG sites. Nucleic Acids Res 2016; 44:5123-32. [PMID: 26932361 PMCID: PMC4914085 DOI: 10.1093/nar/gkw124] [Citation(s) in RCA: 104] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2015] [Accepted: 02/20/2016] [Indexed: 01/01/2023] Open
Abstract
In vertebrates, methylation of cytosine at CpG sequences is implicated in stable and heritable patterns of gene expression. The classical model for inheritance, in which individual CpG sites are independent, provides no explanation for the observed non-random patterns of methylation. We first investigate the exact topology of CpG clustering in the human genome associated to CpG islands. Then, by pooling genomic CpG clusters on the basis of short distances between CpGs within and long distances outside clusters, we show a strong dependence of methylation on the number and density of CpG organization. CpG clusters with fewer, or less densely spaced, CpGs are predominantly hyper-methylated, while larger clusters are predominantly hypo-methylated. Intermediate clusters, however, are either hyper- or hypo-methylated but are rarely found in intermediate methylation states. We develop a model for spatially-dependent collaboration between CpGs, where methylated CpGs recruit methylation enzymes that can act on CpGs over an extended local region, while unmethylated CpGs recruit demethylation enzymes that act more strongly on nearby CpGs. This model can reproduce the effects of CpG clustering on methylation and produces stable and heritable alternative methylation states of CpG clusters, thus providing a coherent model for methylation inheritance and methylation patterning.
Collapse
Affiliation(s)
- Cecilia Lövkvist
- Center for Models of Life, Niels Bohr Institute, University of Copenhagen, Blegdamsvej 17, DK-2100, Copenhagen, Denmark
| | - Ian B Dodd
- Department of Molecular and Cellular Biology, University of Adelaide, SA 5005, Australia
| | - Kim Sneppen
- Center for Models of Life, Niels Bohr Institute, University of Copenhagen, Blegdamsvej 17, DK-2100, Copenhagen, Denmark
| | - Jan O Haerter
- Center for Models of Life, Niels Bohr Institute, University of Copenhagen, Blegdamsvej 17, DK-2100, Copenhagen, Denmark
| |
Collapse
|
9
|
Haerter JO, Lövkvist C, Dodd IB, Sneppen K. Collaboration between CpG sites is needed for stable somatic inheritance of DNA methylation states. Nucleic Acids Res 2013; 42:2235-44. [PMID: 24288373 PMCID: PMC3936770 DOI: 10.1093/nar/gkt1235] [Citation(s) in RCA: 51] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
Inheritance of 5-methyl cytosine modification of CpG (CG/CG) DNA sequences is needed to maintain early developmental decisions in vertebrates. The standard inheritance model treats CpGs as independent, with methylated CpGs maintained by efficient methylation of hemimethylated CpGs produced after DNA replication, and unmethylated CpGs maintained by an absence of de novo methylation. By stochastic simulations of CpG islands over multiple cell cycles and systematic sampling of reaction parameters, we show that the standard model is inconsistent with many experimental observations. In contrast, dynamic collaboration between CpGs can provide strong error-tolerant somatic inheritance of both hypermethylated and hypomethylated states of a cluster of CpGs, reproducing observed stable bimodal methylation patterns. Known recruitment of methylating enzymes by methylated CpGs could provide the necessary collaboration, but we predict that recruitment of demethylating enzymes by unmethylated CpGs strengthens inheritance and allows CpG islands to remain hypomethylated within a sea of hypermethylation.
Collapse
Affiliation(s)
- Jan O Haerter
- Center for Models of Life, Niels Bohr Institute, University of Copenhagen, Blegdamsvej 17, DK-2100 Copenhagen, Denmark and Department of Molecular and Biomedical Sciences (Biochemistry), University of Adelaide, SA 5005, Australia
| | | | | | | |
Collapse
|
10
|
Ekeberg M, Lövkvist C, Lan Y, Weigt M, Aurell E. Improved contact prediction in proteins: using pseudolikelihoods to infer Potts models. Phys Rev E Stat Nonlin Soft Matter Phys 2013; 87:012707. [PMID: 23410359 DOI: 10.1103/physreve.87.012707] [Citation(s) in RCA: 378] [Impact Index Per Article: 34.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/23/2012] [Indexed: 05/24/2023]
Abstract
Spatially proximate amino acids in a protein tend to coevolve. A protein's three-dimensional (3D) structure hence leaves an echo of correlations in the evolutionary record. Reverse engineering 3D structures from such correlations is an open problem in structural biology, pursued with increasing vigor as more and more protein sequences continue to fill the data banks. Within this task lies a statistical inference problem, rooted in the following: correlation between two sites in a protein sequence can arise from firsthand interaction but can also be network-propagated via intermediate sites; observed correlation is not enough to guarantee proximity. To separate direct from indirect interactions is an instance of the general problem of inverse statistical mechanics, where the task is to learn model parameters (fields, couplings) from observables (magnetizations, correlations, samples) in large systems. In the context of protein sequences, the approach has been referred to as direct-coupling analysis. Here we show that the pseudolikelihood method, applied to 21-state Potts models describing the statistical properties of families of evolutionarily related proteins, significantly outperforms existing approaches to the direct-coupling analysis, the latter being based on standard mean-field techniques. This improved performance also relies on a modified score for the coupling strength. The results are verified using known crystal structures of specific sequence instances of various protein families. Code implementing the new method can be found at http://plmdca.csc.kth.se/.
Collapse
Affiliation(s)
- Magnus Ekeberg
- Engineering Physics Program, KTH Royal Institute of Technology, 100 44 Stockholm, Sweden
| | | | | | | | | |
Collapse
|