1
|
Horlacher M, Wagner N, Moyon L, Kuret K, Goedert N, Salvatore M, Ule J, Gagneur J, Winther O, Marsico A. Towards in silico CLIP-seq: predicting protein-RNA interaction via sequence-to-signal learning. Genome Biol 2023; 24:180. [PMID: 37542318 PMCID: PMC10403857 DOI: 10.1186/s13059-023-03015-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2022] [Accepted: 07/17/2023] [Indexed: 08/06/2023] Open
Abstract
We present RBPNet, a novel deep learning method, which predicts CLIP-seq crosslink count distribution from RNA sequence at single-nucleotide resolution. By training on up to a million regions, RBPNet achieves high generalization on eCLIP, iCLIP and miCLIP assays, outperforming state-of-the-art classifiers. RBPNet performs bias correction by modeling the raw signal as a mixture of the protein-specific and background signal. Through model interrogation via Integrated Gradients, RBPNet identifies predictive sub-sequences that correspond to known and novel binding motifs and enables variant-impact scoring via in silico mutagenesis. Together, RBPNet improves imputation of protein-RNA interactions, as well as mechanistic interpretation of predictions.
Collapse
Affiliation(s)
- Marc Horlacher
- Computational Health Center, Helmholtz Center Munich, Munich, Germany.
- Department of Biology, University of Copenhagen, Copenhagen, Denmark.
- Department of Informatics, Technical University of Munich, Garching, Germany.
- Helmholtz Association - Munich School for Data Science (MUDS), Munich, Germany.
| | - Nils Wagner
- Department of Informatics, Technical University of Munich, Garching, Germany
- Helmholtz Association - Munich School for Data Science (MUDS), Munich, Germany
| | - Lambert Moyon
- Computational Health Center, Helmholtz Center Munich, Munich, Germany
| | - Klara Kuret
- National Institute of Chemistry, Ljubljana, Slovenia
- The Francis Crick Institute, London, UK
- Jozef Stefan International Postgraduate School, Jamova cesta 39, 1000, Ljubljana, Slovenia
| | - Nicolas Goedert
- Computational Health Center, Helmholtz Center Munich, Munich, Germany
| | - Marco Salvatore
- Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Jernej Ule
- National Institute of Chemistry, Ljubljana, Slovenia
- The Francis Crick Institute, London, UK
| | - Julien Gagneur
- Computational Health Center, Helmholtz Center Munich, Munich, Germany
- Department of Informatics, Technical University of Munich, Garching, Germany
- Helmholtz Association - Munich School for Data Science (MUDS), Munich, Germany
| | - Ole Winther
- Department of Biology, University of Copenhagen, Copenhagen, Denmark.
| | - Annalisa Marsico
- Computational Health Center, Helmholtz Center Munich, Munich, Germany.
- Helmholtz Association - Munich School for Data Science (MUDS), Munich, Germany.
| |
Collapse
|