1
|
Heames B, Buchel F, Aubel M, Tretyachenko V, Loginov D, Novák P, Lange A, Bornberg-Bauer E, Hlouchová K. Experimental characterization of de novo proteins and their unevolved random-sequence counterparts. Nat Ecol Evol 2023; 7:570-580. [PMID: 37024625 PMCID: PMC10089919 DOI: 10.1038/s41559-023-02010-2] [Citation(s) in RCA: 14] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2022] [Accepted: 02/10/2023] [Indexed: 04/08/2023]
Abstract
De novo gene emergence provides a route for new proteins to be formed from previously non-coding DNA. Proteins born in this way are considered random sequences and typically assumed to lack defined structure. While it remains unclear how likely a de novo protein is to assume a soluble and stable tertiary structure, intersecting evidence from random sequence and de novo-designed proteins suggests that native-like biophysical properties are abundant in sequence space. Taking putative de novo proteins identified in human and fly, we experimentally characterize a library of these sequences to assess their solubility and structure propensity. We compare this library to a set of synthetic random proteins with no evolutionary history. Bioinformatic prediction suggests that de novo proteins may have remarkably similar distributions of biophysical properties to unevolved random sequences of a given length and amino acid composition. However, upon expression in vitro, de novo proteins exhibit moderately higher solubility which is further induced by the DnaK chaperone system. We suggest that while synthetic random sequences are a useful proxy for de novo proteins in terms of structure propensity, de novo proteins may be better integrated in the cellular system than random expectation, given their higher solubility.
Collapse
Affiliation(s)
- Brennen Heames
- Institute for Evolution and Biodiversity, University of Münster, Münster, Germany
| | - Filip Buchel
- Department of Cell Biology, Charles University, BIOCEV, Prague, Czech Republic
- Department of Biochemistry, Charles University, Prague, Czech Republic
| | - Margaux Aubel
- Institute for Evolution and Biodiversity, University of Münster, Münster, Germany
| | | | - Dmitry Loginov
- Institute of Microbiology, Czech Academy of Sciences, Prague, Czech Republic
| | - Petr Novák
- Institute of Microbiology, Czech Academy of Sciences, Prague, Czech Republic
| | - Andreas Lange
- Institute for Evolution and Biodiversity, University of Münster, Münster, Germany
| | - Erich Bornberg-Bauer
- Institute for Evolution and Biodiversity, University of Münster, Münster, Germany.
- Department of Protein Evolution, MPI for Developmental Biology, Tübingen, Germany.
| | - Klára Hlouchová
- Department of Cell Biology, Charles University, BIOCEV, Prague, Czech Republic.
- Institute of Organic Chemistry and Biochemistry, Czech Academy of Sciences, Prague, Czech Republic.
| |
Collapse
|
2
|
Tosi L, Chaikban L, Larman BH, Rosenfield J, Parekkadan B. Massively parallel DNA target capture using long adapter single stranded oligonucleotide (LASSO) probes assembled through a novel DNA recombinase mediated methodology. Biotechnol J 2022; 17:e2100240. [PMID: 34775678 PMCID: PMC8825753 DOI: 10.1002/biot.202100240] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2021] [Revised: 11/05/2021] [Accepted: 11/05/2021] [Indexed: 02/03/2023]
Abstract
In the attempt to bridge the widening gap from DNA sequence to biological function, we developed a novel methodology to assemble Long-Adapter Single-Strand Oligonucleotide (LASSO) probe libraries that enabled the massively multiplexed capture of kilobase-sized DNA fragments for downstream long read DNA sequencing or expression. This method uses short DNA oligonucleotides (pre-LASSO probes) and a plasmid vector that supplies the linker sequence for the mature LASSO probe through Cre-LoxP intramolecular recombination. This strategy generates high quality LASSO probes libraries (≈46% of correct probes). We performed NGS analysis of the post-capture PCR amplification of DNA circles obtained from the LASSO capture of 3087 Escherichia coli ORFs spanning from 400- to 5000 bp. The median enrichment of all targeted ORFs versus untargeted ORFs was 30 times. For ORFs up to 1kb in size, targeted ORFs were enriched up to a median of 260-fold. Here, we show that LASSO probes obtained in this manner, were able to capture full-length open reading frames from total human cDNA. Furthermore, we show that the LASSO capture specificity and sensitivity is sufficient for target capture from total human genomic DNA template. This technology can be used for the preparation of long-read sequencing libraries and for massively multiplexed cloning of human sequences.
Collapse
Affiliation(s)
- Lorenzo Tosi
- Department of Biomedical Engineering, Rutgers University,
Piscataway, New Jersey 08854, USA
| | - Lamia Chaikban
- Department of Biomedical Engineering, Rutgers University,
Piscataway, New Jersey 08854, USA
| | - Benjamin H. Larman
- Institute of Cell Engineering, Division of Immunology,
Department of Pathology, Johns Hopkins University, Baltimore, MD, USA
| | - Jeffrey Rosenfield
- Cancer Institute of New Jersey, New Brunswick, New Jersey
08854, USA,Department of Pathology, Robert Wood Johnson Medical
School, New Brunswick, NJ 08903, USA
| | - Biju Parekkadan
- Department of Biomedical Engineering, Rutgers University,
Piscataway, New Jersey 08854, USA,Cancer Institute of New Jersey, New Brunswick, New Jersey
08854, USA,Correspondence and requests for materials should
be addressed to B.P. (; 599 Taylor
Road, Piscataway, NJ 08854)
| |
Collapse
|
3
|
Sidore AM, Plesa C, Samson JA, Lubock NB, Kosuri S. DropSynth 2.0: high-fidelity multiplexed gene synthesis in emulsions. Nucleic Acids Res 2020; 48:e95. [PMID: 32692349 PMCID: PMC7498354 DOI: 10.1093/nar/gkaa600] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2019] [Revised: 06/13/2020] [Accepted: 07/11/2020] [Indexed: 01/12/2023] Open
Abstract
Multiplexed assays allow functional testing of large synthetic libraries of genetic elements, but are limited by the designability, length, fidelity and scale of the input DNA. Here, we improve DropSynth, a low-cost, multiplexed method that builds gene libraries by compartmentalizing and assembling microarray-derived oligonucleotides in vortexed emulsions. By optimizing enzyme choice, adding enzymatic error correction and increasing scale, we show that DropSynth can build thousands of gene-length fragments at >20% fidelity.
Collapse
Affiliation(s)
- Angus M Sidore
- Department of Chemical and Biomolecular Engineering, University of California, Los Angeles, Los Angeles, CA 90095, USA
| | - Calin Plesa
- Department of Chemistry and Biochemistry, University of California, Los Angeles, Los Angeles, CA 90095, USA
| | - Joyce A Samson
- Department of Chemistry and Biochemistry, University of California, Los Angeles, Los Angeles, CA 90095, USA
| | - Nathan B Lubock
- Department of Chemistry and Biochemistry, University of California, Los Angeles, Los Angeles, CA 90095, USA
| | - Sriram Kosuri
- Department of Chemistry and Biochemistry, University of California, Los Angeles, Los Angeles, CA 90095, USA.,UCLA-DOE Institute for Genomics and Proteomics, Molecular Biology Institute, Quantitative and Computational Biology Institute, Eli and Edythe Broad Center of Regenerative Medicine and Stem Cell Research, Jonsson Comprehensive Cancer Center, University of California, Los Angeles, Los Angeles, CA 90095, USA
| |
Collapse
|
4
|
Plesa C, Sidore AM, Lubock NB, Zhang D, Kosuri S. Multiplexed gene synthesis in emulsions for exploring protein functional landscapes. Science 2018; 359:343-347. [PMID: 29301959 PMCID: PMC6261299 DOI: 10.1126/science.aao5167] [Citation(s) in RCA: 67] [Impact Index Per Article: 11.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2017] [Accepted: 12/18/2017] [Indexed: 12/14/2022]
Abstract
Improving our ability to construct and functionally characterize DNA sequences would broadly accelerate progress in biology. Here, we introduce DropSynth, a scalable, low-cost method to build thousands of defined gene-length constructs in a pooled (multiplexed) manner. DropSynth uses a library of barcoded beads that pull down the oligonucleotides necessary for a gene's assembly, which are then processed and assembled in water-in-oil emulsions. We used DropSynth to successfully build more than 7000 synthetic genes that encode phylogenetically diverse homologs of two essential genes in Escherichia coli We tested the ability of phosphopantetheine adenylyltransferase homologs to complement a knockout E. coli strain in multiplex, revealing core functional motifs and reasons underlying homolog incompatibility. DropSynth coupled with multiplexed functional assays allows us to rationally explore sequence-function relationships at an unprecedented scale.
Collapse
Affiliation(s)
- Calin Plesa
- Department of Chemistry and Biochemistry, University of California, Los Angeles, Los Angeles, California, USA
| | - Angus M. Sidore
- Department of Chemical and Biomolecular Engineering, University of California, Los Angeles, Los Angeles, California, USA
| | - Nathan B. Lubock
- Department of Chemistry and Biochemistry, University of California, Los Angeles, Los Angeles, California, USA
| | - Di Zhang
- Genomics and Computational Biology Graduate Group, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Sriram Kosuri
- Department of Chemistry and Biochemistry, University of California, Los Angeles, Los Angeles, California, USA
- UCLA-DOE Institute for Genomics and Proteomics, Molecular Biology Institute, Quantitative and Computational Biology Institute, Eli and Edythe Broad Center of Regenerative Medicine and Stem Cell Research, Jonsson Comprehensive Cancer Center, University of California, Los Angeles, Los Angeles, California, USA
| |
Collapse
|
5
|
Hsiau THC, Sukovich D, Elms P, Prince RN, Strittmatter T, Ruan P, Curry B, Anderson P, Sampson J, Anderson JC. Correction: A method for multiplex gene synthesis employing error correction based on expression. PLoS One 2015; 10:e0126078. [PMID: 25945930 PMCID: PMC4422662 DOI: 10.1371/journal.pone.0126078] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/03/2022] Open
|