1
|
Biondi E, Benner SA. Artificially Expanded Genetic Information Systems for New Aptamer Technologies. Biomedicines 2018; 6:E53. [PMID: 29747381 PMCID: PMC6027400 DOI: 10.3390/biomedicines6020053] [Citation(s) in RCA: 44] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2018] [Revised: 05/04/2018] [Accepted: 05/06/2018] [Indexed: 01/04/2023] Open
Abstract
Directed evolution was first applied to diverse libraries of DNA and RNA molecules a quarter century ago in the hope of gaining technology that would allow the creation of receptors, ligands, and catalysts on demand. Despite isolated successes, the outputs of this technology have been somewhat disappointing, perhaps because the four building blocks of standard DNA and RNA have too little functionality to have versatile binding properties, and offer too little information density to fold unambiguously. This review covers the recent literature that seeks to create an improved platform to support laboratory Darwinism, one based on an artificially expanded genetic information system (AEGIS) that adds independently replicating nucleotide “letters” to the evolving “alphabet”.
Collapse
Affiliation(s)
- Elisa Biondi
- Foundation for Applied Molecular Evolution, Alachua, FL 32615, USA.
- Firebird Biomolecular Sciences, LLC, Alachua, FL 32615, USA.
| | - Steven A Benner
- Foundation for Applied Molecular Evolution, Alachua, FL 32615, USA.
- Firebird Biomolecular Sciences, LLC, Alachua, FL 32615, USA.
| |
Collapse
|
2
|
Benner SA, Sassi SO, Gaucher EA. Molecular paleoscience: systems biology from the past. ACTA ACUST UNITED AC 2007; 75:1-132, xi. [PMID: 17124866 DOI: 10.1002/9780471224464.ch1] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/22/2023]
Abstract
Experimental paleomolecular biology, paleobiochemistry, and paleogenetics are closely related emerging fields that infer the sequences of ancient genes and proteins from now-extinct organisms, and then resurrect them for study in the laboratory. The goal of paleogenetics is to use information from natural history to solve the conundrum of modern genomics: How can we understand deeply the function of biomolecular structures uncovered and described by modern chemical biology? Reviewed here are the first 20 cases where biomolecular resurrections have been achieved. These show how paleogenetics can lead to an understanding of the function of biomolecules, analyze changing function, and put meaning to genomic sequences, all in ways that are not possible with traditional molecular biological studies.
Collapse
Affiliation(s)
- Steven A Benner
- Foundation for Applied Molecular Evolution, 1115 NW 4th Street, Gainesville, FL 32601, USA
| | | | | |
Collapse
|
3
|
Schmid DG, Grosche P, Bandel H, Jung G. FTICR-mass spectrometry for high-resolution analysis in combinatorial chemistry. Biotechnol Bioeng 2001. [DOI: 10.1002/1097-0290(2000)71:2<149::aid-bit1005>3.0.co;2-c] [Citation(s) in RCA: 33] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
|
4
|
Benner SA, Chamberlin SG, Liberles DA, Govindarajan S, Knecht L. Functional inferences from reconstructed evolutionary biology involving rectified databases--an evolutionarily grounded approach to functional genomics. Res Microbiol 2000; 151:97-106. [PMID: 10865954 DOI: 10.1016/s0923-2508(00)00123-6] [Citation(s) in RCA: 46] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
Abstract
If bioinformatics tools are constructed to reproduce the natural, evolutionary history of the biosphere, they offer powerful approaches to some of the most difficult tasks in genomics, including the organization and retrieval of sequence data, the updating of massive genomic databases, the detection of database error, the assignment of introns, the prediction of protein conformation from protein sequences, the detection of distant homologs, the assignment of function to open reading frames, the identification of biochemical pathways from genomic data, and the construction of a comprehensive model correlating the history of biomolecules with the history of planet Earth.
Collapse
Affiliation(s)
- S A Benner
- Department of Chemistry, University of Florida, Gainesville, USA.
| | | | | | | | | |
Collapse
|
5
|
Gene Trees and Species Trees: The Gene-Duplication Problem is Fixed-Parameter Tractable. LECTURE NOTES IN COMPUTER SCIENCE 1999. [DOI: 10.1007/3-540-48447-7_29] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
|
6
|
Benner SA, Trabesinger N, Schreiber D. Post-genomic science: converting primary structure into physiological function. ADVANCES IN ENZYME REGULATION 1998; 38:155-80. [PMID: 9762352 DOI: 10.1016/s0065-2571(97)00019-8] [Citation(s) in RCA: 32] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
Affiliation(s)
- S A Benner
- Department of Chemistry, University of Florida, Gainesville 32611, USA
| | | | | |
Collapse
|
7
|
Fellows M, Hallett M, Stege U. On the Multiple Gene Duplication Problem. ALGORITHMS AND COMPUTATION 1998. [DOI: 10.1007/3-540-49381-6_37] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]
|
8
|
Gerloff DL, Cohen FE, Korostensky C, Turcotte M, Gonnet GH, Benner SA. A predicted consensus structure for the N-Terminal fragment of the heat shock protein HSP90 family. Proteins 1997. [DOI: 10.1002/(sici)1097-0134(199703)27:3<450::aid-prot12>3.0.co;2-k] [Citation(s) in RCA: 20] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
|
9
|
|
10
|
Tauer A, Benner SA. The B12-dependent ribonucleotide reductase from the archaebacterium Thermoplasma acidophila: an evolutionary solution to the ribonucleotide reductase conundrum. Proc Natl Acad Sci U S A 1997; 94:53-8. [PMID: 8990160 PMCID: PMC19235 DOI: 10.1073/pnas.94.1.53] [Citation(s) in RCA: 51] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023] Open
Abstract
A coenzyme B12-dependent ribonucleotide reductase was purified from the archaebacterium Thermoplasma acidophila and partially sequenced. Using probes derived from the sequence, the corresponding gene was cloned, completely sequenced, and expressed in Escherichia coli. The deduced amino acid sequence shows that the catalytic domain of the B12-dependent enzyme from T. acidophila, some 400 amino acids, is related by common ancestry to the diferric tyrosine radical iron(III)-dependent ribonucleotide reductase from E. coli, yeast, mammalian viruses, and man. The critical cysteine residues in the catalytic domain that participate in the thiyl radical-dependent reaction have been conserved even though the cofactor that generates the radical is not. Evolutionary bridges created by the T. acidophila sequence and that of a B12-dependent reductase from Mycobacterium tuberculosis establish homology between the Fe-dependent enzymes and the catalytic domain of the Lactobacillus leichmannii B12-dependent enzyme as well. These bridges are confirmed by a predicted secondary structure for the Lactobacillus enzyme. Sequence similarities show that the N-terminal domain of the T. acidophila ribonucleotide reductase is also homologous to the anaerobic ribonucleotide reductase from E. coli, which uses neither B12 nor Fe cofactors. A predicted secondary structure of the N-terminal domain suggests that it is predominantly helical, as is the domain in the aerobic E. coli enzyme depending on Fe, extending the homologous family of proteins to include anaerobic ribonucleotide reductases, B12 ribonucleotide reductases, and Fe-dependent aerobic ribonucleotide reductases. A model for the evolution of the ribonucleotide reductase family is presented; in this model, the thiyl radical-based reaction mechanism is conserved, but the cofactor is chosen to best adapt the host organism to its environment. This analysis illustrates how secondary structure predictions can assist evolutionary analyses, each important in "post-genomic" biochemistry.
Collapse
Affiliation(s)
- A Tauer
- Department of Chemistry, Eidgenössiche Technische Hochschule Zürich, Switzerland
| | | |
Collapse
|
11
|
Trabesinger-Ruef N, Jermann T, Zankel T, Durrant B, Frank G, Benner SA. Pseudogenes in ribonuclease evolution: a source of new biomacromolecular function? FEBS Lett 1996; 382:319-22. [PMID: 8605993 DOI: 10.1016/0014-5793(96)00191-3] [Citation(s) in RCA: 51] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023]
Abstract
Bovine seminal ribonuclease (RNase) diverged from pancreatic RNase after a gene duplication ca. 35 million years ago. Members of the seminal RNase gene family evidently remained as unexpressed pseudogene for much of its evolutionary history. Between 5 and 10 million years ago, however, after the divergence of kudu but before the divergence of ox, evidence suggests that the pseudogene was repaired and expressed. Intriguingly, detailed analysis of the sequences suggests that the repair may have involved gene conversion, transfer of information from the pancreatic gene to the RNase pseudogene. Further, the ratio of non-silent to silent substitutions suggests that the pancreatic RNases are divergently evolving under functional constraints, the seminal RNase pseudogenes are diverging under no functional constraints, while the genes expressed in the seminal plasma are evolving extremely rapidly in their amino acid sequences, as if to fulfil a new physiological role.
Collapse
|
12
|
Gerloff DL, Chelvanayagam G, Benner SA. A predicted consensus structure for the protein kinase C2 homology (C2H) domain, the repeating unit of synaptotagmin. Proteins 1995; 22:299-310. [PMID: 7479705 DOI: 10.1002/prot.340220402] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023]
Abstract
A secondary structure has been predicted for the protein kinase C2 regulatory domain found in homologous form in synaptotagmin, some phospholipases, and some GTP activated proteins. The proposed structure is built from seven consecutive beta strands followed by a terminal alpha helix. Considerations of overall surface exposure of individual secondary structural elements suggest that these are packed into a 2-sheet beta sandwich structure, with one of only three of the many possible folds being preferred.
Collapse
Affiliation(s)
- D L Gerloff
- Department of Chemistry, Swiss Federal Institute of Technology, Zurich, Switzerland
| | | | | |
Collapse
|
13
|
Gerloff DL, Benner SA. A consensus prediction of the secondary structure for the 6-phospho-beta-D-galactosidase superfamily. Proteins 1995; 21:273-81. [PMID: 7567950 DOI: 10.1002/prot.340210402] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Abstract
Two separate unrefined models for the secondary structure of two subfamilies of the 6-phospho-beta-D-galactosidase superfamily were independently constructed by examining patterns of variation and conservation within homologous protein sequences, assigning surface, interior, parsing, and active site residues to positions in the alignment, and identifying periodicities in these. A consensus model for the secondary structure of the entire superfamily was then built. The prediction tests the limits of an unrefined prediction made using this approach in a large protein with substantial functional and sequence divergence within the family. The protein belongs to the (alpha-beta class), with the core beta strands aligned parallel. The supersecondary structural elements that are readily identified in this model is a parallel beta sheet built by strands C, D, and E, with helices 2 and 3 connecting strands (C+D) and (D+E), respectively, and an analogous beta-alpha unit (strand G and helix 7) toward the end of the sequence. The resemblance of the supersecondary model to the tertiary structure formed by 8-fold alpha-beta barrel proteins is almost certainly not coincidental.
Collapse
Affiliation(s)
- D L Gerloff
- Department of Chemistry, Swiss Federal Institute of Technology, Zürich
| | | |
Collapse
|
14
|
Benner SA. Predicting the conformation of proteins from sequences. Progress and future progress. J Mol Recognit 1995; 8:9-28. [PMID: 7598957 DOI: 10.1002/jmr.300080104] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Abstract
Recent progress in structure prediction has allowed bona fide predictions, those made and published before an experimental structure is determined, to be remarkably accurate. The most successful methods rely on an analysis of patterns of conservation and variation within homologous protein sequences, extract tertiary structural information before secondary structure is predicted, and avoid 'three state per residue scores' as a tool for evaluating a prediction, focusing instead on efforts to understand why a prediction is successful when it is successful, and why it fails when it fails.
Collapse
Affiliation(s)
- S A Benner
- Laboratory for Organic Chemistry, Zurich, Switzerland
| |
Collapse
|
15
|
Tuckwell DS, Humphries MJ, Brass A. A secondary structure model of the integrin alpha subunit N-terminal domain based on analysis of multiple alignments. CELL ADHESION AND COMMUNICATION 1994; 2:385-402. [PMID: 7842254 DOI: 10.3109/15419069409004450] [Citation(s) in RCA: 33] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]
Abstract
The integrins are alpha/beta heterodimeric proteins which mediate cell-matrix and cell-cell interactions. Current data indicate that the N-terminal moiety of the alpha subunit is involved in ligand binding. This region of the receptor is made up of a seven-fold repeated sequence of unknown structure which contains EF-hand-like putative divalent cation-binding sites. Recent studies have shown that multiple sequence alignments can be analysed to yield secondary structure predictions. Therefore, to obtain a model structure for the integrin alpha subunit N-terminal domain repeat, a large alignment of the seven repeats from sixteen integrin sequences was generated. Two methods of analysis were used: First, Chou and Fasman and Garnier, Osguthorpe and Robson predictions were carried out for individual sequences and the consensus predictions derived. Consensus hydrophobicity and chain flexibility data were also used to provide additional data. Second, sites of conservation and variation were analysed by a computer program STAMA (STructure After Multiple Alignment) to yield a secondary structure prediction. The two analyses gave essentially the same predicted structure: undefined region, loop, alpha-helix, beta-strand, divalent cation-binding loop, beta-strand, putative turn, loop, beta-strand. This is the first model structure to be presented for an integrin domain. Its implications for integrin function are discussed.
Collapse
Affiliation(s)
- D S Tuckwell
- School of Biological Sciences, University of Manchester, U.K
| | | | | |
Collapse
|
16
|
Benner SA, Jenny TF, Cohen MA, Gonnet GH. Predicting the conformation of proteins from sequences. Progress and future progress. ADVANCES IN ENZYME REGULATION 1994; 34:269-353. [PMID: 7942279 DOI: 10.1016/0065-2571(94)90021-3] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]
Abstract
A new paradigm for predicting the secondary and tertiary structure of functional proteins from sequence data has emerged from detailed models of how natural selection, conservation, and neutral drift, the three fundamental factors in molecular evolution, leave their mark upon protein sequences. Structural information is extracted from a set of aligned homologous sequences via an analysis of patterns of conservation and variation between proteins with quantitatively defined evolutionary relationships. Tertiary structural information is obtained prior to the assignment of secondary structure, where it plays an important role. Throughout, structural predictions are made with the active involvement of a biochemist whose expertise and insight is critical both for making the prediction and in analyzing its successful and unsuccessful parts. Secondary structure predictions are evaluated based on their ability to sustain an effort to model tertiary structure. Several predictions made using the new paradigm can now be compared with those made under the classical paradigm, including a neural network. The results obtained from the new paradigm are clearly superior to those obtained with the classical paradigm, at least within the protein families that were examined.
Collapse
Affiliation(s)
- S A Benner
- Institute for Organic Chemistry, E.T.H., Zürich, Switzerland
| | | | | | | |
Collapse
|
17
|
Abstract
Two types of approaches for predicting the conformation of proteins from sequence data have lately received attention: 'black box' tools that generate fully automated predictions of secondary structure from a set of homologous protein sequences, and methods involving the expertise of a human biochemist who is assisted, but not replaced, by computer tools. A friendly controversy has emerged as to which approach offers a brighter future. In fact, both are necessary. Nevertheless, a snapshot of the controversy at this instant offers much insight into the structure prediction problem itself.
Collapse
Affiliation(s)
- S A Benner
- Laboratory for Organic Chemistry, E.T.H. Zurich, Switzerland
| | | |
Collapse
|
18
|
Gerloff DL, Jenny TF, Knecht LJ, Gonnet GH, Benner SA. The nitrogenase MoFe protein. A secondary structure prediction. FEBS Lett 1993; 318:118-24. [PMID: 8440368 DOI: 10.1016/0014-5793(93)80004-e] [Citation(s) in RCA: 26] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023]
Abstract
Surface residues, interior residues, and parsing residues, together with a secondary structure derived from these, are predicted for the MoFe nitrogenase protein in advance of a crystal structure of the protein, scheduled shortly to appear in Nature. By publishing this prediction, we test our method for predicting the conformation of proteins from patterns in the divergent evolution of homologous protein sequences in a way that places the method 'at risk'.
Collapse
Affiliation(s)
- D L Gerloff
- Laboratory for Organic Chemistry, ETH Zürich, Switzerland
| | | | | | | | | |
Collapse
|
19
|
Benner SA, Gerloff D. Patterns of divergence in homologous proteins as indicators of secondary and tertiary structure: a prediction of the structure of the catalytic domain of protein kinases. ADVANCES IN ENZYME REGULATION 1991; 31:121-81. [PMID: 1877385 DOI: 10.1016/0065-2571(91)90012-b] [Citation(s) in RCA: 127] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
The secondary structure and elements of tertiary structure have been predicted for the catalytic domain of protein kinases using a method that extracts structural information from the patterns of conservation and variation in an alignment of homologous proteins. The central features of this structural prediction are: (a) the catalytic domains of protein kinases do not incorporate a Rossmann fold; (b) the core of the structure is founded on beta sheets built from pairs of bent antiparallel beta strands; (c) five helices, including an especially long helix (alignment positions 129-152) that lie on the outside of the folded core. These proteins are important in many aspects of metabolic regulation.
Collapse
Affiliation(s)
- S A Benner
- Laboratory for Organic Chemistry, E.T.H., Zurich, Switzerland
| | | |
Collapse
|