1
|
Error Rate Comparison during Polymerase Chain Reaction by DNA Polymerase. Mol Biol Int 2014; 2014:287430. [PMID: 25197572 PMCID: PMC4150459 DOI: 10.1155/2014/287430] [Citation(s) in RCA: 131] [Impact Index Per Article: 13.1] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2014] [Accepted: 07/21/2014] [Indexed: 12/20/2022] Open
Abstract
As larger-scale cloning projects become more prevalent, there is an increasing need for comparisons among high fidelity DNA polymerases used for PCR amplification. All polymerases marketed for PCR applications are tested for fidelity properties (i.e., error rate determination) by vendors, and numerous literature reports have addressed PCR enzyme fidelity. Nonetheless, it is often difficult to make direct comparisons among different enzymes due to numerous methodological and analytical differences from study to study. We have measured the error rates for 6 DNA polymerases commonly used in PCR applications, including 3 polymerases typically used for cloning applications requiring high fidelity. Error rate measurement values reported here were obtained by direct sequencing of cloned PCR products. The strategy employed here allows interrogation of error rate across a very large DNA sequence space, since 94 unique DNA targets were used as templates for PCR cloning. The six enzymes included in the study, Taq polymerase, AccuPrime-Taq High Fidelity, KOD Hot Start, cloned Pfu polymerase, Phusion Hot Start, and Pwo polymerase, we find the lowest error rates with Pfu, Phusion, and Pwo polymerases. Error rates are comparable for these 3 enzymes and are >10x lower than the error rate observed with Taq polymerase. Mutation spectra are reported, with the 3 high fidelity enzymes displaying broadly similar types of mutations. For these enzymes, transition mutations predominate, with little bias observed for type of transition.
Collapse
|
2
|
Maruyama Y, Kawamura Y, Nishikawa T, Isogai T, Nomura N, Goshima N. HGPD: Human Gene and Protein Database, 2012 update. Nucleic Acids Res 2011; 40:D924-9. [PMID: 22140100 PMCID: PMC3245012 DOI: 10.1093/nar/gkr1188] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
The Human Gene and Protein Database (HGPD; http://www.HGPD.jp/) is a unique database that stores information on a set of human Gateway entry clones in addition to protein expression and protein synthesis data. The HGPD was launched in November 2008, and 33,275 human Gateway entry clones have been constructed from the open reading frames (ORFs) of full-length cDNA, thus representing the largest collection in the world. Recently, research objectives have focused on the development of new medicines and the establishment of novel diagnostic methods and medical treatments. And, studies using proteins and protein information, which are closely related to gene function, have been undertaken. For this update, we constructed an additional 9974 human Gateway entry clones, giving a total of 43,249. This set of human Gateway entry clones was named the Human Proteome Expression Resource, known as the 'HuPEX'. In addition, we also classified the clones into 10 groups according to protein function. Moreover, in vivo cellular localization data of proteins for 32,651 human Gateway entry clones were included for retrieval from the HGPD. In 'Information Overview', which presents the search results, the ORF region of each cDNA is now displayed allowing the Gateway entry clones to be searched more easily.
Collapse
Affiliation(s)
- Yukio Maruyama
- National Institute of Advanced Industrial Science and Technology, Japan Biological Informatics Consortium, Aomi, Koto-ku, Tokyo 135-0064, Japan
| | | | | | | | | | | |
Collapse
|
3
|
Škalamera D, Ranall MV, Wilson BM, Leo P, Purdon AS, Hyde C, Nourbakhsh E, Grimmond SM, Barry SC, Gabrielli B, Gonda TJ. A high-throughput platform for lentiviral overexpression screening of the human ORFeome. PLoS One 2011; 6:e20057. [PMID: 21629697 PMCID: PMC3101218 DOI: 10.1371/journal.pone.0020057] [Citation(s) in RCA: 41] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2010] [Accepted: 04/24/2011] [Indexed: 11/22/2022] Open
Abstract
In response to the growing need for functional analysis of the human genome, we have developed a platform for high-throughput functional screening of genes overexpressed from lentiviral vectors. Protein-coding human open reading frames (ORFs) from the Mammalian Gene Collection were transferred into lentiviral expression vector using the highly efficient Gateway recombination cloning. Target ORFs were inserted into the vector downstream of a constitutive promoter and upstream of an IRES controlled GFP reporter, so that their transfection, transduction and expression could be monitored by fluorescence. The expression plasmids and viral packaging plasmids were combined and transfected into 293T cells to produce virus, which was then used to transduce the screening cell line. We have optimised the transfection and transduction procedures so that they can be performed using robotic liquid handling systems in arrayed 96-well microplate, one-gene-per-well format, without the need to concentrate the viral supernatant. Since lentiviruses can infect both dividing and non-dividing cells, this system can be used to overexpress human ORFs in a broad spectrum of experimental contexts. We tested the platform in a 1990 gene pilot screen for genes that can increase proliferation of the non-tumorigenic mammary epithelial cell line MCF-10A after removal of growth factors. Transduced cells were labelled with the nucleoside analogue 5-ethynyl-2′-deoxyuridine (EdU) to detect cells progressing through S phase. Hits were identified using high-content imaging and statistical analysis and confirmed with vectors using two different promoters (CMV and EF1α). The screen demonstrates the reliability, versatility and utility of our screening platform, and identifies novel cell cycle/proliferative activities for a number of genes.
Collapse
Affiliation(s)
- Dubravka Škalamera
- University of Queensland Diamantina Institute, Princess Alexandra Hospital, Brisbane, Queensland, Australia.
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
4
|
Suzuki T, Moriya K, Nagatoshi K, Ota Y, Ezure T, Ando E, Tsunasawa S, Utsumi T. Strategy for comprehensive identification of human N-myristoylated proteins using an insect cell-free protein synthesis system. Proteomics 2010; 10:1780-93. [PMID: 20213681 DOI: 10.1002/pmic.200900783] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
To establish a strategy for the comprehensive identification of human N-myristoylated proteins, the susceptibility of human cDNA clones to protein N-myristoylation was evaluated by metabolic labeling and MS analyses of proteins expressed in an insect cell-free protein synthesis system. One-hundred-and-forty-one cDNA clones with N-terminal Met-Gly motifs were selected as potential candidates from approximately 2000 Kazusa ORFeome project human cDNA clones, and their susceptibility to protein N-myristoylation was evaluated using fusion proteins, in which the N-terminal ten amino acid residues were fused to an epitope-tagged model protein. As a result, the products of 29 out of 141 cDNA clones were found to be effectively N-myristoylated. The metabolic labeling experiments both in an insect cell-free protein synthesis system and in the transfected COS-1 cells using full-length cDNA revealed that 27 out of 29 proteins were in fact N-myristoylated. Database searches with these 27 cDNA clones revealed that 18 out of 27 proteins are novel N-myristoylated proteins that have not been reported previously to be N-myristoylated, indicating that this strategy is useful for the comprehensive identification of human N-myristoylated proteins from human cDNA resources.
Collapse
Affiliation(s)
- Takashi Suzuki
- Clinical and Biotechnology Business Unit, Shimadzu Corporation, Kyoto, Japan
| | | | | | | | | | | | | | | |
Collapse
|
5
|
Godiska R, Mead D, Dhodda V, Wu C, Hochstein R, Karsi A, Usdin K, Entezam A, Ravin N. Linear plasmid vector for cloning of repetitive or unstable sequences in Escherichia coli. Nucleic Acids Res 2010; 38:e88. [PMID: 20040575 PMCID: PMC2847241 DOI: 10.1093/nar/gkp1181] [Citation(s) in RCA: 72] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2009] [Revised: 12/02/2009] [Accepted: 12/02/2009] [Indexed: 01/26/2023] Open
Abstract
Despite recent advances in sequencing, complete finishing of large genomes and analysis of novel proteins they encode typically require cloning of specific regions. However, many of these fragments are extremely difficult to clone in current vectors. Superhelical stress in circular plasmids can generate secondary structures that are substrates for deletion, particularly in regions that contain numerous tandem or inverted repeats. Common vectors also induce transcription and translation of inserted fragments, which can select against recombinant clones containing open reading frames or repetitive DNA. Conversely, transcription from cloned promoters can interfere with plasmid stability. We have therefore developed a novel Escherichia coli cloning vector (termed 'pJAZZ' vector) that is maintained as a linear plasmid. Further, it contains transcriptional terminators on both sides of the cloning site to minimize transcriptional interference between vector and insert. We show that this vector stably maintains a variety of inserts that were unclonable in conventional plasmids. These targets include short nucleotide repeats, such as those of the expanded Fragile X locus, and large AT-rich inserts, such as 20-kb segments of genomic DNA from Pneumocystis, Plasmodium, Oxytricha or Tetrahymena. The pJAZZ vector shows decreased size bias in cloning, allowing more uniform representation of larger fragments in libraries.
Collapse
Affiliation(s)
- Ronald Godiska
- Lucigen Corp., 2120 W. Greenview Dr., Middleton, WI 53562, USA.
| | | | | | | | | | | | | | | | | |
Collapse
|
6
|
Ohara O. From transcriptome analysis to immunogenomics: current status and future direction. FEBS Lett 2009; 583:1662-7. [PMID: 19379746 DOI: 10.1016/j.febslet.2009.04.021] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2009] [Revised: 04/01/2009] [Accepted: 04/14/2009] [Indexed: 10/20/2022]
Abstract
In 1994, we pioneered a complementary DNA (cDNA) sequencing project that aimed to predict the primary structures of unknown human proteins. Although our cDNA project was focused on the sequencing of large cDNAs, the following cDNA sequencing projects conducted by other groups have more extensively characterized mammalian transcriptome. In parallel, many groups have made a tremendous amount of effort to develop various resources for functional human genomics. In this context, to demonstrate the power of functional genomic approaches in practice, we have applied them for a comprehensive understanding of the immune system, which we term 'immunogenomics'. This mini-review first describes the historical background of our cDNA project and then provides perspectives on the present and future of immunogenomics based on our experiences.
Collapse
Affiliation(s)
- Osamu Ohara
- Department of Human Genome Research, Kazusa DNA Research Institute, 2-6-7 Kazusa-Kamatari, Kisarazu, Chiba 292-0818, Japan.
| |
Collapse
|
7
|
Human protein factory for converting the transcriptome into an in vitro-expressed proteome,. Nat Methods 2009; 5:1011-7. [PMID: 19054851 DOI: 10.1038/nmeth.1273] [Citation(s) in RCA: 197] [Impact Index Per Article: 13.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
Abstract
Appropriate resources and expression technology necessary for human proteomics on a whole-proteome scale are being developed. We prepared a foundation for simple and efficient production of human proteins using the versatile Gateway vector system. We generated 33,275 human Gateway entry clones for protein synthesis, developed mRNA expression protocols for them and improved the wheat germ cell-free protein synthesis system. We applied this protein expression system to the in vitro expression of 13,364 human proteins and assessed their biological activity in two functional categories. Of the 75 tested phosphatases, 58 (77%) showed biological activity. Several cytokines containing disulfide bonds were produced in an active form in a nonreducing wheat germ cell-free expression system. We also manufactured protein microarrays by direct printing of unpurified in vitro-synthesized proteins and demonstrated their utility. Our 'human protein factory' infrastructure includes the resources and expression technology for in vitro proteome research.
Collapse
|
8
|
Yamaguchi K, Inoue S, Ohara O, Nagase T. Pulse-chase experiment for the analysis of protein stability in cultured mammalian cells by covalent fluorescent labeling of fusion proteins. Methods Mol Biol 2009; 577:121-131. [PMID: 19718513 DOI: 10.1007/978-1-60761-232-2_10] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/28/2023]
Abstract
We used HaloTag labeling technology for the pulse labeling of proteins in cultured mammalian cells. HaloTag technology allows a HaloTag-fusion protein to covalently bind to a specific small molecule fluorescent ligand. Thus specifically labeled HaloTag-fusion proteins can be chased in cells and observed in vitro after separation by sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE). The Fluorescent HaloTag ligand allows quantification of the labeled proteins by fluorescent image analysis. Herein, we demonstrated that the method allows analysis of the intracellular protein stability as regulated by protein-degradation signals or an exogenously expressed E3 ubiquitin ligase.
Collapse
Affiliation(s)
- Kei Yamaguchi
- Laboratory of Human Gene Research, Department of Human Genome Research, Kazusa DNA Research Institute, Chiba, Japan
| | | | | | | |
Collapse
|
9
|
Yamakawa H. High-throughput construction of ORF clones for production of the recombinant proteins. Methods Mol Biol 2009; 577:25-39. [PMID: 19718506 DOI: 10.1007/978-1-60761-232-2_3] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
Abstract
Expression-ready cDNA clones, where the open reading frame (ORF) of the gene of interest is placed under the control of an appropriate promoter, are critical for functional characterization of the gene products. To create a resource of human gene products, we attempted to systematically convert original cDNA clones to expression-ready forms for recombinant proteins. For this purpose, we adopted a rare-cutting restriction enzyme-based system, the Flexi cloning system, to construct ORF clones. Taking advantage of the fully sequenced cDNA clones we accumulated to date, a number of sets of Flexi ORF clones in a 96-well format have been prepared. In this section, two methods for the preparation of Flexi ORF clones in a 96-well format are described. A protocol for transferring ORFs between Flexi vectors in a 96-well format is also described. We believe that the resultant clone set could be successfully used as a versatile reagent for functional characterization of human proteins.
Collapse
Affiliation(s)
- Hisashi Yamakawa
- Department of Human Genome Research, Kazusa DNA Research Institute, Laboratory of Human Gene Research, Chiba, Japan
| |
Collapse
|
10
|
Proteome expression moves in vitro: resources and tools for harnessing the human proteome. Nat Methods 2008; 5:1001-2. [DOI: 10.1038/nmeth1208-1001] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
11
|
Ramachandran N, Srivastava S, Labaer J. Applications of protein microarrays for biomarker discovery. Proteomics Clin Appl 2008; 2:1444-59. [PMID: 21136793 DOI: 10.1002/prca.200800032] [Citation(s) in RCA: 60] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2008] [Indexed: 01/18/2023]
Abstract
The search for new biomarkers for diagnosis, prognosis, and therapeutic monitoring of diseases continues in earnest despite dwindling success at finding novel reliable markers. Some of the current markers in clinical use do not provide optimal sensitivity and specificity, with the prostate cancer antigen (PSA) being one of many such examples. The emergence of proteomic techniques and systems approaches to study disease pathophysiology has rekindled the quest for new biomarkers. In particular the use of protein microarrays has surged as a powerful tool for large-scale testing of biological samples. Approximately half the reports on protein microarrays have been published in the last two years especially in the area of biomarker discovery. In this review, we will discuss the application of protein microarray technologies that offer unique opportunities to find novel biomarkers.
Collapse
Affiliation(s)
- Niroshan Ramachandran
- Harvard Institute of Proteomics, Department of Biological Chemistry and Molecular Pharmacology, Harvard Medical School, Cambridge, MA, USA
| | | | | |
Collapse
|
12
|
Subcellular localization of intracellular human proteins by construction of tagged fusion proteins and transient expression in COS-7 Cells. Methods Mol Biol 2008. [PMID: 18370115 DOI: 10.1007/978-1-59745-188-8_24] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register]
Abstract
Identifying the subcellular compartment of a protein is an important step toward assigning protein function. Starting with a clone containing the open reading frame (ORF) of interest, it is possible to attach a variety of short amino acid tags or fluorescent proteins and detect the location of the protein, after transfection into a cell line, using fluorescent microscopy. By collecting data from various expression clone constructs, using a range of cell lines and double labeling with cellular compartment markers, a picture of the localization of a gene can be built up. This chapter describes how to obtain the ORF clone for your gene of interest, clone it into your choice of mammalian expression vector or vectors, transiently transfect for visualization, and where to get started when interpreting the results.
Collapse
|
13
|
Construction and characterization of a normalized yeast two-hybrid library derived from a human protein-coding clone collection. Biotechniques 2008; 44:265-73. [PMID: 18330356 DOI: 10.2144/000112674] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
Abstract
The nuclear yeast two-hybrid (Y2H) system is the most widely used technology for detecting interactions between proteins. A common approach is to screen specific test proteins (baits) against large compilations of randomly cloned proteins (prey libraries). For eukaryotic organisms, libraries have traditionally been generated using messenger RNA (mRNA) extracted from various tissues and cells. Here we present a library construction strategy made possible by ongoing public efforts to establish collections of full-length protein encoding clones. Our approach generates libraries that are essentially normalized and contain both randomly fragmented as well as full-length inserts. We refer to this type of protein-coding clone-derived library as random and full-length (RAFL) Y2H library. The library described here is based on clones from the Mammalian Gene Collection, but our strategy is compatible with the use of any protein-coding clone collection from any organism in any vector and does not require inserts to be devoid of untranslated regions. We tested our prototype human RAFL library against a set of baits that had previously been searched against multiple cDNA libraries. These Y2H searches yielded a combination of novel as well as expected interactions, indicating that the RAFL library constitutes a valuable complement to Y2H cDNA libraries.
Collapse
|
14
|
Production and sequence validation of a complete full length ORF collection for the pathogenic bacterium Vibrio cholerae. Proc Natl Acad Sci U S A 2008; 105:4364-9. [PMID: 18337508 DOI: 10.1073/pnas.0712049105] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open
Abstract
Cholera, an infectious disease with global impact, is caused by pathogenic strains of the bacterium Vibrio cholerae. High-throughput functional proteomics technologies now offer the opportunity to investigate all aspects of the proteome, which has led to an increased demand for comprehensive protein expression clone resources. Genome-scale reagents for cholera would encourage comprehensive analyses of immune responses and systems-wide functional studies that could lead to improved vaccine and therapeutic strategies. Here, we report the production of the FLEXGene clone set for V. cholerae O1 biovar eltor str. N16961: a complete-genome collection of ORF clones. This collection includes 3,761 sequence-verified clones from 3,887 targeted ORFs (97%). The ORFs were captured in a recombinational cloning vector to facilitate high-throughput transfer of ORF inserts into suitable expression vectors. To demonstrate its application, approximately 15% of the collection was transferred into the relevant expression vector and used to produce a protein microarray by transcribing, translating, and capturing the proteins in situ on the array surface with 92% success. In a second application, a method to screen for protein triggers of Toll-like receptors (TLRs) was developed. We tested in vitro-synthesized proteins for their ability to stimulate TLR5 in A549 cells. This approach appropriately identified FlaC, and previously uncharacterized TLR5 agonist activities. These data suggest that the genome-scale, fully sequenced ORF collection reported here will be useful for high-throughput functional proteomic assays, immune response studies, structure biology, and other applications.
Collapse
|
15
|
Nagase T, Yamakawa H, Tadokoro S, Nakajima D, Inoue S, Yamaguchi K, Itokawa Y, Kikuno RF, Koga H, Ohara O. Exploration of human ORFeome: high-throughput preparation of ORF clones and efficient characterization of their protein products. DNA Res 2008; 15:137-49. [PMID: 18316326 PMCID: PMC2650635 DOI: 10.1093/dnares/dsn004] [Citation(s) in RCA: 49] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
In this study, we established new systematic protocols for the preparation of cDNA clones, conventionally termed open reading frame (ORF) clones, suitable for characterization of their gene products by adopting a restriction-enzyme-assisted cloning method using the Flexi cloning system. The system has following advantages: (1) preparation of ORF clones and their transfer into other vectors can be achieved efficiently and at lower cost; (2) the system provides a seamless connection to the versatile HaloTag labeling system, in which a single fusion tag can be used for various proteomic analyses; and (3) the resultant ORF clones show higher expression levels both in vitro and in vivo. With this system, we prepared ORF clones encoding 1,929 human genes and characterized the HaloTag-fusion proteins of its subset that are expressed in vitro or in mammalian cells. Results thus obtained have demonstrated that our Flexi ORF clones are efficient for the production of HaloTag-fusion proteins that can provide a new versatile set for a variety of functional analyses of human genes.
Collapse
Affiliation(s)
- Takahiro Nagase
- Department of Human Genome Research, Kazusa DNA Research Institute, 2-6-7 Kazusa-Kamatari, Kisarazu, Chiba 292-0818, Japan.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
16
|
Maercker C, Rogge T, Mathis H, Ridinger H, Bieback K. Development of Live Cell Chips to Monitor Cell Differentiation Processes. Eng Life Sci 2008. [DOI: 10.1002/elsc.200720225] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022] Open
|
17
|
Rolfs A, Hu Y, Ebert L, Hoffmann D, Zuo D, Ramachandran N, Raphael J, Kelley F, McCarron S, Jepson DA, Shen B, Baqui MMA, Pearlberg J, Taycher E, DeLoughery C, Hoerlein A, Korn B, LaBaer J. A biomedically enriched collection of 7000 human ORF clones. PLoS One 2008; 3:e1528. [PMID: 18231609 PMCID: PMC2211400 DOI: 10.1371/journal.pone.0001528] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2007] [Accepted: 11/28/2007] [Indexed: 01/21/2023] Open
Abstract
We report the production and availability of over 7000 fully sequence verified plasmid ORF clones representing over 3400 unique human genes. These ORF clones were derived using the human MGC collection as template and were produced in two formats: with and without stop codons. Thus, this collection supports the production of either native protein or proteins with fusion tags added to either or both ends. The template clones used to generate this collection were enriched in three ways. First, gene redundancy was removed. Second, clones were selected to represent the best available GenBank reference sequence. Finally, a literature-based software tool was used to evaluate the list of target genes to ensure that it broadly reflected biomedical research interests. The target gene list was compared with 4000 human diseases and over 8500 biological and chemical MeSH classes in ∼15 Million publications recorded in PubMed at the time of analysis. The outcome of this analysis revealed that relative to the genome and the MGC collection, this collection is enriched for the presence of genes with published associations with a wide range of diseases and biomedical terms without displaying a particular bias towards any single disease or concept. Thus, this collection is likely to be a powerful resource for researchers who wish to study protein function in a set of genes with documented biomedical significance.
Collapse
Affiliation(s)
- Andreas Rolfs
- Harvard Institute of Proteomics, Harvard Medical School, Cambridge, Massachusetts, United States of America
| | - Yanhui Hu
- Harvard Institute of Proteomics, Harvard Medical School, Cambridge, Massachusetts, United States of America
| | - Lars Ebert
- Deutsches Ressourcenzentrum fuer Genomforschung (RZPD), Heidelberg, Germany
| | - Dietmar Hoffmann
- Sanofi-Aventis, Cambridge, Massachusetts, United States of America
| | - Dongmei Zuo
- Harvard Institute of Proteomics, Harvard Medical School, Cambridge, Massachusetts, United States of America
| | - Niro Ramachandran
- Harvard Institute of Proteomics, Harvard Medical School, Cambridge, Massachusetts, United States of America
| | - Jacob Raphael
- Harvard Institute of Proteomics, Harvard Medical School, Cambridge, Massachusetts, United States of America
| | - Fontina Kelley
- Harvard Institute of Proteomics, Harvard Medical School, Cambridge, Massachusetts, United States of America
| | - Seamus McCarron
- Harvard Institute of Proteomics, Harvard Medical School, Cambridge, Massachusetts, United States of America
| | - Daniel A. Jepson
- Harvard Institute of Proteomics, Harvard Medical School, Cambridge, Massachusetts, United States of America
| | - Binghua Shen
- Harvard Institute of Proteomics, Harvard Medical School, Cambridge, Massachusetts, United States of America
| | - Munira M. A. Baqui
- Harvard Institute of Proteomics, Harvard Medical School, Cambridge, Massachusetts, United States of America
| | - Joseph Pearlberg
- Harvard Institute of Proteomics, Harvard Medical School, Cambridge, Massachusetts, United States of America
| | - Elena Taycher
- Harvard Institute of Proteomics, Harvard Medical School, Cambridge, Massachusetts, United States of America
| | - Craig DeLoughery
- Sanofi-Aventis, Cambridge, Massachusetts, United States of America
| | - Andreas Hoerlein
- Deutsches Ressourcenzentrum fuer Genomforschung (RZPD), Heidelberg, Germany
| | - Bernhard Korn
- Deutsches Ressourcenzentrum fuer Genomforschung (RZPD), Heidelberg, Germany
| | - Joshua LaBaer
- Harvard Institute of Proteomics, Harvard Medical School, Cambridge, Massachusetts, United States of America
- * To whom correspondence should be addressed. E-mail:
| |
Collapse
|
18
|
Harbers M. The current status of cDNA cloning. Genomics 2008; 91:232-42. [PMID: 18222633 DOI: 10.1016/j.ygeno.2007.11.004] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2007] [Revised: 11/10/2007] [Accepted: 11/17/2007] [Indexed: 11/19/2022]
Abstract
The cloning of cDNAs, copies of cellular RNA, is one of the classical technologies in molecular biology. Over the past 30 years cDNA cloning technologies have been improved to enable the cloning of large cDNA collections, which are fundamental to today's understanding of the utilization of genetic information. With the discovery of noncoding RNAs, additional new approaches to the cloning of short RNAs have been developed. However, with the realization that much larger portions of genomes are transcribed than anticipated from genome annotations, cDNA cloning faces new challenges to uncover rare transcripts and to make the corresponding cDNAs available for functional studies. This review provides an overview on the current status of cDNA cloning and possibilities for the discovery and characterization of new RNA families.
Collapse
Affiliation(s)
- Matthias Harbers
- DNAFORM, Inc., Leading Venture Plaza 2, 75-1 Ono-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0046, Japan.
| |
Collapse
|
19
|
Bechtel S, Rosenfelder H, Duda A, Schmidt CP, Ernst U, Wellenreuther R, Mehrle A, Schuster C, Bahr A, Blöcker H, Heubner D, Hoerlein A, Michel G, Wedler H, Köhrer K, Ottenwälder B, Poustka A, Wiemann S, Schupp I. The full-ORF clone resource of the German cDNA Consortium. BMC Genomics 2007; 8:399. [PMID: 17974005 PMCID: PMC2213676 DOI: 10.1186/1471-2164-8-399] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2007] [Accepted: 10/31/2007] [Indexed: 11/24/2022] Open
Abstract
Background With the completion of the human genome sequence the functional analysis and characterization of the encoded proteins has become the next urging challenge in the post-genome era. The lack of comprehensive ORFeome resources has thus far hampered systematic applications by protein gain-of-function analysis. Gene and ORF coverage with full-length ORF clones thus needs to be extended. In combination with a unique and versatile cloning system, these will provide the tools for genome-wide systematic functional analyses, to achieve a deeper insight into complex biological processes. Results Here we describe the generation of a full-ORF clone resource of human genes applying the Gateway cloning technology (Invitrogen). A pipeline for efficient cloning and sequencing was developed and a sample tracking database was implemented to streamline the clone production process targeting more than 2,200 different ORFs. In addition, a robust cloning strategy was established, permitting the simultaneous generation of two clone variants that contain a particular ORF with as well as without a stop codon by the implementation of only one additional working step into the cloning procedure. Up to 92 % of the targeted ORFs were successfully amplified by PCR and more than 93 % of the amplicons successfully cloned. Conclusion The German cDNA Consortium ORFeome resource currently consists of more than 3,800 sequence-verified entry clones representing ORFs, cloned with and without stop codon, for about 1,700 different gene loci. 177 splice variants were cloned representing 121 of these genes. The entry clones have been used to generate over 5,000 different expression constructs, providing the basis for functional profiling applications. As a member of the recently formed international ORFeome collaboration we substantially contribute to generating and providing a whole genome human ORFeome collection in a unique cloning system that is made freely available in the community.
Collapse
Affiliation(s)
- Stephanie Bechtel
- Department of Molecular Genome Analysis, German Cancer Research Center (DKFZ), Heidelberg, Germany.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
20
|
Khalil AM, Julius JA, Bachant J. One step construction of PCR mutagenized libraries for genetic analysis by recombination cloning. Nucleic Acids Res 2007; 35:e104. [PMID: 17702758 PMCID: PMC2018627 DOI: 10.1093/nar/gkm583] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Recombination cloning encompasses a set of technologies that transfer gene sequences between vectors through site-specific recombination. Due in part to the instability of linear DNA in bacteria, both the initial capture and subsequent transfer of gene sequences is often performed using purified recombination enzymes. However, we find linear DNAs flanked by loxP sites recombine efficiently in bacteria expressing Cre recombinase and the lambda Gam protein, suggesting Cre/lox recombination of linear substrates can be performed in vivo. As one approach towards exploiting this capability, we describe a method for constructing large (>1 × 106 recombinants) libraries of gene mutations in a format compatible with recombination cloning. In this method, gene sequences are cloned into recombination entry plasmids and whole-plasmid PCR is used to produce mutagenized plasmid amplicons flanked by loxP. The PCR products are converted back into circular plasmids by transforming Cre/Gam-expressing bacteria, after which the mutant libraries are transferred to expression vectors and screened for phenotypes of interest. We further show that linear DNA fragments flanked by loxP repeats can be efficiently recombined into loxP-containing vectors through this same one-step transformation procedure. Thus, the approach reported here could be adapted as general cloning method.
Collapse
Affiliation(s)
| | | | - Jeff Bachant
- *To whom correspondence should be addressed. +1 951 827 6473+1 951 827 3087
| |
Collapse
|
21
|
Murthy T, Rolfs A, Hu Y, Shi Z, Raphael J, Moreira D, Kelley F, McCarron S, Jepson D, Taycher E, Zuo D, Mohr SE, Fernandez M, Brizuela L, LaBaer J. A full-genomic sequence-verified protein-coding gene collection for Francisella tularensis. PLoS One 2007; 2:e577. [PMID: 17593976 PMCID: PMC1894649 DOI: 10.1371/journal.pone.0000577] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2007] [Accepted: 05/30/2007] [Indexed: 12/14/2022] Open
Abstract
The rapid development of new technologies for the high throughput (HT) study of proteins has increased the demand for comprehensive plasmid clone resources that support protein expression. These clones must be full-length, sequence-verified and in a flexible format. The generation of these resources requires automated pipelines supported by software management systems. Although the availability of clone resources is growing, current collections are either not complete or not fully sequence-verified. We report an automated pipeline, supported by several software applications that enabled the construction of the first comprehensive sequence-verified plasmid clone resource for more than 96% of protein coding sequences of the genome of F. tularensis, a highly virulent human pathogen and the causative agent of tularemia. This clone resource was applied to a HT protein purification pipeline successfully producing recombinant proteins for 72% of the genes. These methods and resources represent significant technological steps towards exploiting the genomic information of F. tularensis in discovery applications.
Collapse
Affiliation(s)
- Tal Murthy
- Harvard Institute of Proteomics, Department of Biological Chemistry and Molecular Pharmacology, Harvard Medical School, Cambridge, Massachusetts, United States of America
| | - Andreas Rolfs
- Harvard Institute of Proteomics, Department of Biological Chemistry and Molecular Pharmacology, Harvard Medical School, Cambridge, Massachusetts, United States of America
| | - Yanhui Hu
- Harvard Institute of Proteomics, Department of Biological Chemistry and Molecular Pharmacology, Harvard Medical School, Cambridge, Massachusetts, United States of America
| | - Zhenwei Shi
- Harvard Institute of Proteomics, Department of Biological Chemistry and Molecular Pharmacology, Harvard Medical School, Cambridge, Massachusetts, United States of America
| | - Jacob Raphael
- Harvard Institute of Proteomics, Department of Biological Chemistry and Molecular Pharmacology, Harvard Medical School, Cambridge, Massachusetts, United States of America
| | - Donna Moreira
- Harvard Institute of Proteomics, Department of Biological Chemistry and Molecular Pharmacology, Harvard Medical School, Cambridge, Massachusetts, United States of America
| | - Fontina Kelley
- Harvard Institute of Proteomics, Department of Biological Chemistry and Molecular Pharmacology, Harvard Medical School, Cambridge, Massachusetts, United States of America
| | - Seamus McCarron
- Harvard Institute of Proteomics, Department of Biological Chemistry and Molecular Pharmacology, Harvard Medical School, Cambridge, Massachusetts, United States of America
| | - Daniel Jepson
- Harvard Institute of Proteomics, Department of Biological Chemistry and Molecular Pharmacology, Harvard Medical School, Cambridge, Massachusetts, United States of America
| | - Elena Taycher
- Harvard Institute of Proteomics, Department of Biological Chemistry and Molecular Pharmacology, Harvard Medical School, Cambridge, Massachusetts, United States of America
| | - Dongmei Zuo
- Harvard Institute of Proteomics, Department of Biological Chemistry and Molecular Pharmacology, Harvard Medical School, Cambridge, Massachusetts, United States of America
| | - Stephanie E. Mohr
- Harvard Institute of Proteomics, Department of Biological Chemistry and Molecular Pharmacology, Harvard Medical School, Cambridge, Massachusetts, United States of America
- DF/HCC DNA Resource Core, Harvard Medical School, Cambridge, Massachusetts, United States of America
| | - Mauricio Fernandez
- Harvard Institute of Proteomics, Department of Biological Chemistry and Molecular Pharmacology, Harvard Medical School, Cambridge, Massachusetts, United States of America
| | - Leonardo Brizuela
- Harvard Institute of Proteomics, Department of Biological Chemistry and Molecular Pharmacology, Harvard Medical School, Cambridge, Massachusetts, United States of America
| | - Joshua LaBaer
- Harvard Institute of Proteomics, Department of Biological Chemistry and Molecular Pharmacology, Harvard Medical School, Cambridge, Massachusetts, United States of America
- DF/HCC DNA Resource Core, Harvard Medical School, Cambridge, Massachusetts, United States of America
- * To whom correspondence should be addressed. E-mail:
| |
Collapse
|
22
|
A novel approach to sequence validating protein expression clones with automated decision making. BMC Bioinformatics 2007; 8:198. [PMID: 17567908 PMCID: PMC1914086 DOI: 10.1186/1471-2105-8-198] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2007] [Accepted: 06/13/2007] [Indexed: 02/02/2023] Open
Abstract
Background Whereas the molecular assembly of protein expression clones is readily automated and routinely accomplished in high throughput, sequence verification of these clones is still largely performed manually, an arduous and time consuming process. The ultimate goal of validation is to determine if a given plasmid clone matches its reference sequence sufficiently to be "acceptable" for use in protein expression experiments. Given the accelerating increase in availability of tens of thousands of unverified clones, there is a strong demand for rapid, efficient and accurate software that automates clone validation. Results We have developed an Automated Clone Evaluation (ACE) system – the first comprehensive, multi-platform, web-based plasmid sequence verification software package. ACE automates the clone verification process by defining each clone sequence as a list of multidimensional discrepancy objects, each describing a difference between the clone and its expected sequence including the resulting polypeptide consequences. To evaluate clones automatically, this list can be compared against user acceptance criteria that specify the allowable number of discrepancies of each type. This strategy allows users to re-evaluate the same set of clones against different acceptance criteria as needed for use in other experiments. ACE manages the entire sequence validation process including contig management, identifying and annotating discrepancies, determining if discrepancies correspond to polymorphisms and clone finishing. Designed to manage thousands of clones simultaneously, ACE maintains a relational database to store information about clones at various completion stages, project processing parameters and acceptance criteria. In a direct comparison, the automated analysis by ACE took less time and was more accurate than a manual analysis of a 93 gene clone set. Conclusion ACE was designed to facilitate high throughput clone sequence verification projects. The software has been used successfully to evaluate more than 55,000 clones at the Harvard Institute of Proteomics. The software dramatically reduced the amount of time and labor required to evaluate clone sequences and decreased the number of missed sequence discrepancies, which commonly occur during manual evaluation. In addition, ACE helped to reduce the number of sequencing reads needed to achieve adequate coverage for making decisions on clones.
Collapse
|
23
|
Lamesch P, Li N, Milstein S, Fan C, Hao T, Szabo G, Hu Z, Venkatesan K, Bethel G, Martin P, Rogers J, Lawlor S, McLaren S, Dricot A, Borick H, Cusick ME, Vandenhaute J, Dunham I, Hill DE, Vidal M. hORFeome v3.1: a resource of human open reading frames representing over 10,000 human genes. Genomics 2007; 89:307-15. [PMID: 17207965 PMCID: PMC4647941 DOI: 10.1016/j.ygeno.2006.11.012] [Citation(s) in RCA: 216] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2006] [Revised: 10/29/2006] [Accepted: 11/21/2006] [Indexed: 11/24/2022]
Abstract
Complete sets of cloned protein-encoding open reading frames (ORFs), or ORFeomes, are essential tools for large-scale proteomics and systems biology studies. Here we describe human ORFeome version 3.1 (hORFeome v3.1), currently the largest publicly available resource of full-length human ORFs (available at www.openbiosystems.com). Generated by Gateway recombinational cloning, this collection contains 12,212 ORFs, representing 10,214 human genes, and corresponds to a 51% expansion of the original hORFeome v1.1. An online human ORFeome database, hORFDB, was built and serves as the central repository for all cloned human ORFs (http://horfdb.dfci.harvard.edu). This expansion of the original ORFeome resource greatly increases the potential experimental search space for large-scale proteomics studies, which will lead to the generation of more comprehensive datasets.
Collapse
Affiliation(s)
- Philippe Lamesch
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, and Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
- Unité de Recherche en Biologie Moléculaire, Facultés Universitaires Notre-Dame de la Paix, 5000 Namur, Belgium
| | - Ning Li
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, and Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| | - Stuart Milstein
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, and Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| | - Changyu Fan
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, and Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| | - Tong Hao
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, and Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| | - Gabor Szabo
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, and Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
- Department of Physics and Center for Complex Network Research, University of Notre Dame, Notre Dame, IN 46556, USA
| | - Zhenjun Hu
- Department of Biomedical Engineering, Boston University, Boston, MA 02115, USA
| | - Kavitha Venkatesan
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, and Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| | - Graeme Bethel
- The Sanger Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SA, UK
| | - Paul Martin
- The Sanger Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SA, UK
| | - Jane Rogers
- The Sanger Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SA, UK
| | - Stephanie Lawlor
- The Sanger Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SA, UK
| | - Stuart McLaren
- The Sanger Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SA, UK
| | - Amélie Dricot
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, and Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
- Unité de Recherche en Biologie Moléculaire, Facultés Universitaires Notre-Dame de la Paix, 5000 Namur, Belgium
| | - Heather Borick
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, and Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| | - Michael E. Cusick
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, and Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| | - Jean Vandenhaute
- Unité de Recherche en Biologie Moléculaire, Facultés Universitaires Notre-Dame de la Paix, 5000 Namur, Belgium
| | - Ian Dunham
- The Sanger Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SA, UK
| | - David E. Hill
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, and Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
- Corresponding authors. Fax: +1 617 632 5739.
| | - Marc Vidal
- Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, and Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
- Corresponding authors. Fax: +1 617 632 5739.
| |
Collapse
|