Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Tsoka S, Ouzounis CA. Recent developments and future directions in computational genomics. FEBS Lett 2000;480:42-8. [PMID: 10967327 DOI: 10.1016/s0014-5793(00)01776-2] [Citation(s) in RCA: 39] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]

For:	Tsoka S, Ouzounis CA. Recent developments and future directions in computational genomics. FEBS Lett 2000;480:42-8. [PMID: 10967327 DOI: 10.1016/s0014-5793(00)01776-2] [Citation(s) in RCA: 39] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]

Number

Cited by Other Article(s)

Rallis D, Baltogianni M, Kapetaniou K, Kosmeri C, Giapros V. Bioinformatics in Neonatal/Pediatric Medicine-A Literature Review. J Pers Med 2024;14:767. [PMID: 39064021 PMCID: PMC11277633 DOI: 10.3390/jpm14070767] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2024] [Revised: 07/14/2024] [Accepted: 07/16/2024] [Indexed: 07/28/2024] Open

Vasileiou D, Karapiperis C, Baltsavia I, Chasapi A, Ahrén D, Janssen PJ, Iliopoulos I, Promponas VJ, Enright AJ, Ouzounis CA. CGG toolkit: Software components for computational genomics. PLoS Comput Biol 2023;19:e1011498. [PMID: 37934729 PMCID: PMC10629618 DOI: 10.1371/journal.pcbi.1011498] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2023] [Accepted: 09/07/2023] [Indexed: 11/09/2023] Open

Role of Bioinformatics in Biological Sciences. Adv Bioinformatics 2021. [DOI: 10.1007/978-981-33-6191-1_3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022] Open

Noreña – P A, González Muñoz A, Mosquera-Rendón J, Botero K, Cristancho MA. Colombia, an unknown genetic diversity in the era of Big Data. BMC Genomics 2018;19:859. [PMID: 30537922 PMCID: PMC6288850 DOI: 10.1186/s12864-018-5194-8] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023] Open

Abstract

BACKGROUND

Latin America harbors some of the most biodiverse countries in the world, including Colombia. Despite the increasing use of cutting-edge technologies in genomics and bioinformatics in several biological science fields around the world, the region has fallen behind in the inclusion of these approaches in biodiversity studies. In this study, we used data mining methods to search in four main public databases of genetic sequences such as: NCBI Nucleotide and BioProject, Pathosystems Resource Integration Center, and Barcode of Life Data Systems databases. We aimed to determine how much of the Colombian biodiversity is contained in genetic data stored in these public databases and how much of this information has been generated by national institutions. Additionally, we compared this data for Colombia with other countries of high biodiversity in Latin America, such as Brazil, Argentina, Costa Rica, Mexico, and Peru.

RESULTS

In Nucleotide, we found that 66.84% of total records for Colombia have been published at the national level, and this data represents less than 5% of the total number of species reported for the country. In BioProject, 70.46% of records were generated by national institutions and the great majority of them is represented by microorganisms. In BOLD Systems, 26% of records have been submitted by national institutions, representing 258 species for Colombia. This number of species reported for Colombia span approximately 0.46% of the total biodiversity reported for the country (56,343 species). Finally, in PATRIC database, 13.25% of the reported sequences were contributed by national institutions. Colombia has a better biodiversity representation in public databases in comparison to other Latin American countries, like Costa Rica and Peru. Mexico and Argentina have the highest representation of species at the national level, despite Brazil and Colombia, which actually hold the first and second places in biodiversity worldwide.

CONCLUSIONS

Our findings show gaps in the representation of the Colombian biodiversity at the molecular and genetic levels in widely consulted public databases. National funding for high-throughput molecular research, NGS technologies costs, and access to genetic resources are limiting factors. This fact should be taken as an opportunity to foster the development of collaborative projects between research groups in the Latin American region to study the vast biodiversity of these countries using 'omics' technologies.

Collapse

Structure-based function analysis of putative conserved proteins with isomerase activity from Haemophilus influenzae. 3 Biotech 2015;5:741-763. [PMID: 28324524 PMCID: PMC4569619 DOI: 10.1007/s13205-014-0274-1] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2014] [Accepted: 12/18/2014] [Indexed: 01/09/2023] Open

Shahbaaz M, Ahmad F, Imtaiyaz Hassan M. Structure-based functional annotation of putative conserved proteins having lyase activity from Haemophilus influenzae. 3 Biotech 2015;5:317-336. [PMID: 28324295 PMCID: PMC4434415 DOI: 10.1007/s13205-014-0231-z] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2014] [Accepted: 05/28/2014] [Indexed: 12/20/2022] Open

Abstract

Haemophilus influenzae is a small pleomorphic Gram-negative bacteria which causes several chronic diseases, including bacteremia, meningitis, cellulitis, epiglottitis, septic arthritis, pneumonia, and empyema. Here we extensively analyzed the sequenced genome of H. influenzae strain Rd KW20 using protein family databases, protein structure prediction, pathways and genome context methods to assign a precise function to proteins whose functions are unknown. These proteins are termed as hypothetical proteins (HPs), for which no experimental information is available. Function prediction of these proteins would surely be supportive to precisely understand the biochemical pathways and mechanism of pathogenesis of Haemophilus influenzae. During the extensive analysis of H. influenzae genome, we found the presence of eight HPs showing lyase activity. Subsequently, we modeled and analyzed three-dimensional structure of all these HPs to determine their functions more precisely. We found these HPs possess cystathionine-β-synthase, cyclase, carboxymuconolactone decarboxylase, pseudouridine synthase A and C, D-tagatose-1,6-bisphosphate aldolase and aminodeoxychorismate lyase-like features, indicating their corresponding functions in the H. influenzae. Lyases are actively involved in the regulation of biosynthesis of various hormones, metabolic pathways, signal transduction, and DNA repair. Lyases are also considered as a key player for various biological processes. These enzymes are critically essential for the survival and pathogenesis of H. influenzae and, therefore, these enzymes may be considered as a potential target for structure-based rational drug design. Our structure–function relationship analysis will be useful to search and design potential lead molecules based on the structure of these lyases, for drug design and discovery.

Collapse

Szilágyi SM, Szilágyi L. A fast hierarchical clustering algorithm for large-scale protein sequence data sets. Comput Biol Med 2014;48:94-101. [PMID: 24657908 DOI: 10.1016/j.compbiomed.2014.02.016] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2013] [Revised: 02/10/2014] [Accepted: 02/25/2014] [Indexed: 10/25/2022]

Promponas VJ, Ouzounis CA, Iliopoulos I. Experimental evidence validating the computational inference of functional associations from gene fusion events: a critical survey. Brief Bioinform 2012;15:443-54. [PMID: 23220349 PMCID: PMC4017328 DOI: 10.1093/bib/bbs072] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open

Midha M, Polavarapu R, Meetei PA, Krishnan H, Mohareer K, Vindal V. OrFin: A web tool for detection of putative orthologs. Bioinformation 2012;8:738-9. [PMID: 23055622 PMCID: PMC3449379 DOI: 10.6026/97320630008738] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2012] [Accepted: 07/16/2012] [Indexed: 11/23/2022] Open

Mazandu GK, Mulder NJ. Function prediction and analysis of mycobacterium tuberculosis hypothetical proteins. Int J Mol Sci 2012;13:7283-7302. [PMID: 22837694 PMCID: PMC3397526 DOI: 10.3390/ijms13067283] [Citation(s) in RCA: 68] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2012] [Revised: 05/28/2012] [Accepted: 06/07/2012] [Indexed: 11/16/2022] Open

Szilágyi L, Medvés L, Szilágyi SM. A modified Markov clustering approach to unsupervised classification of protein sequences. Neurocomputing 2010. [DOI: 10.1016/j.neucom.2010.02.023] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]

Corchado JM, De Paz JF, Rodríguez S, Bajo J. Model of experts for decision support in the diagnosis of leukemia patients. Artif Intell Med 2009;46:179-200. [DOI: 10.1016/j.artmed.2008.12.001] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2008] [Revised: 11/11/2008] [Accepted: 12/01/2008] [Indexed: 11/26/2022]

Tsoka S. Computational methodologies for genome evolution and functional association. Comput Chem Eng 2007. [DOI: 10.1016/j.compchemeng.2006.11.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]

Tarca AL, Romero R, Draghici S. Analysis of microarray experiments of gene expression profiling. Am J Obstet Gynecol 2006;195:373-88. [PMID: 16890548 PMCID: PMC2435252 DOI: 10.1016/j.ajog.2006.07.001] [Citation(s) in RCA: 125] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]

Yura K, Yamaguchi A, Go M. Coverage of whole proteome by structural genomics observed through protein homology modeling database. JOURNAL OF STRUCTURAL AND FUNCTIONAL GENOMICS 2006;7:65-76. [PMID: 17146617 PMCID: PMC1769342 DOI: 10.1007/s10969-006-9010-3] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/11/2006] [Accepted: 08/08/2006] [Indexed: 11/07/2022]

Abstract

We have been developing FAMSBASE, a protein homology-modeling database of whole ORFs predicted from genome sequences. The latest update of FAMSBASE ( http://daisy.nagahama-i-bio.ac.jp/Famsbase/ ), which is based on the protein three-dimensional (3D) structures released by November 2003, contains modeled 3D structures for 368,724 open reading frames (ORFs) derived from genomes of 276 species, namely 17 archaebacterial, 130 eubacterial, 18 eukaryotic and 111 phage genomes. Those 276 genomes are predicted to have 734,193 ORFs in total and the current FAMSBASE contains protein 3D structure of approximately 50% of the ORF products. However, cases that a modeled 3D structure covers the whole part of an ORF product are rare. When portion of an ORF with 3D structure is compared in three kingdoms of life, in archaebacteria and eubacteria, approximately 60% of the ORFs have modeled 3D structures covering almost the entire amino acid sequences, however, the percentage falls to about 30% in eukaryotes. When annual differences in the number of ORFs with modeled 3D structure are calculated, the fraction of modeled 3D structures of soluble protein for archaebacteria is increased by 5%, and that for eubacteria by 7% in the last 3 years. Assuming that this rate would be maintained and that determination of 3D structures for predicted disordered regions is unattainable, whole soluble protein model structures of prokaryotes without the putative disordered regions will be in hand within 15 years. For eukaryotic proteins, they will be in hand within 25 years. The 3D structures we will have at those times are not the 3D structure of the entire proteins encoded in single ORFs, but the 3D structures of separate structural domains. Measuring or predicting spatial arrangements of structural domains in an ORF will then be a coming issue of structural genomics.

Collapse

Tsoka S, Simon D, Ouzounis CA. Automated metabolic reconstruction for Methanococcus jannaschii. ARCHAEA-AN INTERNATIONAL MICROBIOLOGICAL JOURNAL 2005;1:223-9. [PMID: 15810431 PMCID: PMC2685575 DOI: 10.1155/2004/324925] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]

Tsoka S, Ouzounis CA. Metabolic database systems for the analysis of genome-wide function. Biotechnol Bioeng 2004;84:750-5. [PMID: 14708115 DOI: 10.1002/bit.10881] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

Veeramachaneni V, Makałowski W. Visualizing sequence similarity of protein families. Genome Res 2004;14:1160-9. [PMID: 15140831 PMCID: PMC419794 DOI: 10.1101/gr.2079204] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]

von Mering C, Zdobnov EM, Tsoka S, Ciccarelli FD, Pereira-Leal JB, Ouzounis CA, Bork P. Genome evolution reveals biochemical networks and functional modules. Proc Natl Acad Sci U S A 2003;100:15428-33. [PMID: 14673105 PMCID: PMC307584 DOI: 10.1073/pnas.2136809100] [Citation(s) in RCA: 120] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open

Audit B, Ouzounis CA. From genes to genomes: universal scale-invariant properties of microbial chromosome organisation. J Mol Biol 2003;332:617-33. [PMID: 12963371 DOI: 10.1016/s0022-2836(03)00811-8] [Citation(s) in RCA: 33] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]

Abstract

The availability of complete genome sequences for a large variety of organisms is a major advance in understanding genome structure and function. One attribute of genome structure is chromosome organisation in terms of gene localisation and orientation. For example, bacterial operons, i.e. clusters of co-oriented genes that form transcription units, enable functionally related genes to be expressed simultaneously. The description of genome organisation was pioneered with the study of the distribution of genes of the Escherichia coli partial genetic map before the full genome sequence was known. Deploying powerful techniques from circular statistics and signal processing, we revisit the issue of gene localisation and orientation using 89 complete microbial chromosomes from the eubacterial and archaeal domains. We demonstrate that there is no characteristic size pertinent to the description of chromosome structure, e.g. there does not exist any single length appropriate to describe gene clustering. Our results show that, for all 89 chromosomes, gene positions and gene orientations share a common form of scale-invariant correlations known as "long-range correlations" that we can reveal for distances from the gene length, up to the chromosome size. This observation indicates that genes tend to assemble and to co-orient over any scale of observation greater than a few kilobases. This unexpected property of chromosome structure can be portrayed as an operon-like organisation at all scales and implies that a complete scale range extending over more than three orders of magnitudes of chromosome segment lengths is necessary to properly describe prokaryotic genome organisation. We propose that this pattern results from the effects of the superhelical context on gene expression coupled with the structure and dynamics of the nucleoid, possibly accommodating the diverse gene expression profiles needed during the different stages of cellular life.

Collapse

Peregrin-Alvarez JM, Tsoka S, Ouzounis CA. The phylogenetic extent of metabolic enzymes and pathways. Genome Res 2003;13:422-7. [PMID: 12618373 PMCID: PMC430287 DOI: 10.1101/gr.246903] [Citation(s) in RCA: 77] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]

Elkin PL. Primer on medical genomics part V: bioinformatics. Mayo Clin Proc 2003;78:57-64. [PMID: 12528878 DOI: 10.4065/78.1.57] [Citation(s) in RCA: 18] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]

Ayoubi P, Jin X, Leite S, Liu X, Martajaja J, Abduraham A, Wan Q, Yan W, Misawa E, Prade RA. PipeOnline 2.0: automated EST processing and functional data sorting. Nucleic Acids Res 2002;30:4761-9. [PMID: 12409467 PMCID: PMC135791 DOI: 10.1093/nar/gkf585] [Citation(s) in RCA: 45] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Rigoutsos I, Huynh T, Floratos A, Parida L, Platt D. Dictionary-driven protein annotation. Nucleic Acids Res 2002;30:3901-16. [PMID: 12202776 PMCID: PMC137405 DOI: 10.1093/nar/gkf464] [Citation(s) in RCA: 20] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2002] [Revised: 06/04/2002] [Accepted: 06/04/2002] [Indexed: 11/14/2022] Open

Abstract

Computational methods seeking to automatically determine the properties (functional, structural, physicochemical, etc.) of a protein directly from the sequence have long been the focus of numerous research groups. With the advent of advanced sequencing methods and systems, the number of amino acid sequences that are being deposited in the public databases has been increasing steadily. This has in turn generated a renewed demand for automated approaches that can annotate individual sequences and complete genomes quickly, exhaustively and objectively. In this paper, we present one such approach that is centered around and exploits the Bio-Dictionary, a collection of amino acid patterns that completely covers the natural sequence space and can capture functional and structural signals that have been reused during evolution, within and across protein families. Our annotation approach also makes use of a weighted, position-specific scoring scheme that is unaffected by the over-representation of well-conserved proteins and protein fragments in the databases used. For a given query sequence, the method permits one to determine, in a single pass, the following: local and global similarities between the query and any protein already present in a public database; the likeness of the query to all available archaeal/ bacterial/eukaryotic/viral sequences in the database as a function of amino acid position within the query; the character of secondary structure of the query as a function of amino acid position within the query; the cytoplasmic, transmembrane or extracellular behavior of the query; the nature and position of binding domains, active sites, post-translationally modified sites, signal peptides, etc. In terms of performance, the proposed method is exhaustive, objective and allows for the rapid annotation of individual sequences and full genomes. Annotation examples are presented and discussed in Results, including individual queries and complete genomes that were released publicly after we built the Bio-Dictionary that is used in our experiments. Finally, we have computed the annotations of more than 70 complete genomes and made them available on the World Wide Web at http://cbcsrv.watson.ibm.com/Annotations/.

Collapse

Bayat A. Science, medicine, and the future: Bioinformatics. BMJ 2002;324:1018-22. [PMID: 11976246 PMCID: PMC1122955 DOI: 10.1136/bmj.324.7344.1018] [Citation(s) in RCA: 56] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/03/2022]

Enright AJ, Van Dongen S, Ouzounis CA. An efficient algorithm for large-scale detection of protein families. Nucleic Acids Res 2002;30:1575-84. [PMID: 11917018 PMCID: PMC101833 DOI: 10.1093/nar/30.7.1575] [Citation(s) in RCA: 2344] [Impact Index Per Article: 106.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023] Open

Ouzounis CA, Karp PD. The past, present and future of genome-wide re-annotation. Genome Biol 2002;3:COMMENT2001. [PMID: 11864365 PMCID: PMC139008 DOI: 10.1186/gb-2002-3-2-comment2001] [Citation(s) in RCA: 70] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open

Dandekar T, Du F, Schirmer RH, Schmidt S. Medical target prediction from genome sequence: combining different sequence analysis algorithms with expert knowledge and input from artificial intelligence approaches. COMPUTERS & CHEMISTRY 2001;26:15-21. [PMID: 11765847 DOI: 10.1016/s0097-8485(01)00095-x] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]

Lecompte O, Thompson JD, Plewniak F, Thierry J, Poch O. Multiple alignment of complete sequences (MACS) in the post-genomic era. Gene 2001;270:17-30. [PMID: 11403999 DOI: 10.1016/s0378-1119(01)00461-9] [Citation(s) in RCA: 47] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]

Pawlowski K, Rychlewski L, Zhang B, Godzik A. Fold predictions for bacterial genomes. J Struct Biol 2001;134:219-31. [PMID: 11551181 DOI: 10.1006/jsbi.2001.4394] [Citation(s) in RCA: 18] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]

Current Awareness on Comparative and Functional Genomics. Comp Funct Genomics 2001. [PMCID: PMC2447194 DOI: 10.1002/cfg.56] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/04/2022] Open

Iliopoulos I, Tsoka S, Andrade MA, Janssen P, Audit B, Tramontano A, Valencia A, Leroy C, Sander C, Ouzounis CA. Genome sequences and great expectations. Genome Biol 2001;2:INTERACTIONS0001. [PMID: 11178275 PMCID: PMC150431 DOI: 10.1186/gb-2000-2-1-interactions0001] [Citation(s) in RCA: 34] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open

Legrain P, Selig L. Genome-wide protein interaction maps using two-hybrid systems. FEBS Lett 2000;480:32-6. [PMID: 10967325 DOI: 10.1016/s0014-5793(00)01774-9] [Citation(s) in RCA: 112] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]

Abstract

Automated sequence technology has rendered functional biology amenable to genomic scale analysis. Among genome-wide exploratory approaches, the two-hybrid system in yeast (Y2H) has outranked other techniques because it is the system of choice to detect protein-protein interactions. Deciphering the cascade of binding events in a whole cell helps define signal transduction and metabolic pathways or enzymatic complexes. The function of proteins is eventually attributed through whole cell protein interaction maps where totally unknown proteins are partnered with fully annotated proteins belonging to the same functional category. Since its first description in the late 1980's, several versions of the Y2H have been developed in order to overcome the major limitations of the system, namely false positives and false negatives. Optimized versions have been recently applied at multi-molecular and genomic scale. These genome-wide surveys can be methodologically divided into two types of approaches: one either tests combinations of predefined polypeptides (the so-called matrix approach) using various short-cuts to speed up the process, or one screens with a given polypeptide (bait) for potential partners (preys) present in complex libraries of genomic or complementary DNA (library screening). In the former strategy, one tests what one knows, for example pair-wise interactions between full-length open reading frames from recently sequenced and annotated genomes. Although based on a one-by-one scheme, this method is reported to be amenable to large-scale genomics thanks to multicloning strategies and to the use of small robotics workstations. In the latter, highly complex cDNA or genomic libraries of protein domains can be screened to saturation with high-throughput screening systems allowing the discovery of yet unidentified proteins. Both approaches have strengths and drawbacks that will be discussed here. None yields a full proteome-wide screening since certain proteins (e.g. some transcription factors) are not usable in Y2H. Novel two-hybrid assays have been recently described in bacteria. Applications of these time- and cost-effective assays to genomic screening will be discussed and compared to the Y2H technology.

Collapse