1
|
Liang Y, Luo H, Lin Y, Gao F. Recent advances in the characterization of essential genes and development of a database of essential genes. IMETA 2024; 3:e157. [PMID: 38868518 PMCID: PMC10989110 DOI: 10.1002/imt2.157] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/29/2023] [Accepted: 10/09/2023] [Indexed: 06/14/2024]
Abstract
Over the past few decades, there has been a significant interest in the study of essential genes, which are crucial for the survival of an organism under specific environmental conditions and thus have practical applications in the fields of synthetic biology and medicine. An increasing amount of experimental data on essential genes has been obtained with the continuous development of technological methods. Meanwhile, various computational prediction methods, related databases and web servers have emerged accordingly. To facilitate the study of essential genes, we have established a database of essential genes (DEG), which has become popular with continuous updates to facilitate essential gene feature analysis and prediction, drug and vaccine development, as well as artificial genome design and construction. In this article, we summarized the studies of essential genes, overviewed the relevant databases, and discussed their practical applications. Furthermore, we provided an overview of the main applications of DEG and conducted comprehensive analyses based on its latest version. However, it should be noted that the essential gene is a dynamic concept instead of a binary one, which presents both opportunities and challenges for their future development.
Collapse
Affiliation(s)
| | - Hao Luo
- Department of PhysicsTianjin UniversityTianjinChina
| | - Yan Lin
- Department of PhysicsTianjin UniversityTianjinChina
| | - Feng Gao
- Department of PhysicsTianjin UniversityTianjinChina
- Frontiers Science Center for Synthetic Biology and Key Laboratory of Systems Bioengineering (Ministry of Education)Tianjin UniversityTianjinChina
- SynBio Research PlatformCollaborative Innovation Center of Chemical Science and Engineering (Tianjin)TianjinChina
| |
Collapse
|
2
|
Milner DS, Galindo LJ, Irwin NAT, Richards TA. Transporter Proteins as Ecological Assets and Features of Microbial Eukaryotic Pangenomes. Annu Rev Microbiol 2023; 77:45-66. [PMID: 36944262 DOI: 10.1146/annurev-micro-032421-115538] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/23/2023]
Abstract
Here we review two connected themes in evolutionary microbiology: (a) the nature of gene repertoire variation within species groups (pangenomes) and (b) the concept of metabolite transporters as accessory proteins capable of providing niche-defining "bolt-on" phenotypes. We discuss the need for improved sampling and understanding of pangenome variation in eukaryotic microbes. We then review the factors that shape the repertoire of accessory genes within pangenomes. As part of this discussion, we outline how gene duplication is a key factor in both eukaryotic pangenome variation and transporter gene family evolution. We go on to outline how, through functional characterization of transporter-encoding genes, in combination with analyses of how transporter genes are gained and lost from accessory genomes, we can reveal much about the niche range, the ecology, and the evolution of virulence of microbes. We advocate for the coordinated systematic study of eukaryotic pangenomes through genome sequencing and the functional analysis of genes found within the accessory gene repertoire.
Collapse
Affiliation(s)
- David S Milner
- Department of Biology, University of Oxford, Oxford, United Kingdom;
| | | | - Nicholas A T Irwin
- Department of Biology, University of Oxford, Oxford, United Kingdom;
- Merton College, University of Oxford, Oxford, United Kingdom
| | - Thomas A Richards
- Department of Biology, University of Oxford, Oxford, United Kingdom;
| |
Collapse
|
3
|
Loeillet S, Nicolas A. DNA polymerase δ: A single Pol31 polymorphism suppresses the strain background-specific lethality of Pol32 inactivation in Saccharomyces cerevisiae. DNA Repair (Amst) 2023; 127:103514. [PMID: 37244009 DOI: 10.1016/j.dnarep.2023.103514] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2023] [Revised: 05/12/2023] [Accepted: 05/14/2023] [Indexed: 05/29/2023]
Abstract
The evolutionarily conserved DNA polymerase delta (Polδ) plays several essential roles in eukaryotic DNA replication and repair, responsible for the synthesis of the lagging-strand, lower replicative mutagenesis via its proof-reading exonuclease activity and synthetizes both strands during break-induced replication. In Saccharomyces cerevisiae, the Polδ protein complex consists of three subunits encoded by the POL3, POL31 and POL32 genes. Surprisingly, in contrast to POL3 and POL31, the POL32 gene deletion was found to be viable but lethal in all other eukaryotes, raising the question to which extent the viability of the POL32 deletion in S. cerevisiae was species specific. To address this issue, we inactivated the POL32 gene in 10 evolutionary close or distant S. cerevisiae strains and found that POL32 was either essential (3 strains including SK1), non-essential (5 strains including the reference S288C strain) or confers a slow-growth phenotype (2 strains). Whole-genome sequencing of S288C/SK1 pol32∆ meiotic segregants identified the lethal/suppressor effect of the single Pol31-C43Y polymorphism. Consistently, the introduction of the Pol31-43C allele in the SK1 and West African (WA) pol32∆ mutants was sufficient to restore cell viability and wild-type growth upon introduction of two copies of POL31-43C in the SK1 haploid strain. Reciprocally, introduction of the SK1 POL31-43Y allele in the S288C pol32∆ mutant was lethal. Sequence analyses of the POL31 polymorphisms in the 1,011 yeasts genome dataset correlates with the strict occurrence of the POL31-43Y allele in the yeast African palm wine clade. Differently, the single Pol31-E400G polymorphism confers pol32∆ lethality in the Malaysian strain. In the yeast two-hybrid assay, we observed a weakened interaction between Pol3 and Pol31-43Y versus Pol31-43C suggesting an insufficient level of the Polδ holoenzyme stability/activity. Thus, the enigmatic non-essentiality of Pol32 in S. cerevisiae results from single Pol31 amino acid polymorphism and is clade rather than species specific.
Collapse
Affiliation(s)
- S Loeillet
- Institut Curie Research Center, CNRS UMR3244, PSL Research University, 26 rue d'Ulm, 75248 Paris Cedex 05, France
| | - A Nicolas
- Institut Curie Research Center, CNRS UMR3244, PSL Research University, 26 rue d'Ulm, 75248 Paris Cedex 05, France; IRCAN, CNRS UMR7284, INSERM U1081, Université Côte d'Azur, 28 avenue de Valombrose, 06107 Nice, France.
| |
Collapse
|
4
|
Natalino M, Fumasoni M. Experimental approaches to study evolutionary cell biology using yeasts. Yeast 2023; 40:123-133. [PMID: 36896914 DOI: 10.1002/yea.3848] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2022] [Revised: 02/16/2023] [Accepted: 03/07/2023] [Indexed: 03/11/2023] Open
Abstract
The past century has witnessed tremendous advances in understanding how cells function. Nevertheless, how cellular processes have evolved is still poorly understood. Many studies have highlighted surprising molecular diversity in how cells from diverse species execute the same processes, and advances in comparative genomics are likely to reveal much more molecular diversity than was believed possible until recently. Extant cells remain therefore the product of an evolutionary history that we vastly ignore. Evolutionary cell biology has emerged as a discipline aiming to address this knowledge gap by combining evolutionary, molecular, and cellular biology thinking. Recent studies have shown how even essential molecular processes, such as DNA replication, can undergo fast adaptive evolution under certain laboratory conditions. These developments open new lines of research where the evolution of cellular processes can be investigated experimentally. Yeasts naturally find themselves at the forefront of this research line. Not only do they allow the observation of fast evolutionary adaptation, but they also provide numerous genomic, synthetic, and cellular biology tools already developed by a large community. Here we propose that yeasts can serve as an "evolutionary cell lab" to test hypotheses, principles, and ideas in evolutionary cell biology. We discuss various experimental approaches available for this purpose, and how biology at large can benefit from them.
Collapse
|
5
|
Dede M, Hart T. Recovering false negatives in CRISPR fitness screens with JLOE. Nucleic Acids Res 2023; 51:1637-1651. [PMID: 36727483 PMCID: PMC9976895 DOI: 10.1093/nar/gkad046] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2022] [Revised: 01/09/2023] [Accepted: 01/16/2023] [Indexed: 02/03/2023] Open
Abstract
It is widely accepted that pooled library CRISPR knockout screens offer greater sensitivity and specificity than prior technologies in detecting genes whose disruption leads to fitness defects, a critical step in identifying candidate cancer targets. However, the assumption that CRISPR screens are saturating has been largely untested. Through integrated analysis of screen data in cancer cell lines generated by the Cancer Dependency Map, we show that a typical CRISPR screen has a ∼20% false negative rate, in addition to library-specific false negatives. Replicability falls sharply as gene expression decreases, while cancer subtype-specific genes within a tissue show distinct profiles compared to false negatives. Cumulative analyses across tissues improves our understanding of core essential genes and suggest only a small number of lineage-specific essential genes, enriched for transcription factors that define pathways of tissue differentiation. To recover false negatives, we introduce a method, Joint Log Odds of Essentiality (JLOE), which builds on our prior work with BAGEL to selectively rescue the false negatives without an increased false discovery rate.
Collapse
Affiliation(s)
- Merve Dede
- Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Traver Hart
- Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA.,Department of Cancer Biology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| |
Collapse
|
6
|
Rule-Based Pruning and In Silico Identification of Essential Proteins in Yeast PPIN. Cells 2022; 11:cells11172648. [PMID: 36078056 PMCID: PMC9454873 DOI: 10.3390/cells11172648] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2022] [Revised: 08/18/2022] [Accepted: 08/22/2022] [Indexed: 11/25/2022] Open
Abstract
Proteins are vital for the significant cellular activities of living organisms. However, not all of them are essential. Identifying essential proteins through different biological experiments is relatively more laborious and time-consuming than the computational approaches used in recent times. However, practical implementation of conventional scientific methods sometimes becomes challenging due to poor performance impact in specific scenarios. Thus, more developed and efficient computational prediction models are required for essential protein identification. An effective methodology is proposed in this research, capable of predicting essential proteins in a refined yeast protein–protein interaction network (PPIN). The rule-based refinement is done using protein complex and local interaction density information derived from the neighborhood properties of proteins in the network. Identification and pruning of non-essential proteins are equally crucial here. In the initial phase, careful assessment is performed by applying node and edge weights to identify and discard the non-essential proteins from the interaction network. Three cut-off levels are considered for each node and edge weight for pruning the non-essential proteins. Once the PPIN has been filtered out, the second phase starts with two centralities-based approaches: (1) local interaction density (LID) and (2) local interaction density with protein complex (LIDC), which are successively implemented to identify the essential proteins in the yeast PPIN. Our proposed methodology achieves better performance in comparison to the existing state-of-the-art techniques.
Collapse
|
7
|
Small Molecule Arranged Thermal Proximity Co aggregation (smarTPCA)-A Novel Approach to Characterize Protein-Protein Interactions in Living Cells by Similar Isothermal Dose-Responses. Int J Mol Sci 2022; 23:ijms23105605. [PMID: 35628420 PMCID: PMC9147192 DOI: 10.3390/ijms23105605] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2022] [Revised: 05/06/2022] [Accepted: 05/10/2022] [Indexed: 11/17/2022] Open
Abstract
Chemical biology and the application of small molecules has proven to be a potent perturbation strategy, especially for the functional elucidation of proteins, their networks, and regulators. In recent years, the cellular thermal shift assay (CETSA) and its proteome-wide extension, thermal proteome profiling (TPP), have proven to be effective tools for identifying interactions of small molecules with their target proteins, as well as off-targets in living cells. Here, we asked the question whether isothermal dose-response (ITDR) CETSA can be exploited to characterize secondary effects downstream of the primary binding event, such as changes in post-translational modifications or protein-protein interactions (PPI). By applying ITDR-CETSA to MAPK14 kinase inhibitor treatment of living HL-60 cells, we found similar dose-responses for the direct inhibitor target and its known interaction partners MAPKAPK2 and MAPKAPK3. Extension of the dose-response similarity comparison to the proteome wide level using TPP with compound concentration range (TPP-CCR) revealed not only the known MAPK14 interaction partners MAPKAPK2 and MAPKAPK3, but also the potentially new intracellular interaction partner MYLK. We are confident that dose-dependent small molecule treatment in combination with ITDR-CETSA or TPP-CCR similarity assessment will not only allow discrimination between primary and secondary effects, but will also provide a novel method to study PPI in living cells without perturbation by protein modification, which we named "small molecule arranged thermal proximity coaggregation" (smarTPCA).
Collapse
|
8
|
Wang Y, Jiang B, Wu Y, He X, Liu L. Rapid intraspecies evolution of fitness effects of yeast genes. Genome Biol Evol 2022; 14:6575331. [PMID: 35482054 PMCID: PMC9113246 DOI: 10.1093/gbe/evac061] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/21/2022] [Indexed: 11/14/2022] Open
Abstract
Organisms within species have numerous genetic and phenotypic variations. Growing evidences show intraspecies variation of mutant phenotypes may be more complicated than expected. Current studies on intraspecies variations of mutant phenotypes are limited to just a few strains. This study investigated the intraspecies variation of fitness effects of 5,630 gene mutants in ten Saccharomyces cerevisiae strains using CRISPR–Cas9 screening. We found that the variability of fitness effects induced by gene disruptions is very large across different strains. Over 75% of genes affected cell fitness in a strain-specific manner to varying degrees. The strain specificity of the fitness effect of a gene is related to its evolutionary and functional properties. Subsequent analysis revealed that younger genes, especially those newly acquired in S. cerevisiae species, are more likely to be strongly strain-specific. Intriguingly, there seems to exist a ceiling of fitness effect size for strong strain-specific genes, and among them, the newly acquired genes are still evolving and have yet to reach this ceiling. Additionally, for a large proportion of protein complexes, the strain specificity profile is inconsistent among genes encoding the same complex. Taken together, these results offer a genome-wide map of intraspecies variation for fitness effect as a mutant phenotype and provide an updated insight on intraspecies phenotypic evolution.
Collapse
Affiliation(s)
- Yayu Wang
- MOE Key Laboratory of Gene Function and Regulation, State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-sen University, Guangzhou 510275, China
| | - Bei Jiang
- MOE Key Laboratory of Gene Function and Regulation, State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-sen University, Guangzhou 510275, China
| | - Yue Wu
- MOE Key Laboratory of Gene Function and Regulation, State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-sen University, Guangzhou 510275, China
| | - Xionglei He
- MOE Key Laboratory of Gene Function and Regulation, State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-sen University, Guangzhou 510275, China
| | - Li Liu
- MOE Key Laboratory of Gene Function and Regulation, State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-sen University, Guangzhou 510275, China
| |
Collapse
|
9
|
Abstract
SignificanceMitosis is an essential process in all eukaryotes, but paradoxically, genes required for mitosis vary among species. The essentiality of many mitotic genes was bypassed by activating alternative mechanisms during evolution. However, bypass events have rarely been recapitulated experimentally. Here, using the fission yeast Schizosaccharomyces pombe, the essentiality of a kinase (Plo1) required for bipolar spindle formation was bypassed by other mutations, many of which are associated with glucose metabolism. The Plo1 bypass by the reduction in glucose uptake was dependent on another kinase (casein kinase I), which potentiated spindle microtubule formation. This study illustrates a rare experimental bypass of essentiality for mitotic genes and provides insights into the molecular diversity of mitosis.
Collapse
|
10
|
OUP accepted manuscript. Brief Funct Genomics 2022; 21:243-269. [DOI: 10.1093/bfgp/elac007] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2021] [Revised: 03/17/2022] [Accepted: 03/18/2022] [Indexed: 11/14/2022] Open
|
11
|
Ha D, Kim D, Kim I, Oh Y, Kong J, Han S, Kim S. OUP accepted manuscript. Nucleic Acids Res 2022; 50:1849-1863. [PMID: 35137181 PMCID: PMC8887464 DOI: 10.1093/nar/gkac050] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2021] [Revised: 01/14/2022] [Accepted: 01/25/2022] [Indexed: 11/14/2022] Open
Abstract
Mouse models have been engineered to reveal the biological mechanisms of human diseases based on an assumption. The assumption is that orthologous genes underlie conserved phenotypes across species. However, genetically modified mouse orthologs of human genes do not often recapitulate human disease phenotypes which might be due to the molecular evolution of phenotypic differences across species from the time of the last common ancestor. Here, we systematically investigated the evolutionary divergence of regulatory relationships between transcription factors (TFs) and target genes in functional modules, and found that the rewiring of gene regulatory networks (GRNs) contributes to the phenotypic discrepancies that occur between humans and mice. We confirmed that the rewired regulatory networks of orthologous genes contain a higher proportion of species-specific regulatory elements. Additionally, we verified that the divergence of target gene expression levels, which was triggered by network rewiring, could lead to phenotypic differences. Taken together, a careful consideration of evolutionary divergence in regulatory networks could be a novel strategy to understand the failure or success of mouse models to mimic human diseases. To help interpret mouse phenotypes in human disease studies, we provide quantitative comparisons of gene expression profiles on our website (http://sbi.postech.ac.kr/w/RN).
Collapse
Affiliation(s)
- Doyeon Ha
- Department of Life Sciences, Pohang University of Science and Technology, Pohang, Korea
| | - Donghyo Kim
- Department of Life Sciences, Pohang University of Science and Technology, Pohang, Korea
| | | | - Youngchul Oh
- Department of Life Sciences, Pohang University of Science and Technology, Pohang, Korea
| | - JungHo Kong
- Department of Life Sciences, Pohang University of Science and Technology, Pohang, Korea
| | - Seong Kyu Han
- Department of Life Sciences, Pohang University of Science and Technology, Pohang, Korea
| | - Sanguk Kim
- To whom correspondence should be addressed. Tel: +82 54 279 2348; Fax: +82 54 279 2199;
| |
Collapse
|
12
|
Comprehensive prediction of robust synthetic lethality between paralog pairs in cancer cell lines. Cell Syst 2021; 12:1144-1159.e6. [PMID: 34529928 DOI: 10.1016/j.cels.2021.08.006] [Citation(s) in RCA: 31] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2021] [Revised: 07/08/2021] [Accepted: 08/18/2021] [Indexed: 12/15/2022]
Abstract
Pairs of paralogs may share common functionality and, hence, display synthetic lethal interactions. As the majority of human genes have an identifiable paralog, exploiting synthetic lethality between paralogs may be a broadly applicable approach for targeting gene loss in cancer. However, only a biased subset of human paralog pairs has been tested for synthetic lethality to date. Here, by analyzing genome-wide CRISPR screens and molecular profiles of over 700 cancer cell lines, we identify features predictive of synthetic lethality between paralogs, including shared protein-protein interactions and evolutionary conservation. We develop a machine-learning classifier based on these features to predict which paralog pairs are most likely to be synthetic lethal and to explain why. We show that our classifier accurately predicts the results of combinatorial CRISPR screens in cancer cell lines and furthermore can distinguish pairs that are synthetic lethal in multiple cell lines from those that are cell-line specific. A record of this paper's transparent peer review process is included in the supplemental information.
Collapse
|
13
|
Rees-Garbutt J, Chalkley O, Landon S, Purcell O, Marucci L, Grierson C. Designing minimal genomes using whole-cell models. Nat Commun 2020; 11:836. [PMID: 32047145 PMCID: PMC7012841 DOI: 10.1038/s41467-020-14545-0] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2019] [Accepted: 12/17/2019] [Indexed: 11/29/2022] Open
Abstract
In the future, entire genomes tailored to specific functions and environments could be designed using computational tools. However, computational tools for genome design are currently scarce. Here we present algorithms that enable the use of design-simulate-test cycles for genome design, using genome minimisation as a proof-of-concept. Minimal genomes are ideal for this purpose as they have a simple functional assay whether the cell replicates or not. We used the first (and currently only published) whole-cell model for the bacterium Mycoplasma genitalium. Our computational design-simulate-test cycles discovered novel in silico minimal genomes which, if biologically correct, predict in vivo genomes smaller than JCVI-Syn3.0; a bacterium with, currently, the smallest genome that can be grown in pure culture. In the process, we identified 10 low essential genes and produced evidence for at least two Mycoplasma genitalium in silico minimal genomes. This work brings combined computational and laboratory genome engineering a step closer. Genome engineering will one day benefit from computational tools that can design genomes with desired functions. Here the authors develop computational design-simulate-test algorithms to design minimal genomes based on the whole-cell model of Mycoplasma genitalium.
Collapse
Affiliation(s)
- Joshua Rees-Garbutt
- BrisSynBio, University of Bristol, Bristol, BS8 1TQ, UK.,School of Biological Sciences, University of Bristol, Bristol Life Sciences Building, 24 Tyndall Avenue, Bristol, BS8 1TQ, UK
| | - Oliver Chalkley
- BrisSynBio, University of Bristol, Bristol, BS8 1TQ, UK.,Department of Engineering Mathematics, University of Bristol, Bristol, BS8 1UB, UK.,Bristol Centre for Complexity Science, Department of Engineering Mathematics, University of Bristol, Bristol, BS8 1UB, UK
| | - Sophie Landon
- BrisSynBio, University of Bristol, Bristol, BS8 1TQ, UK.,Department of Engineering Mathematics, University of Bristol, Bristol, BS8 1UB, UK
| | - Oliver Purcell
- Engine Biosciences, MBC Biolabs, 733 Industrial Road, San Carlos, CA, 94070, USA
| | - Lucia Marucci
- BrisSynBio, University of Bristol, Bristol, BS8 1TQ, UK. .,Department of Engineering Mathematics, University of Bristol, Bristol, BS8 1UB, UK. .,School of Cellular and Molecular Medicine, University of Bristol, Bristol, BS8 1UB, UK.
| | - Claire Grierson
- BrisSynBio, University of Bristol, Bristol, BS8 1TQ, UK. .,School of Biological Sciences, University of Bristol, Bristol Life Sciences Building, 24 Tyndall Avenue, Bristol, BS8 1TQ, UK.
| |
Collapse
|
14
|
De Kegel B, Ryan CJ. Paralog buffering contributes to the variable essentiality of genes in cancer cell lines. PLoS Genet 2019; 15:e1008466. [PMID: 31652272 PMCID: PMC6834290 DOI: 10.1371/journal.pgen.1008466] [Citation(s) in RCA: 46] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2019] [Revised: 11/06/2019] [Accepted: 10/08/2019] [Indexed: 12/26/2022] Open
Abstract
What makes a gene essential for cellular survival? In model organisms, such as budding yeast, systematic gene deletion studies have revealed that paralog genes are less likely to be essential than singleton genes and that this can partially be attributed to the ability of paralogs to buffer each other's loss. However, the essentiality of a gene is not a fixed property and can vary significantly across different genetic backgrounds. It is unclear to what extent paralogs contribute to this variation, as most studies have analyzed genes identified as essential in a single genetic background. Here, using gene essentiality profiles of 558 genetically heterogeneous tumor cell lines, we analyze the contribution of paralogy to variable essentiality. We find that, compared to singleton genes, paralogs are less frequently essential and that this is more evident when considering genes with multiple paralogs or with highly sequence-similar paralogs. In addition, we find that paralogs derived from whole genome duplication exhibit more variable essentiality than those derived from small-scale duplications. We provide evidence that in 13–17% of cases the variable essentiality of paralogs can be attributed to buffering relationships between paralog pairs, as evidenced by synthetic lethality. Paralog pairs derived from whole genome duplication and pairs that function in protein complexes are significantly more likely to display such synthetic lethal relationships. Overall we find that many of the observations made using a single strain of budding yeast can be extended to understand patterns of essentiality in genetically heterogeneous cancer cell lines. Somewhat surprisingly, the majority of human genes can be mutated or deleted in individual cell lines without killing the cells. This observation raises a number of questions—which genes can be lost and why? Here we address these questions by analyzing data on which genes are essential for the growth of over 500 cancer cell lines. In general we find that paralog genes are essential in fewer cell lines than genes that are not paralogs. Paralogs are genes that have been duplicated at some point in evolutionary history, resulting in our genome having two copies of the same gene—a paralog pair. These paralog pairs are a potential source of redundancy, similar to a car having a spare tire. If this is the case, one might expect that losing one gene from a paralog pair could be tolerated by cells, due to the existence of a 'backup gene', but losing both members would cause cells to die. By analyzing the cancer cell lines we estimate this to be the case for 13–17% of paralog pairs, and that this provides an explanation for why some genes are essential in some cell lines but not others.
Collapse
Affiliation(s)
- Barbara De Kegel
- School of Computer Science and Systems Biology Ireland, University College Dublin, Belfield, Dublin, Ireland
| | - Colm J. Ryan
- School of Computer Science and Systems Biology Ireland, University College Dublin, Belfield, Dublin, Ireland
- * E-mail:
| |
Collapse
|
15
|
Link clustering explains non-central and contextually essential genes in protein interaction networks. Sci Rep 2019; 9:11672. [PMID: 31406201 PMCID: PMC6690968 DOI: 10.1038/s41598-019-48273-3] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2019] [Accepted: 08/01/2019] [Indexed: 01/29/2023] Open
Abstract
Recent studies have shown that many essential genes (EGs) change their essentiality across various contexts. Finding contextual EGs in pathogenic conditions may facilitate the identification of therapeutic targets. We propose link clustering as an indicator of contextual EGs that are non-central in protein-protein interaction (PPI) networks. In various human and yeast PPI networks, we found that 29–47% of EGs were better characterized by link clustering than by centrality. Importantly, non-central EGs were prone to change their essentiality across different human cell lines and between species. Compared with central EGs and non-EGs, non-central EGs had intermediate levels of expression and evolutionary conservation. In addition, non-central EGs exhibited a significant impact on communities at lower hierarchical levels, suggesting that link clustering is associated with contextual essentiality, as it depicts locally important nodes in network structures.
Collapse
|
16
|
Minic Z, Dahms TES, Babu M. Chromatographic separation strategies for precision mass spectrometry to study protein-protein interactions and protein phosphorylation. J Chromatogr B Analyt Technol Biomed Life Sci 2018; 1102-1103:96-108. [PMID: 30380468 DOI: 10.1016/j.jchromb.2018.10.022] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2018] [Revised: 10/19/2018] [Accepted: 10/22/2018] [Indexed: 11/30/2022]
Abstract
Investigating protein-protein interactions and protein phosphorylation can be of great significance when studying biological processes and human diseases at the molecular level. However, sample complexity, presence of low abundance proteins, and dynamic nature of the proteins often impede in achieving sufficient analytical depth in proteomics research. In this regard, chromatographic separation methodologies have played a vital role in the identification and quantification of proteins in complex sample mixtures. The combination of peptide and protein fractionation techniques with advanced high-performance mass spectrometry has allowed the researchers to successfully study the protein-protein interactions and protein phosphorylation. Several new fractionation strategies for large scale analysis of proteins and peptides have been developed to study protein-protein interactions and protein phosphorylation. These emerging chromatography methodologies have enabled the identification of several hundred protein complexes and even thousands of phosphorylation sites in a single study. In this review, we focus on current workflow strategies and chromatographic tools, highlighting their advantages and disadvantages, and examining their associated challenges and future potential.
Collapse
Affiliation(s)
- Zoran Minic
- Department of Chemistry and Biomolecular Science, University of Ottawa, John L. Holmes, Mass Spectrometry Facility, 10 Marie-Curie, Marion Hall, Room 02, Ottawa, ON K1N 1A2, Canada.
| | - Tanya E S Dahms
- Department of Chemistry and Biochemistry, University of Regina, 3737 Wascana Parkway, Regina, SK S4S 0A2, Canada
| | - Mohan Babu
- Department of Chemistry and Biochemistry, University of Regina, 3737 Wascana Parkway, Regina, SK S4S 0A2, Canada
| |
Collapse
|
17
|
Giscard PL, Wilson RC. A centrality measure for cycles and subgraphs II. APPLIED NETWORK SCIENCE 2018; 3:9. [PMID: 30839787 PMCID: PMC6214294 DOI: 10.1007/s41109-018-0064-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/16/2018] [Accepted: 05/02/2018] [Indexed: 06/09/2023]
Abstract
In a recent work we introduced a measure of importance for groups of vertices in a complex network. This centrality for groups is always between 0 and 1 and induces the eigenvector centrality over vertices. Furthermore, its value over any group is the fraction of all network flows intercepted by this group. Here we provide the rigorous mathematical constructions underpinning these results via a semi-commutative extension of a number theoretic sieve. We then established further relations between the eigenvector centrality and the centrality proposed here, showing that the latter is a proper extension of the former to groups of nodes. We finish by comparing the centrality proposed here with the notion of group-centrality introduced by Everett and Borgatti on two real-world networks: the Wolfe's dataset and the protein-protein interaction network of the yeast Saccharomyces cerevisiae. In this latter case, we demonstrate that the centrality is able to distinguish protein complexes.
Collapse
Affiliation(s)
- Pierre-Louis Giscard
- Department of Computer Science, University of York, Deramore Lane, Heslington, York, YO10 5GH UK
| | - Richard C. Wilson
- Department of Computer Science, University of York, Deramore Lane, Heslington, York, YO10 5GH UK
| |
Collapse
|
18
|
Han SK, Kim D, Lee H, Kim I, Kim S. Divergence of Noncoding Regulatory Elements Explains Gene–Phenotype Differences between Human and Mouse Orthologous Genes. Mol Biol Evol 2018; 35:1653-1667. [DOI: 10.1093/molbev/msy056] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Affiliation(s)
- Seong Kyu Han
- Department of Life Sciences, Pohang University of Science and Technology, Pohang, Korea
| | - Donghyo Kim
- Department of Life Sciences, Pohang University of Science and Technology, Pohang, Korea
| | - Heetak Lee
- Department of Life Sciences, Pohang University of Science and Technology, Pohang, Korea
| | - Inhae Kim
- Department of Life Sciences, Pohang University of Science and Technology, Pohang, Korea
| | - Sanguk Kim
- Department of Life Sciences, Pohang University of Science and Technology, Pohang, Korea
| |
Collapse
|
19
|
A High-Resolution Genome-Wide CRISPR/Cas9 Viability Screen Reveals Structural Features and Contextual Diversity of the Human Cell-Essential Proteome. Mol Cell Biol 2017; 38:MCB.00302-17. [PMID: 29038160 DOI: 10.1128/mcb.00302-17] [Citation(s) in RCA: 45] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2017] [Accepted: 09/11/2017] [Indexed: 11/20/2022] Open
Abstract
To interrogate genes essential for cell growth, proliferation and survival in human cells, we carried out a genome-wide clustered regularly interspaced short palindromic repeat (CRISPR)/Cas9 screen in a B-cell lymphoma line using a custom extended-knockout (EKO) library of 278,754 single-guide RNAs (sgRNAs) that targeted 19,084 RefSeq genes, 20,852 alternatively spliced exons, and 3,872 hypothetical genes. A new statistical analysis tool called robust analytics and normalization for knockout screens (RANKS) identified 2,280 essential genes, 234 of which were unique. Individual essential genes were validated experimentally and linked to ribosome biogenesis and stress responses. Essential genes exhibited a bimodal distribution across 10 different cell lines, consistent with a continuous variation in essentiality as a function of cell type. Genes essential in more lines had more severe fitness defects and encoded the evolutionarily conserved structural cores of protein complexes, whereas genes essential in fewer lines formed context-specific modules and encoded subunits at the periphery of essential complexes. The essentiality of individual protein residues across the proteome correlated with evolutionary conservation, structural burial, modular domains, and protein interaction interfaces. Many alternatively spliced exons in essential genes were dispensable and were enriched for disordered regions. Fitness defects were observed for 44 newly evolved hypothetical reading frames. These results illuminate the contextual nature and evolution of essential gene functions in human cells.
Collapse
|
20
|
Abstract
Gene essentiality is a founding concept of genetics with important implications in both fundamental and applied research. Multiple screens have been performed over the years in bacteria, yeasts, animals and more recently in human cells to identify essential genes. A mounting body of evidence suggests that gene essentiality, rather than being a static and binary property, is both context dependent and evolvable in all kingdoms of life. This concept of a non-absolute nature of gene essentiality changes our fundamental understanding of essential biological processes and could directly affect future treatment strategies for cancer and infectious diseases.
Collapse
|
21
|
Young JH, Peyton M, Seok Kim H, McMillan E, Minna JD, White MA, Marcotte EM. Computational discovery of pathway-level genetic vulnerabilities in non-small-cell lung cancer. Bioinformatics 2016; 32:1373-9. [PMID: 26755624 PMCID: PMC4848405 DOI: 10.1093/bioinformatics/btw010] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2014] [Accepted: 01/07/2016] [Indexed: 01/09/2023] Open
Abstract
Motivation: Novel approaches are needed for discovery of targeted therapies for non-small-cell lung cancer (NSCLC) that are specific to certain patients. Whole genome RNAi screening of lung cancer cell lines provides an ideal source for determining candidate drug targets. Results: Unsupervised learning algorithms uncovered patterns of differential vulnerability across lung cancer cell lines to loss of functionally related genes. Such genetic vulnerabilities represent candidate targets for therapy and are found to be involved in splicing, translation and protein folding. In particular, many NSCLC cell lines were especially sensitive to the loss of components of the LSm2-8 protein complex or the CCT/TRiC chaperonin. Different vulnerabilities were also found for different cell line subgroups. Furthermore, the predicted vulnerability of a single adenocarcinoma cell line to loss of the Wnt pathway was experimentally validated with screening of small-molecule Wnt inhibitors against an extensive cell line panel. Availability and implementation: The clustering algorithm is implemented in Python and is freely available at https://bitbucket.org/youngjh/nsclc_paper. Contact:marcotte@icmb.utexas.edu or jon.young@utexas.edu Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Jonathan H Young
- Institute for Computational Engineering and Sciences, University of Texas at Austin, Austin, TX, USA, Center for Systems and Synthetic Biology and Department of Molecular Biosciences, University of Texas at Austin, Austin, TX, USA
| | - Michael Peyton
- Hamon Center for Therapeutic Oncology Research, University of Texas Southwestern Medical Center, Dallas, TX, USA
| | - Hyun Seok Kim
- Severance Biomedical Science Institute, Yonsei University College of Medicine, Seoul, Korea, and
| | - Elizabeth McMillan
- Department of Cell Biology, University of Texas Southwestern Medical Center, Dallas, TX, USA
| | - John D Minna
- Hamon Center for Therapeutic Oncology Research, University of Texas Southwestern Medical Center, Dallas, TX, USA
| | - Michael A White
- Department of Cell Biology, University of Texas Southwestern Medical Center, Dallas, TX, USA
| | - Edward M Marcotte
- Institute for Computational Engineering and Sciences, University of Texas at Austin, Austin, TX, USA, Center for Systems and Synthetic Biology and Department of Molecular Biosciences, University of Texas at Austin, Austin, TX, USA
| |
Collapse
|
22
|
Castrillo JI, Oliver SG. Alzheimer's as a Systems-Level Disease Involving the Interplay of Multiple Cellular Networks. Methods Mol Biol 2016; 1303:3-48. [PMID: 26235058 DOI: 10.1007/978-1-4939-2627-5_1] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
Alzheimer's disease (AD), and many neurodegenerative disorders, are multifactorial in nature. They involve a combination of genomic, epigenomic, interactomic and environmental factors. Progress is being made, and these complex diseases are beginning to be understood as having their origin in altered states of biological networks at the cellular level. In the case of AD, genomic susceptibility and mechanisms leading to (or accompanying) the impairment of the central Amyloid Precursor Protein (APP) processing and tau networks are widely accepted as major contributors to the diseased state. The derangement of these networks may result in both the gain and loss of functions, increased generation of toxic species (e.g., toxic soluble oligomers and aggregates) and imbalances, whose effects can propagate to supra-cellular levels. Although well sustained by empirical data and widely accepted, this global perspective often overlooks the essential roles played by the main counteracting homeostatic networks (e.g., protein quality control/proteostasis, unfolded protein response, protein folding chaperone networks, disaggregases, ER-associated degradation/ubiquitin proteasome system, endolysosomal network, autophagy, and other stress-protective and clearance networks), whose relevance to AD is just beginning to be fully realized. In this chapter, an integrative perspective is presented. Alzheimer's disease is characterized to be a result of: (a) intrinsic genomic/epigenomic susceptibility and, (b) a continued dynamic interplay between the deranged networks and the central homeostatic networks of nerve cells. This interplay of networks will underlie both the onset and rate of progression of the disease in each individual. Integrative Systems Biology approaches are required to effect its elucidation. Comprehensive Systems Biology experiments at different 'omics levels in simple model organisms, engineered to recapitulate the basic features of AD may illuminate the onset and sequence of events underlying AD. Indeed, studies of models of AD in simple organisms, differentiated cells in culture and rodents are beginning to offer hope that the onset and progression of AD, if detected at an early stage, may be stopped, delayed, or even reversed, by activating or modulating networks involved in proteostasis and the clearance of toxic species. In practice, the incorporation of next-generation neuroimaging, high-throughput and computational approaches are opening the way towards early diagnosis well before irreversible cell death. Thus, the presence or co-occurrence of: (a) accumulation of toxic Aβ oligomers and tau species; (b) altered splicing and transcriptome patterns; (c) impaired redox, proteostatic, and metabolic networks together with, (d) compromised homeostatic capacities may constitute relevant 'AD hallmarks at the cellular level' towards reliable and early diagnosis. From here, preventive lifestyle changes and tailored therapies may be investigated, such as combined strategies aimed at both lowering the production of toxic species and potentiating homeostatic responses, in order to prevent or delay the onset, and arrest, alleviate, or even reverse the progression of the disease.
Collapse
Affiliation(s)
- Juan I Castrillo
- Department of Biochemistry & Cambridge Systems Biology Centre, University of Cambridge, Sanger Building, 80 Tennis Court Road, Cambridge, CB2 1GA, UK,
| | | |
Collapse
|
23
|
Liu G, Yong MYJ, Yurieva M, Srinivasan KG, Liu J, Lim JSY, Poidinger M, Wright GD, Zolezzi F, Choi H, Pavelka N, Rancati G. Gene Essentiality Is a Quantitative Property Linked to Cellular Evolvability. Cell 2015; 163:1388-99. [PMID: 26627736 DOI: 10.1016/j.cell.2015.10.069] [Citation(s) in RCA: 102] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2014] [Revised: 08/01/2015] [Accepted: 10/20/2015] [Indexed: 11/24/2022]
Abstract
Gene essentiality is typically determined by assessing the viability of the corresponding mutant cells, but this definition fails to account for the ability of cells to adaptively evolve to genetic perturbations. Here, we performed a stringent screen to assess the degree to which Saccharomyces cerevisiae cells can survive the deletion of ~1,000 individual "essential" genes and found that ~9% of these genetic perturbations could in fact be overcome by adaptive evolution. Our analyses uncovered a genome-wide gradient of gene essentiality, with certain essential cellular functions being more "evolvable" than others. Ploidy changes were prevalent among the evolved mutant strains, and aneuploidy of a specific chromosome was adaptive for a class of evolvable nucleoporin mutants. These data justify a quantitative redefinition of gene essentiality that incorporates both viability and evolvability of the corresponding mutant cells and will enable selection of therapeutic targets associated with lower risk of emergence of drug resistance.
Collapse
Affiliation(s)
- Gaowen Liu
- Institute of Medical Biology (IMB), Agency for Science, Technology and Research (A(∗)STAR), Singapore 138648, Singapore; School of Biological Sciences, Nanyang Technological University, Singapore 637551, Singapore
| | - Mei Yun Jacy Yong
- Institute of Medical Biology (IMB), Agency for Science, Technology and Research (A(∗)STAR), Singapore 138648, Singapore
| | - Marina Yurieva
- Singapore Immunology Network (SIgN), A(∗)STAR, Singapore 138648, Singapore
| | | | - Jaron Liu
- Institute of Medical Biology (IMB), Agency for Science, Technology and Research (A(∗)STAR), Singapore 138648, Singapore
| | - John Soon Yew Lim
- Institute of Medical Biology (IMB), Agency for Science, Technology and Research (A(∗)STAR), Singapore 138648, Singapore
| | - Michael Poidinger
- Singapore Immunology Network (SIgN), A(∗)STAR, Singapore 138648, Singapore
| | - Graham Daniel Wright
- Institute of Medical Biology (IMB), Agency for Science, Technology and Research (A(∗)STAR), Singapore 138648, Singapore
| | - Francesca Zolezzi
- Singapore Immunology Network (SIgN), A(∗)STAR, Singapore 138648, Singapore
| | - Hyungwon Choi
- Saw Swee Hock School of Public Health, National University of Singapore (NUS) and National University Health System, Singapore 117549, Singapore
| | - Norman Pavelka
- Singapore Immunology Network (SIgN), A(∗)STAR, Singapore 138648, Singapore.
| | - Giulia Rancati
- Institute of Medical Biology (IMB), Agency for Science, Technology and Research (A(∗)STAR), Singapore 138648, Singapore; School of Biological Sciences, Nanyang Technological University, Singapore 637551, Singapore.
| |
Collapse
|
24
|
Zhang Z, Ren Q. Why are essential genes essential? - The essentiality of Saccharomyces genes. MICROBIAL CELL 2015; 2:280-287. [PMID: 28357303 PMCID: PMC5349100 DOI: 10.15698/mic2015.08.218] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
Essential genes are defined as required for the survival of an organism or a cell. They are of particular interests, not only for their essential biological functions, but also in practical applications, such as identifying effective drug targets to pathogenic bacteria and fungi. The budding yeast Saccharomyces cerevisiae has approximately 6,000 open reading frames, 15 to 20% of which are deemed as essential. Some of the essential genes, however, appear to perform non-essential functions, such as aging and cell death, while many of the non-essential genes play critical roles in cell survival. In this paper, we reviewed and analyzed the levels of essentiality of the Saccharomyces cerevisiae genes and have grouped the genes into four categories: (1) Conditional essential: essential only under certain circumstances or growth conditions; (2) Essential: required for survival under optimal growth conditions; (3) Redundant essential: synthetic lethal due to redundant pathways or gene duplication; and (4) Absolute essential: the minimal genes required for maintaining a cellular life under a stress-free environment. The essential and non-essential functions of the essential genes were further analyzed.
Collapse
Affiliation(s)
- Zhaojie Zhang
- Department of Zoology and Physiology, University of Wyoming, Laramie, WY 82071, USA
| | - Qun Ren
- Department of Zoology and Physiology, University of Wyoming, Laramie, WY 82071, USA
| |
Collapse
|
25
|
Luo J, Qi Y. Identification of Essential Proteins Based on a New Combination of Local Interaction Density and Protein Complexes. PLoS One 2015; 10:e0131418. [PMID: 26125187 PMCID: PMC4488326 DOI: 10.1371/journal.pone.0131418] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2015] [Accepted: 06/02/2015] [Indexed: 11/18/2022] Open
Abstract
Background Computational approaches aided by computer science have been used to predict essential proteins and are faster than expensive, time-consuming, laborious experimental approaches. However, the performance of such approaches is still poor, making practical applications of computational approaches difficult in some fields. Hence, the development of more suitable and efficient computing methods is necessary for identification of essential proteins. Method In this paper, we propose a new method for predicting essential proteins in a protein interaction network, local interaction density combined with protein complexes (LIDC), based on statistical analyses of essential proteins and protein complexes. First, we introduce a new local topological centrality, local interaction density (LID), of the yeast PPI network; second, we discuss a new integration strategy for multiple bioinformatics. The LIDC method was then developed through a combination of LID and protein complex information based on our new integration strategy. The purpose of LIDC is discovery of important features of essential proteins with their neighbors in real protein complexes, thereby improving the efficiency of identification. Results Experimental results based on three different PPI(protein-protein interaction) networks of Saccharomyces cerevisiae and Escherichia coli showed that LIDC outperformed classical topological centrality measures and some recent combinational methods. Moreover, when predicting MIPS datasets, the better improvement of performance obtained by LIDC is over all nine reference methods (i.e., DC, BC, NC, LID, PeC, CoEWC, WDC, ION, and UC). Conclusions LIDC is more effective for the prediction of essential proteins than other recently developed methods.
Collapse
Affiliation(s)
- Jiawei Luo
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China
- * E-mail:
| | - Yi Qi
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China
| |
Collapse
|
26
|
Caufield JH, Abreu M, Wimble C, Uetz P. Protein complexes in bacteria. PLoS Comput Biol 2015; 11:e1004107. [PMID: 25723151 PMCID: PMC4344305 DOI: 10.1371/journal.pcbi.1004107] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2014] [Accepted: 01/02/2015] [Indexed: 01/26/2023] Open
Abstract
Large-scale analyses of protein complexes have recently become available for Escherichia coli and Mycoplasma pneumoniae, yielding 443 and 116 heteromultimeric soluble protein complexes, respectively. We have coupled the results of these mass spectrometry-characterized protein complexes with the 285 “gold standard” protein complexes identified by EcoCyc. A comparison with databases of gene orthology, conservation, and essentiality identified proteins conserved or lost in complexes of other species. For instance, of 285 “gold standard” protein complexes in E. coli, less than 10% are fully conserved among a set of 7 distantly-related bacterial “model” species. Complex conservation follows one of three models: well-conserved complexes, complexes with a conserved core, and complexes with partial conservation but no conserved core. Expanding the comparison to 894 distinct bacterial genomes illustrates fractional conservation and the limits of co-conservation among components of protein complexes: just 14 out of 285 model protein complexes are perfectly conserved across 95% of the genomes used, yet we predict more than 180 may be partially conserved across at least half of the genomes. No clear relationship between gene essentiality and protein complex conservation is observed, as even poorly conserved complexes contain a significant number of essential proteins. Finally, we identify 183 complexes containing well-conserved components and uncharacterized proteins which will be interesting targets for future experimental studies. Though more than 20,000 binary protein-protein interactions have been published for a few well-studied bacterial species, the results rarely capture the full extent to which proteins take part in complexes. Here, we use experimentally-observed protein complexes from E. coli or Mycoplasma pneumoniae, as well as gene orthology, to predict protein complexes across many species of bacteria. Surprisingly, the majority of protein complexes is not conserved, demonstrating an unexpected evolutionary flexibility. We also observe broader trends within protein complex conservation, especially in genome-reduced species with minimal sets of protein complexes.
Collapse
Affiliation(s)
- J. Harry Caufield
- Center for the Study of Biological Complexity, Virginia Commonwealth University, Richmond, Virginia, United States of America
| | - Marco Abreu
- Center for the Study of Biological Complexity, Virginia Commonwealth University, Richmond, Virginia, United States of America
| | - Christopher Wimble
- Center for the Study of Biological Complexity, Virginia Commonwealth University, Richmond, Virginia, United States of America
| | - Peter Uetz
- Center for the Study of Biological Complexity, Virginia Commonwealth University, Richmond, Virginia, United States of America
- * E-mail:
| |
Collapse
|
27
|
Protein-protein Interaction Networks of E. coli and S. cerevisiae are similar. Sci Rep 2014; 4:7187. [PMID: 25431098 PMCID: PMC5384207 DOI: 10.1038/srep07187] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2014] [Accepted: 10/29/2014] [Indexed: 12/21/2022] Open
Abstract
Only recently novel high-throughput binary interaction data in E. coli became available that allowed us to compare experimentally obtained protein-protein interaction networks of prokaryotes and eukaryotes (i.e. E. coli and S. cerevisiae). Utilizing binary-Y2H, co-complex and binary literature curated interaction sets in both organisms we found that characteristics of interaction sets that were determined with the same experimental methods were strikingly similar. While essentiality is frequently considered a question of a protein's increasing number of interactions, we found that binary-Y2H interactions failed to show such a trend in both organisms. Furthermore, essential genes are enriched in protein complexes in both organisms. In turn, binary-Y2H interactions hold more bottleneck interactions than co-complex interactions while both binary-Y2H and co-complex interactions are strongly enriched among co-regulated proteins and transcription factors. We discuss if such similarities are a consequence of the underlying methodology or rather reflect truly different biological patterns.
Collapse
|