1
|
Pan X, Zeng T, Zhang YH, Chen L, Feng K, Huang T, Cai YD. Investigation and Prediction of Human Interactome Based on Quantitative Features. Front Bioeng Biotechnol 2020; 8:730. [PMID: 32766217 PMCID: PMC7379396 DOI: 10.3389/fbioe.2020.00730] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2020] [Accepted: 06/09/2020] [Indexed: 01/27/2023] Open
Abstract
Protein is one of the most significant components of all living creatures. All significant and essential biological structures and functions relies on proteins and their respective biological functions. However, proteins cannot perform their unique biological significance independently. They have to interact with each other to realize the complicated biological processes in all living creatures including human beings. In other words, proteins depend on interactions (protein-protein interactions) to realize their significant effects. Thus, the significance comparison and quantitative contribution of candidate PPI features must be determined urgently. According to previous studies, 258 physical and chemical characteristics of proteins have been reported and confirmed to definitively affect the interaction efficiency of the related proteins. Among such features, essential physiochemical features of proteins like stoichiometric balance, protein abundance, molecular weight and charge distribution have been validated to be quite significant and irreplaceable for protein-protein interactions (PPIs). Therefore, in this study, we, on one hand, presented a novel computational framework to identify the key factors affecting PPIs with Boruta feature selection (BFS), Monte Carlo feature selection (MCFS), incremental feature selection (IFS), and on the other hand, built a quantitative decision-rule system to evaluate the potential PPIs under real conditions with random forest (RF) and RIPPER algorithms, thereby supplying several new insights into the detailed biological mechanisms of complicated PPIs. The main datasets and codes can be downloaded at https://github.com/xypan1232/Mass-PPI.
Collapse
Affiliation(s)
- Xiaoyong Pan
- School of Life Sciences, Shanghai University, Shanghai, China.,Key Laboratory of System Control and Information Processing, Ministry of Education of China, Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, Shanghai, China
| | - Tao Zeng
- Key Laboratory of Systems Biology, Institute of Biochemistry and Cell Biology, Chinese Academy of Sciences, Shanghai, China
| | - Yu-Hang Zhang
- Shanghai Institute of Nutrition and Health, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, China
| | - Lei Chen
- College of Information Engineering, Shanghai Maritime University, Shanghai, China
| | - Kaiyan Feng
- Department of Computer Science, Guangdong AIB Polytechnic, Guangzhou, China
| | - Tao Huang
- Shanghai Institute of Nutrition and Health, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, China
| | - Yu-Dong Cai
- School of Life Sciences, Shanghai University, Shanghai, China
| |
Collapse
|
2
|
An Analysis of the Anti-Neuropathic Effects of Qi She Pill Based on Network Pharmacology. EVIDENCE-BASED COMPLEMENTARY AND ALTERNATIVE MEDICINE 2020; 2020:7193832. [PMID: 32454869 PMCID: PMC7222608 DOI: 10.1155/2020/7193832] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/10/2019] [Revised: 01/16/2020] [Accepted: 01/30/2020] [Indexed: 12/12/2022]
Abstract
Background Qi She Pill (QSP) is a traditional prescription for the treatment of neuropathic pain (NP) that is widely used in China. However, no network pharmacology studies of QSP in the treatment of NP have been conducted to date. Objective To verify the potential pharmacological effects of QSP on NP, its components were analyzed via target docking and network analysis, and network pharmacology methods were used to study the interactions of its components. Materials and Methods Information on pharmaceutically active compounds in QSP and gene information related to NP were obtained from public databases, and a compound-target network and protein-protein interaction network were constructed to study the mechanism of action of QSP in the treatment of NP. The mechanism of action of QSP in the treatment of NP was analyzed via Gene Ontology (GO) biological process annotation and Kyoto Gene and Genomics Encyclopedia (KEGG) pathway enrichment, and the drug-like component-target-pathway network was constructed. Results The compound-target network contained 60 compounds and 444 corresponding targets. The key active compounds included quercetin and beta-sitosterol. Key targets included PTGS2 and PTGS1. The protein-protein interaction network of the active ingredients of QSP in the treatment of NP featured 48 proteins, including DRD2, CHRM, β2-adrenergic receptor, HTR2A, and calcitonin gene-related peptide. In total, 53 GO entries, including 35 biological process items, 7 molecular function items, and 11 cell related items, were identified. In addition, eight relevant (KEGG) pathways were identified, including calcium, neuroactive ligand-receptor interaction, and cAMP signaling pathways. Conclusion Network pharmacology can help clarify the role and mechanism of QSP in the treatment of NP and provide a foundation for further research.
Collapse
|
3
|
Genes dysregulated in the blood of people with Williams syndrome are enriched in protein-coding genes positively selected in humans. Eur J Med Genet 2020; 63:103828. [DOI: 10.1016/j.ejmg.2019.103828] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2019] [Revised: 11/09/2019] [Accepted: 12/21/2019] [Indexed: 12/29/2022]
|
4
|
Dobon B, Montanucci L, Peretó J, Bertranpetit J, Laayouni H. Gene connectivity and enzyme evolution in the human metabolic network. Biol Direct 2019; 14:17. [PMID: 31481097 PMCID: PMC6724310 DOI: 10.1186/s13062-019-0248-7] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2019] [Accepted: 08/21/2019] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Determining the factors involved in the likelihood of a gene being under adaptive selection is still a challenging goal in Evolutionary Biology. Here, we perform an evolutionary analysis of the human metabolic genes to explore the associations between network structure and the presence and strength of natural selection in the genes whose products are involved in metabolism. Purifying and positive selection are estimated at interspecific (among mammals) and intraspecific (among human populations) levels, and the connections between enzymatic reactions are differentiated between incoming (in-degree) and outgoing (out-degree) links. RESULTS We confirm that purifying selection has been stronger in highly connected genes. Long-term positive selection has targeted poorly connected enzymes, whereas short-term positive selection has targeted different enzymes depending on whether the selective sweep has reached fixation in the population: genes under a complete selective sweep are poorly connected, whereas those under an incomplete selective sweep have high out-degree connectivity. The last steps of pathways are more conserved due to stronger purifying selection, with long-term positive selection targeting preferentially enzymes that catalyze the first steps. However, short-term positive selection has targeted enzymes that catalyze the last steps in the metabolic network. Strong signals of positive selection have been found for metabolic processes involved in lipid transport and membrane fluidity and permeability. CONCLUSIONS Our analysis highlights the importance of analyzing the same biological system at different evolutionary timescales to understand the evolution of metabolic genes and of distinguishing between incoming and outgoing links in a metabolic network. Short-term positive selection has targeted enzymes with a different connectivity profile depending on the completeness of the selective sweep, while long-term positive selection has targeted genes with fewer connections that code for enzymes that catalyze the first steps in the network. REVIEWERS This article was reviewed by Diamantis Sellis and Brandon Invergo.
Collapse
Affiliation(s)
- Begoña Dobon
- Institut de Biologia Evolutiva (UPF-CSIC), Universitat Pompeu Fabra, Dr. Aiguader 88, 08003, Barcelona, Catalonia, Spain
| | - Ludovica Montanucci
- Dipartimento di Biomedicina Comparata e Alimentazione, Università degli Studi di Padova, Padua, Italy
| | - Juli Peretó
- Institute for Integrative Systems Biology I2SysBio (University of Valencia-CSIC) and Department of Biochemistry and Molecular Biology, University of Valencia, Valencia, Spain
| | - Jaume Bertranpetit
- Institut de Biologia Evolutiva (UPF-CSIC), Universitat Pompeu Fabra, Dr. Aiguader 88, 08003, Barcelona, Catalonia, Spain.
| | - Hafid Laayouni
- Institut de Biologia Evolutiva (UPF-CSIC), Universitat Pompeu Fabra, Dr. Aiguader 88, 08003, Barcelona, Catalonia, Spain. .,Bioinformatics Studies, ESCI-UPF, Pg.Pujades 1, 08003, Barcelona, Catalonia, Spain.
| |
Collapse
|
5
|
Vicens A, Posada D. Selective Pressures on Human Cancer Genes along the Evolution of Mammals. Genes (Basel) 2018; 9:genes9120582. [PMID: 30487452 PMCID: PMC6316132 DOI: 10.3390/genes9120582] [Citation(s) in RCA: 26] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2018] [Revised: 11/21/2018] [Accepted: 11/21/2018] [Indexed: 01/01/2023] Open
Abstract
Cancer is a disease driven by both somatic mutations that increase survival and proliferation of cell lineages and the evolution of genes associated with cancer risk in populations. Several genes associated with cancer in humans, hereafter cancer genes, show evidence of germline positive selection among species. Taking advantage of a large collection of mammalian genomes, we systematically looked for signatures of germline positive selection in 430 cancer genes available in COSMIC. We identified 40 cancer genes with a robust signal of positive selection in mammals. We found evidence for fewer selective constraints—higher number of non-synonymous substitutions per non-synonymous site to the number of synonymous substitutions per synonymous site (dN/dS)—and higher incidence of positive selection—more positively selected sites—in cancer genes bearing germline and recessive mutations that predispose to cancer. This finding suggests a potential association between relaxed selection, positive selection, and risk of hereditary cancer. On the other hand, we did not find significant differences in terms of tissue or gene type. Human cancer genes under germline positive selection in mammals are significantly enriched in the processes of DNA repair, with high presence of Fanconi anaemia/Breast Cancer A (FA/BRCA) pathway components and T cell proliferation genes. We also show that the inferred positively selected sites in the two genes with the strongest signal of positive selection, i.e., BRCA2 and PTPRC, are in regions of functional relevance, which could be relevant to cancer susceptibility.
Collapse
Affiliation(s)
- Alberto Vicens
- Department of Biochemistry, Genetics and Immunology, University of Vigo, 36310 Vigo, Spain.
- Biomedical Research Center (CINBIO), University of Vigo, 36310 Vigo, Spain.
| | - David Posada
- Department of Biochemistry, Genetics and Immunology, University of Vigo, 36310 Vigo, Spain.
- Biomedical Research Center (CINBIO), University of Vigo, 36310 Vigo, Spain.
- Galicia Sur Health Research Institute, 36310 Vigo, Spain.
| |
Collapse
|
6
|
Zhao ZM, Campbell MC, Li N, Lee DSW, Zhang Z, Townsend JP. Detection of Regional Variation in Selection Intensity within Protein-Coding Genes Using DNA Sequence Polymorphism and Divergence. Mol Biol Evol 2018; 34:3006-3022. [PMID: 28962009 DOI: 10.1093/molbev/msx213] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open
Abstract
Numerous approaches have been developed to infer natural selection based on the comparison of polymorphism within species and divergence between species. These methods are especially powerful for the detection of uniform selection operating across a gene. However, empirical analyses have demonstrated that regions of protein-coding genes exhibiting clusters of amino acid substitutions are subject to different levels of selection relative to other regions of the same gene. To quantify this heterogeneity of selection within coding sequences, we developed Model Averaged Site Selection via Poisson Random Field (MASS-PRF). MASS-PRF identifies an ensemble of intragenic clustering models for polymorphic and divergent sites. This ensemble of models is used within the Poisson Random Field framework to estimate selection intensity on a site-by-site basis. Using simulations, we demonstrate that MASS-PRF has high power to detect clusters of amino acid variants in small genic regions, can reliably estimate the probability of a variant occurring at each nucleotide site in sequence data and is robust to historical demographic trends and recombination. We applied MASS-PRF to human gene polymorphism derived from the 1,000 Genomes Project and divergence data from the common chimpanzee. On the basis of this analysis, we discovered striking regional variation in selection intensity, indicative of positive or negative selection, in well-defined domains of genes that have previously been associated with neurological processing, immunity, and reproduction. We suggest that amino acid-altering substitutions within these regions likely are or have been selectively advantageous in the human lineage, playing important roles in protein function.
Collapse
Affiliation(s)
- Zi-Ming Zhao
- Department of Biostatistics, Yale University, New Haven, CT
| | - Michael C Campbell
- Department of Biostatistics, Yale University, New Haven, CT.,Department of Biology, Howard University, Washington, DC
| | - Ning Li
- Department of Ecology and Evolutionary Biology, Yale University, New Haven, CT
| | - Daniel S W Lee
- Department of Biostatistics, Yale University, New Haven, CT
| | - Zhang Zhang
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China
| | - Jeffrey P Townsend
- Department of Biostatistics, Yale University, New Haven, CT.,Department of Ecology and Evolutionary Biology, Yale University, New Haven, CT.,Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT
| |
Collapse
|
7
|
Hart MW, Stover DA, Guerra V, Mozaffari SV, Ober C, Mugal CF, Kaj I. Positive selection on human gamete-recognition genes. PeerJ 2018; 6:e4259. [PMID: 29340252 PMCID: PMC5767332 DOI: 10.7717/peerj.4259] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2017] [Accepted: 12/21/2017] [Indexed: 01/29/2023] Open
Abstract
Coevolution of genes that encode interacting proteins expressed on the surfaces of sperm and eggs can lead to variation in reproductive compatibility between mates and reproductive isolation between members of different species. Previous studies in mice and other mammals have focused in particular on evidence for positive or diversifying selection that shapes the evolution of genes that encode sperm-binding proteins expressed in the egg coat or zona pellucida (ZP). By fitting phylogenetic models of codon evolution to data from the 1000 Genomes Project, we identified candidate sites evolving under diversifying selection in the human genes ZP3 and ZP2. We also identified one candidate site under positive selection in C4BPA, which encodes a repetitive protein similar to the mouse protein ZP3R that is expressed in the sperm head and binds to the ZP at fertilization. Results from several additional analyses that applied population genetic models to the same data were consistent with the hypothesis of selection on those candidate sites leading to coevolution of sperm- and egg-expressed genes. By contrast, we found no candidate sites under selection in a fourth gene (ZP1) that encodes an egg coat structural protein not directly involved in sperm binding. Finally, we found that two of the candidate sites (in C4BPA and ZP2) were correlated with variation in family size and birth rate among Hutterite couples, and those two candidate sites were also in linkage disequilibrium in the same Hutterite study population. All of these lines of evidence are consistent with predictions from a previously proposed hypothesis of balancing selection on epistatic interactions between C4BPA and ZP3 at fertilization that lead to the evolution of co-adapted allele pairs. Such patterns also suggest specific molecular traits that may be associated with both natural reproductive variation and clinical infertility.
Collapse
Affiliation(s)
- Michael W Hart
- Department of Biological Sciences, Simon Fraser University, Burnaby, British Columbia, Canada
| | - Daryn A Stover
- School of Mathematical and Natural Sciences, Arizona State University Colleges at Lake Havasu City, Lake Havasu City, AZ, USA
| | - Vanessa Guerra
- Department of Biological Sciences, Simon Fraser University, Burnaby, British Columbia, Canada
| | - Sahar V Mozaffari
- Department of Human Genetics, University of Chicago, Chicago, IL, USA
| | - Carole Ober
- Department of Human Genetics, University of Chicago, Chicago, IL, USA
| | - Carina F Mugal
- Department of Ecology and Genetics, Uppsala University, Uppsala, Sweden
| | - Ingemar Kaj
- Department of Mathematics, Uppsala University, Uppsala, Sweden
| |
Collapse
|
8
|
Abstract
The degree to which adaptation in recent human evolution shapes genetic variation remains controversial. This is in part due to the limited evidence in humans for classic "hard selective sweeps", wherein a novel beneficial mutation rapidly sweeps through a population to fixation. However, positive selection may often proceed via "soft sweeps" acting on mutations already present within a population. Here, we examine recent positive selection across six human populations using a powerful machine learning approach that is sensitive to both hard and soft sweeps. We found evidence that soft sweeps are widespread and account for the vast majority of recent human adaptation. Surprisingly, our results also suggest that linked positive selection affects patterns of variation across much of the genome, and may increase the frequencies of deleterious mutations. Our results also reveal insights into the role of sexual selection, cancer risk, and central nervous system development in recent human evolution.
Collapse
Affiliation(s)
- Daniel R. Schrider
- Department of Genetics, Rutgers University, Piscataway, NJ
- Human Genetics Institute of New Jersey, Rutgers University, Piscataway, NJ
| | - Andrew D. Kern
- Department of Genetics, Rutgers University, Piscataway, NJ
- Human Genetics Institute of New Jersey, Rutgers University, Piscataway, NJ
| |
Collapse
|
9
|
Taub DR, Page J. Molecular Signatures of Natural Selection for Polymorphic Genes of the Human Dopaminergic and Serotonergic Systems: A Review. Front Psychol 2016; 7:857. [PMID: 27375535 PMCID: PMC4896960 DOI: 10.3389/fpsyg.2016.00857] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2016] [Accepted: 05/24/2016] [Indexed: 12/21/2022] Open
Abstract
A large body of research has examined the behavioral and mental health consequences of polymorphisms in genes of the dopaminergic and serotonergic systems. Along with this, there has been considerable interest in the possibility that these polymorphisms have developed and/or been maintained due to the action of natural selection. Episodes of natural selection on a gene are expected to leave molecular “footprints” in the DNA sequences of the gene and adjacent genomic regions. Here we review the research literature investigating molecular signals of selection for genes of the dopaminergic and serotonergic systems. The gene SLC6A4, which codes for a serotonin transport protein, was the one gene for which there was consistent support from multiple studies for a selective episode. Positive selection on SLC6A4 appears to have been initiated ∼ 20–25,000 years ago in east Asia and possibly in Europe. There are scattered reports of molecular signals of selection for other neurotransmitter genes, but these have generally failed at replication across studies. In spite of speculation in the literature about selection on these genes, current evidence from population genomic analyses supports selectively neutral processes, such as genetic drift and population dynamics, as the principal drivers of recent evolution in dopaminergic and serotonergic genes other than SLC6A4.
Collapse
Affiliation(s)
- Daniel R Taub
- Department of Biology, Southwestern University, Georgetown TX, USA
| | - Joshua Page
- Department of Biology, Southwestern University, GeorgetownTX, USA; School of Medicine, Washington University, St LouisMO, USA
| |
Collapse
|
10
|
Positive Selection and Centrality in the Yeast and Fly Protein-Protein Interaction Networks. BIOMED RESEARCH INTERNATIONAL 2016; 2016:4658506. [PMID: 27119079 PMCID: PMC4826914 DOI: 10.1155/2016/4658506] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/29/2015] [Accepted: 03/07/2016] [Indexed: 01/28/2023]
Abstract
Proteins within a molecular network are expected to be subject to different selective pressures depending on their relative hierarchical positions. However, it is not obvious what genes within a network should be more likely to evolve under positive selection. On one hand, only mutations at genes with a relatively high degree of control over adaptive phenotypes (such as those encoding highly connected proteins) are expected to be “seen” by natural selection. On the other hand, a high degree of pleiotropy at these genes is expected to hinder adaptation. Previous analyses of the human protein-protein interaction network have shown that genes under long-term, recurrent positive selection (as inferred from interspecific comparisons) tend to act at the periphery of the network. It is unknown, however, whether these trends apply to other organisms. Here, we show that long-term positive selection has preferentially targeted the periphery of the yeast interactome. Conversely, in flies, genes under positive selection encode significantly more connected and central proteins. These observations are not due to covariation of genes' adaptability and centrality with confounding factors. Therefore, the distribution of proteins encoded by genes under recurrent positive selection across protein-protein interaction networks varies from one species to another.
Collapse
|