1
|
Castanho EN, Aidos H, Madeira SC. Biclustering data analysis: a comprehensive survey. Brief Bioinform 2024; 25:bbae342. [PMID: 39007596 PMCID: PMC11247412 DOI: 10.1093/bib/bbae342] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2023] [Revised: 05/16/2024] [Accepted: 07/01/2024] [Indexed: 07/16/2024] Open
Abstract
Biclustering, the simultaneous clustering of rows and columns of a data matrix, has proved its effectiveness in bioinformatics due to its capacity to produce local instead of global models, evolving from a key technique used in gene expression data analysis into one of the most used approaches for pattern discovery and identification of biological modules, used in both descriptive and predictive learning tasks. This survey presents a comprehensive overview of biclustering. It proposes an updated taxonomy for its fundamental components (bicluster, biclustering solution, biclustering algorithms, and evaluation measures) and applications. We unify scattered concepts in the literature with new definitions to accommodate the diversity of data types (such as tabular, network, and time series data) and the specificities of biological and biomedical data domains. We further propose a pipeline for biclustering data analysis and discuss practical aspects of incorporating biclustering in real-world applications. We highlight prominent application domains, particularly in bioinformatics, and identify typical biclusters to illustrate the analysis output. Moreover, we discuss important aspects to consider when choosing, applying, and evaluating a biclustering algorithm. We also relate biclustering with other data mining tasks (clustering, pattern mining, classification, triclustering, N-way clustering, and graph mining). Thus, it provides theoretical and practical guidance on biclustering data analysis, demonstrating its potential to uncover actionable insights from complex datasets.
Collapse
Affiliation(s)
- Eduardo N Castanho
- LASIGE, Faculdade de Ciências, Universidade de Lisboa, Campo Grande 16, P-1749-016 Lisbon, Portugal
| | - Helena Aidos
- LASIGE, Faculdade de Ciências, Universidade de Lisboa, Campo Grande 16, P-1749-016 Lisbon, Portugal
| | - Sara C Madeira
- LASIGE, Faculdade de Ciências, Universidade de Lisboa, Campo Grande 16, P-1749-016 Lisbon, Portugal
| |
Collapse
|
2
|
Tang MY, Shen X, Yuan RS, Li HY, Li XW, Jing YM, Zhang Y, Shen HH, Wang ZS, Zhou L, Yang YC, Wen HX, Su F. Plexin domain-containing 1 may be a biomarker of poor prognosis in hepatocellular carcinoma patients, may mediate immune evasion. World J Gastrointest Oncol 2024; 16:2091-2112. [PMID: 38764846 PMCID: PMC11099457 DOI: 10.4251/wjgo.v16.i5.2091] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/10/2023] [Revised: 01/17/2024] [Accepted: 02/22/2024] [Indexed: 05/09/2024] Open
Abstract
BACKGROUND For the first time, we investigated the oncological role of plexin domain-containing 1 (PLXDC1), also known as tumor endothelial marker 7 (TEM7), in hepatocellular carcinoma (HCC). AIM To investigate the oncological profile of PLXDC1 in HCC. METHODS Based on The Cancer Genome Atlas database, we analyzed the expression of PLXDC1 in HCC. Using immunohistochemistry, quantitative real-time polymerase chain reaction (qRT-PCR), and Western blotting, we validated our results. The prognostic value of PLXDC1 in HCC was analyzed by assessing its correlation with clinicopathological features, such as patient survival, methylation level, tumor immune microenvironment features, and immune cell surface checkpoint expression. Finally, to assess the immune evasion potential of PLXDC1 in HCC, we used the tumor immune dysfunction and exclusion (TIDE) website and immunohistochemical staining assays. RESULTS Based on immunohistochemistry, qRT-PCR, and Western blot assays, overexpression of PLXDC1 in HCC was associated with poor prognosis. Univariate and multivariate Cox analyses indicated that PLXDC1 might be an independent prognostic factor. In HCC patients with high methylation levels, the prognosis was worse than in patients with low methylation levels. Pathway enrichment analysis of HCC tissues indicated that genes upregulated in the high-PLXDC1 subgroup were enriched in mesenchymal and immune activation signaling, and TIDE assessment showed that the risk of immune evasion was significantly higher in the high-PLXDC1 subgroup compared to the low-PLXDC1 subgroup. The high-risk group had a significantly lower immune evasion rate as well as a poor prognosis, and PLXDC1-related risk scores were also associated with a poor prognosis. CONCLUSION As a result of this study analyzing PLXDC1 from multiple biological perspectives, it was revealed that it is a biomarker of poor prognosis for HCC patients, and that it plays a role in determining immune evasion status.
Collapse
Affiliation(s)
- Ming-Yue Tang
- Department of Medical Oncology, The First Affiliated Hospital of Bengbu Medical College, Bengbu 233000, Anhui Province, China
| | - Xue Shen
- Department of Medical Oncology, The First Affiliated Hospital of Bengbu Medical College, Bengbu 233000, Anhui Province, China
| | - Run-Sheng Yuan
- Otolaryngology and Head and Neck Surgery, The First Affiliated Hospital of Bengbu Medical College, Bengbu 233000, Anhui Province, China
| | - Hui-Yuan Li
- Department of Medical Oncology, The First Affiliated Hospital of Bengbu Medical College, Bengbu 233000, Anhui Province, China
| | - Xin-Wei Li
- Department of Medical Oncology, The First Affiliated Hospital of Bengbu Medical College, Bengbu 233000, Anhui Province, China
| | - Yi-Ming Jing
- Department of Neurology, The First Affiliated Hospital of Bengbu Medical College, Bengbu 233000, Anhui Province, China
| | - Yue Zhang
- Department of Medical Oncology, The First Affiliated Hospital of Bengbu Medical College, Bengbu 233000, Anhui Province, China
| | - Hong-Hong Shen
- Department of Medical Oncology, The First Affiliated Hospital of Bengbu Medical College, Bengbu 233000, Anhui Province, China
| | - Zi-Shu Wang
- Department of Medical Oncology, The First Affiliated Hospital of Bengbu Medical College, Bengbu 233000, Anhui Province, China
| | - Lei Zhou
- Department of Hepatobiliary Surgery, The First Affiliated Hospital of Bengbu Medical College, Bengbu 233000, Anhui Province, China
| | - Yun-Chuan Yang
- Department of Hepatobiliary Surgery, The First Affiliated Hospital of Bengbu Medical College, Bengbu 233000, Anhui Province, China
| | - He-Xin Wen
- Department of Gastrointestinal Surgery, The First Affiliated Hospital of Bengbu Medical College, Bengbu 233000, Anhui Province, China
| | - Fang Su
- Department of Medical Oncology, The First Affiliated Hospital of Bengbu Medical College, Bengbu 233000, Anhui Province, China
| |
Collapse
|
3
|
Selvan TG, Gollapalli P, Kumar SHS, Ghate SD. Early diagnostic and prognostic biomarkers for gastric cancer: systems-level molecular basis of subsequent alterations in gastric mucosa from chronic atrophic gastritis to gastric cancer. J Genet Eng Biotechnol 2023; 21:86. [PMID: 37594635 PMCID: PMC10439097 DOI: 10.1186/s43141-023-00539-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2022] [Accepted: 07/31/2023] [Indexed: 08/19/2023]
Abstract
PURPOSE It is important to comprehend how the molecular mechanisms shift when gastric cancer in its early stages (GC). We employed integrative bioinformatics approaches to locate various biological signalling pathways and molecular fingerprints to comprehend the pathophysiology of the GC. To facilitate the discovery of their possible biomarkers, a rapid diagnostic may be made, which leads to an improved diagnosis and improves the patient's prognosis. METHODS Through protein-protein interaction networks, functional differentially expressed genes (DEGs), and pathway enrichment studies, we examined the gene expression profiles of individuals with chronic atrophic gastritis and GC. RESULTS A total of 17 DEGs comprising 8 upregulated and 9 down-regulated genes were identified from the microarray dataset from biopsies with chronic atrophic gastritis and GC. These DEGs were primarily enriched for CDK regulation of DNA replication and mitotic M-M/G1 phase pathways, according to KEGG analysis (p > 0.05). We discovered two hub genes, MCM7 and CDC6, in the protein-protein interaction network we obtained for the 17 DEGs (expanded with increased maximum interaction with 110 nodes and 2103 edges). MCM7 was discovered to be up-regulated in GC tissues following confirmation using the GEPIA and Human Protein Atlas databases. CONCLUSION The elevated expression of MCM7 in both chronic atrophic gastritis and GC, as shown by our comprehensive investigation, suggests that this protein may serve as a promising biomarker for the early detection of GC.
Collapse
Affiliation(s)
- Tamizh G Selvan
- Central Research Laboratory, K S Hegde Medical Academy, Nitte (Deemed to Be University), Deralakatte, Mangalore, 575018, Karnataka, India
| | - Pavan Gollapalli
- Center for Bioinformatics, University Annexe, Nitte (Deemed to be University), Deralakatte, Mangalore, 575018, Karnataka, India.
| | - Santosh H S Kumar
- Department of Biotechnology, Jnana Sahyadri Campus, Kuvempu University, Shankaraghatta, 577451, Karnataka, India
| | - Sudeep D Ghate
- Center for Bioinformatics, University Annexe, Nitte (Deemed to be University), Deralakatte, Mangalore, 575018, Karnataka, India
| |
Collapse
|
4
|
Sahoo A, Mandal AK, Kumar M, Dwivedi K, Singh D. Prospective Challenges for Patenting and Clinical Trials of Anticancer Compounds from Natural Products: Coherent Review. Recent Pat Anticancer Drug Discov 2023; 18:470-494. [PMID: 36336805 DOI: 10.2174/1574892818666221104113703] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2022] [Revised: 07/24/2022] [Accepted: 09/14/2022] [Indexed: 11/09/2022]
Abstract
Cancer is a leading cause of morbidity and mortality worldwide. Each year, millions of people worldwide are diagnosed with cancer, and more than half of them die. Various conventional therapies for cancer, including chemotherapy and radiotherapy, have extreme side effects. Therefore, to minimize the global burden of lethal diseases like cancer, an effective and novel drug must be discovered. Its patent should be acquired to secure the novel medicament. The pharmacological potential of different natural products has made them popular in the healthcare and pharmaceutical industries. Various anticancer compounds are obtained from natural sources such as plants, microbes, and marine and terrestrial animals, including alkaloids, terpenoids, biophenols, enzymes, glycosides, etc. The term "natural products" is defined as the product of secondary or non-essential metabolic processes produced by living organisms (such as plants, invertebrates, and microorganisms). Although more precise definitions of NPs exist, they do not always meet consensus. Others define NPs as small molecules (excluding biomolecules) that emerge from the metabolic reaction. A handful of effective compounds are used currently from natural or analog moieties, and many more are in clinical studies. There is an excellent need for patenting molecules from natural products as the hit lead molecules are derived, isolated, and synthesized from natural products. However, these naturally occurring products may not be patentable under the law because they come from nature. This review highlights why natural products and compounds are hard to patent, under what patent law criteria we can patent these natural products and compounds, patent procedural guideline sources and why researchers prefer publication rather than a patent. Here, various patent scenarios of natural products and compounds for cancer have been given.
Collapse
Affiliation(s)
- Ankit Sahoo
- Department of Pharmaceutical Science, Shalom Institute of Health and Allied Sciences, Sam Higginbottom University of Agriculture Technology & Sciences, Prayagraj, Uttar Pradesh 211007, India
| | - Ashok Kumar Mandal
- Natural Product Research Laboratory, Thapathali, Kathmandu, Nepal, 44600
| | - Mayank Kumar
- Department of Pharmaceutical Chemistry, Aryakul College of Pharmacy and Research, Natkur, Lucknow, Uttar Pradesh-226002, India
| | - Khusbu Dwivedi
- Department of Pharmaceutics, Shambhunath Institute of Pharmacy Jhalwa, Prayagraj, Uttar Pradesh 211015, India
| | - Deepika Singh
- Department of Pharmaceutical Science, Shalom Institute of Health and Allied Sciences, Sam Higginbottom University of Agriculture Technology & Sciences, Prayagraj, Uttar Pradesh 211007, India
| |
Collapse
|
5
|
Arjmand MH, Hashemzehi M, Soleimani A, Asgharzadeh F, Avan A, Mehraban S, Fakhraei M, Ferns GA, Ryzhikov M, Gharib M, Salari R, Sayyed Hoseinian SH, Parizadeh MR, Khazaei M, Hassanian SM. Therapeutic potential of active components of saffron in post-surgical adhesion band formation. J Tradit Complement Med 2021; 11:328-335. [PMID: 34195027 PMCID: PMC8240116 DOI: 10.1016/j.jtcme.2021.01.002] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2020] [Revised: 10/26/2020] [Accepted: 01/04/2021] [Indexed: 12/19/2022] Open
Abstract
BACKGROUND Abdominal adhesions are common and often develop after abdominal surgery. There are currently no useful targeted pharmacotherapies for adhesive disease. Saffron and its active constituents, Crocin and Crocetin, are wildly used in traditional medicine for alleviating the severity of inflammatory or malignant disease. PURPOSE The aim of this study was to investigate the therapeutic potential of the pharmacological active component of saffron in attenuating the formation of post-operative adhesion bands using different administration methods in a murine model. MATERIAL METHOD saffron extract (100 mg/kg), Crocin (100 mg/kg), and Crocetin (100 mg/kg) were administered intraperitoneally and by gavage in various groups of male Wistar rat post-surgery. Also three groups were first treated intra-peritoneally by saffron extract, Crocin, and Crocetin (100 mg/kg) for 10 days and then had surgery. At the end of the experiments, animals sacrificed for biological assessment. RESULT A hydro-alcoholic extract of saffron and crocin but not crocetin potently reduced the adhesion band frequency in treatment and pre-treatment groups in the mice given intra-peritoneal (i.p) injections. Following the saffron or crocin administration, histological evaluation and quantitative analysis represented less inflammatory cell infiltration and less collagen composition, compared to control group. Moreover, the oxidative stress was significantly reduced in treatment groups. CONCLUSION These findings suggest that a hydro-alcoholic extract of saffron or its active compound, crocin, is a potentially novel therapeutic strategy for the prevention of adhesions formation and might be used as beneficial anti-inflammatory or anti-fibrosis agents in clinical trials. TAXONOMY Abdominal surgeries/post-surgical adhesions.
Collapse
Key Words
- APC, activated protein C
- Crocetin
- Crocin
- DSS, dextran sodium sulfate
- Fibrosis
- HE, Hematoxylin & Eosin
- IP, intera-peritoneal
- Inflammation
- MDA, malondialdehyde
- PDGF, platelet-derived growth factor
- PSAB, post-surgical adhesion band
- Post-surgical adhesion band formation
- SOD, superoxidase dismutase
- Saffron
- TAA, thioacetamide
- TGF-β, transforming growth factor-beta
- α-SMA, α-smooth muscle actin
Collapse
Affiliation(s)
- Mohammad-Hassan Arjmand
- Medical Plants Research Center, Basic Health Sciences Institute, Shahrekord University of Medical Sciences, Shahrekord, Iran
| | | | - Atena Soleimani
- Department of Clinical Biochemistry, Faculty of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Fereshteh Asgharzadeh
- Department of Physiology, Faculty of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Amir Avan
- Metabolic Syndrome Research Center, Mashhad University of Medical Sciences, Mashhad, Iran
- Student Research Committee and Department of Medical Genetics, Faculty of Medicine, Mashhad University of Medical Science, Mashhad, Iran
| | - Saeedeh Mehraban
- Immunology Research Center, Inflammation and Inflammatory Diseases Division, School of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Maryam Fakhraei
- Department of Physiology, Faculty of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Gordon A. Ferns
- Brighton & Sussex Medical School, Division of Medical Education, Falmer, Brighton, BN1 9PH, UK
| | - Mikhail Ryzhikov
- Division of Pulmonary and Critical Care Medicine, Washington University, School of Medicine, Saint Louis, MO, USA
| | - Masoumeh Gharib
- Department of Pathology, Faculty of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Roshanak Salari
- Department of Pharmaceutical Sciences in Persian Medicine, School of Persian and Complementary Medicine, Mashhad University of Medical Sciences, Mashhad, Iran
| | | | - Mohammad Reza Parizadeh
- Department of Clinical Biochemistry, Faculty of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Majid Khazaei
- Department of Physiology, Faculty of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran
- Metabolic Syndrome Research Center, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Seyed Mahdi Hassanian
- Department of Clinical Biochemistry, Faculty of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran
- Metabolic Syndrome Research Center, Mashhad University of Medical Sciences, Mashhad, Iran
| |
Collapse
|
6
|
Khatun MS, Shoombuatong W, Hasan MM, Kurata H. Evolution of Sequence-based Bioinformatics Tools for Protein-protein Interaction Prediction. Curr Genomics 2020; 21:454-463. [PMID: 33093807 PMCID: PMC7536797 DOI: 10.2174/1389202921999200625103936] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2020] [Revised: 03/19/2020] [Accepted: 05/27/2020] [Indexed: 12/22/2022] Open
Abstract
Protein-protein interactions (PPIs) are the physical connections between two or more proteins via electrostatic forces or hydrophobic effects. Identification of the PPIs is pivotal, which contributes to many biological processes including protein function, disease incidence, and therapy design. The experimental identification of PPIs via high-throughput technology is time-consuming and expensive. Bioinformatics approaches are expected to solve such restrictions. In this review, our main goal is to provide an inclusive view of the existing sequence-based computational prediction of PPIs. Initially, we briefly introduce the currently available PPI databases and then review the state-of-the-art bioinformatics approaches, working principles, and their performances. Finally, we discuss the caveats and future perspective of the next generation algorithms for the prediction of PPIs.
Collapse
Affiliation(s)
| | | | - Md. Mehedi Hasan
- Address correspondence to these authors at the Department of Bioscience and Bioinformatics, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka 820-8502, Japan; Japan Society for the Promotion of Science, 5-3-1 Kojimachi, Chiyoda-ku, Tokyo 102-0083, Japan; Tel: +81-948-297-828; E-mail: and Department of Bioscience and Bioinformatics, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka 820-8502, Japan; Biomedical Informatics R&D Center, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka 820-8502, Japan; Tel: +81-948-297-828; E-mail:
| | - Hiroyuki Kurata
- Address correspondence to these authors at the Department of Bioscience and Bioinformatics, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka 820-8502, Japan; Japan Society for the Promotion of Science, 5-3-1 Kojimachi, Chiyoda-ku, Tokyo 102-0083, Japan; Tel: +81-948-297-828; E-mail: and Department of Bioscience and Bioinformatics, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka 820-8502, Japan; Biomedical Informatics R&D Center, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka 820-8502, Japan; Tel: +81-948-297-828; E-mail:
| |
Collapse
|
7
|
SabziNezhad A, Jalili S. DPCT: A Dynamic Method for Detecting Protein Complexes From TAP-Aware Weighted PPI Network. Front Genet 2020; 11:567. [PMID: 32676097 PMCID: PMC7333736 DOI: 10.3389/fgene.2020.00567] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2020] [Accepted: 05/11/2020] [Indexed: 12/13/2022] Open
Abstract
Detecting protein complexes from the Protein-Protein interaction network (PPI) is the essence of discovering the rules of the cellular world. There is a large amount of PPI data available, generated from high throughput experimental data. The enormous size of the data persuaded us to use computational methods instead of experimental methods to detect protein complexes. In past years, many researchers presented their algorithms to detect protein complexes. Most of the presented algorithms use current static PPI networks. New researches proved the dynamicity of cellular systems, and so, the PPI is not static over time. In this paper, we introduce DPCT to detect protein complexes from dynamic PPI networks. In the proposed method, TAP and GO data are used to make a weighted PPI network and to reduce the noise of PPI. Gene expression data are also used to make dynamic subnetworks from PPI. A memetic algorithm is used to bicluster gene expression data and to create a dynamic subnetwork for each bicluster. Experimental results show that DPCT can detect protein complexes with better correctness than state-of-the-art detection algorithms. The source code and datasets of DPCT used can be found at https://github.com/alisn72/DPCT.
Collapse
Affiliation(s)
- Ali SabziNezhad
- Computer Engineering Department, Tarbiat Modares University, Tehran, Iran
| | - Saeed Jalili
- Computer Engineering Department, Tarbiat Modares University, Tehran, Iran
| |
Collapse
|
8
|
Wu Z, Liao Q, Liu B. A comprehensive review and evaluation of computational methods for identifying protein complexes from protein–protein interaction networks. Brief Bioinform 2019; 21:1531-1548. [DOI: 10.1093/bib/bbz085] [Citation(s) in RCA: 30] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2019] [Revised: 06/17/2019] [Accepted: 06/17/2019] [Indexed: 02/04/2023] Open
Abstract
Abstract
Protein complexes are the fundamental units for many cellular processes. Identifying protein complexes accurately is critical for understanding the functions and organizations of cells. With the increment of genome-scale protein–protein interaction (PPI) data for different species, various computational methods focus on identifying protein complexes from PPI networks. In this article, we give a comprehensive and updated review on the state-of-the-art computational methods in the field of protein complex identification, especially focusing on the newly developed approaches. The computational methods are organized into three categories, including cluster-quality-based methods, node-affinity-based methods and ensemble clustering methods. Furthermore, the advantages and disadvantages of different methods are discussed, and then, the performance of 17 state-of-the-art methods is evaluated on two widely used benchmark data sets. Finally, the bottleneck problems and their potential solutions in this important field are discussed.
Collapse
Affiliation(s)
- Zhourun Wu
- School of Computer Science and Technology, Harbin Institute of Technology, Shenzhen, Guangdong, China
| | - Qing Liao
- School of Computer Science and Technology, Harbin Institute of Technology, Shenzhen, Guangdong, China
| | - Bin Liu
- School of Computer Science and Technology, Beijing Institute of Technology, Beijing, China
- Advanced Research Institute of Multidisciplinary Science, Beijing Institute of Technology, Beijing, China
| |
Collapse
|
9
|
Maind A, Raut S. COSCEB: Comprehensive search for column-coherent evolution biclusters and its application to hub gene identification. J Biosci 2019; 44:48. [PMID: 31180061] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Biclustering is an increasingly used data mining technique for searching groups of co-expressed genes across the subset of experimental conditions from the gene-expression data. The group of co-expressed genes is present in the form of various patterns called a bicluster. A bicluster provides significant insights related to the functionality of genes and plays an important role in various clinical applications such as drug discovery, biomarker discovery, gene network analysis, gene identification, disease diagnosis, pathway analysis etc. This paper presents a novel unsupervised approach 'COmprehensive Search for Column-Coherent Evolution Biclusters (COSCEB)' for a comprehensive search of biologically significant column-coherent evolution biclusters. The concept of column subspace extraction from each gene pair and Longest Common Contiguous Subsequence (LCCS) is employed to identify significant biclusters. The experiments have been performed on both synthetic as well as real datasets. The performance of COSCEB is evaluated with the help of key issues. The issues are comprehensive search, Deep OPSM bicluster, bicluster types, bicluster accuracy, bicluster size, noise, overlapping, output nature, computational complexity and biologically significant biclusters. The performance of COSCEB is compared with six all-time famous biclustering algorithms SAMBA, OPSM, xMotif, Bimax, Deep OPSM- and UniBic. The result shows that the proposed approach performs effectively on most of the issues and extracts all possible biologically significant column-coherent evolution biclusters which are far more than other biclustering algorithms. Along with the proposed approach, we have also presented the case study which shows the application of significant biclusters for hub gene identification.
Collapse
Affiliation(s)
- Ankush Maind
- Department of Computer Science and Engineering, Visvesvaraya National Institute of Technology, Nagpur, Maharashtra 440 010, India
| | | |
Collapse
|
10
|
COSCEB: Comprehensive search for column-coherent evolution biclusters and its application to hub gene identification. J Biosci 2019. [DOI: 10.1007/s12038-019-9862-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
|
11
|
Zahiri J, Emamjomeh A, Bagheri S, Ivazeh A, Mahdevar G, Sepasi Tehrani H, Mirzaie M, Fakheri BA, Mohammad-Noori M. Protein complex prediction: A survey. Genomics 2019; 112:174-183. [PMID: 30660789 DOI: 10.1016/j.ygeno.2019.01.011] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2018] [Revised: 11/27/2018] [Accepted: 01/15/2019] [Indexed: 02/08/2023]
Abstract
Protein complexes are one of the most important functional units for deriving biological processes within the cell. Experimental methods have provided valuable data to infer protein complexes. However, these methods have inherent limitations. Considering these limitations, many computational methods have been proposed to predict protein complexes, in the last decade. Almost all of these in-silico methods predict protein complexes from the ever-increasing protein-protein interaction (PPI) data. These computational approaches usually use the PPI data in the format of a huge protein-protein interaction network (PPIN) as input and output various sub-networks of the given PPIN as the predicted protein complexes. Some of these methods have already reached a promising efficiency in protein complex detection. Nonetheless, there are challenges in prediction of other types of protein complexes, specially sparse and small ones. New methods should further incorporate the knowledge of biological properties of proteins to improve the performance. Additionally, there are several challenges that should be considered more effectively in designing the new complex prediction algorithms in the future. This article not only reviews the history of computational protein complex prediction but also provides new insight for improvement of new methodologies. In this article, most important computational methods for protein complex prediction are evaluated and compared. In addition, some of the challenges in the reconstruction of the protein complexes are discussed. Finally, various tools for protein complex prediction and PPIN analysis as well as the current high-throughput databases are reviewed.
Collapse
Affiliation(s)
- Javad Zahiri
- Bioinformatics and Computational Omics Lab (BioCOOL), Department of Biophysics, Faculty of Biological Sciences, Tarbiat Modares University, Tehran, Iran
| | - Abbasali Emamjomeh
- Laboratory of Computational Biotechnology and Bioinformatics (CBB), Department of Plant Breeding and Biotechnology, University of Zabol, Zabol, Iran.
| | - Samaneh Bagheri
- Department of Plant Breeding and Biotechnology (PBB), Faculty of Agriculture, University of Zabol, Zabol, Iran
| | - Asma Ivazeh
- Database Research Group (DBRG), Control and intelligent Processing Center of Excellence (CIPCE), School of Electrical and Computer Engineering, College of Engineering, University of Tehran, Tehran, Iran
| | - Ghasem Mahdevar
- Department of Mathematics, Faculty of Sciences, University of Isfahan, Isfahan, Iran
| | - Hessam Sepasi Tehrani
- Department of Biology, Science and Research Branch, Islamic Azad University, Tehran, Iran
| | - Mehdi Mirzaie
- Department of Applied Mathematics, Faculty of Mathematical Sciences, Tarbiat Modares University, Tehran, Iran
| | - Barat Ali Fakheri
- Department of Plant Breeding and Biotechnology (PBB), Faculty of Agriculture, University of Zabol, Zabol, Iran
| | - Morteza Mohammad-Noori
- School of Mathematics, Statistics, and Computer Science, College of Science, University of Tehran, Tehran, Iran
| |
Collapse
|
12
|
Zhang W, Xu J, Li Y, Zou X. Integrating network topology, gene expression data and GO annotation information for protein complex prediction. J Bioinform Comput Biol 2018; 17:1950001. [PMID: 30803297 DOI: 10.1142/s021972001950001x] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]
Abstract
The prediction of protein complexes based on the protein interaction network is a fundamental task for the understanding of cellular life as well as the mechanisms underlying complex disease. A great number of methods have been developed to predict protein complexes based on protein-protein interaction (PPI) networks in recent years. However, because the high throughput data obtained from experimental biotechnology are incomplete, and usually contain a large number of spurious interactions, most of the network-based protein complex identification methods are sensitive to the reliability of the PPI network. In this paper, we propose a new method, Identification of Protein Complex based on Refined Protein Interaction Network (IPC-RPIN), which integrates the topology, gene expression profiles and GO functional annotation information to predict protein complexes from the reconstructed networks. To demonstrate the performance of the IPC-RPIN method, we evaluated the IPC-RPIN on three PPI networks of Saccharomycescerevisiae and compared it with four state-of-the-art methods. The simulation results show that the IPC-RPIN achieved a better result than the other methods on most of the measurements and is able to discover small protein complexes which have traditionally been neglected.
Collapse
Affiliation(s)
- Wei Zhang
- * School of Science, East China Jiaotong University, Nanchang 330013, P. R. China
| | - Jia Xu
- † School of Mechatronic Engineering, East China Jiaotong University, Nanchang 330013, P. R. China
| | - Yuanyuan Li
- ‡ School of Mathematics and Statistics, Wuhan Institute of Technology in Wuhan, Wuhan 430072, P. R. China
| | - Xiufen Zou
- § School of Mathematics and Statistics, Wuhan University, Wuhan 430072, P. R. China
| |
Collapse
|
13
|
Jalili M, Gebhardt T, Wolkenhauer O, Salehzadeh-Yazdi A. Unveiling network-based functional features through integration of gene expression into protein networks. Biochim Biophys Acta Mol Basis Dis 2018; 1864:2349-2359. [PMID: 29466699 DOI: 10.1016/j.bbadis.2018.02.010] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2017] [Revised: 01/31/2018] [Accepted: 02/13/2018] [Indexed: 02/02/2023]
Abstract
Decoding health and disease phenotypes is one of the fundamental objectives in biomedicine. Whereas high-throughput omics approaches are available, it is evident that any single omics approach might not be adequate to capture the complexity of phenotypes. Therefore, integrated multi-omics approaches have been used to unravel genotype-phenotype relationships such as global regulatory mechanisms and complex metabolic networks in different eukaryotic organisms. Some of the progress and challenges associated with integrated omics studies have been reviewed previously in comprehensive studies. In this work, we highlight and review the progress, challenges and advantages associated with emerging approaches, integrating gene expression and protein-protein interaction networks to unravel network-based functional features. This includes identifying disease related genes, gene prioritization, clustering protein interactions, developing the modules, extract active subnetworks and static protein complexes or dynamic/temporal protein complexes. We also discuss how these approaches contribute to our understanding of the biology of complex traits and diseases. This article is part of a Special Issue entitled: Cardiac adaptations to obesity, diabetes and insulin resistance, edited by Professors Jan F.C. Glatz, Jason R.B. Dyck and Christine Des Rosiers.
Collapse
Affiliation(s)
- Mahdi Jalili
- Hematology, Oncology and SCT Research Center, Tehran University of Medical Sciences, Tehran, Iran; Hematologic Malignancies Research Center, Tehran University of Medical Sciences, Tehran, Iran
| | - Tom Gebhardt
- Department of Systems Biology and Bioinformatics, University of Rostock, 18051 Rostock, Germany
| | - Olaf Wolkenhauer
- Department of Systems Biology and Bioinformatics, University of Rostock, 18051 Rostock, Germany
| | - Ali Salehzadeh-Yazdi
- Department of Systems Biology and Bioinformatics, University of Rostock, 18051 Rostock, Germany.
| |
Collapse
|
14
|
Zhou H, Liu J, Li J, Duan W. A density-based approach for detecting complexes in weighted PPI networks by semantic similarity. PLoS One 2017; 12:e0180570. [PMID: 28704455 PMCID: PMC5507511 DOI: 10.1371/journal.pone.0180570] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2017] [Accepted: 06/16/2017] [Indexed: 11/23/2022] Open
Abstract
Protein complex detection in PPI networks plays an important role in analyzing biological processes. A new algorithm-DBGPWN-is proposed for predicting complexes in PPI networks. Firstly, a method based on gene ontology is used to measure semantic similarities between interacted proteins, and the similarity values are used as their weights. Then, a density-based graph partitioning algorithm is developed to find clusters in the weighted PPI networks, and the identified ones are considered to be dense and similar. Experimental results demonstrate that our approach achieves good performance as compared with such algorithms as MCL, CMC, MCODE, RNSC, CORE, ClusterOne and FGN.
Collapse
Affiliation(s)
- HongFang Zhou
- School of Computer Science and Engineering, Xi'an University of Technology, Xi’an, China
| | - Jie Liu
- School of Computer Science and Engineering, Xi'an University of Technology, Xi’an, China
| | - JunHuai Li
- School of Computer Science and Engineering, Xi'an University of Technology, Xi’an, China
| | - WenCong Duan
- School of Computer Science and Engineering, Xi'an University of Technology, Xi’an, China
| |
Collapse
|
15
|
|
16
|
Lakizadeh A, Jalili S. BiCAMWI: A Genetic-Based Biclustering Algorithm for Detecting Dynamic Protein Complexes. PLoS One 2016; 11:e0159923. [PMID: 27462706 PMCID: PMC4963120 DOI: 10.1371/journal.pone.0159923] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2016] [Accepted: 07/11/2016] [Indexed: 01/08/2023] Open
Abstract
Considering the roles of protein complexes in many biological processes in the cell, detection of protein complexes from available protein-protein interaction (PPI) networks is a key challenge in the post genome era. Despite high dynamicity of cellular systems and dynamic interaction between proteins in a cell, most computational methods have focused on static networks which cannot represent the inherent dynamicity of protein interactions. Recently, some researchers try to exploit the dynamicity of PPI networks by constructing a set of dynamic PPI subnetworks correspondent to each time-point (column) in a gene expression data. However, many genes can participate in multiple biological processes and cellular processes are not necessarily related to every sample, but they might be relevant only for a subset of samples. So, it is more interesting to explore each subnetwork based on a subset of genes and conditions (i.e., biclusters) in a gene expression data. Here, we present a new method, called BiCAMWI to employ dynamicity in detecting protein complexes. The preprocessing phase of the proposed method is based on a novel genetic algorithm that extracts some sets of genes that are co-regulated under some conditions from input gene expression data. Each extracted gene set is called bicluster. In the detection phase of the proposed method, then, based on the biclusters, some dynamic PPI subnetworks are extracted from input static PPI network. Protein complexes are identified by applying a detection method on each dynamic PPI subnetwork and aggregating the results. Experimental results confirm that BiCAMWI effectively models the dynamicity inherent in static PPI networks and achieves significantly better results than state-of-the-art methods. So, we suggest BiCAMWI as a more reliable method for protein complex detection.
Collapse
Affiliation(s)
- Amir Lakizadeh
- Computer Engineering Department, Tarbiat Modares University, Tehran, Iran
| | - Saeed Jalili
- Computer Engineering Department, Tarbiat Modares University, Tehran, Iran
| |
Collapse
|
17
|
Zhang Y, Lin H, Yang Z, Wang J. Construction of dynamic probabilistic protein interaction networks for protein complex identification. BMC Bioinformatics 2016; 17:186. [PMID: 27117946 PMCID: PMC4847341 DOI: 10.1186/s12859-016-1054-1] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2015] [Accepted: 04/14/2016] [Indexed: 11/10/2022] Open
Abstract
Background Recently, high-throughput experimental techniques have generated a large amount of protein-protein interaction (PPI) data which can construct large complex PPI networks for numerous organisms. System biology attempts to understand cellular organization and function by analyzing these PPI networks. However, most studies still focus on static PPI networks which neglect the dynamic information of PPI. Results The gene expression data under different time points and conditions can reveal the dynamic information of proteins. In this study, we used an active probability-based method to distinguish the active level of proteins at different active time points. We constructed dynamic probabilistic protein networks (DPPN) to integrate dynamic information of protein into static PPI networks. Based on DPPN, we subsequently proposed a novel method to identify protein complexes, which could effectively exploit topological structure as well as dynamic information of DPPN. We used three different yeast PPI datasets and gene expression data to construct three DPPNs. When applied to three DPPNs, many well-characterized protein complexes were accurately identified by this method. Conclusion The shift from static PPI networks to dynamic PPI networks is essential to accurately identify protein complex. This method not only can be applied to identify protein complex, but also establish a framework to integrate dynamic information into static networks for other applications, such as pathway analysis.
Collapse
Affiliation(s)
- Yijia Zhang
- College of Computer Science and Technology, Dalian University of Technology, Dalian, Liaoning, 116023, China.
| | - Hongfei Lin
- College of Computer Science and Technology, Dalian University of Technology, Dalian, Liaoning, 116023, China
| | - Zhihao Yang
- College of Computer Science and Technology, Dalian University of Technology, Dalian, Liaoning, 116023, China
| | - Jian Wang
- College of Computer Science and Technology, Dalian University of Technology, Dalian, Liaoning, 116023, China
| |
Collapse
|