1
|
Lüleci HB, Yılmaz A. Robust and rigorous identification of tissue-specific genes by statistically extending tau score. BioData Min 2022; 15:31. [PMID: 36494766 PMCID: PMC9733102 DOI: 10.1186/s13040-022-00315-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2022] [Accepted: 11/11/2022] [Indexed: 12/13/2022] Open
Abstract
OBJECTIVES In this study, we aimed to identify tissue-specific genes for various human tissues/organs more robustly and rigorously by extending the tau score algorithm. INTRODUCTION Tissue-specific genes are a class of genes whose functions and expressions are preferred in one or several tissues restrictedly. Identification of tissue-specific genes is essential for discovering multi-cellular biological processes such as tissue-specific molecular regulations, tissue development, physiology, and the pathogenesis of tissue-associated diseases. MATERIALS AND METHODS Gene expression data derived from five large RNA sequencing (RNA-seq) projects, spanning 96 different human tissues, were retrieved from ArrayExpress and ExpressionAtlas. The first step is categorizing genes using significant filters and tau score as a specificity index. After calculating tau for each gene in all datasets separately, statistical distance from the maximum expression level was estimated using a new meaningful procedure. Specific expression of a gene in one or several tissues was calculated after the integration of tau and statistical distance estimation, which is called as extended tau approach. Obtained tissue-specific genes for 96 different human tissues were functionally annotated, and some comparisons were carried out to show the effectiveness of the extended tau method. RESULTS AND DISCUSSION Categorization of genes based on expression level and identification of tissue-specific genes for a large number of tissues/organs were executed. Genes were successfully assigned to multiple tissues by generating the extended tau approach as opposed to the original tau score, which can assign tissue specificity to single tissue only.
Collapse
Affiliation(s)
- Hatice Büşra Lüleci
- grid.448834.70000 0004 0595 7127Department of Bioengineering, Gebze Technical University, Kocaeli, Turkey
| | - Alper Yılmaz
- grid.38575.3c0000 0001 2337 3561Department of Bioengineering, Yildiz Technical University, Istanbul, Turkey
| |
Collapse
|
2
|
Begum T, Ghosh TC, Basak S. Systematic Analyses and Prediction of Human Drug Side Effect Associated Proteins from the Perspective of Protein Evolution. Genome Biol Evol 2017; 9:337-350. [PMID: 28391292 PMCID: PMC5499873 DOI: 10.1093/gbe/evw301] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/16/2017] [Indexed: 12/20/2022] Open
Abstract
Identification of various factors involved in adverse drug reactions in target proteins to develop therapeutic drugs with minimal/no side effect is very important. In this context, we have performed a comparative evolutionary rate analyses between the genes exhibiting drug side-effect(s) (SET) and genes showing no side effect (NSET) with an aim to increase the prediction accuracy of SET/NSET proteins using evolutionary rate determinants. We found that SET proteins are more conserved than the NSET proteins. The rates of evolution between SET and NSET protein primarily depend upon their noncomplex (protein complex association number = 0) forming nature, phylogenetic age, multifunctionality, membrane localization, and transmembrane helix content irrespective of their essentiality, total druggability (total number of drugs/target), m-RNA expression level, and tissue expression breadth. We also introduced two novel terms—killer druggability (number of drugs with killing side effect(s)/target), essential druggability (number of drugs targeting essential proteins/target) to explain the evolutionary rate variation between SET and NSET proteins. Interestingly, we noticed that SET proteins are younger than NSET proteins and multifunctional younger SET proteins are candidates of acquiring killing side effects. We provide evidence that higher killer druggability, multifunctionality, and transmembrane helices support the conservation of SET proteins over NSET proteins in spite of their recent origin. By employing all these entities, our Support Vector Machine model predicts human SET/NSET proteins to a high degree of accuracy (∼86%).
Collapse
Affiliation(s)
- Tina Begum
- Bioinformatics Centre, Tripura University, Suryamaninagar, Tripura, India
| | | | - Surajit Basak
- Bioinformatics Centre, Tripura University, Suryamaninagar, Tripura, India.,Department of Molecular Biology & Bioinformatics, Tripura University, Suryamaninagar, Tripura, India
| |
Collapse
|
3
|
Nguyen TT, Almon RR, DuBois DC, Sukumaran S, Jusko WJ, Androulakis IP. Tissue-specific gene expression and regulation in liver and muscle following chronic corticosteroid administration. GENE REGULATION AND SYSTEMS BIOLOGY 2014; 8:75-87. [PMID: 24653645 PMCID: PMC3956809 DOI: 10.4137/grsb.s13134] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/04/2013] [Revised: 10/23/2013] [Accepted: 10/24/2013] [Indexed: 12/20/2022]
Abstract
Although corticosteroids (CSs) affect gene expression in multiple tissues, the array of genes that are regulated by these catabolic steroids is diverse, highly tissue specific, and depends on their functions in the tissue. Liver has many important functions in performing and regulating diverse metabolic processes. Muscle, in addition to its mechanical role, is critical in maintaining systemic energy homeostasis and accounts for about 80% of insulin-directed glucose disposal. Consequently, a better understanding of CS pharmacogenomic effects in these tissues would provide valuable information regarding the tissue-specificity of transcriptional dynamics, and would provide insights into the underlying molecular mechanisms of action for both beneficial and detrimental effects. We performed an integrated analysis of transcriptional data from liver and muscle in response to methylprednisolone (MPL) infusion, which included clustering and functional annotation of clustered gene groups, promoter extraction and putative transcription factor (TF) identification, and finally, regulatory closeness (RC) identification. This analysis allowed the identification of critical transcriptional responses and CS-responsive functions in liver and muscle during chronic MPL administration, the prediction of putative transcriptional regulators relevant to transcriptional responses of CS-affected genes which are also potential secondary bio-signals altering expression levels of target-genes, and the exploration of the tissue-specificity and biological significance of gene expression patterns, CS-responsive functions, and transcriptional regulation. The analysis provided an integrated description of the genomic and functional effects of chronic MPL infusion in liver and muscle.
Collapse
Affiliation(s)
- Tung T Nguyen
- BioMaPS Institute for Quantitative Biology, Rutgers University, Piscataway, NJ, USA
| | - Richard R Almon
- Department of Biological Sciences, State University of New York at Buffalo, Buffalo, NY, USA
- Department of Pharmaceutical Sciences, State University of New York at Buffalo, Buffalo, NY, USA
- New York State Center of Excellence in Bioinformatics and Life Sciences, Buffalo, NY, USA
| | - Debra C DuBois
- Department of Biological Sciences, State University of New York at Buffalo, Buffalo, NY, USA
- Department of Pharmaceutical Sciences, State University of New York at Buffalo, Buffalo, NY, USA
| | - Siddharth Sukumaran
- Department of Pharmaceutical Sciences, State University of New York at Buffalo, Buffalo, NY, USA
| | - William J Jusko
- Department of Pharmaceutical Sciences, State University of New York at Buffalo, Buffalo, NY, USA
- New York State Center of Excellence in Bioinformatics and Life Sciences, Buffalo, NY, USA
| | - Ioannis P Androulakis
- Biomedical Engineering Department, Rutgers University, Piscataway, NJ, USA
- Chemical and Biochemical Engineering Department, Rutgers University, Piscataway, NJ, USA
| |
Collapse
|
4
|
Capra JA, Stolzer M, Durand D, Pollard KS. How old is my gene? Trends Genet 2013; 29:659-68. [PMID: 23915718 DOI: 10.1016/j.tig.2013.07.001] [Citation(s) in RCA: 40] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2013] [Revised: 06/13/2013] [Accepted: 07/03/2013] [Indexed: 11/26/2022]
Abstract
Gene functions, interactions, disease associations, and ecological distributions are all correlated with gene age. However, it is challenging to estimate the intricate series of evolutionary events leading to a modern-day gene and then to reduce this history to a single age estimate. Focusing on eukaryotic gene families, we introduce a framework that can be used to compare current strategies for quantifying gene age, discuss key differences between these methods, and highlight several common problems. We argue that genes with complex evolutionary histories do not have a single well-defined age. As a result, care must be taken to articulate the goals and assumptions of any analysis that uses gene age estimates. Recent algorithmic advances offer the promise of gene age estimates that are fast, accurate, and consistent across gene families. This will enable a shift to integrated genome-wide analyses of all events in gene evolutionary histories in the near future.
Collapse
Affiliation(s)
- John A Capra
- Center for Human Genetics Research and Department of Biomedical Informatics, Vanderbilt University, Nashville, TN 37232, USA
| | | | | | | |
Collapse
|
5
|
Pérez-Montarelo D, Hudson NJ, Fernández AI, Ramayo-Caldas Y, Dalrymple BP, Reverter A. Porcine tissue-specific regulatory networks derived from meta-analysis of the transcriptome. PLoS One 2012; 7:e46159. [PMID: 23049964 PMCID: PMC3458843 DOI: 10.1371/journal.pone.0046159] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2012] [Accepted: 08/28/2012] [Indexed: 11/19/2022] Open
Abstract
The processes that drive tissue identity and differentiation remain unclear for most tissue types. So are the gene networks and transcription factors (TF) responsible for the differential structure and function of each particular tissue, and this is particularly true for non model species with incomplete genomic resources. To better understand the regulation of genes responsible for tissue identity in pigs, we have inferred regulatory networks from a meta-analysis of 20 gene expression studies spanning 480 Porcine Affymetrix chips for 134 experimental conditions on 27 distinct tissues. We developed a mixed-model normalization approach with a covariance structure that accommodated the disparity in the origin of the individual studies, and obtained the normalized expression of 12,320 genes across the 27 tissues. Using this resource, we constructed a network, based on the co-expression patterns of 1,072 TF and 1,232 tissue specific genes. The resulting network is consistent with the known biology of tissue development. Within the network, genes clustered by tissue and tissues clustered by site of embryonic origin. These clusters were significantly enriched for genes annotated in key relevant biological processes and confirm gene functions and interactions from the literature. We implemented a Regulatory Impact Factor (RIF) metric to identify the key regulators in skeletal muscle and tissues from the central nervous systems. The normalization of the meta-analysis, the inference of the gene co-expression network and the RIF metric, operated synergistically towards a successful search for tissue-specific regulators. Novel among these findings are evidence suggesting a novel key role of ERCC3 as a muscle regulator. Together, our results recapitulate the known biology behind tissue specificity and provide new valuable insights in a less studied but valuable model species.
Collapse
Affiliation(s)
- Dafne Pérez-Montarelo
- Computational and Systems Biology, Commonwealth Scientific and Industrial Research Organisation (CSIRO) Animal, Food and Health Sciences, Queensland Bioscience Precinct, St. Lucia, Brisbane, Queensland, Australia
- Departamento de Mejora Genética Animal, Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria, Madrid, Spain
| | - Nicholas J. Hudson
- Computational and Systems Biology, Commonwealth Scientific and Industrial Research Organisation (CSIRO) Animal, Food and Health Sciences, Queensland Bioscience Precinct, St. Lucia, Brisbane, Queensland, Australia
| | - Ana I. Fernández
- Departamento de Mejora Genética Animal, Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria, Madrid, Spain
| | - Yuliaxis Ramayo-Caldas
- Departament de Ciència Animal i dels Aliments, Facultat de Veterinària, Universitat Autònoma de Barcelona, Bellaterra, Barcelona, Spain
| | - Brian P. Dalrymple
- Computational and Systems Biology, Commonwealth Scientific and Industrial Research Organisation (CSIRO) Animal, Food and Health Sciences, Queensland Bioscience Precinct, St. Lucia, Brisbane, Queensland, Australia
| | - Antonio Reverter
- Computational and Systems Biology, Commonwealth Scientific and Industrial Research Organisation (CSIRO) Animal, Food and Health Sciences, Queensland Bioscience Precinct, St. Lucia, Brisbane, Queensland, Australia
| |
Collapse
|
6
|
Ranganathan S, Schönbach C, Nakai K, Tan TW. Challenges of the next decade for the Asia Pacific region: 2010 International Conference in Bioinformatics (InCoB 2010). BMC Genomics 2010; 11 Suppl 4:S1. [PMID: 21143792 PMCID: PMC3005919 DOI: 10.1186/1471-2164-11-s4-s1] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
The 2010 annual conference of the Asia Pacific Bioinformatics Network (APBioNet), Asia’s oldest bioinformatics organisation formed in 1998, was organized as the 9th International Conference on Bioinformatics (InCoB), Sept. 26-28, 2010 in Tokyo, Japan. Initially, APBioNet created InCoB as forum to foster bioinformatics in the Asia Pacific region. Given the growing importance of interdisciplinary research, InCoB2010 included topics targeting scientists in the fields of genomic medicine, immunology and chemoinformatics, supporting translational research. Peer-reviewed manuscripts that were accepted for publication in this supplement, represent key areas of research interests that have emerged in our region. We also highlight some of the current challenges bioinformatics is facing in the Asia Pacific region and conclude our report with the announcement of APBioNet’s 100 BioDatabases (BioDB100) initiative. BioDB100 will comply with the database criteria set out earlier in our proposal for Minimum Information about a Bioinformatics and Investigation (MIABi), setting the standards for biocuration and bioinformatics research, on which we will report at the next InCoB, Nov. 27 – Dec. 2, 2011 at Kuala Lumpur, Malaysia.
Collapse
Affiliation(s)
- Shoba Ranganathan
- Department of Chemistry and Biomolecular Sciences, Macquarie University, Sydney, NSW, Australia.
| | | | | | | |
Collapse
|