1
|
Identification of Human Global, Tissue and Within-Tissue Cell-Specific Stably Expressed Genes at Single-Cell Resolution. Int J Mol Sci 2022; 23:ijms231810214. [PMID: 36142130 PMCID: PMC9499411 DOI: 10.3390/ijms231810214] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2022] [Revised: 08/12/2022] [Accepted: 08/30/2022] [Indexed: 11/17/2022] Open
Abstract
Stably Expressed Genes (SEGs) are a set of genes with invariant expression. Identification of SEGs, especially among both healthy and diseased tissues, is of clinical relevance to enable more accurate data integration, gene expression comparison and biomarker detection. However, it remains unclear how many global SEGs there are, whether there are development-, tissue- or cell-specific SEGs, and whether diseases can influence their expression. In this research, we systematically investigate human SEGs at single-cell level and observe their development-, tissue- and cell-specificity, and expression stability under various diseased states. A hierarchical strategy is proposed to identify a list of 408 spatial-temporal SEGs. Development-specific SEGs are also identified, with adult tissue-specific SEGs enriched with the function of immune processes and fetal tissue-specific SEGs enriched in RNA splicing activities. Cells of the same type within different tissues tend to show similar SEG composition profiles. Diseases or stresses do not show influence on the expression stableness of SEGs in various tissues. In addition to serving as markers and internal references for data normalization and integration, we examine another possible application of SEGs, i.e., being applied for cell decomposition. The deconvolution model could accurately predict the fractions of major immune cells in multiple independent testing datasets of peripheral blood samples. The study provides a reliable list of human SEGs at the single-cell level, facilitates the understanding on the property of SEGs, and extends their possible applications.
Collapse
|
2
|
Comparative genomics reveals genus specific encoding of amino acids by tri-nucleotide SSRs in human pathogenic Streptococcus and Staphylococcus bacteria. Biologia (Bratisl) 2022. [DOI: 10.1007/s11756-022-01143-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
|
3
|
Kalyanasundaram A, Henry BJ, Henry C, Kendall RJ. Molecular phylogenetic and in silico analysis of glyceraldeyde-3-phosphate dehydrogenase (GAPDH) gene from northern bobwhite quail (Colinus virginianus). Mol Biol Rep 2021; 48:1093-1101. [PMID: 33580461 DOI: 10.1007/s11033-021-06186-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2020] [Accepted: 01/28/2021] [Indexed: 10/22/2022]
Abstract
Many recent studies have been focused on prevalence and impact of two helminth parasites, eyeworm Oxyspirura petrowi and caecal worm Aulonocephalus pennula, in the northern bobwhite quail (Colinus virginianus). However, few studies have attempted to examine the effect of these parasites on the bobwhite immune system. This is likely due to the lack of proper reference genes for relative gene expression studies. Glyceraldehyde 3-phosphate dehydrogenase (GAPDH) is a glycolytic enzyme that is often utilized as a reference gene, and in this preliminary study, we evaluated the similarity of bobwhite GAPDH to GAPDH in other avian species to evaluate its potential as a reference gene in bobwhite. GAPDH was identified in the bobwhite full genome sequence and multiple sets of PCR primers were designed to generate overlapping PCR products. These products were then sequenced and then aligned to generate the sequence for the full-length open reading frame (ORF) of bobwhite GAPDH. Utilizing this sequence, phylogenetic analyses and comparative analysis of the exon-intron pattern were conducted that revealed high similarity of GAPDH encoding sequences among bobwhite and other Galliformes. Additionally, This ORF sequence was also used to predict the encoded protein and its three-dimensional structure which like the phylogenetic analyses reveal that bobwhite GAPDH is similar to GAPDH in other Galliformes. Finally, GAPDH qPCR primers were designed, standardized, and tested with bobwhite both uninfected and infected with O. petrowi, and this preliminary test showed no statistical difference in expression of GAPDH between the two groups. These analyses are the first to investigate GAPDH in bobwhite. These efforts in phylogeny, sequence analysis, and protein structure suggest that there is > 97% conservation of GADPH among Galliformes. Furthermore, the results of these in silico tests and the preliminary qPCR indicate that GAPDH is a prospective candidate for use in gene expression analyses in bobwhite.
Collapse
Affiliation(s)
| | - Brett J Henry
- The Wildlife Toxicology Laboratory, Texas Tech University, Lubbock, TX, 79409-3290, USA
| | - Cassandra Henry
- The Wildlife Toxicology Laboratory, Texas Tech University, Lubbock, TX, 79409-3290, USA
| | - Ronald J Kendall
- The Wildlife Toxicology Laboratory, Texas Tech University, Lubbock, TX, 79409-3290, USA.
| |
Collapse
|
4
|
Wei K, Ma L, Zhang T. Characterization of gene promoters in pig: conservative elements, regulatory motifs and evolutionary trend. PeerJ 2019; 7:e7204. [PMID: 31275764 PMCID: PMC6598670 DOI: 10.7717/peerj.7204] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2019] [Accepted: 05/29/2019] [Indexed: 02/04/2023] Open
Abstract
It is vital to understand the conservation and evolution of gene promoter sequences in order to understand environmental adaptation. The level of promoter conservation varies greatly between housekeeping (HK) and tissue-specific (TS) genes, denoting differences in the strength of the evolutionary constraints. Here, we analyzed promoter conservation and evolution to exploit differential regulation between HK and TS genes. The analysis of conserved elements showed CpG islands, short tandem repeats and G-quadruplex sequences are highly enriched in HK promoters relative to TS promoters. In addition, the type and density of regulatory motifs in TS promoters are much higher than HK promoters, indicating that TS genes show more complex regulatory patterns than HK genes. Moreover, the evolutionary dynamics of promoters showed similar evolutionary trend to coding sequences. HK promoters suffer more stringent selective pressure in the long-term evolutionary process. HK genes tend to show increased upstream sequence conservation due to stringent selection pressures acting on the promoter regions. The specificity of TS gene expression may be due to complex regulatory motifs acting in different tissues or conditions. The results from this study can be used to deepen our understanding of adaptive evolution.
Collapse
Affiliation(s)
- Kai Wei
- College of Life Science, Shihezi University, Shihezi, Xinjiang, China.,Center of Life and Food Sciences Weihenstephan, Technische Universität München, Freising, Byern, Germany
| | - Lei Ma
- College of Life Science, Shihezi University, Shihezi, Xinjiang, China
| | - Tingting Zhang
- College of Life Science, Shihezi University, Shihezi, Xinjiang, China
| |
Collapse
|
5
|
Zhang L, Xiao M, Zhou J, Yu J. Lineage-associated underrepresented permutations (LAUPs) of mammalian genomic sequences based on a Jellyfish-based LAUPs analysis application (JBLA). Bioinformatics 2018; 34:3624-3630. [DOI: 10.1093/bioinformatics/bty392] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2018] [Accepted: 05/09/2018] [Indexed: 12/25/2022] Open
Affiliation(s)
- Le Zhang
- College of Computer Science, Sichuan University, Chengdu, China
- School of Computer and Information Science, Southwest University, Chongqing, China
| | - Ming Xiao
- School of Computer and Information Science, Southwest University, Chongqing, China
- College of Mobile Telecommunications, Chongqing University of Posts and Telecommunications, Chongqing, China
| | - Jingsong Zhou
- College of Computer Science, Sichuan University, Chengdu, China
| | - Jun Yu
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China
- University of Chinese Academy of Sciences, Beijing, China
| |
Collapse
|
6
|
Chang YC, Ding Y, Dong L, Zhu LJ, Jensen RV, Hsiao LL. Differential expression patterns of housekeeping genes increase diagnostic and prognostic value in lung cancer. PeerJ 2018; 6:e4719. [PMID: 29761043 PMCID: PMC5949062 DOI: 10.7717/peerj.4719] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2018] [Accepted: 04/16/2018] [Indexed: 12/30/2022] Open
Abstract
BACKGROUND Using DNA microarrays, we previously identified 451 genes expressed in 19 different human tissues. Although ubiquitously expressed, the variable expression patterns of these "housekeeping genes" (HKGs) could separate one normal human tissue type from another. Current focus on identifying "specific disease markers" is problematic as single gene expression in a given sample represents the specific cellular states of the sample at the time of collection. In this study, we examine the diagnostic and prognostic potential of the variable expressions of HKGs in lung cancers. METHODS Microarray and RNA-seq data for normal lungs, lung adenocarcinomas (AD), squamous cell carcinomas of the lung (SQCLC), and small cell carcinomas of the lung (SCLC) were collected from online databases. Using 374 of 451 HKGs, differentially expressed genes between pairs of sample types were determined via two-sided, homoscedastic t-test. Principal component analysis and hierarchical clustering classified normal lung and lung cancers subtypes according to relative gene expression variations. We used uni- and multi-variate cox-regressions to identify significant predictors of overall survival in AD patients. Classifying genes were selected using a set of training samples and then validated using an independent test set. Gene Ontology was examined by PANTHER. RESULTS This study showed that the differential expression patterns of 242, 245, and 99 HKGs were able to distinguish normal lung from AD, SCLC, and SQCLC, respectively. From these, 70 HKGs were common across the three lung cancer subtypes. These HKGs have low expression variation compared to current lung cancer markers (e.g., EGFR, KRAS) and were involved in the most common biological processes (e.g., metabolism, stress response). In addition, the expression pattern of 106 HKGs alone was a significant classifier of AD versus SQCLC. We further highlighted that a panel of 13 HKGs was an independent predictor of overall survival and cumulative risk in AD patients. DISCUSSION Here we report HKG expression patterns may be an effective tool for evaluation of lung cancer states. For example, the differential expression pattern of 70 HKGs alone can separate normal lung tissue from various lung cancers while a panel of 106 HKGs was a capable class predictor of subtypes of non-small cell carcinomas. We also reported that HKGs have significantly lower variance compared to traditional cancer markers across samples, highlighting the robustness of a panel of genes over any one specific biomarker. Using RNA-seq data, we showed that the expression pattern of 13 HKGs is a significant, independent predictor of overall survival for AD patients. This reinforces the predictive power of a HKG panel across different gene expression measurement platforms. Thus, we propose the expression patterns of HKGs alone may be sufficient for the diagnosis and prognosis of individuals with lung cancer.
Collapse
Affiliation(s)
- Yu-Chun Chang
- Division of Renal Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, United States of America
| | - Yan Ding
- Division of Renal Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, United States of America
| | - Lingsheng Dong
- Research Computing, Harvard Medical School, Boston, MA, United States of America
| | - Lang-Jing Zhu
- Division of Renal Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, United States of America
- Department of Nephrology, The Eighth Affiliated Hospital, Sun Yat-sen University, Shenzhen, China
| | - Roderick V. Jensen
- Department of Biological Sciences, Virginia Polytechnic Institute and State University (Virginia Tech), Blacksburg, United States of America
| | - Li-Li Hsiao
- Division of Renal Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, United States of America
| |
Collapse
|
7
|
Characterization of porcine simple sequence repeat variation on a population scale with genome resequencing data. Sci Rep 2017; 7:2376. [PMID: 28539617 PMCID: PMC5443785 DOI: 10.1038/s41598-017-02600-8] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2017] [Accepted: 04/13/2017] [Indexed: 12/23/2022] Open
Abstract
Simple sequence repeats (SSRs) are used as polymorphic molecular markers in many species. They contribute very important functional variations in a range of complex traits; however, little is known about the variation of most SSRs in pig populations. Here, using genome resequencing data, we identified ~0.63 million polymorphic SSR loci from more than 100 individuals. Through intensive analysis of this dataset, we found that the SSR motif composition, motif length, total length of alleles and distribution of alleles all contribute to SSR variability. Furthermore, we found that CG-containing SSRs displayed significantly lower polymorphism and higher cross-species conservation. With a rigorous filter procedure, we provided a catalogue of 16,527 high-quality polymorphic SSRs, which displayed reliable results for the analysis of phylogenetic relationships and provided valuable summary statistics for 30 individuals equally selected from eight local Chinese pig breeds, six commercial lean pig breeds and Chinese wild boars. In addition, from the high-quality polymorphic SSR catalogue, we identified four loci with potential loss-of-function alleles. Overall, these analyses provide a valuable catalogue of polymorphic SSRs to the existing pig genetic variation database, and we believe this catalogue could be used for future genome-wide genetic analysis.
Collapse
|
8
|
Vieira MLC, Santini L, Diniz AL, Munhoz CDF. Microsatellite markers: what they mean and why they are so useful. Genet Mol Biol 2016; 39:312-28. [PMID: 27561112 PMCID: PMC5004837 DOI: 10.1590/1678-4685-gmb-2016-0027] [Citation(s) in RCA: 289] [Impact Index Per Article: 36.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/29/2016] [Accepted: 05/13/2016] [Indexed: 12/11/2022] Open
Abstract
Microsatellites or Single Sequence Repeats (SSRs) are extensively employed in plant genetics studies, using both low and high throughput genotyping approaches. Motivated by the importance of these sequences over the last decades this review aims to address some theoretical aspects of SSRs, including definition, characterization and biological function. The methodologies for the development of SSR loci, genotyping and their applications as molecular markers are also reviewed. Finally, two data surveys are presented. The first was conducted using the main database of Web of Science, prospecting for articles published over the period from 2010 to 2015, resulting in approximately 930 records. The second survey was focused on papers that aimed at SSR marker development, published in the American Journal of Botany's Primer Notes and Protocols in Plant Sciences (over 2013 up to 2015), resulting in a total of 87 publications. This scenario confirms the current relevance of SSRs and indicates their continuous utilization in plant science.
Collapse
Affiliation(s)
- Maria Lucia Carneiro Vieira
- Departamento de Genética, Escola Superior de Agricultura "Luiz de
Queiroz" (ESALQ), Universidade de São Paulo (USP), Piracicaba, SP, Brazil
| | - Luciane Santini
- Departamento de Genética, Escola Superior de Agricultura "Luiz de
Queiroz" (ESALQ), Universidade de São Paulo (USP), Piracicaba, SP, Brazil
| | - Augusto Lima Diniz
- Departamento de Genética, Escola Superior de Agricultura "Luiz de
Queiroz" (ESALQ), Universidade de São Paulo (USP), Piracicaba, SP, Brazil
| | - Carla de Freitas Munhoz
- Departamento de Genética, Escola Superior de Agricultura "Luiz de
Queiroz" (ESALQ), Universidade de São Paulo (USP), Piracicaba, SP, Brazil
| |
Collapse
|
9
|
Bolton KA, Avery-Kiejda KA, Holliday EG, Attia J, Bowden NA, Scott RJ. A polymorphic repeat in the IGF1 promoter influences the risk of endometrial cancer. Endocr Connect 2016; 5:115-22. [PMID: 27090263 PMCID: PMC5002956 DOI: 10.1530/ec-16-0003] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/13/2016] [Accepted: 04/18/2016] [Indexed: 01/22/2023]
Abstract
Due to the lack of high-throughput genetic assays for tandem repeats, there is a paucity of knowledge about the role they may play in disease. A polymorphic CA repeat in the promoter region of the insulin-like growth factor 1 gene (IGF1 has been studied extensively over the past 10 years for association with the risk of developing breast cancer, among other cancers, with variable results. The aim of this study was to determine if this CA repeat is associated with the risk of developing breast cancer and endometrial cancer. Using a case-control design, we analysed the length of this CA repeat in a series of breast cancer and endometrial cancer cases and compared this with a control population. Our results showed an association when both alleles were considered in breast and endometrial cancers (P=0.029 and 0.011, respectively), but this did not pass our corrected threshold for significance due to multiple testing. When the allele lengths were analysed categorically against the most common allele length of 19 CA repeats, an association was observed with the risk of endometrial cancer due to a reduction in the number of long alleles (P=0.013). This was confirmed in an analysis of the long alleles separately for endometrial cancer risk (P=0.0012). Our study found no association between the length of this polymorphic CA repeat and breast cancer risk. The significant association observed between the CA repeat length and the risk of developing endometrial cancer has not been previously reported.
Collapse
Affiliation(s)
- Katherine A Bolton
- Centre for BioinformaticsBiomarker Discovery and Information-Based Medicine, Hunter Medical Research Institute, Newcastle, New South Wales, Australia Priority Research Centre for CancerSchool of Biomedical Sciences and Pharmacy, Faculty of Health and Medicine, University of Newcastle, Newcastle, New South Wales, Australia
| | - Kelly A Avery-Kiejda
- Centre for BioinformaticsBiomarker Discovery and Information-Based Medicine, Hunter Medical Research Institute, Newcastle, New South Wales, Australia Priority Research Centre for CancerSchool of Biomedical Sciences and Pharmacy, Faculty of Health and Medicine, University of Newcastle, Newcastle, New South Wales, Australia
| | - Elizabeth G Holliday
- Centre for Clinical Epidemiology and BiostatisticsSchool of Medicine and Public Health, Faculty of Health and Medicine, University of Newcastle, Newcastle, New South Wales, Australia Clinical Research DesignIT and Statistical Support Unit, Hunter Medical Research Institute, Newcastle, New South Wales, Australia
| | - John Attia
- Centre for Clinical Epidemiology and BiostatisticsSchool of Medicine and Public Health, Faculty of Health and Medicine, University of Newcastle, Newcastle, New South Wales, Australia Clinical Research DesignIT and Statistical Support Unit, Hunter Medical Research Institute, Newcastle, New South Wales, Australia
| | - Nikola A Bowden
- Centre for BioinformaticsBiomarker Discovery and Information-Based Medicine, Hunter Medical Research Institute, Newcastle, New South Wales, Australia Priority Research Centre for CancerSchool of Biomedical Sciences and Pharmacy, Faculty of Health and Medicine, University of Newcastle, Newcastle, New South Wales, Australia
| | - Rodney J Scott
- Centre for BioinformaticsBiomarker Discovery and Information-Based Medicine, Hunter Medical Research Institute, Newcastle, New South Wales, Australia Priority Research Centre for CancerSchool of Biomedical Sciences and Pharmacy, Faculty of Health and Medicine, University of Newcastle, Newcastle, New South Wales, Australia Molecular MedicinePathology North, John Hunter Hospital, Newcastle, New South Wales, Australia Discipline of Medical GeneticsSchool of Biomedical Sciences and Pharmacy, Faculty of Health and Medicine, University of Newcastle, University Drive, Newcastle, New South Wales, Australia
| |
Collapse
|
10
|
Identification and analysis of house-keeping and tissue-specific genes based on RNA-seq data sets across 15 mouse tissues. Gene 2015; 576:560-70. [PMID: 26551299 DOI: 10.1016/j.gene.2015.11.003] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2015] [Revised: 10/27/2015] [Accepted: 11/03/2015] [Indexed: 12/31/2022]
Abstract
Recently, RNA-seq has become widely used technology for transcriptome profiling due to its single-base accuracy and high-throughput speciality. In this study, we applied a computational approach on an integrated RNA-seq dataset across 15 normal mouse tissues, and consequently assigned 8408 house-keeping (HK) genes and 2581 tissue-specific (TS) genes among UCSC RefGene annotation. Apart from some basic genomic features, we also performed expression, function and pathway analysis with clustering, DAVID and Ingenuity Pathway Analysis, indicating the physiological connections (tissues) and diverse biological roles of HK genes (fundamental processes) and TS genes (tissue-corresponding processes). Moreover, we used RT-PCR method to test 18 candidate HK genes and finally identified a novel list of highly stable internal control genes: Ywhae, Ddb 1, Eif4h, etc. In summary, this study provides a new HK gene and TS gene resource for further genetic and evolution research and helps us better understand morphogenesis and biological diversity in mouse.
Collapse
|
11
|
Namdar-Aligoodarzi P, Mohammadparast S, Zaker-Kandjani B, Talebi Kakroodi S, Jafari Vesiehsari M, Ohadi M. Exceptionally long 5' UTR short tandem repeats specifically linked to primates. Gene 2015; 569:88-94. [PMID: 26022613 DOI: 10.1016/j.gene.2015.05.053] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2015] [Revised: 05/12/2015] [Accepted: 05/13/2015] [Indexed: 12/23/2022]
Abstract
We have previously reported genome-scale short tandem repeats (STRs) in the core promoter interval (i.e. -120 to +1 to the transcription start site) of protein-coding genes that have evolved identically in primates vs. non-primates. Those STRs may function as evolutionary switch codes for primate speciation. In the current study, we used the Ensembl database to analyze the 5' untranslated region (5' UTR) between +1 and +60 of the transcription start site of the entire human protein-coding genes annotated in the GeneCards database, in order to identify "exceptionally long" STRs (≥5-repeats), which may be of selective/adaptive advantage. The importance of this critical interval is its function as core promoter, and its effect on transcription and translation. In order to minimize ascertainment bias, we analyzed the evolutionary status of the human 5' UTR STRs of ≥5-repeats in several species encompassing six major orders and superorders across mammals, including primates, rodents, Scandentia, Laurasiatheria, Afrotheria, and Xenarthra. We introduce primate-specific STRs, and STRs which have expanded from mouse to primates. Identical co-occurrence of the identified STRs of rare average frequency between 0.006 and 0.0001 in primates supports a role for those motifs in processes that diverged primates from other mammals, such as neuronal differentiation (e.g. APOD and FGF4), and craniofacial development (e.g. FILIP1L). A number of the identified STRs of ≥5-repeats may be human-specific (e.g. ZMYM3 and DAZAP1). Future work is warranted to examine the importance of the listed genes in primate/human evolution, development, and disease.
Collapse
Affiliation(s)
- P Namdar-Aligoodarzi
- Genetics Research Center, University of Social Welfare and Rehabilitation Sciences, Tehran, Iran
| | - S Mohammadparast
- Genetics Research Center, University of Social Welfare and Rehabilitation Sciences, Tehran, Iran
| | - B Zaker-Kandjani
- Genetics Research Center, University of Social Welfare and Rehabilitation Sciences, Tehran, Iran
| | - S Talebi Kakroodi
- Genetics Research Center, University of Social Welfare and Rehabilitation Sciences, Tehran, Iran
| | - M Jafari Vesiehsari
- Genetics Research Center, University of Social Welfare and Rehabilitation Sciences, Tehran, Iran
| | - M Ohadi
- Genetics Research Center, University of Social Welfare and Rehabilitation Sciences, Tehran, Iran.
| |
Collapse
|
12
|
Xiao Y, Zhou L, Xia W, Mason AS, Yang Y, Ma Z, Peng M. Exploiting transcriptome data for the development and characterization of gene-based SSR markers related to cold tolerance in oil palm (Elaeis guineensis). BMC PLANT BIOLOGY 2014; 14:384. [PMID: 25522814 PMCID: PMC4279980 DOI: 10.1186/s12870-014-0384-2] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/08/2014] [Accepted: 12/12/2014] [Indexed: 05/23/2023]
Abstract
BACKGROUND The oil palm (Elaeis guineensis, 2n = 32) has the highest oil yield of any crop species, as well as comprising the richest dietary source of provitamin A. For the tropical species, the best mean growth temperature is about 27°C, with a minimal growth temperature of 15°C. Hence, the plantation area is limited into the geographical ranges of 10°N to 10°S. Enhancing cold tolerance capability will increase the total cultivation area and subsequently oil productivity of this tropical species. Developing molecular markers related to cold tolerance would be helpful for molecular breeding of cold tolerant Elaeis guineensis. RESULTS In total, 5791 gene-based SSRs were identified in 51,452 expressed sequences from Elaeis guineensis transcriptome data: approximately one SSR was detected per 10 expressed sequences. Of these 5791 gene-based SSRs, 916 were derived from expressed sequences up- or down-regulated at least two-fold in response to cold stress. A total of 182 polymorphic markers were developed and characterized from 442 primer pairs flanking these cold-responsive SSR repeats. The polymorphic information content (PIC) of these polymorphic SSR markers across 24 lines of Elaeis guineensis varied from 0.08 to 0.65 (mean = 0.31 ± 0.12). Using in-silico mapping, 137 (75.3%) of the 182 polymorphic SSR markers were located onto the 16 Elaeis guineensis chromosomes. Total coverage of 473 Mbp was achieved, with an average physical distance of 3.4 Mbp between adjacent markers (range 96 bp - 20.8 Mbp). Meanwhile, Comparative analysis of transcriptome under cold stress revealed that one ICE1 putative ortholog, five CBF putative orthologs, 19 NAC transcription factors and four cold-induced orhologs were up-regulated at least two fold in response to cold stress. Interestingly, 5' untranslated region of both Unigene21287 (ICE1) and CL2628.Contig1 (NAC) both contained an SSR markers. CONCLUSIONS In the present study, a series of SSR markers were developed based on sequences differentially expressed in response to cold stress. These EST-SSR markers would be particularly useful for gene mapping and population structure analysis in Elaeis guineensis. Meanwhile, the EST-SSR loci were inducible expressed in response to low temperature, which may have potential application in identifying trait-associated markers in oil palm in the future.
Collapse
Affiliation(s)
- Yong Xiao
- />Hainan Key Laboratory of Tropical Oil Crops Biology/Coconut Research Institute, Chinese Academy of Tropical Agricultural Sciences, Wenchang, Hainan 571339 P.R. China
| | - Lixia Zhou
- />Hainan Key Laboratory of Tropical Oil Crops Biology/Coconut Research Institute, Chinese Academy of Tropical Agricultural Sciences, Wenchang, Hainan 571339 P.R. China
| | - Wei Xia
- />Hainan Key Laboratory of Tropical Oil Crops Biology/Coconut Research Institute, Chinese Academy of Tropical Agricultural Sciences, Wenchang, Hainan 571339 P.R. China
| | - Annaliese S Mason
- />School of Agriculture and Food Sciences and Centre for Integrative Legume Research, the University of Queensland, 4072 Brisbane, Australia
| | - Yaodong Yang
- />Hainan Key Laboratory of Tropical Oil Crops Biology/Coconut Research Institute, Chinese Academy of Tropical Agricultural Sciences, Wenchang, Hainan 571339 P.R. China
| | - Zilong Ma
- />Institute of Tropical Bioscience and Biotechnology, Chinese Academy of Tropical Agricultural Science, Haikou, Hainan 571101 P. R. China
| | - Ming Peng
- />Institute of Tropical Bioscience and Biotechnology, Chinese Academy of Tropical Agricultural Science, Haikou, Hainan 571101 P. R. China
| |
Collapse
|
13
|
Abe H, Gemmell NJ. Abundance, arrangement, and function of sequence motifs in the chicken promoters. BMC Genomics 2014; 15:900. [PMID: 25318583 PMCID: PMC4203960 DOI: 10.1186/1471-2164-15-900] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2014] [Accepted: 10/08/2014] [Indexed: 01/01/2023] Open
Abstract
Background Eukaryotic promoters are regions containing various sequence motifs necessary to control gene transcription. Much evidence has emerged showing that structural and/or contextual changes in regulatory elements can critically affect cis-regulatory activity. As sequence motifs can be key factors in maintaining complex promoter architectures, one effective approach to further understand the evolution of promoter regions in vertebrates is to compare the abundance and distribution patterns of sequence motifs in these regions between divergent species. When compared with mammals, the chicken (Gallus gallus) has a very different genome composition and sufficient genomic information to make it a good model for the exploration of promoter structure and evolution. Results More than 10% of chicken genes contained short tandem repeat (STR) in the region 2 kb upstream of promoters, but the total number of STRs observed in chicken is approximately half of that detected in human promoters. In terms of the STR motif frequencies, chicken promoter regions were more similar to other avian and mammalian promoters than these were to the entire chicken genome. Unlike other STRs, nearly half of the trinucleotide repeats found in promoters partly or entirely overlapped with CpG islands, indicating potential association with nucleosome positions. Moreover, the chicken promoters are abundant with sequence motifs such as poly-A, poly-G and G-quadruplexes, especially in the core region, that are otherwise rare in the genome. Most of sequence motifs showed strong functional enrichment for particular gene ontology (GO) categories, indicating roles in regulation of transcription and gene expression, as well as immune response and cognition. Conclusions Chicken promoter regions share some, but not all, of the structural features observed in mammalian promoters. The findings presented here provide empirical evidence suggesting that the frequencies and locations of STR motifs have been conserved through promoter evolution in a lineage-specific manner. Correlation analysis between GO categories and sequence motifs suggests motif-specific constraints acting on gene function. Electronic supplementary material The online version of this article (doi:10.1186/1471-2164-15-900) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Hideaki Abe
- Department of Anatomy, University of Otago, Dunedin, New Zealand.
| | | |
Collapse
|
14
|
Zhang J, Ma W, Song X, Lin Q, Gui JF, Mei J. Characterization and development of EST-SSR markers derived from transcriptome of yellow catfish. Molecules 2014; 19:16402-15. [PMID: 25314602 PMCID: PMC6271634 DOI: 10.3390/molecules191016402] [Citation(s) in RCA: 32] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2014] [Revised: 09/28/2014] [Accepted: 09/29/2014] [Indexed: 11/16/2022] Open
Abstract
Yellow catfish (Pelteobagrus fulvidraco) is one of the most important freshwater fish due to its delicious flesh and high nutritional value. However, lack of sufficient simple sequence repeat (SSR) markers has hampered the progress of genetic selection breeding and molecular research for yellow catfish. To this end, we aimed to develop and characterize polymorphic expressed sequence tag (EST)–SSRs from the 454 pyrosequencing transcriptome of yellow catfish. Totally, 82,794 potential EST-SSR markers were identified and distributed in the coding and non-coding regions. Di-nucleotide (53,933) is the most abundant motif type, and AC/GT, AAT/ATT, AAAT/ATTT are respective the most frequent di-, tri-, tetra-nucleotide repeats. We designed primer pairs for all of the identified EST-SSRs and randomly selected 300 of these pairs for further validation. Finally, 263 primer pairs were successfully amplified and 57 primer pairs were found to be consistently polymorphic when four populations of 48 individuals were tested. The number of alleles for the 57 loci ranged from 2 to 17, with an average of 8.23. The observed heterozygosity (HO), expected heterozygosity (HE), polymorphism information content (PIC) and fixation index (FIS) values ranged from 0.04 to 1.00, 0.12 to 0.92, 0.12 to 0.91 and −0.83 to 0.93, respectively. These EST-SSR markers generated in this study could greatly facilitate future studies of genetic diversity and molecular breeding in yellow catfish.
Collapse
Affiliation(s)
- Jin Zhang
- Key Laboratory of Freshwater Animal Breeding, Ministry of Agriculture, Freshwater Aquaculture Collaborative Innovation Center of Hubei Province, College of Fisheries, Huazhong Agricultural University, Wuhan 430070, China
| | - Wenge Ma
- Key Laboratory of Freshwater Animal Breeding, Ministry of Agriculture, Freshwater Aquaculture Collaborative Innovation Center of Hubei Province, College of Fisheries, Huazhong Agricultural University, Wuhan 430070, China
| | - Xiaomin Song
- Key Laboratory of Freshwater Animal Breeding, Ministry of Agriculture, Freshwater Aquaculture Collaborative Innovation Center of Hubei Province, College of Fisheries, Huazhong Agricultural University, Wuhan 430070, China
| | - Qiaohong Lin
- Key Laboratory of Freshwater Animal Breeding, Ministry of Agriculture, Freshwater Aquaculture Collaborative Innovation Center of Hubei Province, College of Fisheries, Huazhong Agricultural University, Wuhan 430070, China
| | - Jian-Fang Gui
- Key Laboratory of Freshwater Animal Breeding, Ministry of Agriculture, Freshwater Aquaculture Collaborative Innovation Center of Hubei Province, College of Fisheries, Huazhong Agricultural University, Wuhan 430070, China.
| | - Jie Mei
- Key Laboratory of Freshwater Animal Breeding, Ministry of Agriculture, Freshwater Aquaculture Collaborative Innovation Center of Hubei Province, College of Fisheries, Huazhong Agricultural University, Wuhan 430070, China.
| |
Collapse
|
15
|
Bolton KA, Ross JP, Grice DM, Bowden NA, Holliday EG, Avery-Kiejda KA, Scott RJ. STaRRRT: a table of short tandem repeats in regulatory regions of the human genome. BMC Genomics 2013; 14:795. [PMID: 24228761 PMCID: PMC3840602 DOI: 10.1186/1471-2164-14-795] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2013] [Accepted: 11/05/2013] [Indexed: 11/22/2022] Open
Abstract
Background Tandem repeats (TRs) are unstable regions commonly found within genomes that have consequences for evolution and disease. In humans, polymorphic TRs are known to cause neurodegenerative and neuromuscular disorders as well as being associated with complex diseases such as diabetes and cancer. If present in upstream regulatory regions, TRs can modify chromatin structure and affect transcription; resulting in altered gene expression and protein abundance. The most common TRs are short tandem repeats (STRs), or microsatellites. Promoter located STRs are considerably more polymorphic than coding region STRs. As such, they may be a common driver of phenotypic variation. To study STRs located in regulatory regions, we have performed genome-wide analysis to identify all STRs present in a region that is 2 kilobases upstream and 1 kilobase downstream of the transcription start sites of genes. Results The Short Tandem Repeats in Regulatory Regions Table, STaRRRT, contains the results of the genome-wide analysis, outlining the characteristics of 5,264 STRs present in the upstream regulatory region of 4,441 human genes. Gene set enrichment analysis has revealed significant enrichment for STRs in cellular, transcriptional and neurological system gene promoters and genes important in ion and calcium homeostasis. The set of enriched terms has broad similarity to that seen in coding regions, suggesting that regulatory region STRs are subject to similar evolutionary pressures as STRs in coding regions and may, like coding region STRs, have an important role in controlling gene expression. Conclusions STaRRRT is a readily-searchable resource for investigating potentially polymorphic STRs that could influence the expression of any gene of interest. The processes and genes enriched for regulatory region STRs provide potential novel targets for diagnosing and treating disease, and support a role for these STRs in the evolution of the human genome.
Collapse
Affiliation(s)
| | | | | | | | | | | | - Rodney J Scott
- Centre for Information-Based Medicine, Hunter Medical Research Institute, Newcastle, NSW, Australia.
| |
Collapse
|
16
|
Llera-Herrera R, García-Gasca A, Abreu-Goodger C, Huvet A, Ibarra AM. Identification of male gametogenesis expressed genes from the scallop Nodipecten subnodosus by suppressive subtraction hybridization and pyrosequencing. PLoS One 2013; 8:e73176. [PMID: 24066034 PMCID: PMC3774672 DOI: 10.1371/journal.pone.0073176] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2013] [Accepted: 07/17/2013] [Indexed: 01/01/2023] Open
Abstract
Despite the great advances in sequencing technologies, genomic and transcriptomic information for marine non-model species with ecological, evolutionary, and economical interest is still scarce. In this work we aimed to identify genes expressed during spermatogenesis in the functional hermaphrodite scallop Nodipecten subnodosus (Mollusca: Bivalvia: Pectinidae), with the purpose of obtaining a panel of genes that would allow for the study of differentially transcribed genes between diploid and triploid scallops in the context of meiotic arrest and reproductive sterility. Because our aim was to isolate genes involved in meiosis and other testis maturation-related processes, we generated suppressive subtractive hybridization libraries of testis vs. inactive gonad. We obtained 352 and 177 ESTs by clone sequencing, and using pyrosequencing (454-Roche) we maximized the identified ESTs to 34,276 reads. A total of 1,153 genes from the testis library had a blastx hit and GO annotation, including genes specific for meiosis, spermatogenesis, sex-differentiation, and transposable elements. Some of the identified meiosis genes function in chromosome pairing (scp2, scp3), recombination and DNA repair (dmc1, rad51, ccnb1ip1/hei10), and meiotic checkpoints (rad1, hormad1, dtl/cdt2). Gene expression analyses in different gametogenic stages in both sexual regions of the gonad of meiosis genes confirmed that the expression was specific or increased towards the maturing testis. Spermatogenesis genes included known testis-specific ones (kelch-10, shippo1, adad1), with some of these known to be associated to sterility. Sex differentiation genes included one of the most conserved genes at the bottom of the sex-determination cascade (dmrt1). Transcript from transposable elements, reverse transcriptase, and transposases in this library evidenced that transposition is an active process during spermatogenesis in N. subnodosus. In relation to the inactive library, we identified 833 transcripts with functional annotation related to activation of the transcription and translation machinery, as well as to germline control and maintenance.
Collapse
Affiliation(s)
- Raúl Llera-Herrera
- Aquaculture Genetics and Breeding Laboratory, Centro de Investigaciones Biológicas del Noroeste, La Paz, Baja California Sur, Mexico
| | | | - Cei Abreu-Goodger
- Laboratorio Nacional de Genómica para la Biodiversidad (Langebio), Centro de Investigación y Estudios Avanzados del Instituto Politécnico Nacional (CINVESTAV-IPN), Irapuato, Guanajuato, Mexico
| | - Arnaud Huvet
- Laboratoire des Sciences de l'Environnement Marin, Institut Français de Recherche pour l'Exploitation de la Mer, (IFREMER), Centre de Bretagne, Plouzané, France
| | - Ana M. Ibarra
- Aquaculture Genetics and Breeding Laboratory, Centro de Investigaciones Biológicas del Noroeste, La Paz, Baja California Sur, Mexico
- * E-mail:
| |
Collapse
|
17
|
Eisenberg E, Levanon EY. Human housekeeping genes, revisited. Trends Genet 2013; 29:569-74. [PMID: 23810203 DOI: 10.1016/j.tig.2013.05.010] [Citation(s) in RCA: 831] [Impact Index Per Article: 75.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2013] [Revised: 05/06/2013] [Accepted: 05/30/2013] [Indexed: 10/26/2022]
Abstract
Housekeeping genes are involved in basic cell maintenance and, therefore, are expected to maintain constant expression levels in all cells and conditions. Identification of these genes facilitates exposure of the underlying cellular infrastructure and increases understanding of various structural genomic features. In addition, housekeeping genes are instrumental for calibration in many biotechnological applications and genomic studies. Advances in our ability to measure RNA expression have resulted in a gradual increase in the number of identified housekeeping genes. Here, we describe housekeeping gene detection in the era of massive parallel sequencing and RNA-seq. We emphasize the importance of expression at a constant level and provide a list of 3804 human genes that are expressed uniformly across a panel of tissues. Several exceptionally uniform genes are singled out for future experimental use, such as RT-PCR control genes. Finally, we discuss both ways in which current technology can meet some of past obstacles encountered, and several as yet unmet challenges.
Collapse
Affiliation(s)
- Eli Eisenberg
- Raymond and Beverly Sackler School of Physics and Astronomy, Tel-Aviv University, Tel Aviv 69978, Israel.
| | | |
Collapse
|
18
|
Sawaya S, Bagshaw A, Buschiazzo E, Kumar P, Chowdhury S, Black MA, Gemmell N. Microsatellite tandem repeats are abundant in human promoters and are associated with regulatory elements. PLoS One 2013; 8:e54710. [PMID: 23405090 PMCID: PMC3566118 DOI: 10.1371/journal.pone.0054710] [Citation(s) in RCA: 110] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2012] [Accepted: 12/18/2012] [Indexed: 12/13/2022] Open
Abstract
Tandem repeats are genomic elements that are prone to changes in repeat number and are thus often polymorphic. These sequences are found at a high density at the start of human genes, in the gene’s promoter. Increasing empirical evidence suggests that length variation in these tandem repeats can affect gene regulation. One class of tandem repeats, known as microsatellites, rapidly alter in repeat number. Some of the genetic variation induced by microsatellites is known to result in phenotypic variation. Recently, our group developed a novel method for measuring the evolutionary conservation of microsatellites, and with it we discovered that human microsatellites near transcription start sites are often highly conserved. In this study, we examined the properties of microsatellites found in promoters. We found a high density of microsatellites at the start of genes. We showed that microsatellites are statistically associated with promoters using a wavelet analysis, which allowed us to test for associations on multiple scales and to control for other promoter related elements. Because promoter microsatellites tend to be G/C rich, we hypothesized that G/C rich regulatory elements may drive the association between microsatellites and promoters. Our results indicate that CpG islands, G-quadruplexes (G4) and untranslated regulatory regions have highly significant associations with microsatellites, but controlling for these elements in the analysis does not remove the association between microsatellites and promoters. Due to their intrinsic lability and their overlap with predicted functional elements, these results suggest that many promoter microsatellites have the potential to affect human phenotypes by generating mutations in regulatory elements, which may ultimately result in disease. We discuss the potential functions of human promoter microsatellites in this context.
Collapse
Affiliation(s)
- Sterling Sawaya
- Centre for Reproduction and Genomics, Department of Anatomy, and Allan Wilson Centre for Molecular Ecology and Evolution, University of Otago, Dunedin, New Zealand.
| | | | | | | | | | | | | |
Collapse
|
19
|
Marum L, Miguel A, Ricardo CP, Miguel C. Reference gene selection for quantitative real-time PCR normalization in Quercus suber. PLoS One 2012; 7:e35113. [PMID: 22529976 PMCID: PMC3329553 DOI: 10.1371/journal.pone.0035113] [Citation(s) in RCA: 69] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2011] [Accepted: 03/12/2012] [Indexed: 12/11/2022] Open
Abstract
The use of reverse transcription quantitative PCR technology to assess gene expression levels requires an accurate normalization of data in order to avoid misinterpretation of experimental results and erroneous analyses. Despite being the focus of several transcriptomics projects, oaks, and particularly cork oak (Quercus suber), have not been investigated regarding the identification of reference genes suitable for the normalization of real-time quantitative PCR data. In this study, ten candidate reference genes (Act, CACs, EF-1α, GAPDH, His3, PsaH, Sand, PP2A, ß-Tub and Ubq) were evaluated to determine the most stable internal reference for quantitative PCR normalization in cork oak. The transcript abundance of these genes was analysed in several tissues of cork oak, including leaves, reproduction cork, and periderm from branches at different developmental stages (1-, 2-, and 3-year old) or collected in different dates (active growth period versus dormancy). The three statistical methods (geNorm, NormFinder, and CV method) used in the evaluation of the most suitable combination of reference genes identified Act and CACs as the most stable candidates when all the samples were analysed together, while ß-Tub and PsaH showed the lowest expression stability. However, when different tissues, developmental stages, and collection dates were analysed separately, the reference genes exhibited some variation in their expression levels. In this study, and for the first time, we have identified and validated reference genes in cork oak that can be used for quantification of target gene expression in different tissues and experimental conditions and will be useful as a starting point for gene expression studies in other oaks.
Collapse
Affiliation(s)
- Liliana Marum
- Instituto de Biologia Experimental e Tecnológica (IBET) / Instituto de Tecnologia Química e Biológica-Universidade Nova de Lisboa (ITQB-UNL), Oeiras, Portugal
- * E-mail: (LM); (CM)
| | | | | | - Célia Miguel
- Instituto de Biologia Experimental e Tecnológica (IBET) / Instituto de Tecnologia Química e Biológica-Universidade Nova de Lisboa (ITQB-UNL), Oeiras, Portugal
- * E-mail: (LM); (CM)
| |
Collapse
|
20
|
Wang H, Huan P, Lu X, Liu B. Mining of EST-SSR markers in clam Meretrix meretrix larvae from 454 shotgun transcriptome. Genes Genet Syst 2012; 86:197-205. [PMID: 21952209 DOI: 10.1266/ggs.86.197] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
Abstract
A total of 2,970 EST-SSRs (2.38%) were identified by transcriptome sequencing of clam Meretrix meretrix (751,970 reads, ~310.82 Mbp), using 454 Genome Sequencer FLX next-generation sequencing platform. Dinucleotide SSR was the dominant repeat type (40.2%), followed by trinucleotide (37.8%), tetranuleotide (12.0%) and pentanucleotide (2.0%) SSR. The dominant repeat motif was AT (71.3%) in the dinucleotide SSR type and AAC (45.6%) in the trinucleotide SSR type. Nearly 79% of all microsatellites had flanking sequences suitable for PCR primer design. Half of PAL were found to be polymorphic in a subset of 40 primer pairs randomly selected. Specifically, the density of dinucleotide, trinucleotide and tetranucleotide repeats showed significant variation among four development stages (trochophore, D-veliger, pediveliger and postlarva). The results suggested that dinucleotide, trinucleotide and tetranucleotide SSRs may play an important role in contributing to the different expression profiles in larval stages.
Collapse
Affiliation(s)
- Hongxia Wang
- Key Laboratory of Experimental Marine Biology, Institute of Oceanology, Chinese Academy of Sciences, Quingdao, China
| | | | | | | |
Collapse
|
21
|
Tranbarger TJ, Kluabmongkol W, Sangsrakru D, Morcillo F, Tregear JW, Tragoonrung S, Billotte N. SSR markers in transcripts of genes linked to post-transcriptional and transcriptional regulatory functions during vegetative and reproductive development of Elaeis guineensis. BMC PLANT BIOLOGY 2012; 12:1. [PMID: 22214433 PMCID: PMC3282652 DOI: 10.1186/1471-2229-12-1] [Citation(s) in RCA: 81] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/12/2011] [Accepted: 01/03/2012] [Indexed: 05/18/2023]
Abstract
BACKGROUND The oil palm (Elaeis guineensis Jacq.) is a perennial monocotyledonous tropical crop species that is now the world's number one source of edible vegetable oil, and the richest dietary source of provitamin A. While new elite genotypes from traditional breeding programs provide steady yield increases, the long selection cycle (10-12 years) and the large areas required to cultivate oil palm make genetic improvement slow and labor intensive. Molecular breeding programs have the potential to make significant impacts on the rate of genetic improvement but the limited molecular resources, in particular the lack of molecular markers for agronomic traits of interest, restrict the application of molecular breeding schemes for oil palm. RESULTS In the current study, 6,103 non-redundant ESTs derived from cDNA libraries of developing vegetative and reproductive tissues were annotated and searched for simple sequence repeats (SSRs). Primer pairs from sequences flanking 289 EST-SSRs were tested to detect polymorphisms in elite breeding parents and their crosses. 230 of these amplified PCR products, 88 of which were polymorphic within the breeding material tested. A detailed analysis and annotation of the EST-SSRs revealed the locations of the polymorphisms within the transcripts, and that the main functional category was related to transcription and post-transcriptional regulation. Indeed, SSR polymorphisms were found in sequences encoding AP2-like, bZIP, zinc finger, MADS-box, and NAC-like transcription factors in addition to other transcriptional regulatory proteins and several RNA interacting proteins. CONCLUSIONS The identification of new EST-SSRs that detect polymorphisms in elite breeding material provides tools for molecular breeding strategies. The identification of SSRs within transcripts, in particular those that encode proteins involved in transcriptional and post-transcriptional regulation, will allow insight into the functional roles of these proteins by studying the phenotypic traits that cosegregate with these markers. Finally, the oil palm EST-SSRs derived from vegetative and reproductive development will be useful for studies on the evolution of the functional diversity within the palm family.
Collapse
Affiliation(s)
- Timothy John Tranbarger
- IRD, UMR DIADE (IRD, UM2), 911 Avenue Agropolis BP 64501, 34394, Montpellier cedex 5, France
| | - Wanwisa Kluabmongkol
- Genome Institute, National Center for Genetic Engineering and Biotechnology (BIOTEC), 113 Thailand Science Park, Phahonyothin Road, Klong 1, Klong Luang, Pathumthani 12120, Thailand
| | - Duangjai Sangsrakru
- Genome Institute, National Center for Genetic Engineering and Biotechnology (BIOTEC), 113 Thailand Science Park, Phahonyothin Road, Klong 1, Klong Luang, Pathumthani 12120, Thailand
| | | | - James W Tregear
- IRD, UMR DIADE (IRD, UM2), 911 Avenue Agropolis BP 64501, 34394, Montpellier cedex 5, France
| | - Somvong Tragoonrung
- Genome Institute, National Center for Genetic Engineering and Biotechnology (BIOTEC), 113 Thailand Science Park, Phahonyothin Road, Klong 1, Klong Luang, Pathumthani 12120, Thailand
| | | |
Collapse
|
22
|
Promoter microsatellites as modulators of human gene expression. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2012; 769:41-54. [PMID: 23560304 DOI: 10.1007/978-1-4614-5434-2_4] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]
Abstract
Microsatellites in and around genes have been shown to modulate levels of gene expression in multiple organisms, ranging from bacteria to humans. Here we will discuss promoter microsatellites known to modulate gene expression, with a few key examples related to the human brain. Many of the microsatellites we discuss are highly conserved in mammals, indicating that selection may favor their retention as "tuning knobs" of gene expression. We will also discuss the mechanisms by which microsatellites in promoters can alter gene expression as they expand and contract, with particular attention to secondary structures like Z-DNA and H-DNA. We suggest that promoter microsatellites, especially those that are highly conserved, may be an important source of human phenotypic variation.
Collapse
|
23
|
Dong B, Zhang P, Chen X, Liu L, Wang Y, He S, Chen R. Predicting housekeeping genes based on Fourier analysis. PLoS One 2011; 6:e21012. [PMID: 21687628 PMCID: PMC3110801 DOI: 10.1371/journal.pone.0021012] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2010] [Accepted: 05/18/2011] [Indexed: 11/19/2022] Open
Abstract
Housekeeping genes (HKGs) generally have fundamental functions in basic biochemical processes in organisms, and usually have relatively steady expression levels across various tissues. They play an important role in the normalization of microarray technology. Using Fourier analysis we transformed gene expression time-series from a Hela cell cycle gene expression dataset into Fourier spectra, and designed an effective computational method for discriminating between HKGs and non-HKGs using the support vector machine (SVM) supervised learning algorithm which can extract significant features of the spectra, providing a basis for identifying specific gene expression patterns. Using our method we identified 510 human HKGs, and then validated them by comparison with two independent sets of tissue expression profiles. Results showed that our predicted HKG set is more reliable than three previously identified sets of HKGs.
Collapse
Affiliation(s)
- Bo Dong
- Bioinformatics Laboratory and National Laboratory of Biomacromolecules, Institute of Biophysics, Chinese Academy of Sciences, Beijing, People's Republic of China
- Graduate School of the Chinese Academy of Sciences, Beijing, People's Republic of China
| | - Peng Zhang
- Bioinformatics Laboratory and National Laboratory of Biomacromolecules, Institute of Biophysics, Chinese Academy of Sciences, Beijing, People's Republic of China
- Graduate School of the Chinese Academy of Sciences, Beijing, People's Republic of China
| | - Xiaowei Chen
- Bioinformatics Laboratory and National Laboratory of Biomacromolecules, Institute of Biophysics, Chinese Academy of Sciences, Beijing, People's Republic of China
- Graduate School of the Chinese Academy of Sciences, Beijing, People's Republic of China
| | - Li Liu
- Key Laboratory of the Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing, People's Republic of China
- Graduate School of the Chinese Academy of Sciences, Beijing, People's Republic of China
| | - Yunfei Wang
- Bioinformatics Laboratory and National Laboratory of Biomacromolecules, Institute of Biophysics, Chinese Academy of Sciences, Beijing, People's Republic of China
- Graduate School of the Chinese Academy of Sciences, Beijing, People's Republic of China
| | - Shunmin He
- Key Laboratory of the Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing, People's Republic of China
| | - Runsheng Chen
- Bioinformatics Laboratory and National Laboratory of Biomacromolecules, Institute of Biophysics, Chinese Academy of Sciences, Beijing, People's Republic of China
| |
Collapse
|
24
|
Amino Acid Biosynthetic Cost and Protein Conservation. J Mol Evol 2011; 72:466-73. [DOI: 10.1007/s00239-011-9445-4] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2010] [Accepted: 04/13/2011] [Indexed: 12/24/2022]
|
25
|
Kumar RP, Senthilkumar R, Singh V, Mishra RK. Repeat performance: how do genome packaging and regulation depend on simple sequence repeats? Bioessays 2010; 32:165-74. [PMID: 20091758 DOI: 10.1002/bies.200900111] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
Non-coding DNA has consistently increased during evolution of higher eukaryotes. Since the number of genes has remained relatively static during the evolution of complex organisms, it is believed that increased degree of sophisticated regulation of genes has contributed to the increased complexity. A higher proportion of non-coding DNA, including repeats, is likely to provide more complex regulatory potential. Here, we propose that repeats play a regulatory role by contributing to the packaging of the genome during cellular differentiation. Repeats, and in particular the simple sequence repeats, are proposed to serve as landmarks that can target regulatory mechanisms to a large number of genomic sites with the help of very few factors and regulate the linked loci in a coordinated manner. Repeats may, therefore, function as common target sites for regulatory mechanisms involved in the packaging and dynamic compartmentalization of the chromatin into active and inactive regions during cellular differentiation.
Collapse
Affiliation(s)
- Ram Parikshan Kumar
- Centre for Cellular and Molecular Biology, Uppal Road, Hyderabad 500 007, India
| | | | | | | |
Collapse
|
26
|
Vinogradov AE. Human transcriptome nexuses: basic-eukaryotic and metazoan. Genomics 2010; 95:345-54. [PMID: 20298777 DOI: 10.1016/j.ygeno.2010.03.004] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2009] [Revised: 03/01/2010] [Accepted: 03/08/2010] [Indexed: 01/10/2023]
Abstract
Using a new approach, I analysed human transcriptome coexpression network and revealed two large-scale nexuses. Besides gene coexpression, each nexus is characterized by a combination of gene evolutionary origin, function and among-tissues expression breadth. The first nexus contains mostly genes of pre-metazoan origin, which are widely expressed and have cell-centred functions. The second nexus is enriched in genes of metazoan origin, which are expressed more narrowly and have organism-centred functions. The revealed nexuses are supported by asymmetry in distribution of transcription factor targets between them. Within the metazoan nexus, there is a subnexus that is more pronounced in the nervous tissues and is enriched in gene regulatory complexity. It mostly contains genes related to nervous system, cell communication and multicellular organism processes and development. The revealed nexuses indicate a dichotomy in the transcriptional regulation and can provide a framework for further functional genomics studies.
Collapse
|
27
|
Kozlowski P, de Mezer M, Krzyzosiak WJ. Trinucleotide repeats in human genome and exome. Nucleic Acids Res 2010; 38:4027-39. [PMID: 20215431 PMCID: PMC2896521 DOI: 10.1093/nar/gkq127] [Citation(s) in RCA: 94] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
Trinucleotide repeats (TNRs) are of interest in genetics because they are used as markers for tracing genotype–phenotype relations and because they are directly involved in numerous human genetic diseases. In this study, we searched the human genome reference sequence and annotated exons (exome) for the presence of uninterrupted triplet repeat tracts composed of six or more repeated units. A list of 32 448 TNRs and 878 TNR-containing genes was generated and is provided herein. We found that some triplet repeats, specifically CNG, are overrepresented, while CTT, ATC, AAC and AAT are underrepresented in exons. This observation suggests that the occurrence of TNRs in exons is not random, but undergoes positive or negative selective pressure. Additionally, TNR types strongly determine their localization in mRNA sections (ORF, UTRs). Most genes containing exon-overrepresented TNRs are associated with gene ontology-defined functions. Surprisingly, many groups of genes that contain TNR types coding for different homo-amino acid tracts associate with the same transcription-related GO categories. We propose that TNRs have potential to be functional genetic elements and that their variation may be involved in the regulation of many common phenotypes; as such, TNR polymorphisms should be considered a priority in association studies.
Collapse
Affiliation(s)
- Piotr Kozlowski
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, Noskowskiego 12/14, 61-704 Poznan, Poland.
| | | | | |
Collapse
|
28
|
Abstract
Single nucleotide polymorphisms (SNPs) are widely distributed in the human genome and although most SNPs are the result of independent point-mutations, there are exceptions. When studying distances between SNPs, a periodic pattern in the distance between pairs of identical SNPs has been found to be heavily correlated with periodicity in short tandem repeats (STRs). STRs are short DNA segments, widely distributed in the human genome and mainly found outside known tandem repeats. Because of the biased occurrence of SNPs, special care has to be taken when analyzing SNP-variation in STRs. We present a review of STRs in the human genome and discuss molecular mechanisms related to the biased occurrence of SNPs in STRs, and its implications for genome comparisons and genetic association studies.
Collapse
Affiliation(s)
- Bo Eskerod Madsen
- AgroTech, Institute for Agri Technology and Food Innovation, Aarhus N, Denmark
| | | | | |
Collapse
|
29
|
Tandem repeats modify the structure of human genes hosted in segmental duplications. Genome Biol 2009; 10:R137. [PMID: 19954527 PMCID: PMC2812944 DOI: 10.1186/gb-2009-10-12-r137] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2009] [Revised: 10/08/2009] [Accepted: 12/02/2009] [Indexed: 12/14/2022] Open
Abstract
BACKGROUND Recently duplicated genes are often subject to genomic rearrangements that can lead to the development of novel gene structures. Here we specifically investigated the effect of variations in internal tandem repeats (ITRs) on the gene structure of human paralogs located in segmental duplications. RESULTS We found that around 7% of the primate-specific genes located within duplicated regions of the genome contain variable tandem repeats. These genes are members of large groups of recently duplicated paralogs that are often polymorphic in the human population. Half of the identified ITRs occur within coding exons and may be either kept or spliced out from the mature transcript. When ITRs reside within exons, they encode variable amino acid repeats. When located at exon-intron boundaries, ITRs can generate alternative splicing patterns through the formation of novel introns. CONCLUSIONS Our study shows that variation in the number of ITRs impacts on recently duplicated genes by modifying their coding sequence, splicing pattern, and tissue expression. The resulting effect is the production of a variety of primate-specific proteins, which mostly differ in number and sequence of amino acid repeats.
Collapse
|
30
|
Morgan AA, Dudley JT, Deshpande T, Butte AJ. Dynamism in gene expression across multiple studies. Physiol Genomics 2009; 40:128-40. [PMID: 19920211 DOI: 10.1152/physiolgenomics.90403.2008] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Open
Abstract
In this study we develop methods of examining gene expression dynamics, how and when genes change expression, and demonstrate their application in a meta-analysis involving over 29,000 microarrays. By defining measures across many experimental conditions, we have a new way of characterizing dynamics, complementary to measures looking at changes in absolute variation or breadth of tissues showing expression. We show conservation in overall patterns of dynamism across three species (human, mouse, and rat) and show associations with known disease-related genes. We discuss the enriched functional properties of the sets of genes showing different patterns of dynamics and show that the differences in expression dynamics is associated with the variety of different transcription factor regulatory sites. These results can influence thinking about the selection of genes for microarray design and the analysis of measurements of mRNA expression variation in a global context of expression dynamics across many conditions, as genes that are rarely differentially expressed between experimental conditions may be the subject of increased scrutiny when they significantly vary in expression between experimental subsets.
Collapse
Affiliation(s)
- Alexander A Morgan
- Stanford Center for Biomedical Informatics Research, Department of Medicine, Stanford University, Stanford, CA 94305, USA
| | | | | | | |
Collapse
|
31
|
|