51
|
Mahdi LK, Deihimi T, Zamansani F, Fruzangohar M, Adelson DL, Paton JC, Ogunniyi AD, Ebrahimie E. A functional genomics catalogue of activated transcription factors during pathogenesis of pneumococcal disease. BMC Genomics 2014; 15:769. [PMID: 25196724 PMCID: PMC4171566 DOI: 10.1186/1471-2164-15-769] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2014] [Accepted: 09/03/2014] [Indexed: 11/11/2022] Open
Abstract
Background Streptococcus pneumoniae (the pneumococcus) is the world’s foremost microbial pathogen, killing more people each year than HIV, TB or malaria. The capacity to penetrate deeper host tissues contributes substantially to the ability of this organism to cause disease. Here we investigated, for the first time, functional genomics modulation of 3 pneumococcal strains (serotype 2 [D39], serotype 4 [WCH43] and serotype 6A [WCH16]) during transition from the nasopharynx to lungs to blood and to brain of mice at both promoter and domain activation levels. Results We found 7 highly activated transcription factors (TFs) [argR, codY, hup, rpoD, rr02, scrR and smrC] capable of binding to a large number of up-regulated genes, potentially constituting the regulatory backbone of pneumococcal pathogenesis. Strain D39 showed a distinct profile in employing a large number of TFs during blood infection. Interestingly, the same highly activated TFs used by D39 in blood are also used by WCH16 and WCH43 during brain infection. This indicates that different pneumococcal strains might activate a similar set of TFs and regulatory elements depending on the final site of infection. Hierarchical clustering analysis showed that all the highly activated TFs, except rpoD, clustered together with a high level of similarity in all 3 strains, which might suggest redundancy in the regulatory roles of these TFs during infection. Discriminant function analysis of the TFs in various niches highlights differential regulatory backgrounds of the 3 strains, and pathogenesis data confirms codY as the most significant predictor discriminating between these strains in various niches, particularly in the blood. Moreover, the predicted TF and domain activation profiles of the 3 strains correspond with their distinct pathogenicity characteristics. Conclusions Our findings suggest that the pneumococcus changes the short binding sites in the promoter regions of genes in a niche-specific manner to enhance its ability to disseminate from one host niche to another. This study provides a framework for an improved understanding of the dynamics of pneumococcal pathogenesis, and opens a new avenue into similar investigations in other pathogenic bacteria. Electronic supplementary material The online version of this article (doi:10.1186/1471-2164-15-769) contains supplementary material, which is available to authorized users.
Collapse
|
52
|
Gagliardi L, Schreiber AW, Hahn CN, Feng J, Cranston T, Boon H, Hotu C, Oftedal BE, Cutfield R, Adelson DL, Braund WJ, Gordon RD, Rees DA, Grossman AB, Torpy DJ, Scott HS. ARMC5 mutations are common in familial bilateral macronodular adrenal hyperplasia. J Clin Endocrinol Metab 2014; 99:E1784-92. [PMID: 24905064 DOI: 10.1210/jc.2014-1265] [Citation(s) in RCA: 75] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
CONTEXT Bilateral macronodular adrenal hyperplasia (BMAH) is a rare form of adrenal Cushing's syndrome. Familial cases have been reported, but at the time we conducted this study, the genetic basis of BMAH was unknown. Recently, germline variants of armadillo repeat containing 5 (ARMC5) in patients with isolated BMAH and somatic, second-hit mutations in tumor nodules, were identified. OBJECTIVE Our objective was to identify the genetic basis of familial BMAH. DESIGN We performed whole exome capture and sequencing of 2 affected individuals from each of 4 BMAH families (BMAH-01, BMAH-02, BMAH-03, and BMAH-05). Based on clinical evaluation, there were 7, 3, 3, and 4 affected individuals in these families, respectively. Sanger sequencing of ARMC5 was performed in 1 other BMAH kindred, BMAH-06. RESULTS Exome sequencing identified novel variants Chr16:g.31477540, c.2139delT, p.(Thr715Leufs*1) (BMAH-02) and Chr16:g.31473811, c.943C→T, p.(Arg315Trp) (BMAH-03) in ARMC5 (GRch37/hg19), validated by Sanger sequencing. BMAH-01 had a recently reported mutation Chr16:g.31476121, c.1777C→T, p.(Arg593Trp). Sanger sequencing of ARMC5 in BMAH-06 identified a previously reported mutation, Chr16:g. 31473688; c.799C→T, p.(Arg267*). The genetic basis of BMAH in BMAH-05 was not identified. CONCLUSIONS Our studies have detected ARMC5 mutations in 4 of 5 BMAH families tested, confirming that these mutations are a frequent cause of BMAH. Two of the 4 families had novel mutations, indicating allelic heterogeneity. Preclinical evaluation did not predict mutation status. The ARMC5-negative family had unusual prominent hyperaldosteronism. Further studies are needed to determine the penetrance of BMAH in ARMC5 mutation-positive relatives of affected patients, the practical utility of genetic screening and genotype-phenotype correlations.
Collapse
|
53
|
Jiang Y, Xie M, Chen W, Talbot R, Maddox JF, Faraut T, Wu C, Muzny DM, Li Y, Zhang W, Stanton JA, Brauning R, Barris WC, Hourlier T, Aken BL, Searle SMJ, Adelson DL, Bian C, Cam GR, Chen Y, Cheng S, DeSilva U, Dixen K, Dong Y, Fan G, Franklin IR, Fu S, Guan R, Highland MA, Holder ME, Huang G, Ingham AB, Jhangiani SN, Kalra D, Kovar CL, Lee SL, Liu W, Liu X, Lu C, Lv T, Mathew T, McWilliam S, Menzies M, Pan S, Robelin D, Servin B, Townley D, Wang W, Wei B, White SN, Yang X, Ye C, Yue Y, Zeng P, Zhou Q, Hansen JB, Kristensen K, Gibbs RA, Flicek P, Warkup CC, Jones HE, Oddy VH, Nicholas FW, McEwan JC, Kijas J, Wang J, Worley KC, Archibald AL, Cockett N, Xu X, Wang W, Dalrymple BP. The sheep genome illuminates biology of the rumen and lipid metabolism. Science 2014; 344:1168-1173. [PMID: 24904168 DOI: 10.1126/science.1252806] [Citation(s) in RCA: 312] [Impact Index Per Article: 31.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
Sheep (Ovis aries) are a major source of meat, milk, and fiber in the form of wool and represent a distinct class of animals that have a specialized digestive organ, the rumen, that carries out the initial digestion of plant material. We have developed and analyzed a high-quality reference sheep genome and transcriptomes from 40 different tissues. We identified highly expressed genes encoding keratin cross-linking proteins associated with rumen evolution. We also identified genes involved in lipid metabolism that had been amplified and/or had altered tissue expression patterns. This may be in response to changes in the barrier lipids of the skin, an interaction between lipid metabolism and wool synthesis, and an increased role of volatile fatty acids in ruminants compared with nonruminant animals.
Collapse
|
54
|
Lim SL, Kortschak RD, Adelson DL. Discovery of a novel long terminal repeat (LTR2i_SS) in Sus Scrofa. Anim Genet 2014; 45:367-72. [DOI: 10.1111/age.12138] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/22/2014] [Indexed: 11/29/2022]
|
55
|
Wang D, Qu Z, Adelson DL, Zhu JK, Timmis JN. Transcription of nuclear organellar DNA in a model plant system. Genome Biol Evol 2014; 6:1327-34. [PMID: 24868015 PMCID: PMC4079196 DOI: 10.1093/gbe/evu111] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023] Open
Abstract
Endosymbiotic gene transfer from cytoplasmic organelles (chloroplasts and mitochondria) to the nucleus is an ongoing process in land plants. Although the frequency of organelle DNA migration is high, functional gene transfer is rare because a nuclear promoter is thought necessary for activity in the nucleus. Here we show that a chloroplast promoter, 16S rrn, drives nuclear transcription, suggesting that a transferred organellar gene may become active without obtaining a nuclear promoter. Examining the chromatin status of a known de novo chloroplast integrant indicates that plastid DNA inserts into open chromatin and that this relaxed condition is maintained after integration. Transcription of nuclear organelle DNA integrants was explored at the whole genome level by analyzing RNA-seq data of Oryza sativa subsp. japonica, and utilizing sequence polymorphisms to unequivocally discriminate nuclear organelle DNA transcripts from those of bona fide cytoplasmic organelle DNA. Nuclear copies of organelle DNA that are transcribed show a spectrum of transcriptional activity but at comparatively low levels compared with the majority of other nuclear genes.
Collapse
|
56
|
Alanazi I, Ebrahimie E, Hoffmann P, Adelson DL. Combined gene expression and proteomic analysis of EGF induced apoptosis in A431 cells suggests multiple pathways trigger apoptosis. Apoptosis 2014; 18:1291-1305. [PMID: 23892916 DOI: 10.1007/s10495-013-0887-6] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
A431 cells, derived from epidermoid carcinoma, overexpress the epidermal growth factor receptor (EGFR) and when treated with a high dose of EGF will undergo apoptosis. We exploited microarray and proteomics techniques and network prediction to study the regulatory mechanisms of EGF-induced apoptosis in A431 cells. We observed significant changes in gene expression in 162 genes, approximately evenly split between pro-apoptotic and anti-apoptotic genes and identified 30 proteins from the proteomic data that had either pro or anti-apoptotic annotation. Our correlation analysis of gene expression and proteome modeled a number of distinct sub-networks that are associated with the onset of apoptosis, allowing us to identify specific pathways and components. These include components of the interferon signalling pathway, and down stream components, including cytokines and suppressors of cytokine signalling. A central component of almost all gene expression sub-networks identified was TP53, which is mutated in A431 cells, and was down regulated. This down regulation of TP53 appeared to be correlated with proteomic sub-networks of cytoskeletal or cell adhesion components that might induce apoptosis by triggering cytochrome C release. Of the only three genes also differentially expressed as proteins, only serpinb1 had a known association with apoptosis. We confirmed that up regulation and cleavage of serpinb1 into L-DNAaseII was correlated with the induction of apoptosis. It is unlikely that a single pathway, but more likely a combination of pathways is needed to trigger EGF induced apoptosis in A431cells.
Collapse
|
57
|
Ebrahimi M, Aghagolzadeh P, Shamabadi N, Tahmasebi A, Alsharifi M, Adelson DL, Hemmatzadeh F, Ebrahimie E. Understanding the undelaying mechanism of HA-subtyping in the level of physic-chemical characteristics of protein. PLoS One 2014; 9:e96984. [PMID: 24809455 PMCID: PMC4014573 DOI: 10.1371/journal.pone.0096984] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2013] [Accepted: 04/07/2014] [Indexed: 01/05/2023] Open
Abstract
The evolution of the influenza A virus to increase its host range is a major concern worldwide. Molecular mechanisms of increasing host range are largely unknown. Influenza surface proteins play determining roles in reorganization of host-sialic acid receptors and host range. In an attempt to uncover the physic-chemical attributes which govern HA subtyping, we performed a large scale functional analysis of over 7000 sequences of 16 different HA subtypes. Large number (896) of physic-chemical protein characteristics were calculated for each HA sequence. Then, 10 different attribute weighting algorithms were used to find the key characteristics distinguishing HA subtypes. Furthermore, to discover machine leaning models which can predict HA subtypes, various Decision Tree, Support Vector Machine, Naïve Bayes, and Neural Network models were trained on calculated protein characteristics dataset as well as 10 trimmed datasets generated by attribute weighting algorithms. The prediction accuracies of the machine learning methods were evaluated by 10-fold cross validation. The results highlighted the frequency of Gln (selected by 80% of attribute weighting algorithms), percentage/frequency of Tyr, percentage of Cys, and frequencies of Try and Glu (selected by 70% of attribute weighting algorithms) as the key features that are associated with HA subtyping. Random Forest tree induction algorithm and RBF kernel function of SVM (scaled by grid search) showed high accuracy of 98% in clustering and predicting HA subtypes based on protein attributes. Decision tree models were successful in monitoring the short mutation/reassortment paths by which influenza virus can gain the key protein structure of another HA subtype and increase its host range in a short period of time with less energy consumption. Extracting and mining a large number of amino acid attributes of HA subtypes of influenza A virus through supervised algorithms represent a new avenue for understanding and predicting possible future structure of influenza pandemics.
Collapse
|
58
|
Bellone RR, Holl H, Setaluri V, Devi S, Maddodi N, Archer S, Sandmeyer L, Ludwig A, Foerster D, Pruvost M, Reissmann M, Bortfeldt R, Adelson DL, Lim SL, Nelson J, Haase B, Engensteiner M, Leeb T, Forsyth G, Mienaltowski MJ, Mahadevan P, Hofreiter M, Paijmans JLA, Gonzalez-Fortes G, Grahn B, Brooks SA. Evidence for a retroviral insertion in TRPM1 as the cause of congenital stationary night blindness and leopard complex spotting in the horse. PLoS One 2013; 8:e78280. [PMID: 24167615 PMCID: PMC3805535 DOI: 10.1371/journal.pone.0078280] [Citation(s) in RCA: 77] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2013] [Accepted: 09/10/2013] [Indexed: 12/21/2022] Open
Abstract
Leopard complex spotting is a group of white spotting patterns in horses caused by an incompletely dominant gene (LP) where homozygotes (LP/LP) are also affected with congenital stationary night blindness. Previous studies implicated Transient Receptor Potential Cation Channel, Subfamily M, Member 1 (TRPM1) as the best candidate gene for both CSNB and LP. RNA-Seq data pinpointed a 1378 bp insertion in intron 1 of TRPM1 as the potential cause. This insertion, a long terminal repeat (LTR) of an endogenous retrovirus, was completely associated with LP, testing 511 horses (χ2=1022.00, p<<0.0005), and CSNB, testing 43 horses (χ2=43, p<<0.0005). The LTR was shown to disrupt TRPM1 transcription by premature poly-adenylation. Furthermore, while deleterious transposable element insertions should be quickly selected against the identification of this insertion in three ancient DNA samples suggests it has been maintained in the horse gene pool for at least 17,000 years. This study represents the first description of an LTR insertion being associated with both a pigmentation phenotype and an eye disorder.
Collapse
|
59
|
Fruzangohar M, Kroeger TA, Adelson DL. Improved part-of-speech prediction in suffix analysis. PLoS One 2013; 8:e76042. [PMID: 24124532 PMCID: PMC3790802 DOI: 10.1371/journal.pone.0076042] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2012] [Accepted: 08/26/2013] [Indexed: 11/21/2022] Open
Abstract
Motivation Predicting the part of speech (POS) tag of an unknown word in a sentence is a significant challenge. This is particularly difficult in biomedicine, where POS tags serve as an input to training sophisticated literature summarization techniques, such as those based on Hidden Markov Models (HMM). Different approaches have been taken to deal with the POS tagger challenge, but with one exception – the TnT POS tagger - previous publications on POS tagging have omitted details of the suffix analysis used for handling unknown words. The suffix of an English word is a strong predictor of a POS tag for that word. As a pre-requisite for an accurate HMM POS tagger for biomedical publications, we present an efficient suffix prediction method for integration into a POS tagger. Results We have implemented a fully functional HMM POS tagger using experimentally optimised suffix based prediction. Our simple suffix analysis method, significantly outperformed the probability interpolation based TnT method. We have also shown how important suffix analysis can be for probability estimation of a known word (in the training corpus) with an unseen POS tag; a common scenario with a small training corpus. We then integrated this simple method in our POS tagger and determined an optimised parameter set for both methods, which can help developers to optimise their current algorithm, based on our results. We also introduce the concept of counting methods in maximum likelihood estimation for the first time and show how counting methods can affect the prediction result. Finally, we describe how machine-learning techniques were applied to identify words, for which prediction of POS tags were always incorrect and propose a method to handle words of this type. Availability and Implementation Java source code, binaries and setup instructions are freely available at http://genomes.sapac.edu.au/text_mining/pos_tagger.zip.
Collapse
|
60
|
Ivancevic AM, Walsh AM, Kortschak RD, Adelson DL. Jumping the fine LINE between species: horizontal transfer of transposable elements in animals catalyses genome evolution. Bioessays 2013; 35:1071-82. [PMID: 24003001 DOI: 10.1002/bies.201300072] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023]
Abstract
Horizontal transfer (HT) is the transmission of genetic material between non-mating species, a phenomenon thought to occur rarely in multicellular eukaryotes. However, many transposable elements (TEs) are not only capable of HT, but have frequently jumped between widely divergent species. Here we review and integrate reported cases of HT in retrotransposons of the BovB family, and DNA transposons, over a broad range of animals spanning all continents. Our conclusions challenge the paradigm that HT in vertebrates is restricted to infective long terminal repeat (LTR) retrotransposons or retroviruses. This raises the possibility that other non-LTR retrotransposons, such as L1 or CR1 elements, believed to be only vertically transmitted, can horizontally transfer between species. Growing evidence indicates that the process of HT is much more general across different TEs and species than previously believed, and that it likely shapes eukaryotic genomes and catalyses genome evolution.
Collapse
|
61
|
Mahdi LK, Ebrahimie E, Adelson DL, Paton JC, Ogunniyi AD. A transcription factor contributes to pathogenesis and virulence in Streptococcus pneumoniae. PLoS One 2013; 8:e70862. [PMID: 23967124 PMCID: PMC3742648 DOI: 10.1371/journal.pone.0070862] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2013] [Accepted: 06/24/2013] [Indexed: 11/21/2022] Open
Abstract
To date, the role of transcription factors (TFs) in the progression of disease for many pathogens is yet to be studied in detail. This is probably due to transient, and generally low expression levels of TFs, which are the central components controlling the expression of many genes during the course of infection. However, a small change in the expression or specificity of a TF can radically alter gene expression. In this study, we combined a number of quality-based selection strategies including structural prediction of modulated genes, gene ontology and network analysis, to predict the regulatory mechanisms underlying pathogenesis of Streptococcus pneumoniae (the pneumococcus). We have identified two TFs (SP_0676 and SP_0927 [SmrC]) that might control tissue-specific gene expression during pneumococcal translocation from the nasopharynx to lungs, to blood and then to brain of mice. Targeted mutagenesis and mouse models of infection confirmed the role of SP_0927 in pathogenesis and virulence, and suggests that SP_0676 might be essential to pneumococcal viability. These findings provide fundamental new insights into virulence gene expression and regulation during pathogenesis.
Collapse
|
62
|
Roberts ND, Kortschak RD, Parker WT, Schreiber AW, Branford S, Scott HS, Glonek G, Adelson DL. A comparative analysis of algorithms for somatic SNV detection in cancer. ACTA ACUST UNITED AC 2013; 29:2223-30. [PMID: 23842810 PMCID: PMC3753564 DOI: 10.1093/bioinformatics/btt375] [Citation(s) in RCA: 73] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
Motivation: With the advent of relatively affordable high-throughput technologies, DNA sequencing of cancers is now common practice in cancer research projects and will be increasingly used in clinical practice to inform diagnosis and treatment. Somatic (cancer-only) single nucleotide variants (SNVs) are the simplest class of mutation, yet their identification in DNA sequencing data is confounded by germline polymorphisms, tumour heterogeneity and sequencing and analysis errors. Four recently published algorithms for the detection of somatic SNV sites in matched cancer–normal sequencing datasets are VarScan, SomaticSniper, JointSNVMix and Strelka. In this analysis, we apply these four SNV calling algorithms to cancer–normal Illumina exome sequencing of a chronic myeloid leukaemia (CML) patient. The candidate SNV sites returned by each algorithm are filtered to remove likely false positives, then characterized and compared to investigate the strengths and weaknesses of each SNV calling algorithm. Results: Comparing the candidate SNV sets returned by VarScan, SomaticSniper, JointSNVMix2 and Strelka revealed substantial differences with respect to the number and character of sites returned; the somatic probability scores assigned to the same sites; their susceptibility to various sources of noise; and their sensitivities to low-allelic-fraction candidates. Availability: Data accession number SRA081939, code at http://code.google.com/p/snv-caller-review/ Contact:david.adelson@adelaide.edu.au Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
|
63
|
Fruzangohar M, Ebrahimie E, Ogunniyi AD, Mahdi LK, Paton JC, Adelson DL. Comparative GO: a web application for comparative gene ontology and gene ontology-based gene selection in bacteria. PLoS One 2013; 8:e58759. [PMID: 23536820 PMCID: PMC3594149 DOI: 10.1371/journal.pone.0058759] [Citation(s) in RCA: 48] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2013] [Accepted: 02/06/2013] [Indexed: 11/18/2022] Open
Abstract
The primary means of classifying new functions for genes and proteins relies on Gene Ontology (GO), which defines genes/proteins using a controlled vocabulary in terms of their Molecular Function, Biological Process and Cellular Component. The challenge is to present this information to researchers to compare and discover patterns in multiple datasets using visually comprehensible and user-friendly statistical reports. Importantly, while there are many GO resources available for eukaryotes, there are none suitable for simultaneous, graphical and statistical comparison between multiple datasets. In addition, none of them supports comprehensive resources for bacteria. By using Streptococcus pneumoniae as a model, we identified and collected GO resources including genes, proteins, taxonomy and GO relationships from NCBI, UniProt and GO organisations. Then, we designed database tables in PostgreSQL database server and developed a Java application to extract data from source files and loaded into database automatically. We developed a PHP web application based on Model-View-Control architecture, used a specific data structure as well as current and novel algorithms to estimate GO graphs parameters. We designed different navigation and visualization methods on the graphs and integrated these into graphical reports. This tool is particularly significant when comparing GO groups between multiple samples (including those of pathogenic bacteria) from different sources simultaneously. Comparing GO protein distribution among up- or down-regulated genes from different samples can improve understanding of biological pathways, and mechanism(s) of infection. It can also aid in the discovery of genes associated with specific function(s) for investigation as a novel vaccine or therapeutic targets.
Collapse
|
64
|
Qu Z, Adelson DL. Evolutionary conservation and functional roles of ncRNA. Front Genet 2012; 3:205. [PMID: 23087702 PMCID: PMC3466565 DOI: 10.3389/fgene.2012.00205] [Citation(s) in RCA: 51] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2012] [Accepted: 09/24/2012] [Indexed: 11/24/2022] Open
Abstract
Non-coding RNAs (ncRNAs) are a class of transcribed RNA molecules without protein-coding potential. They were regarded as transcriptional noise, or the byproduct of genetic information flow from DNA to protein for a long time. However, in recent years, a number of studies have shown that ncRNAs are pervasively transcribed, and most of them show evidence of evolutionary conservation, although less conserved than protein-coding genes. More importantly, many ncRNAs have been confirmed as playing crucial regulatory roles in diverse biological processes and tumorigenesis. Here we summarize the functional significance of this class of “dark matter” in terms its genomic organization, evolutionary conservation, and broad functional classes.
Collapse
|
65
|
Qu Z, Adelson DL. Bovine ncRNAs are abundant, primarily intergenic, conserved and associated with regulatory genes. PLoS One 2012; 7:e42638. [PMID: 22880061 PMCID: PMC3412814 DOI: 10.1371/journal.pone.0042638] [Citation(s) in RCA: 40] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2012] [Accepted: 07/11/2012] [Indexed: 12/15/2022] Open
Abstract
It is apparent that non-coding transcripts are a common feature of higher organisms and encode uncharacterized layers of genetic regulation and information. We used public bovine EST data from many developmental stages and tissues, and developed a pipeline for the genome wide identification and annotation of non-coding RNAs (ncRNAs). We have predicted 23,060 bovine ncRNAs, 99% of which are un-annotated, based on known ncRNA databases. Intergenic transcripts accounted for the majority (57%) of the predicted ncRNAs and the occurrence of ncRNAs and genes were only moderately correlated (r = 0.55, p-value<2.2e-16). Many of these intergenic non-coding RNAs mapped close to the 3′ or 5′ end of thousands of genes and many of these were transcribed from the opposite strand with respect to the closest gene, particularly regulatory-related genes. Conservation analyses showed that these ncRNAs were evolutionarily conserved, and many intergenic ncRNAs proximate to genes contained sequence-specific motifs. Correlation analysis of expression between these intergenic ncRNAs and protein-coding genes using RNA-seq data from a variety of tissues showed significant correlations with many transcripts. These results support the hypothesis that ncRNAs are common, transcribed in a regulated fashion and have regulatory functions.
Collapse
|
66
|
Hamernik DL, Adelson DL. USDA Stakeholder Workshop on Animal Bioinformatics: Summary and Recommendations. Comp Funct Genomics 2011; 4:271-4. [PMID: 18629125 PMCID: PMC2447412 DOI: 10.1002/cfg.266] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2003] [Revised: 01/31/2003] [Accepted: 01/31/2003] [Indexed: 12/02/2022] Open
Abstract
An electronic workshop was conducted on 4 November–13 December 2002 to discuss current issues and needs in animal bioinformatics. The electronic (e-mail listserver)
format was chosen to provide a relatively speedy process that is broad in scope,
cost-efficient and easily accessible to all participants. Approximately 40 panelists
with diverse species and discipline expertise communicated through the panel e-mail
listserver. The panel included scientists from academia, industry and government, in
the USA, Australia and the UK. A second ‘stakeholder’ e-mail listserver was used to
obtain input from a broad audience with general interests in animal genomics. The
objectives of the electronic workshop were: (a) to define priorities for animal genome
database development; and (b) to recommend ways in which the USDA could provide
leadership in the area of animal genome database development. E-mail messages
from panelists and stakeholders are archived at http://genome.cvm.umn.edu/bioinfo/.
Priorities defined for animal genome database development included: (a) data
repository; (b) tools for genome analysis; (c) annotation; (d) practical application of
genomic data; and (e) a biological framework for DNA sequence. A stable source of
funding, such as the USDA Agricultural Research Service (ARS), was recommended
to support maintenance of data repositories and data curation. Continued support
for competitive grants programs within the USDA Cooperative State Research,
Education and Extension Service (CSREES) was recommended for tool development
and hypothesis-driven research projects in genome analysis. Additional stakeholder
input will be required to continuously refine priorities and maximize the use of limited
resources for animal bioinformatics within the USDA.
Collapse
|
67
|
Ling KH, Brautigan PJ, Hahn CN, Daish T, Rayner JR, Cheah PS, Raison JM, Piltz S, Mann JR, Mattiske DM, Thomas PQ, Adelson DL, Scott HS. Deep sequencing analysis of the developing mouse brain reveals a novel microRNA. BMC Genomics 2011; 12:176. [PMID: 21466694 PMCID: PMC3088569 DOI: 10.1186/1471-2164-12-176] [Citation(s) in RCA: 46] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2010] [Accepted: 04/05/2011] [Indexed: 12/31/2022] Open
Abstract
Background MicroRNAs (miRNAs) are small non-coding RNAs that can exert multilevel inhibition/repression at a post-transcriptional or protein synthesis level during disease or development. Characterisation of miRNAs in adult mammalian brains by deep sequencing has been reported previously. However, to date, no small RNA profiling of the developing brain has been undertaken using this method. We have performed deep sequencing and small RNA analysis of a developing (E15.5) mouse brain. Results We identified the expression of 294 known miRNAs in the E15.5 developing mouse brain, which were mostly represented by let-7 family and other brain-specific miRNAs such as miR-9 and miR-124. We also discovered 4 putative 22-23 nt miRNAs: mm_br_e15_1181, mm_br_e15_279920, mm_br_e15_96719 and mm_br_e15_294354 each with a 70-76 nt predicted pre-miRNA. We validated the 4 putative miRNAs and further characterised one of them, mm_br_e15_1181, throughout embryogenesis. Mm_br_e15_1181 biogenesis was Dicer1-dependent and was expressed in E3.5 blastocysts and E7 whole embryos. Embryo-wide expression patterns were observed at E9.5 and E11.5 followed by a near complete loss of expression by E13.5, with expression restricted to a specialised layer of cells within the developing and early postnatal brain. Mm_br_e15_1181 was upregulated during neurodifferentiation of P19 teratocarcinoma cells. This novel miRNA has been identified as miR-3099. Conclusions We have generated and analysed the first deep sequencing dataset of small RNA sequences of the developing mouse brain. The analysis revealed a novel miRNA, miR-3099, with potential regulatory effects on early embryogenesis, and involvement in neuronal cell differentiation/function in the brain during late embryonic and early neonatal development.
Collapse
|
68
|
Appels R, Adelson DL, Moolhuijzen P, Webster H, Barrero R, Bellgard M. Genome studies at the PAG 2011 conference. Funct Integr Genomics 2011; 11:1-11. [PMID: 21360134 DOI: 10.1007/s10142-011-0215-6] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2011] [Revised: 02/15/2011] [Accepted: 02/15/2011] [Indexed: 01/15/2023]
Abstract
The contents of the plenary lectures presented at the Plant and Animal Genome (PAG) meeting in January 2011 are summarized in order to provide some insights into the advances in plant, animal and microbe genome studies as they impact on our understanding of complex biological systems. The areas of biology covered include the dynamics of genome change, biological recognition processes and the new processes that underpin investment in science. This overview does not attempt to summarize the diversity of activities that are covered during the PAG through workshops, posters and the suppliers of cutting-edge technologies, but reviews major advances in specific research areas.
Collapse
|
69
|
Childers CP, Reese JT, Sundaram JP, Vile DC, Dickens CM, Childs KL, Salih H, Bennett AK, Hagen DE, Adelson DL, Elsik CG. Bovine Genome Database: integrated tools for genome annotation and discovery. Nucleic Acids Res 2010; 39:D830-4. [PMID: 21123190 PMCID: PMC3013744 DOI: 10.1093/nar/gkq1235] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
The Bovine Genome Database (BGD; http://BovineGenome.org) strives to improve annotation of the bovine genome and to integrate the genome sequence with other genomics data. BGD includes GBrowse genome browsers, the Apollo Annotation Editor, a quantitative trait loci (QTL) viewer, BLAST databases and gene pages. Genome browsers, available for both scaffold and chromosome coordinate systems, display the bovine Official Gene Set (OGS), RefSeq and Ensembl gene models, non-coding RNA, repeats, pseudogenes, single-nucleotide polymorphism, markers, QTL and alignments to complementary DNAs, ESTs and protein homologs. The Bovine QTL viewer is connected to the BGD Chromosome GBrowse, allowing for the identification of candidate genes underlying QTL. The Apollo Annotation Editor connects directly to the BGD Chado database to provide researchers with remote access to gene evidence in a graphical interface that allows editing and creating new gene models. Researchers may upload their annotations to the BGD server for review and integration into the subsequent release of the OGS. Gene pages display information for individual OGS gene models, including gene structure, transcript variants, functional descriptions, gene symbols, Gene Ontology terms, annotator comments and links to National Center for Biotechnology Information and Ensembl. Each gene page is linked to a wiki page to allow input from the research community.
Collapse
|
70
|
Rios JJ, Fleming JGW, Bryant UK, Carter CN, Huber JC, Long MT, Spencer TE, Adelson DL. OAS1 polymorphisms are associated with susceptibility to West Nile encephalitis in horses. PLoS One 2010; 5:e10537. [PMID: 20479874 PMCID: PMC2866329 DOI: 10.1371/journal.pone.0010537] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2009] [Accepted: 04/18/2010] [Indexed: 12/13/2022] Open
Abstract
West Nile virus, first identified within the United States in 1999, has since spread across the continental states and infected birds, humans and domestic animals, resulting in numerous deaths. Previous studies in mice identified the Oas1b gene, a member of the OAS/RNASEL innate immune system, as a determining factor for resistance to West Nile virus (WNV) infection. A recent case-control association study described mutations of human OAS1 associated with clinical susceptibility to WNV infection. Similar studies in horses, a particularly susceptible species, have been lacking, in part, because of the difficulty in collecting populations sufficiently homogenous in their infection and disease states. The equine OAS gene cluster most closely resembles the human cluster, with single copies of OAS1, OAS3 and OAS2 in the same orientation. With naturally occurring susceptible and resistant sub-populations to lethal West Nile encephalitis, we undertook a case-control association study to investigate whether, similar to humans (OAS1) and mice (Oas1b), equine OAS1 plays a role in resistance to severe WNV infection. We identified naturally occurring single nucleotide mutations in equine (Equus caballus) OAS1 and RNASEL genes and, using Fisher's Exact test, we provide evidence that mutations in equine OAS1 contribute to host susceptibility. Virtually all of the associated OAS1 polymorphisms were located within the interferon-inducible promoter, suggesting that differences in OAS1 gene expression may determine the host's ability to resist clinical manifestations associated with WNV infection.
Collapse
|
71
|
Wade CM, Giulotto E, Sigurdsson S, Zoli M, Gnerre S, Imsland F, Lear TL, Adelson DL, Bailey E, Bellone RR, Blöcker H, Distl O, Edgar RC, Garber M, Leeb T, Mauceli E, MacLeod JN, Penedo MCT, Raison JM, Sharpe T, Vogel J, Andersson L, Antczak DF, Biagi T, Binns MM, Chowdhary BP, Coleman SJ, Della Valle G, Fryc S, Guérin G, Hasegawa T, Hill EW, Jurka J, Kiialainen A, Lindgren G, Liu J, Magnani E, Mickelson JR, Murray J, Nergadze SG, Onofrio R, Pedroni S, Piras MF, Raudsepp T, Rocchi M, Røed KH, Ryder OA, Searle S, Skow L, Swinburne JE, Syvänen AC, Tozaki T, Valberg SJ, Vaudin M, White JR, Zody MC, Lander ES, Lindblad-Toh K. Genome sequence, comparative analysis, and population genetics of the domestic horse. Science 2009; 326:865-7. [PMID: 19892987 PMCID: PMC3785132 DOI: 10.1126/science.1178158] [Citation(s) in RCA: 555] [Impact Index Per Article: 37.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
Abstract
We report a high-quality draft sequence of the genome of the horse (Equus caballus). The genome is relatively repetitive but has little segmental duplication. Chromosomes appear to have undergone few historical rearrangements: 53% of equine chromosomes show conserved synteny to a single human chromosome. Equine chromosome 11 is shown to have an evolutionary new centromere devoid of centromeric satellite DNA, suggesting that centromeric function may arise before satellite repeat accumulation. Linkage disequilibrium, showing the influences of early domestication of large herds of female horses, is intermediate in length between dog and human, and there is long-range haplotype sharing among breeds.
Collapse
|
72
|
Satterfield MC, Song G, Kochan KJ, Riggs PK, Simmons RM, Elsik CG, Adelson DL, Bazer FW, Zhou H, Spencer TE. Discovery of candidate genes and pathways in the endometrium regulating ovine blastocyst growth and conceptus elongation. Physiol Genomics 2009; 39:85-99. [DOI: 10.1152/physiolgenomics.00001.2009] [Citation(s) in RCA: 69] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023] Open
Abstract
Establishment of pregnancy in ruminants requires blastocyst growth to form an elongated conceptus that produces interferon tau, the pregnancy recognition signal, and initiates implantation. Blastocyst growth and development requires secretions from the uterine endometrium. An early increase in circulating concentrations of progesterone (P4) stimulates blastocyst growth and elongation in ruminants. This study utilized sheep as a model to identify candidate genes and regulatory networks in the endometrium that govern preimplantation blastocyst growth and development. Ewes were treated daily with either P4 or corn oil vehicle from day 1.5 after mating to either day 9 or day 12 of pregnancy when endometrium was obtained by hysterectomy. Microarray analyses revealed many differentially expressed genes in the endometria affected by day of pregnancy and early P4 treatment. In situ hybridization analyses revealed that many differentially expressed genes were expressed in a cell-specific manner within the endometrium. The Database for Annotation, Visualization, and Integrated Discovery (DAVID) was used to identify functional groups of genes and biological processes in the endometrium that are associated with growth and development of preimplantation blastocysts. Notably, biological processes affected by day of pregnancy and/or early P4 treatment included lipid biosynthesis and metabolism, angiogenesis, transport, extracellular space, defense and inflammatory response, proteolysis, amino acid transport and metabolism, and hormone metabolism. This transcriptomic data provides novel insights into the biology of endometrial function and preimplantation blastocyst growth and development in sheep.
Collapse
|
73
|
Adelson DL, Raison JM, Edgar RC. Characterization and distribution of retrotransposons and simple sequence repeats in the bovine genome. Proc Natl Acad Sci U S A 2009; 106:12855-60. [PMID: 19625614 PMCID: PMC2722308 DOI: 10.1073/pnas.0901282106] [Citation(s) in RCA: 73] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2009] [Indexed: 12/11/2022] Open
Abstract
Interspersed repeat composition and distribution in mammals have been best characterized in the human and mouse genomes. The bovine genome contains typical eutherian mammal repeats, but also has a significant number of long interspersed nuclear element RTE (BovB) elements proposed to have been horizontally transferred from squamata. Our analysis of the BovB repeats has indicated that only a few of them are currently likely to retrotranspose in cattle. However, bovine L1 repeats (L1 BT) have many likely active copies. Comparison of substitution rates for BovB and L1 BT indicates that L1 BT is a younger repeat family than BovB. In contrast to mouse and human, L1 occurrence is not negatively correlated with G+C content. However, BovB, Bov A2, ART2A, and Bov-tA are negatively correlated with G+C, although Bov-tAs correlation is weaker. Also, by performing genome wide correlation analysis of interspersed and simple sequence repeats, we have identified genome territories by repeat content that appear to define ancestral vs. ruminant-specific genomic regions. These ancestral regions, enriched with L2 and MIR repeats, are largely conserved between bovine and human.
Collapse
|
74
|
Elsik CG, Tellam RL, Worley KC, Gibbs RA, Muzny DM, Weinstock GM, Adelson DL, Eichler EE, Elnitski L, Guigó R, Hamernik DL, Kappes SM, Lewin HA, Lynn DJ, Nicholas FW, Reymond A, Rijnkels M, Skow LC, Zdobnov EM, Schook L, Womack J, Alioto T, Antonarakis SE, Astashyn A, Chapple CE, Chen HC, Chrast J, Câmara F, Ermolaeva O, Henrichsen CN, Hlavina W, Kapustin Y, Kiryutin B, Kitts P, Kokocinski F, Landrum M, Maglott D, Pruitt K, Sapojnikov V, Searle SM, Solovyev V, Souvorov A, Ucla C, Wyss C, Anzola JM, Gerlach D, Elhaik E, Graur D, Reese JT, Edgar RC, McEwan JC, Payne GM, Raison JM, Junier T, Kriventseva EV, Eyras E, Plass M, Donthu R, Larkin DM, Reecy J, Yang MQ, Chen L, Cheng Z, Chitko-McKown CG, Liu GE, Matukumalli LK, Song J, Zhu B, Bradley DG, Brinkman FSL, Lau LPL, Whiteside MD, Walker A, Wheeler TT, Casey T, German JB, Lemay DG, Maqbool NJ, Molenaar AJ, Seo S, Stothard P, Baldwin CL, Baxter R, Brinkmeyer-Langford CL, Brown WC, Childers CP, Connelley T, Ellis SA, Fritz K, Glass EJ, Herzig CTA, Iivanainen A, Lahmers KK, Bennett AK, Dickens CM, Gilbert JGR, Hagen DE, Salih H, Aerts J, Caetano AR, Dalrymple B, Garcia JF, Gill CA, Hiendleder SG, Memili E, Spurlock D, Williams JL, Alexander L, Brownstein MJ, Guan L, Holt RA, Jones SJM, Marra MA, Moore R, Moore SS, Roberts A, Taniguchi M, Waterman RC, Chacko J, Chandrabose MM, Cree A, Dao MD, Dinh HH, Gabisi RA, Hines S, Hume J, Jhangiani SN, Joshi V, Kovar CL, Lewis LR, Liu YS, Lopez J, Morgan MB, Nguyen NB, Okwuonu GO, Ruiz SJ, Santibanez J, Wright RA, Buhay C, Ding Y, Dugan-Rocha S, Herdandez J, Holder M, Sabo A, Egan A, Goodell J, Wilczek-Boney K, Fowler GR, Hitchens ME, Lozado RJ, Moen C, Steffen D, Warren JT, Zhang J, Chiu R, Schein JE, Durbin KJ, Havlak P, Jiang H, Liu Y, Qin X, Ren Y, Shen Y, Song H, Bell SN, Davis C, Johnson AJ, Lee S, Nazareth LV, Patel BM, Pu LL, Vattathil S, Williams RL, Curry S, Hamilton C, Sodergren E, Wheeler DA, Barris W, Bennett GL, Eggen A, Green RD, Harhay GP, Hobbs M, Jann O, Keele JW, Kent MP, Lien S, McKay SD, McWilliam S, Ratnakumar A, Schnabel RD, Smith T, Snelling WM, Sonstegard TS, Stone RT, Sugimoto Y, Takasuga A, Taylor JF, Van Tassell CP, Macneil MD, Abatepaulo ARR, Abbey CA, Ahola V, Almeida IG, Amadio AF, Anatriello E, Bahadue SM, Biase FH, Boldt CR, Carroll JA, Carvalho WA, Cervelatti EP, Chacko E, Chapin JE, Cheng Y, Choi J, Colley AJ, de Campos TA, De Donato M, Santos IKFDM, de Oliveira CJF, Deobald H, Devinoy E, Donohue KE, Dovc P, Eberlein A, Fitzsimmons CJ, Franzin AM, Garcia GR, Genini S, Gladney CJ, Grant JR, Greaser ML, Green JA, Hadsell DL, Hakimov HA, Halgren R, Harrow JL, Hart EA, Hastings N, Hernandez M, Hu ZL, Ingham A, Iso-Touru T, Jamis C, Jensen K, Kapetis D, Kerr T, Khalil SS, Khatib H, Kolbehdari D, Kumar CG, Kumar D, Leach R, Lee JCM, Li C, Logan KM, Malinverni R, Marques E, Martin WF, Martins NF, Maruyama SR, Mazza R, McLean KL, Medrano JF, Moreno BT, Moré DD, Muntean CT, Nandakumar HP, Nogueira MFG, Olsaker I, Pant SD, Panzitta F, Pastor RCP, Poli MA, Poslusny N, Rachagani S, Ranganathan S, Razpet A, Riggs PK, Rincon G, Rodriguez-Osorio N, Rodriguez-Zas SL, Romero NE, Rosenwald A, Sando L, Schmutz SM, Shen L, Sherman L, Southey BR, Lutzow YS, Sweedler JV, Tammen I, Telugu BPVL, Urbanski JM, Utsunomiya YT, Verschoor CP, Waardenberg AJ, Wang Z, Ward R, Weikard R, Welsh TH, White SN, Wilming LG, Wunderlich KR, Yang J, Zhao FQ. The genome sequence of taurine cattle: a window to ruminant biology and evolution. Science 2009; 324:522-8. [PMID: 19390049 DOI: 10.1126/science.1169588] [Citation(s) in RCA: 806] [Impact Index Per Article: 53.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023]
Abstract
To understand the biology and evolution of ruminants, the cattle genome was sequenced to about sevenfold coverage. The cattle genome contains a minimum of 22,000 genes, with a core set of 14,345 orthologs shared among seven mammalian species of which 1217 are absent or undetected in noneutherian (marsupial or monotreme) genomes. Cattle-specific evolutionary breakpoint regions in chromosomes have a higher density of segmental duplications, enrichment of repetitive elements, and species-specific variations in genes associated with lactation and immune responsiveness. Genes involved in metabolism are generally highly conserved, although five metabolic genes are deleted or extensively diverged from their human orthologs. The cattle genome sequence thus provides a resource for understanding mammalian evolution and accelerating livestock genetic improvement for milk and meat production.
Collapse
|
75
|
Salih H, Adelson DL. QTL global meta-analysis: are trait determining genes clustered? BMC Genomics 2009; 10:184. [PMID: 19393059 PMCID: PMC2683869 DOI: 10.1186/1471-2164-10-184] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2008] [Accepted: 04/24/2009] [Indexed: 12/04/2022] Open
Abstract
Background A key open question in biology is if genes are physically clustered with respect to their known functions or phenotypic effects. This is of particular interest for Quantitative Trait Loci (QTL) where a QTL region could contain a number of genes that contribute to the trait being measured. Results We observed a significant increase in gene density within QTL regions compared to non-QTL regions and/or the entire bovine genome. By grouping QTL from the Bovine QTL Viewer database into 8 categories of non-redundant regions, we have been able to analyze gene density and gene function distribution, based on Gene Ontology (GO) with relation to their location within QTL regions, outside of QTL regions and across the entire bovine genome. We identified a number of GO terms that were significantly over represented within particular QTL categories. Furthermore, select GO terms expected to be associated with the QTL category based on common biological knowledge have also proved to be significantly over represented in QTL regions. Conclusion Our analysis provides evidence of over represented GO terms in QTL regions. This increased GO term density indicates possible clustering of gene functions within QTL regions of the bovine genome. Genes with similar functions may be grouped in specific locales and could be contributing to QTL traits. Moreover, we have identified over-represented GO terminology that from a biological standpoint, makes sense with respect to QTL category type.
Collapse
|