1
|
Backofen R, Gorodkin J, Hofacker IL, Stadler PF. Comparative RNA Genomics. Methods Mol Biol 2024; 2802:347-393. [PMID: 38819565 DOI: 10.1007/978-1-0716-3838-5_12] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/01/2024]
Abstract
Over the last quarter of a century it has become clear that RNA is much more than just a boring intermediate in protein expression. Ancient RNAs still appear in the core information metabolism and comprise a surprisingly large component in bacterial gene regulation. A common theme with these types of mostly small RNAs is their reliance of conserved secondary structures. Large-scale sequencing projects, on the other hand, have profoundly changed our understanding of eukaryotic genomes. Pervasively transcribed, they give rise to a plethora of large and evolutionarily extremely flexible non-coding RNAs that exert a vastly diverse array of molecule functions. In this chapter we provide a-necessarily incomplete-overview of the current state of comparative analysis of non-coding RNAs, emphasizing computational approaches as a means to gain a global picture of the modern RNA world.
Collapse
Affiliation(s)
- Rolf Backofen
- Bioinformatics Group, Department of Computer Science, University of Freiburg, Freiburg, Germany
- Center for Non-coding RNA in Technology and Health, University of Copenhagen, Frederiksberg, Denmark
| | - Jan Gorodkin
- Center for Non-coding RNA in Technology and Health, Department of Veterinary and Animal Sciences, University of Copenhagen, Frederiksberg, Denmark
| | - Ivo L Hofacker
- Institute for Theoretical Chemistry, University of Vienna, Wien, Austria
- Bioinformatics and Computational Biology research group, University of Vienna, Vienna, Austria
- Center for Non-coding RNA in Technology and Health, University of Copenhagen, Frederiksberg, Denmark
| | - Peter F Stadler
- Bioinformatics Group, Department of Computer Science, University of Leipzig, Leipzig, Germany.
- Interdisciplinary Center for Bioinformatics, University of Leipzig, Leipzig, Germany.
- Max Planck Institute for Mathematics in the Sciences, Leipzig, Germany.
- Universidad National de Colombia, Bogotá, Colombia.
- Institute for Theoretical Chemistry, University of Vienna, Wien, Austria.
- Center for Non-coding RNA in Technology and Health, University of Copenhagen, Frederiksberg, Denmark.
- Santa Fe Institute, Santa Fe, NM, USA.
| |
Collapse
|
2
|
Gley K, Hadlich F, Trakooljul N, Haack F, Murani E, Gimsa U, Wimmers K, Ponsuksili S. Multi-Transcript Level Profiling Revealed Distinct mRNA, miRNA, and tRNA-Derived Fragment Bio-Signatures for Coping Behavior Linked Haplotypes in HPA Axis and Limbic System. Front Genet 2021; 12:635794. [PMID: 34490028 PMCID: PMC8417057 DOI: 10.3389/fgene.2021.635794] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2020] [Accepted: 08/03/2021] [Indexed: 01/10/2023] Open
Abstract
The molecular basis of porcine coping behavior (CB) relies on a sophisticated interplay of genetic and epigenetic features. Deep sequencing technologies allowed the identification of a plethora of new regulatory small non-coding RNA (sncRNA). We characterized mRNA and sncRNA profiles of central parts of the physiological stress response system including amygdala, hippocampus, hypothalamus and adrenal gland using systems biology for integration. Therefore, ten each of high- (HR) and low- (LR) reactive pigs (n = 20) carrying a CB associated haplotype in a prominent QTL-region on SSC12 were selected for mRNA and sncRNA expression profiling. The molecular markers related to the LR group included ATP1B2, MPDU1, miR-19b-5p, let-7g-5p, and 5′-tiRNALeu in the adrenal gland, miR-194a-5p, miR-125a-5p, miR-7-1-5p, and miR-107-5p in the hippocampus and CBL and PVRL1 in the hypothalamus. Interestingly, amygdalae of the LR group showed 5′-tiRNA and 5′-tRF (5′-tRFLys, 5′-tiRNALys, 5′-tiRNACys, and 5′-tiRNAGln) enrichment. Contrarily, molecular markers associated with the HR group encompassed miR-26b-5p, tRNAArg, tRNAGlyiF in the adrenal gland, IGF1 and APOD in the amygdala and PBX1, TOB1, and C18orf1 in the hippocampus and miR-24 in the hypothalamus. In addition, hypothalami of the HR group were characterized by 3′-tiRNA enrichment (3′-tiRNAGln, 3′-tiRNAAsn, 3′-tiRNAVal, 3′-tRFPro, 3′-tiRNACys, and 3′-tiRNAAla) and 3′-tRFs enrichment (3′-tRFAsn, 3′-tRFGlu, and 3′-tRFVal). These evidence suggest that tRNA-derived fragments and their cleavage activity are a specific marker for coping behavior. Data integration revealed new bio-signatures of important molecular interactions on a multi-transcript level in HPA axis and limbic system of pigs carrying a CB-associated haplotype.
Collapse
Affiliation(s)
- Kevin Gley
- Leibniz Institute for Farm Animal Biology (FBN), Institute of Genome Biology, Dummerstorf, Germany
| | - Frieder Hadlich
- Leibniz Institute for Farm Animal Biology (FBN), Institute of Genome Biology, Dummerstorf, Germany
| | - Nares Trakooljul
- Leibniz Institute for Farm Animal Biology (FBN), Institute of Genome Biology, Dummerstorf, Germany
| | - Fiete Haack
- Leibniz Institute for Farm Animal Biology (FBN), Institute of Genome Biology, Dummerstorf, Germany
| | - Eduard Murani
- Leibniz Institute for Farm Animal Biology (FBN), Institute of Genome Biology, Dummerstorf, Germany
| | - Ulrike Gimsa
- Leibniz Institute for Farm Animal Biology (FBN), Institute of Behavioral Physiology, Dummerstorf, Germany
| | - Klaus Wimmers
- Leibniz Institute for Farm Animal Biology (FBN), Institute of Genome Biology, Dummerstorf, Germany
| | - Siriluck Ponsuksili
- Leibniz Institute for Farm Animal Biology (FBN), Institute of Genome Biology, Dummerstorf, Germany
| |
Collapse
|
3
|
Geissler AS, Anthon C, Alkan F, González-Tortuero E, Poulsen LD, Kallehauge TB, Breüner A, Seemann SE, Vinther J, Gorodkin J. BSGatlas: a unified Bacillus subtilis genome and transcriptome annotation atlas with enhanced information access. Microb Genom 2021; 7:000524. [PMID: 33539279 PMCID: PMC8208703 DOI: 10.1099/mgen.0.000524] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2020] [Accepted: 01/11/2021] [Indexed: 12/26/2022] Open
Abstract
A large part of our current understanding of gene regulation in Gram-positive bacteria is based on Bacillus subtilis, as it is one of the most well studied bacterial model systems. The rapid growth in data concerning its molecular and genomic biology is distributed across multiple annotation resources. Consequently, the interpretation of data from further B. subtilis experiments becomes increasingly challenging in both low- and large-scale analyses. Additionally, B. subtilis annotation of structured RNA and non-coding RNA (ncRNA), as well as the operon structure, is still lagging behind the annotation of the coding sequences. To address these challenges, we created the B. subtilis genome atlas, BSGatlas, which integrates and unifies multiple existing annotation resources. Compared to any of the individual resources, the BSGatlas contains twice as many ncRNAs, while improving the positional annotation for 70 % of the ncRNAs. Furthermore, we combined known transcription start and termination sites with lists of known co-transcribed gene sets to create a comprehensive transcript map. The combination with transcription start/termination site annotations resulted in 717 new sets of co-transcribed genes and 5335 untranslated regions (UTRs). In comparison to existing resources, the number of 5' and 3' UTRs increased nearly fivefold, and the number of internal UTRs doubled. The transcript map is organized in 2266 operons, which provides transcriptional annotation for 92 % of all genes in the genome compared to the at most 82 % by previous resources. We predicted an off-target-aware genome-wide library of CRISPR-Cas9 guide RNAs, which we also linked to polycistronic operons. We provide the BSGatlas in multiple forms: as a website (https://rth.dk/resources/bsgatlas/), an annotation hub for display in the UCSC genome browser, supplementary tables and standardized GFF3 format, which can be used in large scale -omics studies. By complementing existing resources, the BSGatlas supports analyses of the B. subtilis genome and its molecular biology with respect to not only non-coding genes but also genome-wide transcriptional relationships of all genes.
Collapse
Affiliation(s)
- Adrian Sven Geissler
- Center for Non-coding RNA in Technology and Health, Department of Veterinary and Animal Sciences, University of Copenhagen, 1871 Frederiksberg, Denmark
| | - Christian Anthon
- Center for Non-coding RNA in Technology and Health, Department of Veterinary and Animal Sciences, University of Copenhagen, 1871 Frederiksberg, Denmark
| | - Ferhat Alkan
- Center for Non-coding RNA in Technology and Health, Department of Veterinary and Animal Sciences, University of Copenhagen, 1871 Frederiksberg, Denmark
- Division of Oncogenomics, Netherlands Cancer Institute, 1066 CX Amsterdam, The Netherlands
| | - Enrique González-Tortuero
- Center for Non-coding RNA in Technology and Health, Department of Veterinary and Animal Sciences, University of Copenhagen, 1871 Frederiksberg, Denmark
- Present address: School of Science, Engineering and Environment, University of Salford, Salford, UK
| | - Line Dahl Poulsen
- Section for Computational and RNA Biology, Department of Biology, University of Copenhagen, 1165 Copenhagen, Denmark
| | | | | | - Stefan Ernst Seemann
- Center for Non-coding RNA in Technology and Health, Department of Veterinary and Animal Sciences, University of Copenhagen, 1871 Frederiksberg, Denmark
| | - Jeppe Vinther
- Section for Computational and RNA Biology, Department of Biology, University of Copenhagen, 1165 Copenhagen, Denmark
| | - Jan Gorodkin
- Center for Non-coding RNA in Technology and Health, Department of Veterinary and Animal Sciences, University of Copenhagen, 1871 Frederiksberg, Denmark
| |
Collapse
|
4
|
Pandey M, Kumar R, Srivastava P, Agarwal S, Srivastava S, Nagpure NS, Jena JK, Kushwaha B. WGSSAT: A High-Throughput Computational Pipeline for Mining and Annotation of SSR Markers From Whole Genomes. J Hered 2019; 109:339-343. [PMID: 28992259 DOI: 10.1093/jhered/esx075] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2017] [Accepted: 09/14/2017] [Indexed: 12/12/2022] Open
Abstract
Mining and characterization of Simple Sequence Repeat (SSR) markers from whole genomes provide valuable information about biological significance of SSR distribution and also facilitate development of markers for genetic analysis. Whole genome sequencing (WGS)-SSR Annotation Tool (WGSSAT) is a graphical user interface pipeline developed using Java Netbeans and Perl scripts which facilitates in simplifying the process of SSR mining and characterization. WGSSAT takes input in FASTA format and automates the prediction of genes, noncoding RNA (ncRNA), core genes, repeats and SSRs from whole genomes followed by mapping of the predicted SSRs onto a genome (classified according to genes, ncRNA, repeats, exonic, intronic, and core gene region) along with primer identification and mining of cross-species markers. The program also generates a detailed statistical report along with visualization of mapped SSRs, genes, core genes, and RNAs. The features of WGSSAT were demonstrated using Takifugu rubripes data. This yielded a total of 139 057 SSR, out of which 113 703 SSR primer pairs were uniquely amplified in silico onto a T. rubripes (fugu) genome. Out of 113 703 mined SSRs, 81 463 were from coding region (including 4286 exonic and 77 177 intronic), 7 from RNA, 267 from core genes of fugu, whereas 105 641 SSR and 601 SSR primer pairs were uniquely mapped onto the medaka genome. WGSSAT is tested under Ubuntu Linux. The source code, documentation, user manual, example dataset and scripts are available online at https://sourceforge.net/projects/wgssat-nbfgr.
Collapse
Affiliation(s)
- Manmohan Pandey
- Division of Molecular Biology and Biotechnology, ICAR-National Bureau of Fish Genetic Resources, Lucknow, India.,AMITY Institute of Biotechnology, AMITY University, Uttar Pradesh, Lucknow Campus, Lucknow, India
| | - Ravindra Kumar
- Division of Molecular Biology and Biotechnology, ICAR-National Bureau of Fish Genetic Resources, Lucknow, India
| | - Prachi Srivastava
- AMITY Institute of Biotechnology, AMITY University, Uttar Pradesh, Lucknow Campus, Lucknow, India
| | - Suyash Agarwal
- Division of Molecular Biology and Biotechnology, ICAR-National Bureau of Fish Genetic Resources, Lucknow, India
| | - Shreya Srivastava
- Division of Molecular Biology and Biotechnology, ICAR-National Bureau of Fish Genetic Resources, Lucknow, India
| | - Naresh S Nagpure
- Division of Molecular Biology and Biotechnology, ICAR-National Bureau of Fish Genetic Resources, Lucknow, India
| | - Joy K Jena
- Division of Molecular Biology and Biotechnology, ICAR-National Bureau of Fish Genetic Resources, Lucknow, India
| | - Basdeo Kushwaha
- Division of Molecular Biology and Biotechnology, ICAR-National Bureau of Fish Genetic Resources, Lucknow, India
| |
Collapse
|
5
|
Haack F, Trakooljul N, Gley K, Murani E, Hadlich F, Wimmers K, Ponsuksili S. Deep sequencing of small non-coding RNA highlights brain-specific expression patterns and RNA cleavage. RNA Biol 2019; 16:1764-1774. [PMID: 31432767 DOI: 10.1080/15476286.2019.1657743] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
With the advance of high-throughput sequencing technology numerous new regulatory small RNAs have been identified, that broaden the variety of processing mechanisms and functions of non-coding RNA. Here we explore small non-coding RNA (sncRNA) expression in central parts of the physiological stress and anxiety response system. Therefore, we characterize the sncRNA profile of tissue samples from Amygdala, Hippocampus, Hypothalamus and Adrenal Gland, obtained from 20 pigs. Our analysis reveals that all tissues but Amygdala and Hippocampus possess distinct, tissue-specific expression pattern of miRNA that are associated with Hypoxia, stress responses as well as memory and fear conditioning. In particular, we observe marked differences in the expression profile of limbic tissues compared to those associated to the HPA/stress axis, with a surprisingly high aggregation of 3´-tRNA halves in Amygdala and Hippocampus. Since regulation of sncRNA and RNA cleavage plays a pivotal role in the central nervous system, our work provides seminal insights in the role/involvement of sncRNA in the transcriptional and post-transcriptional regulation of negative emotion, stress and coping behaviour in pigs, and mammals in general.
Collapse
Affiliation(s)
- Fiete Haack
- Institute for Genome Biology, Functional Genome Analysis Research Unit, Leibniz Institute for Farm Animal Biology (FBN), Dummerstorf, Germany
| | - Nares Trakooljul
- Institute for Genome Biology, Genomics Research Unit, Leibniz Institute for Farm Animal Biology (FBN), Dummerstorf, Germany
| | - Kevin Gley
- Institute for Genome Biology, Functional Genome Analysis Research Unit, Leibniz Institute for Farm Animal Biology (FBN), Dummerstorf, Germany
| | - Eduard Murani
- Institute for Genome Biology, Genomics Research Unit, Leibniz Institute for Farm Animal Biology (FBN), Dummerstorf, Germany
| | - Frieder Hadlich
- Institute for Genome Biology, Functional Genome Analysis Research Unit, Leibniz Institute for Farm Animal Biology (FBN), Dummerstorf, Germany
| | - Klaus Wimmers
- Institute for Genome Biology, Genomics Research Unit, Leibniz Institute for Farm Animal Biology (FBN), Dummerstorf, Germany.,Faculty of Agricultural and Environmental Sciences, University Rostock, Rostock, Germany
| | - Siriluck Ponsuksili
- Institute for Genome Biology, Functional Genome Analysis Research Unit, Leibniz Institute for Farm Animal Biology (FBN), Dummerstorf, Germany
| |
Collapse
|
6
|
Giuffra E, Tuggle CK. Functional Annotation of Animal Genomes (FAANG): Current Achievements and Roadmap. Annu Rev Anim Biosci 2018; 7:65-88. [PMID: 30427726 DOI: 10.1146/annurev-animal-020518-114913] [Citation(s) in RCA: 99] [Impact Index Per Article: 16.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Functional annotation of genomes is a prerequisite for contemporary basic and applied genomic research, yet farmed animal genomics is deficient in such annotation. To address this, the FAANG (Functional Annotation of Animal Genomes) Consortium is producing genome-wide data sets on RNA expression, DNA methylation, and chromatin modification, as well as chromatin accessibility and interactions. In addition to informing our understanding of genome function, including comparative approaches to elucidate constrained sequence or epigenetic elements, these annotation maps will improve the precision and sensitivity of genomic selection for animal improvement. A scientific community-driven effort has already created a coordinated data collection and analysis enterprise crucial for the success of this global effort. Although it is early in this continuing process, functional data have already been produced and application to genetic improvement reported. The functional annotation delivered by the FAANG initiative will add value and utility to the greatly improved genome sequences being established for domesticated animal species.
Collapse
Affiliation(s)
- Elisabetta Giuffra
- Génétique Animale et Biologie Intégrative (GABI), Institut National de la Recherche Agronomique (INRA), AgroParisTech, Université Paris Saclay, 78350 Jouy-en-Josas, France;
| | | | | |
Collapse
|
7
|
Pan X, Wenzel A, Jensen LJ, Gorodkin J. Genome-wide identification of clusters of predicted microRNA binding sites as microRNA sponge candidates. PLoS One 2018; 13:e0202369. [PMID: 30142196 PMCID: PMC6108476 DOI: 10.1371/journal.pone.0202369] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2018] [Accepted: 08/01/2018] [Indexed: 12/21/2022] Open
Abstract
The number of discovered natural miRNA sponges in plants, viruses, and mammals is increasing steadily. Some sponges like ciRS-7 for miR-7 contain multiple nearby miRNA binding sites. We hypothesize that such clusters of miRNA binding sites on the genome can function together as a sponge. No systematic effort has been made in search for clusters of miRNA targets. Here, we, to our knowledge, make the first genome-wide target site predictions for clusters of mature human miRNAs. For each miRNA, we predict the target sites on a genome-wide scale, build a graph with edge weights based on the pairwise distances between sites, and apply Markov clustering to identify genomic regions with high binding site density. Significant clusters are then extracted based on cluster size difference between real and shuffled genomes preserving local properties such as the GC content. We then use conservation and binding energy to filter a final set of miRNA target site clusters or sponge candidates. Our pipeline predicts 3673 sponge candidates for 1250 miRNAs, including the experimentally verified miR-7 sponge ciRS-7. In addition, we point explicitly to 19 high-confidence candidates overlapping annotated genomic sequence. The full list of candidates is freely available at http://rth.dk/resources/mirnasponge, where detailed properties for individual candidates can be explored, such as alignment details, conservation, accessibility and target profiles, which facilitates selection of sponge candidates for further context specific analysis.
Collapse
Affiliation(s)
- Xiaoyong Pan
- Center for non-coding RNA in Technology and Health, Department of Veterinary and Animal Sciences, University of Copenhagen, Frederiksberg, Denmark
- Disease Systems Biology Program, Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Copenhagen, Denmark
| | - Anne Wenzel
- Center for non-coding RNA in Technology and Health, Department of Veterinary and Animal Sciences, University of Copenhagen, Frederiksberg, Denmark
| | - Lars Juhl Jensen
- Center for non-coding RNA in Technology and Health, Department of Veterinary and Animal Sciences, University of Copenhagen, Frederiksberg, Denmark
- Disease Systems Biology Program, Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Copenhagen, Denmark
- * E-mail: (LJJ); (JG)
| | - Jan Gorodkin
- Center for non-coding RNA in Technology and Health, Department of Veterinary and Animal Sciences, University of Copenhagen, Frederiksberg, Denmark
- * E-mail: (LJJ); (JG)
| |
Collapse
|
8
|
Fleming DS, Miller LC. Identification of small non-coding RNA classes expressed in swine whole blood during HP-PRRSV infection. Virology 2018; 517:56-61. [PMID: 29429554 DOI: 10.1016/j.virol.2018.01.027] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2018] [Accepted: 01/30/2018] [Indexed: 02/06/2023]
Abstract
It has been established that reduced susceptibility to porcine reproductive and respiratory syndrome virus (PRRSV) has a genetic component. This genetic component may take the form of small non-coding RNAs (sncRNA), which are molecules that function as regulators of gene expression. Various sncRNAs have emerged as having an important role in the immune system in humans. The study uses transcriptomic read counts to profile the type and quantity of both well and lesser characterized sncRNAs, such as microRNAs and small nucleolar RNAs to identify and quantify the classes of sncRNA expressed in whole blood between healthy and highly pathogenic PRRSV-infected pigs. Our results returned evidence on nine classes of sncRNA, four of which were consistently statistically significantly different based on Fisher's Exact Test, that can be detected and possibly interrogated for their effect on host dysregulation during PRRSV infections.
Collapse
Affiliation(s)
- Damarius S Fleming
- ORAU/ORISE, Oak Ridge, TN, USA; Virus and Prion Diseases of Livestock Research Unit, National Animal Disease Center, USDA, Agricultural Research Service, P.O. Box 70, Ames, IA 50010-0070, USA
| | - Laura C Miller
- Virus and Prion Diseases of Livestock Research Unit, National Animal Disease Center, USDA, Agricultural Research Service, P.O. Box 70, Ames, IA 50010-0070, USA.
| |
Collapse
|
9
|
Abstract
Over the last two decades it has become clear that RNA is much more than just a boring intermediate in protein expression. Ancient RNAs still appear in the core information metabolism and comprise a surprisingly large component in bacterial gene regulation. A common theme with these types of mostly small RNAs is their reliance of conserved secondary structures. Large scale sequencing projects, on the other hand, have profoundly changed our understanding of eukaryotic genomes. Pervasively transcribed, they give rise to a plethora of large and evolutionarily extremely flexible noncoding RNAs that exert a vastly diverse array of molecule functions. In this chapter we provide a-necessarily incomplete-overview of the current state of comparative analysis of noncoding RNAs, emphasizing computational approaches as a means to gain a global picture of the modern RNA world.
Collapse
Affiliation(s)
- Rolf Backofen
- Bioinformatics Group, Department of Computer Science, University of Freiburg, Georges-Köhler-Allee 106, D-79110 Freiburg, Germany.,Center for non-coding RNA in Technology and Health, Department of Veterinary and Animal Sciences, University of Copenhagen, Grønnegårdsvej 3, DK-1870 Frederiksberg C, Denmark
| | - Jan Gorodkin
- Center for non-coding RNA in Technology and Health, Department of Veterinary and Animal Sciences, University of Copenhagen, Grønnegårdsvej 3, DK-1870 Frederiksberg C, Denmark
| | - Ivo L Hofacker
- Center for non-coding RNA in Technology and Health, Department of Veterinary and Animal Sciences, University of Copenhagen, Grønnegårdsvej 3, DK-1870 Frederiksberg C, Denmark.,Institute for Theoretical Chemistry, University of Vienna, Währingerstraße 17, A-1090 Wien, Austria.,Bioinformatics and Computational Biology Research Group, University of Vienna, Währingerstraße 17, A-1090 Vienna, Austria
| | - Peter F Stadler
- Center for non-coding RNA in Technology and Health, Department of Veterinary and Animal Sciences, University of Copenhagen, Grønnegårdsvej 3, DK-1870 Frederiksberg C, Denmark. .,Institute for Theoretical Chemistry, University of Vienna, Währingerstraße 17, A-1090 Wien, Austria. .,Bioinformatics Group, Department of Computer Science, Interdisciplinary Center for Bioinformatics, University of Leipzig, Härtelstraße 16-18, D-04107 Leipzig, Germany. .,Max Planck Institute for Mathematics in the Sciences, Inselstraße 22, D-04103 Leipzig, Germany. .,Fraunhofer Institute for Cell Therapy and Immunology, Perlickstraße 1, D-04103 Leipzig, Germany. .,Santa Fe Institute, 1399 Hyde Park Rd, Santa Fe, NM 87501, USA.
| |
Collapse
|
10
|
van Son M, Kent MP, Grove H, Agarwal R, Hamland H, Lien S, Grindflek E. Fine mapping of a QTL affecting levels of skatole on pig chromosome 7. BMC Genet 2017; 18:85. [PMID: 29020941 PMCID: PMC5637327 DOI: 10.1186/s12863-017-0549-8] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2017] [Accepted: 09/11/2017] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Previous studies in the Norwegian pig breeds Landrace and Duroc have revealed a QTL for levels of skatole located in the region 74.7-80.5 Mb on SSC7. Skatole is one of the main components causing boar taint, which gives an undesirable smell and taste to the pig meat when heated. Surgical castration of boars is a common practice to reduce the risk of boar taint, however, a selection for boars genetically predisposed for low levels of taint would help eliminating the need for castration and be advantageous for both economic and welfare reasons. In order to identify the causal mutation(s) for the QTL and/or identify genetic markers for selection purposes we performed a fine mapping of the SSC7 skatole QTL region. RESULTS A dense set of markers on SSC7 was obtained by whole genome re-sequencing of 24 Norwegian Landrace and 23 Duroc boars. Subsets of 126 and 157 SNPs were used for association analyses in Landrace and Duroc, respectively. Significant single markers associated with skatole spanned a large 4.4 Mb region from 75.9-80.3 Mb in Landrace, with the highest test scores found in a region between the genes NOVA1 and TGM1 (p < 0.001). The same QTL was obtained in Duroc and, although less significant, with associated SNPs spanning a 1.2 Mb region from 78.9-80.1 Mb (p < 0.01). The highest test scores in Duroc were found in genes of the granzyme family (GZMB and GZMH-like) and STXBP6. Haplotypes associated with levels of skatole were identified in Landrace but not in Duroc, and a haplotype block was found to explain 2.3% of the phenotypic variation for skatole. The SNPs in this region were not associated with levels of sex steroids. CONCLUSIONS Fine mapping of a QTL for skatole on SSC7 confirmed associations of this region with skatole levels in pigs. The QTL region was narrowed down to 4.4 Mb in Landrace and haplotypes explaining 2.3% of the phenotypic variance for skatole levels were identified. Results confirmed that sex steroids are not affected by this QTL region, making these markers attractive for selection against boar taint.
Collapse
Affiliation(s)
- Maren van Son
- Topigs Norsvin, Storhamargata 44, 2317, Hamar, Norway.
| | - Matthew P Kent
- Centre for Integrative Genetics (CIGENE), Department of Animal and Aquacultural Sciences, Faculty of Biosciences, Norwegian University of Life Sciences, P. O. Box 5003, 1432, Ås, Norway
| | - Harald Grove
- Centre for Integrative Genetics (CIGENE), Department of Animal and Aquacultural Sciences, Faculty of Biosciences, Norwegian University of Life Sciences, P. O. Box 5003, 1432, Ås, Norway
| | - Rahul Agarwal
- Centre for Integrative Genetics (CIGENE), Department of Animal and Aquacultural Sciences, Faculty of Biosciences, Norwegian University of Life Sciences, P. O. Box 5003, 1432, Ås, Norway
| | - Hanne Hamland
- Topigs Norsvin, Storhamargata 44, 2317, Hamar, Norway
| | - Sigbjørn Lien
- Centre for Integrative Genetics (CIGENE), Department of Animal and Aquacultural Sciences, Faculty of Biosciences, Norwegian University of Life Sciences, P. O. Box 5003, 1432, Ås, Norway
| | - Eli Grindflek
- Topigs Norsvin, Storhamargata 44, 2317, Hamar, Norway
| |
Collapse
|
11
|
Kirk IK, Weinhold N, Brunak S, Belling K. The impact of the protein interactome on the syntenic structure of mammalian genomes. PLoS One 2017; 12:e0179112. [PMID: 28910296 PMCID: PMC5598925 DOI: 10.1371/journal.pone.0179112] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2016] [Accepted: 05/10/2017] [Indexed: 02/06/2023] Open
Abstract
Conserved synteny denotes evolutionary preserved gene order across species. It is not well understood to which degree functional relationships between genes are preserved in syntenic blocks. Here we investigate whether protein-coding genes conserved in mammalian syntenic blocks encode gene products that serve the common functional purpose of interacting at protein level, i.e. connectivity. High connectivity among protein-protein interactions (PPIs) was only moderately associated with conserved synteny on a genome-wide scale. However, we observed a smaller subset of 3.6% of all syntenic blocks with high-confidence PPIs that had significantly higher connectivity than expected by random. Additionally, syntenic blocks with high-confidence PPIs contained significantly more chromatin loops than the remaining blocks, indicating functional preservation among these syntenic blocks. Conserved synteny is typically defined by sequence similarity. In this study, we also examined whether a functional relationship, here PPI connectivity, can identify syntenic blocks independently of orthology. While orthology-based syntenic blocks with high-confident PPIs and the connectivity-based syntenic blocks largely overlapped, the connectivity-based approach identified additional syntenic blocks that were not found by conventional sequence-based methods alone. Additionally, the connectivity-based approach enabled identification of potential orthologous genes between species. Our analyses demonstrate that subsets of syntenic blocks are associated with highly connected proteins, and that PPI connectivity can be used to detect conserved synteny even if sequence conservation drifts beyond what orthology algorithms normally can identify.
Collapse
Affiliation(s)
- Isa Kristina Kirk
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Nils Weinhold
- Memorial Sloan Kettering Cancer Center, Computational Biology Program, New York, NY, United States of America
| | - Søren Brunak
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Kirstine Belling
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
- * E-mail:
| |
Collapse
|
12
|
Dawson HD, Chen C, Gaynor B, Shao J, Urban JF. The porcine translational research database: a manually curated, genomics and proteomics-based research resource. BMC Genomics 2017; 18:643. [PMID: 28830355 PMCID: PMC5568366 DOI: 10.1186/s12864-017-4009-7] [Citation(s) in RCA: 39] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2016] [Accepted: 08/02/2017] [Indexed: 12/20/2022] Open
Abstract
BACKGROUND The use of swine in biomedical research has increased dramatically in the last decade. Diverse genomic- and proteomic databases have been developed to facilitate research using human and rodent models. Current porcine gene databases, however, lack the robust annotation to study pig models that are relevant to human studies and for comparative evaluation with rodent models. Furthermore, they contain a significant number of errors due to their primary reliance on machine-based annotation. To address these deficiencies, a comprehensive literature-based survey was conducted to identify certain selected genes that have demonstrated function in humans, mice or pigs. RESULTS The process identified 13,054 candidate human, bovine, mouse or rat genes/proteins used to select potential porcine homologs by searching multiple online sources of porcine gene information. The data in the Porcine Translational Research Database (( http://www.ars.usda.gov/Services/docs.htm?docid=6065 ) is supported by >5800 references, and contains 65 data fields for each entry, including >9700 full length (5' and 3') unambiguous pig sequences, >2400 real time PCR assays and reactivity information on >1700 antibodies. It also contains gene and/or protein expression data for >2200 genes and identifies and corrects 8187 errors (gene duplications artifacts, mis-assemblies, mis-annotations, and incorrect species assignments) for 5337 porcine genes. CONCLUSIONS This database is the largest manually curated database for any single veterinary species and is unique among porcine gene databases in regard to linking gene expression to gene function, identifying related gene pathways, and connecting data with other porcine gene databases. This database provides the first comprehensive description of three major Super-families or functionally related groups of proteins (Cluster of Differentiation (CD) Marker genes, Solute Carrier Superfamily, ATP binding Cassette Superfamily), and a comparative description of porcine microRNAs.
Collapse
Affiliation(s)
- Harry D Dawson
- United States Department of Agriculture, Agricultural Research Service, Beltsville Human Nutrition Research Center, Diet, Genomics and Immunology Laboratory, Beltsville, MD, USA.
| | - Celine Chen
- United States Department of Agriculture, Agricultural Research Service, Beltsville Human Nutrition Research Center, Diet, Genomics and Immunology Laboratory, Beltsville, MD, USA
| | - Brady Gaynor
- United States Department of Agriculture, Agricultural Research Service, Beltsville Agricultural Research Center, Molecular Plant Pathology Lab, Beltsville, MD, 20705, USA
| | - Jonathan Shao
- United States Department of Agriculture, Agricultural Research Service, Beltsville Agricultural Research Center, Molecular Plant Pathology Lab, Beltsville, MD, 20705, USA
| | - Joseph F Urban
- United States Department of Agriculture, Agricultural Research Service, Beltsville Human Nutrition Research Center, Diet, Genomics and Immunology Laboratory, Beltsville, MD, USA
| |
Collapse
|
13
|
Structural Variant Detection by Large-scale Sequencing Reveals New Evolutionary Evidence on Breed Divergence between Chinese and European Pigs. Sci Rep 2016; 6:18501. [PMID: 26729041 PMCID: PMC4700453 DOI: 10.1038/srep18501] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2015] [Accepted: 11/19/2015] [Indexed: 01/28/2023] Open
Abstract
In this study, we performed a genome-wide SV detection among the genomes of thirteen pigs from diverse Chinese and European originated breeds by next genetation sequencing, and constrcuted a single-nucleotide resolution map involving 56,930 putative SVs. We firstly identified a SV hotspot spanning 35 Mb region on the X chromosome specifically in the genomes of Chinese originated individuals. Further scrutinizing this region by large-scale sequencing data of extra 111 individuals, we obtained the confirmatory evidence on our initial finding. Moreover, thirty five SV-related genes within the hotspot region, being of importance for reproduction ability, rendered significant different evolution rates between Chinese and European originated breeds. The SV hotspot identified herein offers a novel evidence for assessing phylogenetic relationships, as well as likely explains the genetic difference of corresponding phenotypes and features, among Chinese and European pig breeds. Furthermore, we employed various SVs to infer genetic structure of individuls surveyed. We found SVs can clearly detect the difference of genetic background among individuals. This clues us that genome-wide SVs can capture majority of geneic variation and be applied into cladistic analyses. Characterizing whole genome SVs demonstrated that SVs are significantly enriched/depleted with various genomic features.
Collapse
|
14
|
Robert C, Kapetanovic R, Beraldi D, Watson M, Archibald AL, Hume DA. Identification and annotation of conserved promoters and macrophage-expressed genes in the pig genome. BMC Genomics 2015; 16:970. [PMID: 26582032 PMCID: PMC4652390 DOI: 10.1186/s12864-015-2111-2] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2015] [Accepted: 10/19/2015] [Indexed: 01/09/2023] Open
Abstract
BACKGROUND The FANTOM5 consortium used Cap Analysis of Gene Expression (CAGE) tag sequencing to produce a comprehensive atlas of promoters and enhancers within the human and mouse genomes. We reasoned that the mapping of these regulatory elements to the pig genome could provide useful annotation and evidence to support assignment of orthology. RESULTS For human transcription start sites (TSS) associated with annotated human-mouse orthologs, 17% mapped to the pig genome but not to the mouse, 10% mapped only to the mouse, and 55% mapped to both pig and mouse. Around 17% did not map to either species. The mapping percentages were lower where there was not clear orthology relationship, but in every case, mapping to pig was greater than to mouse, and the degree of homology was also greater. Combined mapping of mouse and human CAGE-defined promoters identified at least one putative conserved TSS for >16,000 protein-coding genes. About 54% of the predicted locations of regulatory elements in the pig genome were supported by CAGE and/or RNA-Seq analysis from pig macrophages. CONCLUSIONS Comparative mapping of promoters and enhancers from humans and mice can provide useful preliminary annotation of other animal genomes. The data also confirm extensive gain and loss of regulatory elements between species, and the likelihood that pigs provide a better model than mice for human gene regulation and function.
Collapse
Affiliation(s)
- Christelle Robert
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Easter Bush, EH25 9RG, Edinburgh, UK.
| | - Ronan Kapetanovic
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, QLD 4072, Australia.
| | - Dario Beraldi
- Cancer Research UK, Cambridge Research Institute, Li Ka Shing Center, Robinson Way, Cambridge, CB2 0RE, UK.
| | - Mick Watson
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Easter Bush, EH25 9RG, Edinburgh, UK.
- Edinburgh Genomics, University of Edinburgh, Easter Bush, Edinburgh, EH25 9RG, UK.
| | - Alan L Archibald
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Easter Bush, EH25 9RG, Edinburgh, UK.
| | - David A Hume
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Easter Bush, EH25 9RG, Edinburgh, UK.
| |
Collapse
|
15
|
Hecker N, Christensen-Dalsgaard M, Seemann SE, Havgaard JH, Stadler PF, Hofacker IL, Nielsen H, Gorodkin J. Optimizing RNA structures by sequence extensions using RNAcop. Nucleic Acids Res 2015; 43:8135-45. [PMID: 26283181 PMCID: PMC4787817 DOI: 10.1093/nar/gkv813] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2015] [Revised: 07/28/2015] [Accepted: 07/30/2015] [Indexed: 12/26/2022] Open
Abstract
A key aspect of RNA secondary structure prediction is the identification of novel functional elements. This is a challenging task because these elements typically are embedded in longer transcripts where the borders between the element and flanking regions have to be defined. The flanking sequences impact the folding of the functional elements both at the level of computational analyses and when the element is extracted as a transcript for experimental analysis. Here, we analyze how different flanking region lengths impact folding into a constrained structure by computing probabilities of folding for different sizes of flanking regions. Our method, RNAcop (RNA context optimization by probability), is tested on known and de novo predicted structures. In vitro experiments support the computational analysis and suggest that for a number of structures, choosing proper lengths of flanking regions is critical. RNAcop is available as web server and stand-alone software via http://rth.dk/resources/rnacop.
Collapse
Affiliation(s)
- Nikolai Hecker
- Center for non-coding RNA in Technology and Health, University of Copenhagen, Grønnegårdsvej 3, 1870 Frederiksberg C, Denmark Department of Veterinary Clinical and Animal Science, University of Copenhagen, Grønnegårdsvej 3, 1870 Frederiksberg C, Denmark
| | - Mikkel Christensen-Dalsgaard
- Center for non-coding RNA in Technology and Health, University of Copenhagen, Grønnegårdsvej 3, 1870 Frederiksberg C, Denmark Department of Cellular and Molecular Medicine, Panum Institute, University of Copenhagen, Bledgamsvej 3, 2200 Copenhagen N, Denmark
| | - Stefan E Seemann
- Center for non-coding RNA in Technology and Health, University of Copenhagen, Grønnegårdsvej 3, 1870 Frederiksberg C, Denmark Department of Veterinary Clinical and Animal Science, University of Copenhagen, Grønnegårdsvej 3, 1870 Frederiksberg C, Denmark
| | - Jakob H Havgaard
- Center for non-coding RNA in Technology and Health, University of Copenhagen, Grønnegårdsvej 3, 1870 Frederiksberg C, Denmark Department of Veterinary Clinical and Animal Science, University of Copenhagen, Grønnegårdsvej 3, 1870 Frederiksberg C, Denmark
| | - Peter F Stadler
- Center for non-coding RNA in Technology and Health, University of Copenhagen, Grønnegårdsvej 3, 1870 Frederiksberg C, Denmark Bioinformatics Group, Department of Computer Science & IZBI-Interdisciplinary Center for Bioinformatics & LIFE-Leipzig Research Center for Civilization Diseases, University Leipzig, Härtelstraße 16-18, 04107 Leipzig, Germany Institute for Theoretical Chemistry, University of Vienna, Währingerstraße 17, 1090 Wien, Austria Max Planck Institute for Mathematics in the Sciences, Inselstraße 22, 04103 Leipzig, Germany Santa Fe Institute, 1399 Hyde Park Road, Santa Fe, NM 87501, USA
| | - Ivo L Hofacker
- Center for non-coding RNA in Technology and Health, University of Copenhagen, Grønnegårdsvej 3, 1870 Frederiksberg C, Denmark Institute for Theoretical Chemistry, University of Vienna, Währingerstraße 17, 1090 Wien, Austria
| | - Henrik Nielsen
- Center for non-coding RNA in Technology and Health, University of Copenhagen, Grønnegårdsvej 3, 1870 Frederiksberg C, Denmark Department of Cellular and Molecular Medicine, Panum Institute, University of Copenhagen, Bledgamsvej 3, 2200 Copenhagen N, Denmark
| | - Jan Gorodkin
- Center for non-coding RNA in Technology and Health, University of Copenhagen, Grønnegårdsvej 3, 1870 Frederiksberg C, Denmark Department of Veterinary Clinical and Animal Science, University of Copenhagen, Grønnegårdsvej 3, 1870 Frederiksberg C, Denmark
| |
Collapse
|
16
|
van der Kolk JH, Pacholewska A, Gerber V. The role of microRNAs in equine medicine: a review. Vet Q 2015; 35:88-96. [PMID: 25695624 DOI: 10.1080/01652176.2015.1021186] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
The search for new markers of diseases in human as well as veterinary medicine is ongoing. Recently, microRNAs (miRNAs or miRs) have emerged as potential new biomarkers. MiRNAs are short sequences of RNA (∼22 nucleotides) that regulate gene expression via their target messenger RNA (mRNA). Circulating miRNAs in blood can be used as novel diagnostic markers for diseases due to their evolutionary conservation and stability. As a consequence of their systemic and manifold effects on the gene expression in various target organs, the concept that miRNAs could function as hormones has been suggested. This review summarizes the biogenesis, maturation, and stability of miRNAs and discusses their use as potential biomarkers in equine medicine. To date, over 700 equine miRNAs are identified with distinct subsets of miRNAs differentially expressed in a tissue-specific manner. A physiological involvement of various miRNAs in the regulation of cell survival, steroidogenesis, and differentiation during follicle selection and ovulation in the monovular equine ovary has been demonstrated. Furthermore, miRNAs might be used as novel diagnostic markers for myopathies such as polysaccharide storage myopathy and recurrent exertional rhabdomyolysis as well as osteochondrosis. Preliminary data indicate that miRNAs in blood might play important roles in equine glucose metabolism pathway. Of note, breed differences have been reported regarding the normal equine miRNA signature. For disease prevention, it is of utmost importance to identify disease-associated biomarkers which help detect diseases before symptoms appear. As such, circulating miRNAs represent promising novel diagnostic markers in equine medicine.
Collapse
Affiliation(s)
- J H van der Kolk
- a Department of Clinical Veterinary Medicine, Vetsuisse Faculty, Swiss Institute for Equine Medicine (ISME) , University of Bern and Agroscope , Länggassstrasse 124, 3012 Bern , Switzerland
| | | | | |
Collapse
|
17
|
Paczynska P, Grzemski A, Szydlowski M. Distribution of miRNA genes in the pig genome. BMC Genet 2015; 16:6. [PMID: 25632794 PMCID: PMC4318388 DOI: 10.1186/s12863-015-0166-3] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2013] [Accepted: 01/16/2015] [Indexed: 12/13/2022] Open
Abstract
Background Recent completion of swine genome may simplify the production of swine as a large biomedical model. Here we studied sequence and location of known swine miRNA genes, key regulators of protein-coding genes at the level of RNA, and compared them to human and mouse data to prioritize future molecular studies. Results Distribution of miRNA genes in pig genome shows no particular relation to different genomic features including protein coding genes - proportions of miRNA genes in intergenic regions, introns and exons roughly agree with the size of these regions in the pig genome. Our analyses indicate that host genes harbouring intragenic miRNAs are longer from other protein-coding genes, however, no important GO enrichment was found. Swine mature miRNAs show high sequence similarity to their human and mouse orthologues. Location of miRNA genes relative to protein-coding genes is also similar among studied species, however, there are differences in the precise position in particular intergenic regions and within particular hosts. The most prominent difference between pig and human miRNAs is a large group of pig-specific sequences (53% of swine miRNAs). We found no evidence that this group of evolutionary new pig miRNAs is different from old miRNAs genes with respect to genomic location except that they are less likely to be clustered. Conclusions There are differences in precise location of orthologues miRNA genes in particular intergenic regions and within particular hosts, and their meaning for coexpression with protein-coding genes deserves experimental studies. Functional studies of a large group of pig-specific sequences in future may reveal limits of the pig as a model organism to study human gene expression. Electronic supplementary material The online version of this article (doi:10.1186/s12863-015-0166-3) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Paulina Paczynska
- Department of Genetics and Animal Breeding, Poznan University of Life Sciences, Poland, Wolynska 33, 60-637, Poznan, Poland.
| | - Adrian Grzemski
- Department of Genetics and Animal Breeding, Poznan University of Life Sciences, Poland, Wolynska 33, 60-637, Poznan, Poland.
| | - Maciej Szydlowski
- Department of Genetics and Animal Breeding, Poznan University of Life Sciences, Poland, Wolynska 33, 60-637, Poznan, Poland.
| |
Collapse
|
18
|
Nawrocki EP, Burge SW, Bateman A, Daub J, Eberhardt RY, Eddy SR, Floden EW, Gardner PP, Jones TA, Tate J, Finn RD. Rfam 12.0: updates to the RNA families database. Nucleic Acids Res 2014; 43:D130-7. [PMID: 25392425 PMCID: PMC4383904 DOI: 10.1093/nar/gku1063] [Citation(s) in RCA: 762] [Impact Index Per Article: 76.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023] Open
Abstract
The Rfam database (available at http://rfam.xfam.org) is a collection of non-coding RNA families represented by manually curated sequence alignments, consensus secondary structures and annotation gathered from corresponding Wikipedia, taxonomy and ontology resources. In this article, we detail updates and improvements to the Rfam data and website for the Rfam 12.0 release. We describe the upgrade of our search pipeline to use Infernal 1.1 and demonstrate its improved homology detection ability by comparison with the previous version. The new pipeline is easier for users to apply to their own data sets, and we illustrate its ability to annotate RNAs in genomic and metagenomic data sets of various sizes. Rfam has been expanded to include 260 new families, including the well-studied large subunit ribosomal RNA family, and for the first time includes information on short sequence- and structure-based RNA motifs present within families.
Collapse
Affiliation(s)
| | - Sarah W Burge
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, UK
| | - Alex Bateman
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, UK
| | - Jennifer Daub
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, UK
| | - Ruth Y Eberhardt
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, UK
| | - Sean R Eddy
- HHMI Janelia Farm Research Campus, Ashburn, VA, USA
| | - Evan W Floden
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, UK
| | - Paul P Gardner
- Biomolecular Interaction Centre, School of Biological Sciences, University of Canterbury, Christchurch, New Zealand
| | | | - John Tate
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, UK
| | - Robert D Finn
- HHMI Janelia Farm Research Campus, Ashburn, VA, USA European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, UK
| |
Collapse
|