1
|
Enriched atlas of lncRNA and protein-coding genes for the GRCg7b chicken assembly and its functional annotation across 47 tissues. Sci Rep 2024; 14:6588. [PMID: 38504112 PMCID: PMC10951430 DOI: 10.1038/s41598-024-56705-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2023] [Accepted: 03/09/2024] [Indexed: 03/21/2024] Open
Abstract
Gene atlases for livestock are steadily improving thanks to new genome assemblies and new expression data improving the gene annotation. However, gene content varies across databases due to differences in RNA sequencing data and bioinformatics pipelines, especially for long non-coding RNAs (lncRNAs) which have higher tissue and developmental specificity and are harder to consistently identify compared to protein coding genes (PCGs). As done previously in 2020 for chicken assemblies galgal5 and GRCg6a, we provide a new gene atlas, lncRNA-enriched, for the latest GRCg7b chicken assembly, integrating "NCBI RefSeq", "EMBL-EBI Ensembl/GENCODE" reference annotations and other resources such as FAANG and NONCODE. As a result, the number of PCGs increases from 18,022 (RefSeq) and 17,007 (Ensembl) to 24,102, and that of lncRNAs from 5789 (RefSeq) and 11,944 (Ensembl) to 44,428. Using 1400 public RNA-seq transcriptome representing 47 tissues, we provided expression evidence for 35,257 (79%) lncRNAs and 22,468 (93%) PCGs, supporting the relevance of this atlas. Further characterization including tissue-specificity, sex-differential expression and gene configurations are provided. We also identified conserved miRNA-hosting genes with human counterparts, suggesting common function. The annotated atlas is available at gega.sigenae.org.
Collapse
|
2
|
Watch Out for a Second SNP: Focus on Multi-Nucleotide Variants in Coding Regions and Rescued Stop-Gained. Front Genet 2021; 12:659287. [PMID: 34306009 PMCID: PMC8293744 DOI: 10.3389/fgene.2021.659287] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2021] [Accepted: 05/27/2021] [Indexed: 12/30/2022] Open
Abstract
Most single-nucleotide polymorphisms (SNPs) are located in non-coding regions, but the fraction usually studied is harbored in protein-coding regions because potential impacts on proteins are relatively easy to predict by popular tools such as the Variant Effect Predictor. These tools annotate variants independently without considering the potential effect of grouped or haplotypic variations, often called "multi-nucleotide variants" (MNVs). Here, we used a large RNA-seq dataset to survey MNVs, comprising 382 chicken samples originating from 11 populations analyzed in the companion paper in which 9.5M SNPs- including 3.3M SNPs with reliable genotypes-were detected. We focused our study on in-codon MNVs and evaluate their potential mis-annotation. Using GATK HaplotypeCaller read-based phasing results, we identified 2,965 MNVs observed in at least five individuals located in 1,792 genes. We found 41.1% of them showing a novel impact when compared to the effect of their constituent SNPs analyzed separately. The biggest impact variation flux concerns the originally annotated stop-gained consequences, for which around 95% were rescued; this flux is followed by the missense consequences for which 37% were reannotated with a different amino acid. We then present in more depth the rescued stop-gained MNVs and give an illustration in the SLC27A4 gene. As previously shown in human datasets, our results in chicken demonstrate the value of haplotype-aware variant annotation, and the interest to consider MNVs in the coding region, particularly when searching for severe functional consequence such as stop-gained variants.
Collapse
|
3
|
RNA-Seq Data for Reliable SNP Detection and Genotype Calling: Interest for Coding Variant Characterization and Cis-Regulation Analysis by Allele-Specific Expression in Livestock Species. Front Genet 2021; 12:655707. [PMID: 34262593 PMCID: PMC8273700 DOI: 10.3389/fgene.2021.655707] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2021] [Accepted: 06/01/2021] [Indexed: 12/19/2022] Open
Abstract
In addition to their common usages to study gene expression, RNA-seq data accumulated over the last 10 years are a yet-unexploited resource of SNPs in numerous individuals from different populations. SNP detection by RNA-seq is particularly interesting for livestock species since whole genome sequencing is expensive and exome sequencing tools are unavailable. These SNPs detected in expressed regions can be used to characterize variants affecting protein functions, and to study cis-regulated genes by analyzing allele-specific expression (ASE) in the tissue of interest. However, gene expression can be highly variable, and filters for SNP detection using the popular GATK toolkit are not yet standardized, making SNP detection and genotype calling by RNA-seq a challenging endeavor. We compared SNP calling results using GATK suggested filters, on two chicken populations for which both RNA-seq and DNA-seq data were available for the same samples of the same tissue. We showed, in expressed regions, a RNA-seq precision of 91% (SNPs detected by RNA-seq and shared by DNA-seq) and we characterized the remaining 9% of SNPs. We then studied the genotype (GT) obtained by RNA-seq and the impact of two factors (GT call-rate and read number per GT) on the concordance of GT with DNA-seq; we proposed thresholds for them leading to a 95% concordance. Applying these thresholds to 767 multi-tissue RNA-seq of 382 birds of 11 chicken populations, we found 9.5 M SNPs in total, of which ∼550,000 SNPs per tissue and population with a reliable GT (call rate ≥ 50%) and among them, ∼340,000 with a MAF ≥ 10%. We showed that such RNA-seq data from one tissue can be used to (i) detect SNPs with a strong predicted impact on proteins, despite their scarcity in each population (16,307 SIFT deleterious missenses and 590 stop-gained), (ii) study, on a large scale, cis-regulations of gene expression, with ∼81% of protein-coding and 68% of long non-coding genes (TPM ≥ 1) that can be analyzed for ASE, and with ∼29% of them that were cis-regulated, and (iii) analyze population genetic using such SNPs located in expressed regions. This work shows that RNA-seq data can be used with good confidence to detect SNPs and associated GT within various populations and used them for different analyses as GTEx studies.
Collapse
|
4
|
Author Correction: An integrative atlas of chicken long non-coding genes and their annotations across 25 tissues. Sci Rep 2021; 11:9463. [PMID: 33911173 PMCID: PMC8080728 DOI: 10.1038/s41598-021-89158-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
|
5
|
Chicken adaptive response to low energy diet: main role of the hypothalamic lipid metabolism revealed by a phenotypic and multi-tissue transcriptomic approach. BMC Genomics 2019; 20:1033. [PMID: 31888468 PMCID: PMC6937963 DOI: 10.1186/s12864-019-6384-8] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2019] [Accepted: 12/11/2019] [Indexed: 02/07/2023] Open
Abstract
Background Production conditions of layer chicken can vary in terms of temperature or diet energy content compared to the controlled environment where pure-bred selection is undertaken. The aim of this study was to better understand the long-term effects of a 15%-energy depleted diet on egg-production, energy homeostasis and metabolism via a multi-tissue transcriptomic analysis. Study was designed to compare effects of the nutritional intervention in two layer chicken lines divergently selected for residual feed intake. Results Chicken adapted to the diet in terms of production by significantly increasing their feed intake and decreasing their body weight and body fat composition, while their egg production was unchanged. No significant interaction was observed between diet and line for the production traits. The low energy diet had no effect on adipose tissue and liver transcriptomes. By contrast, the nutritional challenge affected the blood transcriptome and, more severely, the hypothalamus transcriptome which displayed 2700 differentially expressed genes. In this tissue, the low-energy diet lead to an over-expression of genes related to endocannabinoid signaling (CN1R, NAPE-PLD) and to the complement system, a part of the immune system, both known to regulate feed intake. Both mechanisms are associated to genes related polyunsaturated fatty acids synthesis (FADS1, ELOVL5 and FADS2), like the arachidonic acid, a precursor of anandamide, a key endocannabinoid, and of prostaglandins, that mediate the regulatory effects of the complement system. A possible regulatory role of NR1H3 (alias LXRα) has been associated to these transcriptional changes. The low-energy diet further affected brain plasticity-related genes involved in the cholesterol synthesis and in the synaptic activity, revealing a link between nutrition and brain plasticity. It upregulated genes related to protein synthesis, mitochondrial oxidative phosphorylation and fatty acid oxidation in the hypothalamus, suggesting reorganization in nutrient utilization and biological synthesis in this brain area. Conclusions We observed a complex transcriptome modulation in the hypothalamus of chicken in response to low-energy diet suggesting numerous changes in synaptic plasticity, endocannabinoid regulation, neurotransmission, lipid metabolism, mitochondrial activity and protein synthesis. This global transcriptomic reprogramming could explain the adaptive behavioral response (i.e. increase of feed intake) of the animals to the low-energy content of the diet.
Collapse
|
6
|
Long noncoding RNAs in lipid metabolism: literature review and conservation analysis across species. BMC Genomics 2019; 20:882. [PMID: 31752679 PMCID: PMC6868825 DOI: 10.1186/s12864-019-6093-3] [Citation(s) in RCA: 43] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2019] [Accepted: 09/10/2019] [Indexed: 12/14/2022] Open
Abstract
BACKGROUND Lipids are important for the cell and organism life since they are major components of membranes, energy reserves and are also signal molecules. The main organs for the energy synthesis and storage are the liver and adipose tissue, both in humans and in more distant species such as chicken. Long noncoding RNAs (lncRNAs) are known to be involved in many biological processes including lipid metabolism. RESULTS In this context, this paper provides the most exhaustive list of lncRNAs involved in lipid metabolism with 60 genes identified after an in-depth analysis of the bibliography, while all "review" type articles list a total of 27 genes. These 60 lncRNAs are mainly described in human or mice and only a few of them have a precise described mode-of-action. Because these genes are still named in a non-standard way making such a study tedious, we propose a standard name for this list according to the rules dictated by the HUGO consortium. Moreover, we identified about 10% of lncRNAs which are conserved between mammals and chicken and 2% between mammals and fishes. Finally, we demonstrated that two lncRNA were wrongly considered as lncRNAs in the literature since they are 3' extensions of the closest coding gene. CONCLUSIONS Such a lncRNAs catalogue can participate to the understanding of the lipid metabolism regulators; it can be useful to better understand the genetic regulation of some human diseases (obesity, hepatic steatosis) or traits of economic interest in livestock species (meat quality, carcass composition). We have no doubt that this first set will be rapidly enriched in coming years.
Collapse
|
7
|
Discovery of Human-Similar Gene Fusions in Canine Cancers. Cancer Res 2017; 77:5721-5727. [DOI: 10.1158/0008-5472.can-16-2691] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2016] [Revised: 02/27/2017] [Accepted: 08/29/2017] [Indexed: 11/16/2022]
|
8
|
A Point Mutation in a lincRNA Upstream of GDNF Is Associated to a Canine Insensitivity to Pain: A Spontaneous Model for Human Sensory Neuropathies. PLoS Genet 2016; 12:e1006482. [PMID: 28033318 PMCID: PMC5198995 DOI: 10.1371/journal.pgen.1006482] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2016] [Accepted: 11/15/2016] [Indexed: 01/06/2023] Open
Abstract
Human Hereditary Sensory Autonomic Neuropathies (HSANs) are characterized by insensitivity to pain, sometimes combined with self-mutilation. Strikingly, several sporting dog breeds are particularly affected by such neuropathies. Clinical signs appear in young puppies and consist of acral analgesia, with or without sudden intense licking, biting and severe self-mutilation of the feet, whereas proprioception, motor abilities and spinal reflexes remain intact. Through a Genome Wide Association Study (GWAS) with 24 affected and 30 unaffected sporting dogs using the Canine HD 170K SNP array (Illumina), we identified a 1.8 Mb homozygous locus on canine chromosome 4 (adj. p-val = 2.5x10-6). Targeted high-throughput sequencing of this locus in 4 affected and 4 unaffected dogs identified 478 variants. Only one variant perfectly segregated with the expected recessive inheritance in 300 sporting dogs of known clinical status, while it was never present in 900 unaffected dogs from 130 other breeds. This variant, located 90 kb upstream of the GDNF gene, a highly relevant neurotrophic factor candidate gene, lies in a long intergenic non-coding RNAs (lincRNA), GDNF-AS. Using human comparative genomic analysis, we observed that the canine variant maps onto an enhancer element. Quantitative RT-PCR of dorsal root ganglia RNAs of affected dogs showed a significant decrease of both GDNF mRNA and GDNF-AS expression levels (respectively 60% and 80%), as compared to unaffected dogs. We thus performed gel shift assays (EMSA) that reveal that the canine variant significantly alters the binding of regulatory elements. Altogether, these results allowed the identification in dogs of GDNF as a relevant candidate for human HSAN and insensitivity to pain, but also shed light on the regulation of GDNF transcription. Finally, such results allow proposing these sporting dog breeds as natural models for clinical trials with a double benefit for human and veterinary medicine. In this study, we present a canine neuropathy characterized by insensitivity to pain in the feet, sometimes combined with self-mutilation described in four sporting breeds. This particular phenotype has the clinical hallmarks of human Hereditary Sensory Autonomic Neuropathies (HSAN). As we hypothesized that a monogenic recessive disorder was shared between these breeds, we performed a Genome Wide Association Study (GWAS) to search for the genetic causes and found one homozygous chromosomal region in affected dogs. High-throughput sequencing of this region allowed the identification of a point mutation upstream to the GDNF gene and located in the last exon of a long non-coding RNA, GDNF-AS. We confirmed the perfect association of this variant with the disease using more than 900 unaffected dogs that do not present with this mutation. Functional analyses (qRT-PCR, EMSA) confirmed that the mutation alters the binding of regulatory complex, leading to a significant decrease of both GDNF and GDNF-AS mRNA expression levels. This work in canine spontaneous forms of human neuropathies allowed the identification of a novel gene GDNF and its regulation mechanism, not yet described in human HSAN, opening the field of clinical trials to benefit both canine and human medicine.
Collapse
|
9
|
Amy2B copy number variation reveals starch diet adaptations in ancient European dogs. ROYAL SOCIETY OPEN SCIENCE 2016; 3:160449. [PMID: 28018628 PMCID: PMC5180126 DOI: 10.1098/rsos.160449] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/24/2016] [Accepted: 10/13/2016] [Indexed: 05/26/2023]
Abstract
Extant dog and wolf DNA indicates that dog domestication was accompanied by the selection of a series of duplications on the Amy2B gene coding for pancreatic amylase. In this study, we used a palaeogenetic approach to investigate the timing and expansion of the Amy2B gene in the ancient dog populations of Western and Eastern Europe and Southwest Asia. Quantitative polymerase chain reaction was used to estimate the copy numbers of this gene for 13 ancient dog samples, dated to between 15 000 and 4000 years before present (cal. BP). This evidenced an increase of Amy2B copies in ancient dogs from as early as the 7th millennium cal. BP in Southeastern Europe. We found that the gene expansion was not fixed across all dogs within this early farming context, with ancient dogs bearing between 2 and 20 diploid copies of the gene. The results also suggested that selection for the increased Amy2B copy number started 7000 years cal. BP, at the latest. This expansion reflects a local adaptation that allowed dogs to thrive on a starch rich diet, especially within early farming societies, and suggests a biocultural coevolution of dog genes and human culture.
Collapse
|
10
|
A spontaneous KRT16 mutation in a dog breed: a model for human focal non-epidermolytic palmoplantar keratoderma (FNEPPK). J Invest Dermatol 2014; 135:1187-1190. [PMID: 25521457 DOI: 10.1038/jid.2014.526] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
|
11
|
Comparison of buccal and blood-derived canine DNA, either native or whole genome amplified, for array-based genome-wide association studies. BMC Res Notes 2011; 4:226. [PMID: 21718521 PMCID: PMC3145587 DOI: 10.1186/1756-0500-4-226] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2011] [Accepted: 06/30/2011] [Indexed: 12/02/2022] Open
Abstract
Background The availability of array-based genotyping platforms for single nucleotide polymorphisms (SNPs) for the canine genome has expanded the opportunities to undertake genome-wide association (GWA) studies to identify the genetic basis for Mendelian and complex traits. Whole blood as the source of high quality DNA is undisputed but often proves impractical for collection of the large numbers of samples necessary to discover the loci underlying complex traits. Further, many countries prohibit the collection of blood from dogs unless medically necessary thereby restricting access to critical control samples from healthy dogs. Alternate sources of DNA, typically from buccal cytobrush extractions, while convenient, have been suggested to have low yield and perform poorly in GWA. Yet buccal cytobrushes provide a cost-effective means of collecting DNA, are readily accepted by dog owners, and represent a large resource base in many canine genetics laboratories. To increase the DNA quantities, whole genome amplification (WGA) can be performed. Thus, the present study assessed the utility of buccal-derived DNA as well as whole genome amplification in comparison to blood samples for use on the most recent iteration of the canine HD SNP array (Illumina). Findings In both buccal and blood samples, whether whole genome amplified or not, 97% of the samples had SNP call rates in excess of 80% indicating that the vast majority of the SNPs would be suitable to perform association studies regardless of the DNA source. Similarly, there were no significant differences in marker intensity measurements between buccal and blood samples for copy number variations (CNV) analysis. Conclusions All DNA samples assayed, buccal or blood, native or whole genome amplified, are appropriate for use in array-based genome-wide association studies. The concordance between subsets of dogs for which both buccal and blood samples, or those samples whole genome amplified, was shown to average >99%. Thus, the two DNA sources were comparable in the generation of SNP genotypes and intensity values to estimate structural variation indicating the utility for the use of buccal cytobrush samples and the reliability of whole genome amplification for genome-wide association and CNV studies.
Collapse
|
12
|
Identification of a gene expression profile associated with operational tolerance among a selected group of stable kidney transplant patients. Transpl Int 2011; 24:536-47. [PMID: 21457359 DOI: 10.1111/j.1432-2277.2011.01251.x] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Despite their utility, immunosuppressive treatments have numerous side effects, including infectious complications, malignancies and metabolic disorders, all of which contribute to long-term graft loss. In addition to the development of new pharmaceutical products with reduced toxicity and more comfortable modes of administration, tailoring immunosuppression according to the immune status of each patient would represent a significant breakthrough. Gene expression profiling has been shown to be a clinically relevant monitoring tool. In this paper, we have assessed the overall long-term kidney transplant outcome and attempted to identify operationally tolerant-like patients among recipients with stable clinical status at least 5 years post-transplantation. We thus measured a combination of noninvasive blood biomarkers of operational tolerance in a cohort of 144 stable patients and showed that only 3.5% exhibited a gene expression profile of operational tolerance, suggesting that such a profile can be detected under immunosuppressive therapy but that its frequency is low in kidney transplant recipients when compared with liver transplant recipients. We suggest that a rational approach to patient selection, based on a combination of clinical and biological characteristics, may help to provide a safer method for identification of patients potentially suitable for immunosuppressive drug weaning procedures.
Collapse
|
13
|
mtDNA controls expression of the Death Associated Protein 3. Exp Cell Res 2006; 312:737-45. [PMID: 16413536 DOI: 10.1016/j.yexcr.2005.11.027] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2005] [Revised: 11/13/2005] [Accepted: 11/17/2005] [Indexed: 10/25/2022]
Abstract
The Death Associated Protein 3 (DAP3), a GTP-binding constituent of the small subunit of the mitochondrial ribosome, is implicated in the TNFalpha and IFNgamma apoptotic pathways of the cell and is involved in the maintenance of the mitochondrial network. We have investigated the mitochondrial role of DAP3 by analyzing its mRNA and protein expression in transformed and non-transformed cell lines presenting various levels of mtDNA. The 3 mtDNA-less (rho degrees ) cell lines showed a complete absence of DAP3, whereas the mRNA expression was conserved. In HepG2 cells treated with increasing doses of ddCTP, the depletion of mtDNA was accompanied by the reduced expression of DAP3. However, the expression of the corresponding mRNA was maintained, suggesting the existence of a post-transcriptional mechanism responsible for the depletion of the DAP3. Compared to the parental cells, the 3 rho degrees cell lines displayed partial resistance to staurosporin-induced cell death. The absence of pro-apoptotic DAP3 in these mtDNA-less cells could explain their reduced apoptotic capacity. Our results suggest that the mtDNA content plays a role in cell apoptosis by mediating the expression of DAP3.
Collapse
|