1
|
ChromoEnhancer: An Artificial-Intelligence-Based Tool to Enhance Neoplastic Karyograms as an Aid for Effective Analysis. Cells 2022; 11:cells11142244. [PMID: 35883687 PMCID: PMC9324748 DOI: 10.3390/cells11142244] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2022] [Revised: 04/21/2022] [Accepted: 04/28/2022] [Indexed: 12/10/2022] Open
Abstract
Cytogenetics laboratory tests are among the most important procedures for the diagnosis of genetic diseases, especially in the area of hematological malignancies. Manual chromosomal karyotyping methods are time consuming and labor intensive and, hence, expensive. Therefore, to alleviate the process of analysis, several attempts have been made to enhance karyograms. The current chromosomal image enhancement is based on classical image processing. This approach has its limitations, one of which is that it has a mandatory application to all chromosomes, where customized application to each chromosome is ideal. Moreover, each chromosome needs a different level of enhancement, depending on whether a given area is from the chromosome itself or it is just an artifact from staining. The analysis of poor-quality karyograms, which is a difficulty faced often in preparations from cancer samples, is time consuming and might result in missing the abnormality or difficulty in reporting the exact breakpoint within the chromosome. We developed ChromoEnhancer, a novel artificial-intelligence-based method to enhance neoplastic karyogram images. The method is based on Generative Adversarial Networks (GANs) with a data-centric approach. GANs are known for the conversion of one image domain to another. We used GANs to convert poor-quality karyograms into good-quality images. Our method of karyogram enhancement led to robust routine cytogenetic analysis and, therefore, to accurate detection of cryptic chromosomal abnormalities. To evaluate ChromoEnahancer, we randomly assigned a subset of the enhanced images and their corresponding original (unenhanced) images to two independent cytogeneticists to measure the karyogram quality and the elapsed time to complete the analysis, using four rating criteria, each scaled from 1 to 5. Furthermore, we compared the enhanced images with our method to the original ones, using quantitative measures (PSNR and SSIM metrics).
Collapse
|
2
|
Guk JY, Jang MJ, Kim S. Identification of novel PHD-finger genes in pepper by genomic re-annotation and comparative analyses. BMC PLANT BIOLOGY 2022; 22:206. [PMID: 35443608 PMCID: PMC9020097 DOI: 10.1186/s12870-022-03580-2] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/17/2021] [Accepted: 04/06/2022] [Indexed: 06/01/2023]
Abstract
BACKGROUND The plant homeodomain (PHD)-finger gene family that belongs to zinc-finger genes, plays an important role in epigenetics by regulating gene expression in eukaryotes. However, inaccurate annotation of PHD-finger genes hinders further downstream comparative, evolutionary, and functional studies. RESULTS We performed genome-wide re-annotation in Arabidopsis thaliana (Arabidopsis), Oryza sativa (rice), Capsicum annuum (pepper), Solanum tuberosum (potato), and Solanum lycopersicum (tomato) to better understand the role of PHD-finger genes in these species. Our investigation identified 875 PHD-finger genes, of which 225 (26% of total) were newly identified, including 57 (54%) novel PHD-finger genes in pepper. The PHD-finger genes of the five plant species have various integrated domains that may be responsible for the diversification of structures and functions of these genes. Evolutionary analyses suggest that PHD-finger genes were expanded recently by lineage-specific duplication, especially in pepper and potato, resulting in diverse repertoires of PHD-finger genes among the species. We validated the expression of six newly identified PHD-finger genes in pepper with qRT-PCR. Transcriptome analyses suggest potential functions of PHD-finger genes in response to various abiotic stresses in pepper. CONCLUSIONS Our data, including the updated annotation of PHD-finger genes, provide useful information for further evolutionary and functional analyses to better understand the roles of the PHD-finger gene family in pepper.
Collapse
Affiliation(s)
- Ji-Yoon Guk
- Department of Environmental Horticulture, University of Seoul, Seoul, 02504, Republic of Korea
| | - Min-Jeong Jang
- Department of Environmental Horticulture, University of Seoul, Seoul, 02504, Republic of Korea
| | - Seungill Kim
- Department of Environmental Horticulture, University of Seoul, Seoul, 02504, Republic of Korea.
| |
Collapse
|
3
|
Kim S, Cheong K, Park J, Kim M, Kim J, Seo M, Chae GY, Jang MJ, Mang H, Kwon S, Kim Y, Koo N, Min CW, Kim K, Oh N, Kim K, Jeon J, Kim H, Lee Y, Sohn KH, McCann HC, Ye S, Kim ST, Park K, Lee Y, Choi D. TGFam-Finder: a novel solution for target-gene family annotation in plants. THE NEW PHYTOLOGIST 2020; 227:1568-1581. [PMID: 32392385 PMCID: PMC7496378 DOI: 10.1111/nph.16645] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/30/2020] [Accepted: 04/21/2020] [Indexed: 05/26/2023]
Abstract
Whole-genome annotation error that omits essential protein-coding genes hinders further research. We developed Target Gene Family Finder (TGFam-Finder), an alternative tool for the structural annotation of protein-coding genes containing target domain(s) of interest in plant genomes. TGFam-Finder took considerably reduced annotation run-time and improved accuracy compared to conventional annotation tools. Large-scale re-annotation of 50 plant genomes identified an average of 150, 166 and 86 additional far-red-impaired response 1, nucleotide-binding and leucine-rich-repeat, and cytochrome P450 genes, respectively, that were missed in previous annotations. We detected significantly higher number of translated genes in the new annotations using mass spectrometry data from seven plant species compared to previous annotations. TGFam-Finder along with the new gene models can provide an optimized platform for comprehensive functional, comparative, and evolutionary studies in plants.
Collapse
Affiliation(s)
- Seungill Kim
- Department of Plant SciencePlant Immunity Research CenterPlant Genomics and Breeding InstituteResearch Institute for Agriculture and Life SciencesSeoul National UniversitySeoul08826Korea
- Department of Environmental HorticultureUniversity of SeoulSeoul02504Korea
| | - Kyeongchae Cheong
- Interdisciplinary Program in Agricultural GenomicsSeoul National UniversitySeoul08826Korea
| | - Jieun Park
- Department of Plant SciencePlant Immunity Research CenterPlant Genomics and Breeding InstituteResearch Institute for Agriculture and Life SciencesSeoul National UniversitySeoul08826Korea
| | - Myung‐Shin Kim
- Department of Plant SciencePlant Immunity Research CenterPlant Genomics and Breeding InstituteResearch Institute for Agriculture and Life SciencesSeoul National UniversitySeoul08826Korea
- Interdisciplinary Program in Agricultural GenomicsSeoul National UniversitySeoul08826Korea
| | - Jihyun Kim
- Department of Plant SciencePlant Immunity Research CenterPlant Genomics and Breeding InstituteResearch Institute for Agriculture and Life SciencesSeoul National UniversitySeoul08826Korea
| | - Min‐Ki Seo
- Department of Plant SciencePlant Immunity Research CenterPlant Genomics and Breeding InstituteResearch Institute for Agriculture and Life SciencesSeoul National UniversitySeoul08826Korea
| | - Geun Young Chae
- Department of Environmental HorticultureUniversity of SeoulSeoul02504Korea
| | - Min Jeong Jang
- Department of Environmental HorticultureUniversity of SeoulSeoul02504Korea
| | - Hyunggon Mang
- Department of Plant SciencePlant Immunity Research CenterPlant Genomics and Breeding InstituteResearch Institute for Agriculture and Life SciencesSeoul National UniversitySeoul08826Korea
| | - Sun‐Ho Kwon
- Department of PharmacologySeoul National University College of MedicineSeoul03080Korea
| | - Yong‐Min Kim
- Korean Bioinformation CenterKorea Research Institute of Bioscience and BiotechnologyDaejeon34141Korea
| | - Namjin Koo
- Korean Bioinformation CenterKorea Research Institute of Bioscience and BiotechnologyDaejeon34141Korea
| | - Cheol Woo Min
- Department of Plant BioscienceLife and Energy Convergence Research InstitutePusan National UniversityMiryang627‐706Korea
| | - Kwang‐Soo Kim
- Department of Biomedical ScienceCollege of Life ScienceCHA UniversitySeongnam13488Korea
| | - Nuri Oh
- Department of Biomedical ScienceCollege of Life ScienceCHA UniversitySeongnam13488Korea
| | - Ki‐Tae Kim
- Department of Agricultural BiotechnologySeoul National UniversitySeoul08826Korea
| | - Jongbum Jeon
- Interdisciplinary Program in Agricultural GenomicsSeoul National UniversitySeoul08826Korea
| | - Hyunbin Kim
- Interdisciplinary Program in Agricultural GenomicsSeoul National UniversitySeoul08826Korea
| | - Yoon‐Young Lee
- Department of Life SciencesPohang University of Science and TechnologyPohangGyeongbuk37673Korea
| | - Kee Hoon Sohn
- Department of Life SciencesPohang University of Science and TechnologyPohangGyeongbuk37673Korea
- School of Interdisciplinary Bioscience and BioengineeringPohang University of Science and TechnologyPohangGyeongbuk37673Korea
| | - Honour C. McCann
- New Zealand Institute for Advanced StudyMassey University AucklandAuckland0632New Zealand
| | - Sang‐Kyu Ye
- Department of PharmacologySeoul National University College of MedicineSeoul03080Korea
| | - Sun Tae Kim
- Department of Plant BioscienceLife and Energy Convergence Research InstitutePusan National UniversityMiryang627‐706Korea
| | - Kyung‐Soon Park
- Department of Biomedical ScienceCollege of Life ScienceCHA UniversitySeongnam13488Korea
| | - Yong‐Hwan Lee
- Interdisciplinary Program in Agricultural GenomicsSeoul National UniversitySeoul08826Korea
- Department of Agricultural BiotechnologySeoul National UniversitySeoul08826Korea
| | - Doil Choi
- Department of Plant SciencePlant Immunity Research CenterPlant Genomics and Breeding InstituteResearch Institute for Agriculture and Life SciencesSeoul National UniversitySeoul08826Korea
- Interdisciplinary Program in Agricultural GenomicsSeoul National UniversitySeoul08826Korea
| |
Collapse
|
4
|
Emamjomeh A, Zahiri J, Asadian M, Behmanesh M, Fakheri BA, Mahdevar G. Identification, Prediction and Data Analysis of Noncoding RNAs: A Review. Med Chem 2019; 15:216-230. [PMID: 30484409 DOI: 10.2174/1573406414666181015151610] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2017] [Revised: 06/03/2018] [Accepted: 09/30/2018] [Indexed: 12/13/2022]
Abstract
BACKGROUND Noncoding RNAs (ncRNAs) which play an important role in various cellular processes are important in medicine as well as in drug design strategies. Different studies have shown that ncRNAs are dis-regulated in cancer cells and play an important role in human tumorigenesis. Therefore, it is important to identify and predict such molecules by experimental and computational methods, respectively. However, to avoid expensive experimental methods, computational algorithms have been developed for accurately and fast prediction of ncRNAs. OBJECTIVE The aim of this review was to introduce the experimental and computational methods to identify and predict ncRNAs structure. Also, we explained the ncRNA's roles in cellular processes and drugs design, briefly. METHOD In this survey, we will introduce ncRNAs and their roles in biological and medicinal processes. Then, some important laboratory techniques will be studied to identify ncRNAs. Finally, the state-of-the-art models and algorithms will be introduced along with important tools and databases. RESULTS The results showed that the integration of experimental and computational approaches improves to identify ncRNAs. Moreover, the high accurate databases, algorithms and tools were compared to predict the ncRNAs. CONCLUSION ncRNAs prediction is an exciting research field, but there are different difficulties. It requires accurate and reliable algorithms and tools. Also, it should be mentioned that computational costs of such algorithm including running time and usage memory are very important. Finally, some suggestions were presented to improve computational methods of ncRNAs gene and structural prediction.
Collapse
Affiliation(s)
- Abbasali Emamjomeh
- Laboratory of Computational Biotechnology and Bioinformatics (CBB), Department of Plant Breeding and Biotechnology (PBB), University of Zabol, Zabol, Iran
| | - Javad Zahiri
- Bioinformatics and Computational Omics Lab (BioCOOL), Department of Biophysics, Faculty of Biological Sciences, Tarbiat Modares University, Tehran, Iran
| | - Mehrdad Asadian
- Department of Plant Breeding and Biotechnology (PBB), Faculty of Agriculture, University of Zabol, Zabol, Iran
| | - Mehrdad Behmanesh
- Department of Genetics, Faculty of Biological Sciences, Tarbiat Modares University, Tehran, Iran
| | - Barat A Fakheri
- Department of Plant Breeding and Biotechnology (PBB), Faculty of Agriculture, University of Zabol, Zabol, Iran
| | - Ghasem Mahdevar
- Department of Mathematics, Faculty of Sciences, University of Isfahan, Isfahan, Iran
| |
Collapse
|
5
|
Lee NK, Li X, Wang D. A comprehensive survey on genetic algorithms for DNA motif prediction. Inf Sci (N Y) 2018. [DOI: 10.1016/j.ins.2018.07.004] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
|
6
|
The creatine kinase pathway is a metabolic vulnerability in EVI1-positive acute myeloid leukemia. Nat Med 2017; 23:301-313. [PMID: 28191887 DOI: 10.1038/nm.4283] [Citation(s) in RCA: 68] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2016] [Accepted: 01/12/2017] [Indexed: 12/16/2022]
Abstract
Expression of the MECOM (also known as EVI1) proto-oncogene is deregulated by chromosomal translocations in some cases of acute myeloid leukemia (AML) and is associated with poor clinical outcome. Here, through transcriptomic and metabolomic profiling of hematopoietic cells, we reveal that EVI1 overexpression alters cellular metabolism. A screen using pooled short hairpin RNAs (shRNAs) identified the ATP-buffering, mitochondrial creatine kinase CKMT1 as necessary for survival of EVI1-expressing cells in subjects with EVI1-positive AML. EVI1 promotes CKMT1 expression by repressing the myeloid differentiation regulator RUNX1. Suppression of arginine-creatine metabolism by CKMT1-directed shRNAs or by the small molecule cyclocreatine selectively decreased the viability, promoted the cell cycle arrest and apoptosis of human EVI1-positive cell lines, and prolonged survival in both orthotopic xenograft models and mouse models of primary AML. CKMT1 inhibition altered mitochondrial respiration and ATP production, an effect that was abrogated by phosphocreatine-mediated reactivation of the arginine-creatine pathway. Targeting CKMT1 is thus a promising therapeutic strategy for this EVI1-driven AML subtype that is highly resistant to current treatment regimens.
Collapse
|
7
|
WISCOD: a statistical web-enabled tool for the identification of significant protein coding regions. BIOMED RESEARCH INTERNATIONAL 2014; 2014:282343. [PMID: 25313355 PMCID: PMC4181902 DOI: 10.1155/2014/282343] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/01/2013] [Revised: 12/18/2013] [Accepted: 02/11/2014] [Indexed: 11/17/2022]
Abstract
Classically, gene prediction programs are based on detecting signals such as boundary sites (splice sites, starts, and stops) and coding regions in the DNA sequence in order to build potential exons and join them into a gene structure. Although nowadays it is possible to improve their performance with additional information from related species or/and cDNA databases, further improvement at any step could help to obtain better predictions. Here, we present WISCOD, a web-enabled tool for the identification of significant protein coding regions, a novel software tool that tackles the exon prediction problem in eukaryotic genomes. WISCOD has the capacity to detect real exons from large lists of potential exons, and it provides an easy way to use global P value called expected probability of being a false exon (EPFE) that is useful for ranking potential exons in a probabilistic framework, without additional computational costs. The advantage of our approach is that it significantly increases the specificity and sensitivity (both between 80% and 90%) in comparison to other ab initio methods (where they are in the range of 70–75%). WISCOD is written in JAVA and R and is available to download and to run in a local mode on Linux and Windows platforms.
Collapse
|
8
|
Huang YH, Wu HY, Wu KM, Liu TT, Liou RF, Tsai SF, Shiao MS, Ho LT, Tzean SS, Yang UC. Generation and analysis of the expressed sequence tags from the mycelium of Ganoderma lucidum. PLoS One 2013; 8:e61127. [PMID: 23658685 PMCID: PMC3642047 DOI: 10.1371/journal.pone.0061127] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2011] [Accepted: 03/07/2013] [Indexed: 12/24/2022] Open
Abstract
Ganoderma lucidum (G. lucidum) is a medicinal mushroom renowned in East Asia for its potential biological effects. To enable a systematic exploration of the genes associated with the various phenotypes of the fungus, the genome consortium of G. lucidum has carried out an expressed sequence tag (EST) sequencing project. Using a Sanger sequencing based approach, 47,285 ESTs were obtained from in vitro cultures of G. lucidum mycelium of various durations. These ESTs were further clustered and merged into 7,774 non-redundant expressed loci. The features of these expressed contigs were explored in terms of over-representation, alternative splicing, and natural antisense transcripts. Our results provide an invaluable information resource for exploring the G. lucidum transcriptome and its regulation. Many cases of the genes over-represented in fast-growing dikaryotic mycelium are closely related to growth, such as cell wall and bioactive compound synthesis. In addition, the EST-genome alignments containing putative cassette exons and retained introns were manually curated and then used to make inferences about the predominating splice-site recognition mechanism of G. lucidum. Moreover, a number of putative antisense transcripts have been pinpointed, from which we noticed that two cases are likely to reveal hitherto undiscovered biological pathways. To allow users to access the data and the initial analysis of the results of this project, a dedicated web site has been created at http://csb2.ym.edu.tw/est/.
Collapse
Affiliation(s)
- Yen-Hua Huang
- Department of Biochemistry, Faculty of Medicine, School of Medicine, National Yang-Ming University, Taipei City, Taiwan, R.O.C.
- Center for Systems and Synthetic Biology, National Yang-Ming University, Taipei City, Taiwan, R.O.C.
| | - Hung-Yi Wu
- Department of Plant Pathology and Microbiology, National Taiwan University, Taipei City, Taiwan, R.O.C.
| | - Keh-Ming Wu
- VYM Genome Research Center, National Yang-Ming University, Taipei City, Taiwan, R.O.C.
| | - Tze-Tze Liu
- VYM Genome Research Center, National Yang-Ming University, Taipei City, Taiwan, R.O.C.
| | - Ruey-Fen Liou
- Department of Plant Pathology and Microbiology, National Taiwan University, Taipei City, Taiwan, R.O.C.
| | - Shih-Feng Tsai
- VYM Genome Research Center, National Yang-Ming University, Taipei City, Taiwan, R.O.C.
| | - Ming-Shi Shiao
- Medical Research and Education Department, Taipei Veterans General Hospital, Taipei City, Taiwan, R.O.C.
| | - Low-Tone Ho
- Medical Research and Education Department, Taipei Veterans General Hospital, Taipei City, Taiwan, R.O.C.
| | - Shean-Shong Tzean
- Department of Plant Pathology and Microbiology, National Taiwan University, Taipei City, Taiwan, R.O.C.
| | - Ueng-Cheng Yang
- Institute of Biomedical Informatics, College of Life Science, National Yang-Ming University, Taipei City, Taiwan, R.O.C.
- Center for Systems and Synthetic Biology, National Yang-Ming University, Taipei City, Taiwan, R.O.C.
| |
Collapse
|
9
|
Blais EM, Chavali AK, Papin JA. Linking genome-scale metabolic modeling and genome annotation. Methods Mol Biol 2013; 985:61-83. [PMID: 23417799 DOI: 10.1007/978-1-62703-299-5_4] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Genome-scale metabolic network reconstructions, assembled from annotated genomes, serve as a platform for integrating data from heterogeneous sources and generating hypotheses for further experimental validation. Implementing constraint-based modeling techniques such as flux balance analysis (FBA) on network reconstructions allows for interrogating metabolism at a systems level, which aids in identifying and rectifying gaps in knowledge. With genome sequences for various organisms from prokaryotes to eukaryotes becoming increasingly available, a significant bottleneck lies in the structural and functional annotation of these sequences. Using topologically based and biologically inspired metabolic network refinement, we can better characterize enzymatic functions present in an organism and link annotation of these functions to candidate transcripts; both steps can be experimentally validated.
Collapse
Affiliation(s)
- Edik M Blais
- Department of Biomedical Engineering, University of Virginia, Charlottesville, VA, USA
| | | | | |
Collapse
|
10
|
Wang D, Tapan S. MISCORE: a new scoring function for characterizing DNA regulatory motifs in promoter sequences. BMC SYSTEMS BIOLOGY 2012; 6 Suppl 2:S4. [PMID: 23282090 PMCID: PMC3521183 DOI: 10.1186/1752-0509-6-s2-s4] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
Background Computational approaches for finding DNA regulatory motifs in promoter sequences are useful to biologists in terms of reducing the experimental costs and speeding up the discovery process of de novo binding sites. It is important for rule-based or clustering-based motif searching schemes to effectively and efficiently evaluate the similarity between a k-mer (a k-length subsequence) and a motif model, without assuming the independence of nucleotides in motif models or without employing computationally expensive Markov chain models to estimate the background probabilities of k-mers. Also, it is interesting and beneficial to use a priori knowledge in developing advanced searching tools. Results This paper presents a new scoring function, termed as MISCORE, for functional motif characterization and evaluation. Our MISCORE is free from: (i) any assumption on model dependency; and (ii) the use of Markov chain model for background modeling. It integrates the compositional complexity of motif instances into the function. Performance evaluations with comparison to the well-known Maximum a Posteriori (MAP) score and Information Content (IC) have shown that MISCORE has promising capabilities to separate and recognize functional DNA motifs and its instances from non-functional ones. Conclusions MISCORE is a fast computational tool for candidate motif characterization, evaluation and selection. It enables to embed priori known motif models for computing motif-to-motif similarity, which is more advantageous than IC and MAP score. In addition to these merits mentioned above, MISCORE can automatically filter out some repetitive k-mers from a motif model due to the introduction of the compositional complexity in the function. Consequently, the merits of our proposed MISCORE in terms of both motif signal modeling power and computational efficiency will make it more applicable in the development of computational motif discovery tools.
Collapse
Affiliation(s)
- Dianhui Wang
- Department of Computer Science and Computer Engineering, La Trobe University, Melbourne, Victoria 3086, Australia.
| | | |
Collapse
|
11
|
Mittendorf KF, Deatherage CL, Ohi MD, Sanders CR. Tailoring of membrane proteins by alternative splicing of pre-mRNA. Biochemistry 2012; 51:5541-56. [PMID: 22708632 DOI: 10.1021/bi3007065] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Alternative splicing (AS) of RNA is a key mechanism for diversification of the eukaryotic proteome. In this process, different mRNA transcripts can be produced through altered excision and/or inclusion of exons during processing of the pre-mRNA molecule. Since its discovery, AS has been shown to play roles in protein structure, function, and localization. Dysregulation of this process can result in disease phenotypes. Moreover, AS pathways are promising therapeutic targets for a number of diseases. Integral membrane proteins (MPs) represent a class of proteins that may be particularly amenable to regulation by alternative splicing because of the distinctive topological restraints associated with their folding, structure, trafficking, and function. Here, we review the impact of AS on MP form and function and the roles of AS in MP-related disorders such as Alzheimer's disease.
Collapse
Affiliation(s)
- Kathleen F Mittendorf
- Department of Biochemistry, Vanderbilt University School of Medicine, Nashville, TN 37232, USA
| | | | | | | |
Collapse
|
12
|
Screening for PPAR Responsive Regulatory Modules in Cancer. PPAR Res 2011; 2008:749073. [PMID: 18551184 PMCID: PMC2422871 DOI: 10.1155/2008/749073] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2008] [Accepted: 05/05/2008] [Indexed: 12/31/2022] Open
Abstract
Peroxisome proliferator-activated receptors (PPARs) have via their large set of target genes a critical impact on numerous diseases including cancer. Cancer development involves numerous regulatory cascades that drive the progression of the malignancy of the cells. On a genomic level, these pathways converge on regulatory modules, some of which contain colocalizing PPAR binding sites (PPREs). We developed an in silico screening method that incorporates experiment- and informatics-derived evidence for a more reliable prediction of PPREs and PPAR target genes. This method is based on DNA-binding data of PPAR subtypes to a panel of DR1-type PPREs and tracking the enrichment of binding sites from multiple species. The ability of PPARγ to induce cellular differentiation and the existence of FDA-approved PPARγ agonists encourage the exploration of possibilities to activate or inactivate PPRE containing modules to arrest cancer progression. Recent advances in genomic techniques combined with computational analysis of binding modules are discussed in the review with the example of our recent screen for PPREs on human chromosome 19.
Collapse
|
13
|
Misawa K, Kikuno RF. GeneWaltz--A new method for reducing the false positives of gene finding. BioData Min 2010; 3:6. [PMID: 20875138 PMCID: PMC2955682 DOI: 10.1186/1756-0381-3-6] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2010] [Accepted: 09/28/2010] [Indexed: 11/28/2022] Open
Abstract
Background Identifying protein-coding regions in genomic sequences is an essential step in genome analysis. It is well known that the proportion of false positives among genes predicted by current methods is high, especially when the exons are short. These false positives are problematic because they waste time and resources of experimental studies. Methods We developed GeneWaltz, a new filtering method that reduces the risk of false positives in gene finding. GeneWaltz utilizes a codon-to-codon substitution matrix that was constructed by comparing protein-coding regions from orthologous gene pairs between mouse and human genomes. Using this matrix, a scoring scheme was developed; it assigned higher scores to coding regions and lower scores to non-coding regions. The regions with high scores were considered candidate coding regions. One-dimensional Karlin-Altschul statistics was used to test the significance of the coding regions identified by GeneWaltz. Results The proportion of false positives among genes predicted by GENSCAN and Twinscan were high, especially when the exons were short. GeneWaltz significantly reduced the ratio of false positives to all positives predicted by GENSCAN and Twinscan, especially when the exons were short. Conclusions GeneWaltz will be helpful in experimental genomic studies. GeneWaltz binaries and the matrix are available online at http://en.sourceforge.jp/projects/genewaltz/.
Collapse
Affiliation(s)
- Kazuharu Misawa
- Research Program for Computational Science, Research and Development Group for Next-Generation Integrated Living Matter Simulation, Fusion of Data and Analysis Research and Development Team, RIKEN, 4-6-1 Shirokanedai, Minato-ku, Tokyo 108-8639, Japan.
| | | |
Collapse
|
14
|
Abstract
Many people expected the question 'How many genes in the human genome?' to be resolved with the publication of the genome sequence in 2001, but estimates continue to fluctuate.
Collapse
Affiliation(s)
- Mihaela Pertea
- Center for Bioinformatics and Computational Biology, University of Maryland, College Park, MD 20742, USA
| | - Steven L Salzberg
- Center for Bioinformatics and Computational Biology, University of Maryland, College Park, MD 20742, USA
| |
Collapse
|
15
|
Ellington R, Wachira J, Nkwanta A. RNA secondary structure prediction by using discrete mathematics: an interdisciplinary research experience for undergraduate students. CBE LIFE SCIENCES EDUCATION 2010; 9:348-356. [PMID: 20810968 PMCID: PMC2931683 DOI: 10.1187/cbe.10-03-0036] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/16/2010] [Revised: 06/18/2010] [Accepted: 06/23/2010] [Indexed: 05/29/2023]
Abstract
The focus of this Research Experience for Undergraduates (REU) project was on RNA secondary structure prediction by using a lattice walk approach. The lattice walk approach is a combinatorial and computational biology method used to enumerate possible secondary structures and predict RNA secondary structure from RNA sequences. The method uses discrete mathematical techniques and identifies specified base pairs as parameters. The goal of the REU was to introduce upper-level undergraduate students to the principles and challenges of interdisciplinary research in molecular biology and discrete mathematics. At the beginning of the project, students from the biology and mathematics departments of a mid-sized university received instruction on the role of secondary structure in the function of eukaryotic RNAs and RNA viruses, RNA related to combinatorics, and the National Center for Biotechnology Information resources. The student research projects focused on RNA secondary structure prediction on a regulatory region of the yellow fever virus RNA genome and on an untranslated region of an mRNA of a gene associated with the neurological disorder epilepsy. At the end of the project, the REU students gave poster and oral presentations, and they submitted written final project reports to the program director. The outcome of the REU was that the students gained transferable knowledge and skills in bioinformatics and an awareness of the applications of discrete mathematics to biological research problems.
Collapse
Affiliation(s)
- Roni Ellington
- Departments of *Advanced Studies, Leadership, and Policy
| | | | | |
Collapse
|
16
|
Feschotte C, Keswani U, Ranganathan N, Guibotsy ML, Levine D. Exploring repetitive DNA landscapes using REPCLASS, a tool that automates the classification of transposable elements in eukaryotic genomes. Genome Biol Evol 2009; 1:205-20. [PMID: 20333191 PMCID: PMC2817418 DOI: 10.1093/gbe/evp023] [Citation(s) in RCA: 78] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/21/2009] [Indexed: 12/24/2022] Open
Abstract
Eukaryotic genomes contain large amount of repetitive DNA, most of which is derived from transposable elements (TEs). Progress has been made to develop computational tools for ab initio identification of repeat families, but there is an urgent need to develop tools to automate the annotation of TEs in genome sequences. Here we introduce REPCLASS, a tool that automates the classification of TE sequences. Using control repeat libraries, we show that the program can classify accurately virtually any known TE types. Combining REPCLASS to ab initio repeat finding in the genomes of Caenorhabditis elegans and Drosophila melanogaster allowed us to recover the contrasting TE landscape characteristic of these species. Unexpectedly, REPCLASS also uncovered several novel TE families in both genomes, augmenting the TE repertoire of these model species. When applied to the genomes of distant Caenorhabditis and Drosophila species, the approach revealed a remarkable conservation of TE composition profile within each genus, despite substantial interspecific covariations in genome size and in the number of TEs and TE families. Lastly, we applied REPCLASS to analyze 10 fungal genomes from a wide taxonomic range, most of which have not been analyzed for TE content previously. The results showed that TE diversity varies widely across the fungi “kingdom” and appears to positively correlate with genome size, in particular for DNA transposons. Together, these data validate REPCLASS as a powerful tool to explore the repetitive DNA landscapes of eukaryotes and to shed light onto the evolutionary forces shaping TE diversity and genome architecture.
Collapse
|
17
|
Deplancke B. Experimental advances in the characterization of metazoan gene regulatory networks. BRIEFINGS IN FUNCTIONAL GENOMICS AND PROTEOMICS 2009; 8:12-27. [PMID: 19324929 DOI: 10.1093/bfgp/elp001] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
Gene regulatory networks (GRNs) play a vital role in metazoan development and function, and deregulation of these networks is often implicated in disease. GRNs depict the dynamic interactions between genomic and regulatory state components. The genomic components comprise genes and their associated cis-regulatory elements. The regulatory state components consist primarily of transcriptional complexes that bind the latter elements. With the availability of complete genome sequences, several approaches have recently been developed which promise to significantly enhance our ability to identify either the genomic or regulatory state components, or the interactions between these two. In this review, I provide an in-depth overview of these approaches and detail how each contributes to a more comprehensive understanding of GRN composition and function.
Collapse
Affiliation(s)
- Bart Deplancke
- Ecole Polytechnique Fédérale de Lausanne, School of Life Sciences, Institute of Bioengineering, Lausanne, Switzerland.
| |
Collapse
|
18
|
Manichaikul A, Ghamsari L, Hom EFY, Lin C, Murray RR, Chang RL, Balaji S, Hao T, Shen Y, Chavali AK, Thiele I, Yang X, Fan C, Mello E, Hill DE, Vidal M, Salehi-Ashtiani K, Papin JA. Metabolic network analysis integrated with transcript verification for sequenced genomes. Nat Methods 2009; 6:589-92. [PMID: 19597503 DOI: 10.1038/nmeth.1348] [Citation(s) in RCA: 74] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2009] [Accepted: 06/17/2009] [Indexed: 01/02/2023]
Abstract
With sequencing of thousands of organisms completed or in progress, there is a growing need to integrate gene prediction with metabolic network analysis. Using Chlamydomonas reinhardtii as a model, we describe a systems-level methodology bridging metabolic network reconstruction with experimental verification of enzyme encoding open reading frames. Our quantitative and predictive metabolic model and its associated cloned open reading frames provide useful resources for metabolic engineering.
Collapse
Affiliation(s)
- Ani Manichaikul
- Department of Biomedical Engineering, University of Virginia, Charlottesville, Virginia, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
19
|
Prevalence of transcription promoters within archaeal operons and coding sequences. Mol Syst Biol 2009; 5:285. [PMID: 19536208 PMCID: PMC2710873 DOI: 10.1038/msb.2009.42] [Citation(s) in RCA: 96] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2008] [Accepted: 05/13/2009] [Indexed: 01/21/2023] Open
Abstract
Despite the knowledge of complex prokaryotic-transcription mechanisms, generalized rules, such as the simplified organization of genes into operons with well-defined promoters and terminators, have had a significant role in systems analysis of regulatory logic in both bacteria and archaea. Here, we have investigated the prevalence of alternate regulatory mechanisms through genome-wide characterization of transcript structures of approximately 64% of all genes, including putative non-coding RNAs in Halobacterium salinarum NRC-1. Our integrative analysis of transcriptome dynamics and protein-DNA interaction data sets showed widespread environment-dependent modulation of operon architectures, transcription initiation and termination inside coding sequences, and extensive overlap in 3' ends of transcripts for many convergently transcribed genes. A significant fraction of these alternate transcriptional events correlate to binding locations of 11 transcription factors and regulators (TFs) inside operons and annotated genes-events usually considered spurious or non-functional. Using experimental validation, we illustrate the prevalence of overlapping genomic signals in archaeal transcription, casting doubt on the general perception of rigid boundaries between coding sequences and regulatory elements.
Collapse
|
20
|
Reeves GA, Talavera D, Thornton JM. Genome and proteome annotation: organization, interpretation and integration. J R Soc Interface 2009; 6:129-47. [PMID: 19019817 DOI: 10.1098/rsif.2008.0341] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Recent years have seen a huge increase in the generation of genomic and proteomic data. This has been due to improvements in current biological methodologies, the development of new experimental techniques and the use of computers as support tools. All these raw data are useless if they cannot be properly analysed, annotated, stored and displayed. Consequently, a vast number of resources have been created to present the data to the wider community. Annotation tools and databases provide the means to disseminate these data and to comprehend their biological importance. This review examines the various aspects of annotation: type, methodology and availability. Moreover, it puts a special interest on novel annotation fields, such as that of phenotypes, and highlights the recent efforts focused on the integrating annotations.
Collapse
Affiliation(s)
- Gabrielle A Reeves
- EMBL-EBI, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK.
| | | | | |
Collapse
|
21
|
Harrow J, Nagy A, Reymond A, Alioto T, Patthy L, Antonarakis SE, Guigó R. Identifying protein-coding genes in genomic sequences. Genome Biol 2009; 10:201. [PMID: 19226436 PMCID: PMC2687780 DOI: 10.1186/gb-2009-10-1-201] [Citation(s) in RCA: 54] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022] Open
Abstract
A review of the main computational pipelines used to generate the human reference protein-coding gene sets. The vast majority of the biology of a newly sequenced genome is inferred from the set of encoded proteins. Predicting this set is therefore invariably the first step after the completion of the genome DNA sequence. Here we review the main computational pipelines used to generate the human reference protein-coding gene sets.
Collapse
Affiliation(s)
- Jennifer Harrow
- Wellcome Trust Sanger Institute, Wellcome Trust Campus, Hinxton, Cambridge, UK
| | | | | | | | | | | | | |
Collapse
|
22
|
Kavanaugh LA, Dietrich FS. Non-coding RNA prediction and verification in Saccharomyces cerevisiae. PLoS Genet 2009; 5:e1000321. [PMID: 19119416 PMCID: PMC2603021 DOI: 10.1371/journal.pgen.1000321] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2008] [Accepted: 12/01/2008] [Indexed: 11/18/2022] Open
Abstract
Non-coding RNA (ncRNA) play an important and varied role in cellular function. A significant amount of research has been devoted to computational prediction of these genes from genomic sequence, but the ability to do so has remained elusive due to a lack of apparent genomic features. In this work, thermodynamic stability of ncRNA structural elements, as summarized in a Z-score, is used to predict ncRNA in the yeast Saccharomyces cerevisiae. This analysis was coupled with comparative genomics to search for ncRNA genes on chromosome six of S. cerevisiae and S. bayanus. Sets of positive and negative control genes were evaluated to determine the efficacy of thermodynamic stability for discriminating ncRNA from background sequence. The effect of window sizes and step sizes on the sensitivity of ncRNA identification was also explored. Non-coding RNA gene candidates, common to both S. cerevisiae and S. bayanus, were verified using northern blot analysis, rapid amplification of cDNA ends (RACE), and publicly available cDNA library data. Four ncRNA transcripts are well supported by experimental data (RUF10, RUF11, RUF12, RUF13), while one additional putative ncRNA transcript is well supported but the data are not entirely conclusive. Six candidates appear to be structural elements in 5′ or 3′ untranslated regions of annotated protein-coding genes. This work shows that thermodynamic stability, coupled with comparative genomics, can be used to predict ncRNA with significant structural elements. Recent advances in DNA sequence technology have made it possible to sequence entire genomes. Once a genome is sequenced, it becomes necessary to identify the set of genes and other functional elements within the genome. This is particularly challenging as much of the genomic sequence does not appear to perform any function and is loosely referred to as “junk.” Identifying functional elements among the “junk” is difficult. Experimental methods have been developed for this purpose but they are time-consuming, expensive, and often provide an incomplete picture. Thus, it is important to develop the ability to identify these functional elements using computational methods. Protein-coding genes are relatively easy to identify computationally, but other categories of functional elements present a significantly greater challenge. In this work, we used a computational approach to identify genes that do not encode for a protein but rather function as an RNA molecule. We then used experimental methods to verify our predictions and thereby validate the computational method.
Collapse
Affiliation(s)
- Laura A. Kavanaugh
- Department of Molecular Genetics and Microbiology, Institute for Genome Sciences and Policy, Duke University Medical Center, Durham, North Carolina, United States of America
| | - Fred S. Dietrich
- Department of Molecular Genetics and Microbiology, Institute for Genome Sciences and Policy, Duke University Medical Center, Durham, North Carolina, United States of America
- * E-mail:
| |
Collapse
|
23
|
|
24
|
Lesne A, Benecke A. Probability landscapes for integrative genomics. Theor Biol Med Model 2008; 5:9. [PMID: 18492240 PMCID: PMC2409305 DOI: 10.1186/1742-4682-5-9] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2008] [Accepted: 05/20/2008] [Indexed: 11/16/2022] Open
Abstract
Background The comprehension of the gene regulatory code in eukaryotes is one of the major challenges of systems biology, and is a requirement for the development of novel therapeutic strategies for multifactorial diseases. Its bi-fold degeneration precludes brute force and statistical approaches based on the genomic sequence alone. Rather, recursive integration of systematic, whole-genome experimental data with advanced statistical regulatory sequence predictions needs to be developed. Such experimental approaches as well as the prediction tools are only starting to become available and increasing numbers of genome sequences and empirical sequence annotations are under continual discovery-driven change. Furthermore, given the complexity of the question, a decade(s) long multi-laboratory effort needs to be envisioned. These constraints need to be considered in the creation of a framework that can pave a road to successful comprehension of the gene regulatory code. Results We introduce here a concept for such a framework, based entirely on systematic annotation in terms of probability profiles of genomic sequence using any type of relevant experimental and theoretical information and subsequent cross-correlation analysis in hypothesis-driven model building and testing. Conclusion Probability landscapes, which include as reference set the probabilistic representation of the genomic sequence, can be used efficiently to discover and analyze correlations amongst initially heterogeneous and un-relatable descriptions and genome-wide measurements. Furthermore, this structure is usable as a support for automatically generating and testing hypotheses for alternative gene regulatory grammars and the evaluation of those through statistical analysis of the high-dimensional correlations between genomic sequence, sequence annotations, and experimental data. Finally, this structure provides a concrete and tangible basis for attempting to formulate a mathematical description of gene regulation in eukaryotes on a genome-wide scale.
Collapse
Affiliation(s)
- Annick Lesne
- Institut des Hautes Etudes Scientifiques, Bures sur Yvette, France.
| | | |
Collapse
|
25
|
Levasseur A, Pontarotti P, Poch O, Thompson JD. Strategies for reliable exploitation of evolutionary concepts in high throughput biology. Evol Bioinform Online 2008; 4:121-37. [PMID: 19204813 PMCID: PMC2614184 DOI: 10.4137/ebo.s597] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022] Open
Abstract
The recent availability of the complete genome sequences of a large number of model organisms, together with the immense amount of data being produced by the new high-throughput technologies, means that we can now begin comparative analyses to understand the mechanisms involved in the evolution of the genome and their consequences in the study of biological systems. Phylogenetic approaches provide a unique conceptual framework for performing comparative analyses of all this data, for propagating information between different systems and for predicting or inferring new knowledge. As a result, phylogeny-based inference systems are now playing an increasingly important role in most areas of high throughput genomics, including studies of promoters (phylogenetic footprinting), interactomes (based on the presence and degree of conservation of interacting proteins), and in comparisons of transcriptomes or proteomes (phylogenetic proximity and co-regulation/co-expression). Here we review the recent developments aimed at making automatic, reliable phylogeny-based inference feasible in large-scale projects. We also discuss how evolutionary concepts and phylogeny-based inference strategies are now being exploited in order to understand the evolution and function of biological systems. Such advances will be fundamental for the success of the emerging disciplines of systems biology and synthetic biology, and will have wide-reaching effects in applied fields such as biotechnology, medicine and pharmacology.
Collapse
Affiliation(s)
- Anthony Levasseur
- Phylogenomics Laboratory, EA 3781 Evolution Biologique, Université de Provence, 13331 Marseille, France
| | | | | | | |
Collapse
|
26
|
Abstract
Pharmacological treatment in Alzheimer's disease (AD) accounts for 10-20% of direct costs, and fewer than 20% of AD patients are moderate responders to conventional drugs (donepezil, rivastigmine, galantamine, memantine), with doubtful cost-effectiveness. Both AD pathogenesis and drug metabolism are genetically regulated complex traits in which hundreds of genes cooperatively participate. Structural genomics studies demonstrated that more than 200 genes might be involved in AD pathogenesis regulating dysfunctional genetic networks leading to premature neuronal death. The AD population exhibits a higher genetic variation rate than the control population, with absolute and relative genetic variations of 40-60% and 0.85-1.89%, respectively. AD patients also differ in their genomic architecture from patients with other forms of dementia. Functional genomics studies in AD revealed that age of onset, brain atrophy, cerebrovascular hemodynamics, brain bioelectrical activity, cognitive decline, apoptosis, immune function, lipid metabolism dyshomeostasis, and amyloid deposition are associated with AD-related genes. Pioneering pharmacogenomics studies also demonstrated that the therapeutic response in AD is genotype-specific, with apolipoprotein E (APOE) 4/4 carriers the worst responders to conventional treatments. About 10-20% of Caucasians are carriers of defective cytochrome P450 (CYP) 2D6 polymorphic variants that alter the metabolism and effects of AD drugs and many psychotropic agents currently administered to patients with dementia. There is a moderate accumulation of AD-related genetic variants of risk in CYP2D6 poor metabolizers (PMs) and ultrarapid metabolizers (UMs), who are the worst responders to conventional drugs. The association of the APOE-4 allele with specific genetic variants of other genes (e.g., CYP2D6, angiotensin-converting enzyme [ACE]) negatively modulates the therapeutic response to multifactorial treatments affecting cognition, mood, and behavior. Pharmacogenetic and pharmacogenomic factors may account for 60-90% of drug variability in drug disposition and pharmacodynamics. The incorporation of pharmacogenetic/pharmacogenomic protocols to AD research and clinical practice can foster therapeutics optimization by helping to develop cost-effective pharmaceuticals and improving drug efficacy and safety.
Collapse
Affiliation(s)
- Ramón Cacabelos
- EuroEspes Biomedical Research Center, Institute for CNS Disorders, Bergondo, Coruña, Spain
| |
Collapse
|
27
|
Cogburn LA, Porter TE, Duclos MJ, Simon J, Burgess SC, Zhu JJ, Cheng HH, Dodgson JB, Burnside J. Functional genomics of the chicken--a model organism. Poult Sci 2007; 86:2059-94. [PMID: 17878436 DOI: 10.1093/ps/86.10.2059] [Citation(s) in RCA: 86] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Since the sequencing of the genome and the development of high-throughput tools for the exploration of functional elements of the genome, the chicken has reached model organism status. Functional genomics focuses on understanding the function and regulation of genes and gene products on a global or genome-wide scale. Systems biology attempts to integrate functional information derived from multiple high-content data sets into a holistic view of all biological processes within a cell or organism. Generation of a large collection ( approximately 600K) of chicken expressed sequence tags, representing most tissues and developmental stages, has enabled the construction of high-density microarrays for transcriptional profiling. Comprehensive analysis of this large expressed sequence tag collection and a set of approximately 20K full-length cDNA sequences indicate that the transcriptome of the chicken represents approximately 20,000 genes. Furthermore, comparative analyses of these sequences have facilitated functional annotation of the genome and the creation of several bioinformatic resources for the chicken. Recently, about 20 papers have been published on transcriptional profiling with DNA microarrays in chicken tissues under various conditions. Proteomics is another powerful high-throughput tool currently used for examining the dynamics of protein expression in chicken tissues and fluids. Computational analyses of the chicken genome are providing new insight into the evolution of gene families in birds and other organisms. Abundant functional genomic resources now support large-scale analyses in the chicken and will facilitate identification of transcriptional mechanisms, gene networks, and metabolic or regulatory pathways that will ultimately determine the phenotype of the bird. New technologies such as marker-assisted selection, transgenics, and RNA interference offer the opportunity to modify the phenotype of the chicken to fit defined production goals. This review focuses on functional genomics in the chicken and provides a road map for large-scale exploration of the chicken genome.
Collapse
Affiliation(s)
- L A Cogburn
- Department of Animal and Food Sciences, University of Delaware, Newark 19717, USA.
| | | | | | | | | | | | | | | | | |
Collapse
|
28
|
Bechtel S, Rosenfelder H, Duda A, Schmidt CP, Ernst U, Wellenreuther R, Mehrle A, Schuster C, Bahr A, Blöcker H, Heubner D, Hoerlein A, Michel G, Wedler H, Köhrer K, Ottenwälder B, Poustka A, Wiemann S, Schupp I. The full-ORF clone resource of the German cDNA Consortium. BMC Genomics 2007; 8:399. [PMID: 17974005 PMCID: PMC2213676 DOI: 10.1186/1471-2164-8-399] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2007] [Accepted: 10/31/2007] [Indexed: 11/24/2022] Open
Abstract
Background With the completion of the human genome sequence the functional analysis and characterization of the encoded proteins has become the next urging challenge in the post-genome era. The lack of comprehensive ORFeome resources has thus far hampered systematic applications by protein gain-of-function analysis. Gene and ORF coverage with full-length ORF clones thus needs to be extended. In combination with a unique and versatile cloning system, these will provide the tools for genome-wide systematic functional analyses, to achieve a deeper insight into complex biological processes. Results Here we describe the generation of a full-ORF clone resource of human genes applying the Gateway cloning technology (Invitrogen). A pipeline for efficient cloning and sequencing was developed and a sample tracking database was implemented to streamline the clone production process targeting more than 2,200 different ORFs. In addition, a robust cloning strategy was established, permitting the simultaneous generation of two clone variants that contain a particular ORF with as well as without a stop codon by the implementation of only one additional working step into the cloning procedure. Up to 92 % of the targeted ORFs were successfully amplified by PCR and more than 93 % of the amplicons successfully cloned. Conclusion The German cDNA Consortium ORFeome resource currently consists of more than 3,800 sequence-verified entry clones representing ORFs, cloned with and without stop codon, for about 1,700 different gene loci. 177 splice variants were cloned representing 121 of these genes. The entry clones have been used to generate over 5,000 different expression constructs, providing the basis for functional profiling applications. As a member of the recently formed international ORFeome collaboration we substantially contribute to generating and providing a whole genome human ORFeome collection in a unique cloning system that is made freely available in the community.
Collapse
Affiliation(s)
- Stephanie Bechtel
- Department of Molecular Genome Analysis, German Cancer Research Center (DKFZ), Heidelberg, Germany.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|