1
|
Andronis CE, Bringans S, Tan KC. Application of Proteomic Methods in Oomycete Biology. Methods Mol Biol 2025; 2892:211-231. [PMID: 39729279 DOI: 10.1007/978-1-0716-4330-3_15] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2024]
Abstract
The biochemical makeup of any organism provides insight into key factors regarding its biological functions. These factors can be explored using proteomics, which allows us to obtain a snapshot of the protein content and abundance in an organism, cell type or sub-cellular compartment. Here, we describe proteomic methodologies that can be used to dissect the biochemical mechanism of phytopathogenicity in oomycetes. These methodologies include protein extraction, purification, subsequent processing, mass spectrometry analysis, and qualitative and quantitative data processing of oomycete proteomes for comparative studies. Additionally, the use of mass spectra to assist in gene validation and modelling in unfinished oomycete genomes is also described.
Collapse
Affiliation(s)
- Christina E Andronis
- Proteomics International, Nedlands, WA, Australia
- The Centre for Crop and Disease Management, Curtin University, Bentley, WA, Australia
| | | | - Kar-Chun Tan
- The Centre for Crop and Disease Management, Curtin University, Bentley, WA, Australia.
| |
Collapse
|
2
|
Yaşar P, Kars G, Yavuz K, Ayaz G, Oğuztüzün Ç, Bilgen E, Suvacı Z, Çetinkol ÖP, Can T, Muyan M. A CpG island promoter drives the CXXC5 gene expression. Sci Rep 2021; 11:15655. [PMID: 34341443 PMCID: PMC8329181 DOI: 10.1038/s41598-021-95165-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2021] [Accepted: 07/16/2021] [Indexed: 02/06/2023] Open
Abstract
CXXC5 is a member of the zinc-finger CXXC family that binds to unmethylated CpG dinucleotides. CXXC5 modulates gene expressions resulting in diverse cellular events mediated by distinct signaling pathways. However, the mechanism responsible for CXXC5 expression remains largely unknown. We found here that of the 14 annotated CXXC5 transcripts with distinct 5' untranslated regions encoding the same protein, transcript variant 2 with the highest expression level among variants represents the main transcript in cell models. The DNA segment in and at the immediate 5'-sequences of the first exon of variant 2 contains a core promoter within which multiple transcription start sites are present. Residing in a region with high G-C nucleotide content and CpG repeats, the core promoter is unmethylated, deficient in nucleosomes, and associated with active RNA polymerase-II. These findings suggest that a CpG island promoter drives CXXC5 expression. Promoter pull-down revealed the association of various transcription factors (TFs) and transcription co-regulatory proteins, as well as proteins involved in histone/chromatin, DNA, and RNA processing with the core promoter. Of the TFs, we verified that ELF1 and MAZ contribute to CXXC5 expression. Moreover, the first exon of variant 2 may contain a G-quadruplex forming region that could modulate CXXC5 expression.
Collapse
Affiliation(s)
- Pelin Yaşar
- Department of Biological Sciences, Middle East Technical University, Ankara, 06800, Turkey.
- Epigenetics and Stem Cell Biology Laboratory, Single Cell Dynamics Group, National Institute of Environmental Health Sciences, Research Triangle Park, NC, 27709, USA.
| | - Gizem Kars
- Department of Biological Sciences, Middle East Technical University, Ankara, 06800, Turkey
| | - Kerim Yavuz
- Department of Biological Sciences, Middle East Technical University, Ankara, 06800, Turkey
| | - Gamze Ayaz
- Department of Biological Sciences, Middle East Technical University, Ankara, 06800, Turkey
- Cancer and Stem Cell Epigenetics Section, Laboratory of Cancer Biology and Genetics, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD, 20892, USA
| | - Çerağ Oğuztüzün
- Department of Computer Engineering, Bilkent University, Ankara, 06800, Turkey
| | - Ecenaz Bilgen
- Department of Chemistry, Middle East Technical University, Ankara, 06800, Turkey
| | - Zeynep Suvacı
- Department of Chemistry, Middle East Technical University, Ankara, 06800, Turkey
| | | | - Tolga Can
- Department of Computer Engineering, Middle East Technical University, Ankara, 06800, Turkey
| | - Mesut Muyan
- Department of Biological Sciences, Middle East Technical University, Ankara, 06800, Turkey.
- Cansyl Laboratories, Middle East Technical University, Ankara, 06800, Turkey.
| |
Collapse
|
3
|
Foerster H, Battey JND, Sierro N, Ivanov NV, Mueller LA. Metabolic networks of the Nicotiana genus in the spotlight: content, progress and outlook. Brief Bioinform 2021; 22:bbaa136. [PMID: 32662816 PMCID: PMC8138835 DOI: 10.1093/bib/bbaa136] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2020] [Revised: 05/19/2020] [Accepted: 06/04/2020] [Indexed: 01/09/2023] Open
Abstract
Manually curated metabolic databases residing at the Sol Genomics Network comprise two taxon-specific databases for the Solanaceae family, i.e. SolanaCyc and the genus Nicotiana, i.e. NicotianaCyc as well as six species-specific databases for Nicotiana tabacum TN90, N. tabacum K326, Nicotiana benthamiana, N. sylvestris, N. tomentosiformis and N. attenuata. New pathways were created through the extraction, examination and verification of related data from the literature and the aid of external database guided by an expert-led curation process. Here we describe the curation progress that has been achieved in these databases since the first release version 1.0 in 2016, the curation flow and the curation process using the example metabolic pathway for cholesterol in plants. The current content of our databases comprises 266 pathways and 36 superpathways in SolanaCyc and 143 pathways plus 21 superpathways in NicotianaCyc, manually curated and validated specifically for the Solanaceae family and Nicotiana genus, respectively. The curated data have been propagated to the respective Nicotiana-specific databases, which resulted in the enrichment and more accurate presentation of their metabolic networks. The quality and coverage in those databases have been compared with related external databases and discussed in terms of literature support and metabolic content.
Collapse
|
4
|
Abstract
Every microarray experiment is based on a common format. First, a large number of nucleotide "spots" are arrayed onto a substrate, typically a glass slide, a silicon chip, or microbeads. Second, a complex population of nucleic acids (isolated from cells, selected from in vitro-synthesized libraries, or obtained from another source) is labeled, typically with fluorescent dyes. Third, the labeled nucleic acids are allowed to hybridize to their complementary spot(s) on the microarray. Fourth, the hybridized microarray is washed, allowing the amount of hybridized label to then be quantified. Analysis of the raw data generates a readout of the levels of each species of RNA in the original complex population. This introduction includes several examples of microarray applications and provides a discussion of the basic steps of most microarray experiments.
Collapse
|
5
|
Visser M, Weber K, Rincon G, Merritt D. Use of RNA-seq to determine variation in canine cytochrome P450 mRNA expression between blood, liver, lung, kidney and duodenum in healthy beagles. J Vet Pharmacol Ther 2017; 40:583-590. [DOI: 10.1111/jvp.12400] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2016] [Accepted: 02/03/2017] [Indexed: 12/22/2022]
Affiliation(s)
- M. Visser
- Veterinary Medicine Research and Development, Metabolism & Safety; Zoetis; Kalamazoo MI USA
- Department of Anatomy, Physiology and Pharmacology; College of Veterinary Medicine; Auburn University; Auburn AL USA
| | - K. Weber
- Veterinary Medicine Research and Development, Genetics; Zoetis; Kalamazoo MI USA
| | - G. Rincon
- Veterinary Medicine Research and Development, Genetics; Zoetis; Kalamazoo MI USA
| | - D. Merritt
- Veterinary Medicine Research and Development, Metabolism & Safety; Zoetis; Kalamazoo MI USA
| |
Collapse
|
6
|
Leelananda SP, Kloczkowski A, Jernigan RL. Fold-specific sequence scoring improves protein sequence matching. BMC Bioinformatics 2016; 17:328. [PMID: 27578239 PMCID: PMC5006591 DOI: 10.1186/s12859-016-1198-z] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2016] [Accepted: 08/24/2016] [Indexed: 11/10/2022] Open
Abstract
Background Sequence matching is extremely important for applications throughout biology, particularly for discovering information such as functional and evolutionary relationships, and also for discriminating between unimportant and disease mutants. At present the functions of a large fraction of genes are unknown; improvements in sequence matching will improve gene annotations. Universal amino acid substitution matrices such as Blosum62 are used to measure sequence similarities and to identify distant homologues, regardless of the structure class. However, such single matrices do not take into account important structural information evident within the different topologies of proteins and treats substitutions within all protein folds identically. Others have suggested that the use of structural information can lead to significant improvements in sequence matching but this has not yet been very effective. Here we develop novel substitution matrices that include not only general sequence information but also have a topology specific component that is unique for each CATH topology. This novel feature of using a combination of sequence and structure information for each protein topology significantly improves the sequence matching scores for the sequence pairs tested. We have used a novel multi-structure alignment method for each homology level of CATH in order to extract topological information. Results We obtain statistically significant improved sequence matching scores for 73 % of the alpha helical test cases. On average, 61 % of the test cases showed improvements in homology detection when structure information was incorporated into the substitution matrices. On average z-scores for homology detection are improved by more than 54 % for all cases, and some individual cases have z-scores more than twice those obtained using generic matrices. Our topology specific similarity matrices also outperform other traditional similarity matrices and single matrix based structure methods. When default amino acid substitution matrix in the Psi-blast algorithm is replaced by our structure-based matrices, the structure matching is significantly improved over conventional Psi-blast. It also outperforms results obtained for the corresponding HMM profiles generated for each topology. Conclusions We show that by incorporating topology-specific structure information in addition to sequence information into specific amino acid substitution matrices, the sequence matching scores and homology detection are significantly improved. Our topology specific similarity matrices outperform other traditional similarity matrices, single matrix based structure methods, also show improvement over conventional Psi-blast and HMM profile based methods in sequence matching. The results support the discriminatory ability of the new amino acid similarity matrices to distinguish between distant homologs and structurally dissimilar pairs. Electronic supplementary material The online version of this article (doi:10.1186/s12859-016-1198-z) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Sumudu P Leelananda
- Department of Biochemistry, Biophysics and Molecular Biology, Iowa State University, 112 Office and Lab Building, Ames, IA, 50011-3020, USA.,Laurence H. Baker Center for Bioinformatics and Biological Statistics, Iowa State University, 112 Office and Lab Building, Ames, IA, 50011-3020, USA.,Present Address: 2120 Newman and Wolfrom Laboratory, The Ohio State University, 100 W 18th Ave, Columbus, OH, 43210, USA.,Present Address: Battelle Center for Mathematical Medicine, The Research Institute at Nationwide Children's Hospital, Columbus, OH, 43205, USA
| | - Andrzej Kloczkowski
- Present Address: Battelle Center for Mathematical Medicine, The Research Institute at Nationwide Children's Hospital, Columbus, OH, 43205, USA.,Present Address: Department of Pediatrics, The Ohio State University College of Medicine, Columbus, OH, 43205, USA
| | - Robert L Jernigan
- Department of Biochemistry, Biophysics and Molecular Biology, Iowa State University, 112 Office and Lab Building, Ames, IA, 50011-3020, USA. .,Laurence H. Baker Center for Bioinformatics and Biological Statistics, Iowa State University, 112 Office and Lab Building, Ames, IA, 50011-3020, USA.
| |
Collapse
|
7
|
Ascensao JA, Dolan ME, Hill DP, Blake JA. Methodology for the inference of gene function from phenotype data. BMC Bioinformatics 2014; 15:405. [PMID: 25495798 PMCID: PMC4302099 DOI: 10.1186/s12859-014-0405-z] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2014] [Accepted: 12/02/2014] [Indexed: 12/14/2022] Open
Abstract
Background Biomedical ontologies are increasingly instrumental in the advancement of biological research primarily through their use to efficiently consolidate large amounts of data into structured, accessible sets. However, ontology development and usage can be hampered by the segregation of knowledge by domain that occurs due to independent development and use of the ontologies. The ability to infer data associated with one ontology to data associated with another ontology would prove useful in expanding information content and scope. We here focus on relating two ontologies: the Gene Ontology (GO), which encodes canonical gene function, and the Mammalian Phenotype Ontology (MP), which describes non-canonical phenotypes, using statistical methods to suggest GO functional annotations from existing MP phenotype annotations. This work is in contrast to previous studies that have focused on inferring gene function from phenotype primarily through lexical or semantic similarity measures. Results We have designed and tested a set of algorithms that represents a novel methodology to define rules for predicting gene function by examining the emergent structure and relationships between the gene functions and phenotypes rather than inspecting the terms semantically. The algorithms inspect relationships among multiple phenotype terms to deduce if there are cases where they all arise from a single gene function. We apply this methodology to data about genes in the laboratory mouse that are formally represented in the Mouse Genome Informatics (MGI) resource. From the data, 7444 rule instances were generated from five generalized rules, resulting in 4818 unique GO functional predictions for 1796 genes. Conclusions We show that our method is capable of inferring high-quality functional annotations from curated phenotype data. As well as creating inferred annotations, our method has the potential to allow for the elucidation of unforeseen, biologically significant associations between gene function and phenotypes that would be overlooked by a semantics-based approach. Future work will include the implementation of the described algorithms for a variety of other model organism databases, taking full advantage of the abundance of available high quality curated data. Electronic supplementary material The online version of this article (doi:10.1186/s12859-014-0405-z) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Joao A Ascensao
- The Jackson Laboratory, 600 Main Street, Bar Harbor, ME, USA. .,Rice University, 6100 Main Street, Houston, TX, USA.
| | - Mary E Dolan
- The Jackson Laboratory, 600 Main Street, Bar Harbor, ME, USA.
| | - David P Hill
- The Jackson Laboratory, 600 Main Street, Bar Harbor, ME, USA.
| | - Judith A Blake
- The Jackson Laboratory, 600 Main Street, Bar Harbor, ME, USA.
| |
Collapse
|
8
|
Dutt M, Dhekney SA, Soriano L, Kandel R, Grosser JW. Temporal and spatial control of gene expression in horticultural crops. HORTICULTURE RESEARCH 2014; 1:14047. [PMID: 26504550 PMCID: PMC4596326 DOI: 10.1038/hortres.2014.47] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/25/2014] [Revised: 07/19/2014] [Accepted: 08/06/2014] [Indexed: 05/05/2023]
Abstract
Biotechnology provides plant breeders an additional tool to improve various traits desired by growers and consumers of horticultural crops. It also provides genetic solutions to major problems affecting horticultural crops and can be a means for rapid improvement of a cultivar. With the availability of a number of horticultural genome sequences, it has become relatively easier to utilize these resources to identify DNA sequences for both basic and applied research. Promoters play a key role in plant gene expression and the regulation of gene expression. In recent years, rapid progress has been made on the isolation and evaluation of plant-derived promoters and their use in horticultural crops, as more and more species become amenable to genetic transformation. Our understanding of the tools and techniques of horticultural plant biotechnology has now evolved from a discovery phase to an implementation phase. The availability of a large number of promoters derived from horticultural plants opens up the field for utilization of native sequences and improving crops using precision breeding. In this review, we look at the temporal and spatial control of gene expression in horticultural crops and the usage of a variety of promoters either isolated from horticultural crops or used in horticultural crop improvement.
Collapse
Affiliation(s)
- Manjul Dutt
- Citrus Research and Education Center, University of Florida, 700 Experiment Station Road, Lake Alfred, FL 33850, USA
| | - Sadanand A Dhekney
- Department of Plant Sciences, Sheridan Research and Extension Center, University of Wyoming, Sheridan, WY 82801, USA
| | - Leonardo Soriano
- Citrus Research and Education Center, University of Florida, 700 Experiment Station Road, Lake Alfred, FL 33850, USA
- Universidade de Sao Paulo, Centro de Energia Nuclear na Agricultura, Piracicaba, Brazil
| | - Raju Kandel
- Department of Plant Sciences, Sheridan Research and Extension Center, University of Wyoming, Sheridan, WY 82801, USA
| | - Jude W Grosser
- Citrus Research and Education Center, University of Florida, 700 Experiment Station Road, Lake Alfred, FL 33850, USA
| |
Collapse
|
9
|
Rauch HB, Patrick TL, Klusman KM, Battistuzzi FU, Mei W, Brendel VP, Lal SK. Discovery and expression analysis of alternative splicing events conserved among plant SR proteins. Mol Biol Evol 2013; 31:605-13. [PMID: 24356560 DOI: 10.1093/molbev/mst238] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023] Open
Abstract
The high frequency of alternative splicing among the serine/arginine-rich (SR) family of proteins in plants has been linked to important roles in gene regulation during development and in response to environmental stress. In this article, we have searched and manually annotated all the SR proteins in the genomes of maize and sorghum. The experimental validation of gene structure by reverse transcription-polymerase chain reaction (RT-PCR) analysis revealed, with few exceptions, that SR genes produced multiple isoforms of transcripts by alternative splicing. Despite sharing high structural similarity and conserved positions of the introns, the profile of alternative splicing diverged significantly between maize and sorghum for the vast majority of SR genes. These include many transcript isoforms discovered by RT-PCR and not represented in extant expressed sequence tag (EST) collection. However, we report the occurrence of various maize and sorghum SR mRNA isoforms that display evolutionary conservation of splicing events with their homologous SR genes in Arabidopsis and moss. Our data also indicate an important role of both 5' and 3' untranslated regions in the regulation of SR gene expression. These observations have potentially important implications for the processes of evolution and adaptation of plants to land.
Collapse
|
10
|
Wang G. Chromosome 10q26 locus and age-related macular degeneration: a progress update. Exp Eye Res 2013; 119:1-7. [PMID: 24291204 DOI: 10.1016/j.exer.2013.11.009] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2013] [Revised: 11/12/2013] [Accepted: 11/18/2013] [Indexed: 12/18/2022]
Abstract
Age-related macular degeneration (AMD) is the leading cause of late-onset central vision loss in developed countries. Both genetic and environmental factors contribute to the onset of AMD. Variation at a locus on chromosome 10q26 has been consistently associated with this disease and represents one of the two strongest genetic effects being identified in AMD. At least three genes are located within the bounds of the locus: pleckstrin homology domain containing family A member 1 (PLEKHA1), age-related maculopathy susceptibility 2 (ARMS2) and high-temperature requirement A serine peptidase 1 (HTRA1), all of which are associated with AMD. Due to the strong linkage disequilibrium (LD) across this region, statistical genetic analysis alone is incapable of distinguishing the effect of an individual gene in the locus. Uncertainty remains, however, in regards to which gene is responsible for the linkage and association of the locus with AMD. Investigating functional consequences of the associated variants and related genes tends to be essential to identifying the biologically responsible gene(s) underlying AMD. This review examines the recent progress and current uncertainty on the genetic and functional analyses of the 10q26 locus in AMD with a focus on ARMS2 and HTRA1. A discussion, which entails the possible multi-faceted approaches for pinpointing the gene(s) in the locus underlying the pathogenesis of AMD, is also included.
Collapse
Affiliation(s)
- Gaofeng Wang
- John P. Hussman Institute for Human Genomics, Dr. John T. Macdonald Foundation Department of Human Genetics, University of Miami Miller School of Medicine, 1501 N.W. 10th Avenue, BRB 525, M860, Miami, FL 33136, United States.
| |
Collapse
|
11
|
Wang G, Scott WK, Whitehead P, Court BL, Kovach JL, Schwartz SG, Agarwal A, Dubovy S, Haines JL, Pericak-Vance MA. A novel ARMS2 splice variant is identified in human retina. Exp Eye Res 2011; 94:187-91. [PMID: 22138417 DOI: 10.1016/j.exer.2011.11.005] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2011] [Revised: 10/29/2011] [Accepted: 11/14/2011] [Indexed: 11/18/2022]
|
12
|
Cole CG, McCann OT, Collins JE, Oliver K, Willey D, Gribble SM, Yang F, McLaren K, Rogers J, Ning Z, Beare DM, Dunham I. Finishing the finished human chromosome 22 sequence. Genome Biol 2008; 9:R78. [PMID: 18477386 PMCID: PMC2441464 DOI: 10.1186/gb-2008-9-5-r78] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2008] [Revised: 04/10/2008] [Accepted: 05/13/2008] [Indexed: 11/25/2022] Open
Abstract
A combination of approaches was used to close 8 of the 11 gaps in the original sequence of human chromosome 22, and to generate a total 1.018 Mb of new sequence. Background Although the human genome sequence was declared complete in 2004, the sequence was interrupted by 341 gaps of which 308 lay in an estimated approximately 28 Mb of euchromatin. While these gaps constitute only approximately 1% of the sequence, knowledge of the full complement of human genes and regulatory elements is incomplete without their sequences. Results We have used a combination of conventional chromosome walking (aided by the availability of end sequences) in fosmid and bacterial artificial chromosome (BAC) libraries, whole chromosome shotgun sequencing, comparative genome analysis and long PCR to finish 8 of the 11 gaps in the initial chromosome 22 sequence. In addition, we have patched four regions of the initial sequence where the original clones were found to be deleted, or contained a deletion allele of a known gene, with a further 126 kb of new sequence. Over 1.018 Mb of new sequence has been generated to extend into and close the gaps, and we have annotated 16 new or extended gene structures and one pseudogene. Conclusion Thus, we have made significant progress to completing the sequence of the euchromatic regions of human chromosome 22 using a combination of detailed approaches. Our experience suggests that substantial work remains to close the outstanding gaps in the human genome sequence.
Collapse
Affiliation(s)
- Charlotte G Cole
- The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK.
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
13
|
Maston GA, Evans SK, Green MR. Transcriptional regulatory elements in the human genome. Annu Rev Genomics Hum Genet 2008; 7:29-59. [PMID: 16719718 DOI: 10.1146/annurev.genom.7.080505.115623] [Citation(s) in RCA: 567] [Impact Index Per Article: 33.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
The faithful execution of biological processes requires a precise and carefully orchestrated set of steps that depend on the proper spatial and temporal expression of genes. Here we review the various classes of transcriptional regulatory elements (core promoters, proximal promoters, distal enhancers, silencers, insulators/boundary elements, and locus control regions) and the molecular machinery (general transcription factors, activators, and coactivators) that interacts with the regulatory elements to mediate precisely controlled patterns of gene expression. The biological importance of transcriptional regulation is highlighted by examples of how alterations in these transcriptional components can lead to disease. Finally, we discuss the methods currently used to identify transcriptional regulatory elements, and the ability of these methods to be scaled up for the purpose of annotating the entire human genome.
Collapse
Affiliation(s)
- Glenn A Maston
- Howard Hughes Medical Institute, Programs in Gene Function and Expression and Molecular Medicine, University of Massachusetts Medical School, Worcester, Massachusetts 01605, USA.
| | | | | |
Collapse
|
14
|
Bechtel S, Rosenfelder H, Duda A, Schmidt CP, Ernst U, Wellenreuther R, Mehrle A, Schuster C, Bahr A, Blöcker H, Heubner D, Hoerlein A, Michel G, Wedler H, Köhrer K, Ottenwälder B, Poustka A, Wiemann S, Schupp I. The full-ORF clone resource of the German cDNA Consortium. BMC Genomics 2007; 8:399. [PMID: 17974005 PMCID: PMC2213676 DOI: 10.1186/1471-2164-8-399] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2007] [Accepted: 10/31/2007] [Indexed: 11/24/2022] Open
Abstract
Background With the completion of the human genome sequence the functional analysis and characterization of the encoded proteins has become the next urging challenge in the post-genome era. The lack of comprehensive ORFeome resources has thus far hampered systematic applications by protein gain-of-function analysis. Gene and ORF coverage with full-length ORF clones thus needs to be extended. In combination with a unique and versatile cloning system, these will provide the tools for genome-wide systematic functional analyses, to achieve a deeper insight into complex biological processes. Results Here we describe the generation of a full-ORF clone resource of human genes applying the Gateway cloning technology (Invitrogen). A pipeline for efficient cloning and sequencing was developed and a sample tracking database was implemented to streamline the clone production process targeting more than 2,200 different ORFs. In addition, a robust cloning strategy was established, permitting the simultaneous generation of two clone variants that contain a particular ORF with as well as without a stop codon by the implementation of only one additional working step into the cloning procedure. Up to 92 % of the targeted ORFs were successfully amplified by PCR and more than 93 % of the amplicons successfully cloned. Conclusion The German cDNA Consortium ORFeome resource currently consists of more than 3,800 sequence-verified entry clones representing ORFs, cloned with and without stop codon, for about 1,700 different gene loci. 177 splice variants were cloned representing 121 of these genes. The entry clones have been used to generate over 5,000 different expression constructs, providing the basis for functional profiling applications. As a member of the recently formed international ORFeome collaboration we substantially contribute to generating and providing a whole genome human ORFeome collection in a unique cloning system that is made freely available in the community.
Collapse
Affiliation(s)
- Stephanie Bechtel
- Department of Molecular Genome Analysis, German Cancer Research Center (DKFZ), Heidelberg, Germany.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
15
|
Wilkerson MD, Schlueter SD, Brendel V. yrGATE: a web-based gene-structure annotation tool for the identification and dissemination of eukaryotic genes. Genome Biol 2007; 7:R58. [PMID: 16859520 PMCID: PMC1779557 DOI: 10.1186/gb-2006-7-7-r58] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2006] [Revised: 06/08/2006] [Accepted: 07/05/2006] [Indexed: 11/10/2022] Open
Abstract
Your Gene structure Annotation Tool for Eukaryotes (yrGATE) provides an Annotation Tool and Community Utilities for worldwide web-based community genome and gene annotation. Annotators can evaluate gene structure evidence derived from multiple sources to create gene structure annotations. Administrators regulate the acceptance of annotations into published gene sets. yrGATE is designed to facilitate rapid and accurate annotation of emerging genomes as well as to confirm, refine, or correct currently published annotations. yrGATE is highly portable and supports different standard input and output formats. The yrGATE software and usage cases are available at http://www.plantgdb.org/prj/yrGATE.
Collapse
Affiliation(s)
- Matthew D Wilkerson
- Department of Genetics, Development and Cell Biology, Iowa State University, Ames, IA 50011-3260, USA
| | - Shannon D Schlueter
- Department of Genetics, Development and Cell Biology, Iowa State University, Ames, IA 50011-3260, USA
| | - Volker Brendel
- Department of Genetics, Development and Cell Biology, Iowa State University, Ames, IA 50011-3260, USA
- Department of Statistics, Iowa State University, Ames, IA 50011-3260, USA
| |
Collapse
|
16
|
Yoon K, Kwek S. A data reduction approach for resolving the imbalanced data issue in functional genomics. Neural Comput Appl 2007. [DOI: 10.1007/s00521-007-0089-7] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
17
|
Schlueter SD, Wilkerson MD, Dong Q, Brendel V. xGDB: open-source computational infrastructure for the integrated evaluation and analysis of genome features. Genome Biol 2007; 7:R111. [PMID: 17116260 PMCID: PMC1794590 DOI: 10.1186/gb-2006-7-11-r111] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2006] [Revised: 08/02/2006] [Accepted: 11/20/2006] [Indexed: 11/28/2022] Open
Abstract
XGDB, a software infrastructure consisting of integrated tools for the storage, display and analysis of genome features (any property that can be associated with a genomic location, for example spliced alignments) in their genomics context is described. The eXtensible Genome Data Broker (xGDB) provides a software infrastructure consisting of integrated tools for the storage, display, and analysis of genome features in their genomic context. Common features include gene structure annotations, spliced alignments, mapping of repetitive sequence, and microarray probes, but the software supports inclusion of any property that can be associated with a genomic location. The xGDB distribution and user support utilities are available online at the xGDB project website, http://xgdb.sourceforge.net/.
Collapse
Affiliation(s)
- Shannon D Schlueter
- Department of Genetics, Development, and Cell Biology, Iowa State University, Ames, Iowa 50011-3260, USA
- Department of Agronomy, Purdue University, West Lafayette, Indiana 47907, USA
| | - Matthew D Wilkerson
- Department of Genetics, Development, and Cell Biology, Iowa State University, Ames, Iowa 50011-3260, USA
| | - Qunfeng Dong
- Department of Genetics, Development, and Cell Biology, Iowa State University, Ames, Iowa 50011-3260, USA
- Center for Genomics and Bioinformatics, Indiana University, Bloomington, Indiana 47405-3700, USA
| | - Volker Brendel
- Department of Genetics, Development, and Cell Biology, Iowa State University, Ames, Iowa 50011-3260, USA
- Department of Statistics, Iowa State University, Ames, Iowa 50011-3260, USA
| |
Collapse
|
18
|
Machado J, Abdulla P, Hanna WJB, Hilliker AJ, Coe IR. Genomic analysis of nucleoside transporters in Diptera and functional characterization ofDmENT2, a Drosophila equilibrative nucleoside transporter. Physiol Genomics 2007; 28:337-47. [PMID: 17090699 DOI: 10.1152/physiolgenomics.00087.2006] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
The recent completion of genome sequencing projects in a number of eukaryotes allows comparative analysis of orthologs, which can aid in identifying evolutionary constraints on protein structure and function. Nucleoside transporters (NTs) are present in a diverse array of organisms and previous studies have suggested that there is low protein sequence similarity but conserved structure in invertebrate and vertebrate NT orthologs. In addition, most taxa possess multiple NT isoforms but their respective roles in the physiology of the organism are not clear. To investigate the evolution of the structure and function of NTs, we have extended our previous studies by identifying NT orthologs in the Dipteran Anopheles gambiae and comparing these proteins to human and Drosophila melanogaster (Dm) NTs. In addition, we have functionally characterized DmENT2, one of three putative D. melanogaster ENTs that we have previously described. DmENT2 has broad substrate specificity, is insensitive to standard nucleoside transport inhibitors and is expressed in the digestive tract of late stage embryos based on in situ hybridization. DmENT1 and DmENT2 are expressed in most stages during development with the exception of early embryogenesis suggesting specific physiological roles for each isoform. These data represent the first complete genomic analysis of Dipteran NTs and the first report of the functional characterization of any Dipteran NT.
Collapse
Affiliation(s)
- Jerry Machado
- Department of Biology, York University, Toronto, Ontario, Canada
| | | | | | | | | |
Collapse
|
19
|
|
20
|
Brzoska PM, Brown C, Cassel M, Ceccardi T, Di Francisco V, Dubman A, Evans J, Fang R, Harris M, Hoover J, Hu F, Larry C, Li P, Malicdem M, Maltchenko S, Shannon M, Perkins S, Poulter K, Webster-Laig M, Xiao C, Young S, Spier G, Guegler K, Gilbert D, Samaha RR. An efficient and high-throughput approach for experimental validation of novel human gene predictions. Genomics 2006; 87:437-45. [PMID: 16406193 DOI: 10.1016/j.ygeno.2005.11.016] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2005] [Revised: 10/26/2005] [Accepted: 11/24/2005] [Indexed: 11/29/2022]
Abstract
A highly automated RT-PCR-based approach has been established to validate novel human gene predictions with no prior experimental evidence of mRNA splicing (ab initio predictions). Ab initio gene predictions were selected for high-throughput validation using predicted protein classification, sequence similarity to other genomes, colocalization with an MPSS tag, or microarray expression. Initial microarray prioritization followed by RT-PCR validation was the most efficient combination, resulting in approximately 35% of the ab initio predictions being validated by RT-PCR. Of the 7252 novel genes that were prioritized and processed, 796 constituted real transcripts. In addition, high-throughput RACE successfully extended the 5' and/or 3' ends of >60% of RT-PCR-validated genes. Reevaluation of these transcripts produced 574 novel transcripts using RefSeq as a reference. RT-PCR sequencing in combination with RACE on ab initio gene predictions could be used to define the transcriptome across all species.
Collapse
Affiliation(s)
- Pius M Brzoska
- Applied Biosystems, 850 Lincoln Center Drive, Foster City, CA 94404, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
21
|
Shafer P, Lin DM, Yona G. EST2Prot: mapping EST sequences to proteins. BMC Genomics 2006; 7:41. [PMID: 16515706 PMCID: PMC1456965 DOI: 10.1186/1471-2164-7-41] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2005] [Accepted: 03/04/2006] [Indexed: 11/12/2022] Open
Abstract
Background EST libraries are used in various biological studies, from microarray experiments to proteomic and genetic screens. These libraries usually contain many uncharacterized ESTs that are typically ignored since they cannot be mapped to known genes. Consequently, new discoveries are possibly overlooked. Results We describe a system (EST2Prot) that uses multiple elements to map EST sequences to their corresponding protein products. EST2Prot uses UniGene clusters, substring analysis, information about protein coding regions in existing DNA sequences and protein database searches to detect protein products related to a query EST sequence. Gene Ontology terms, Swiss-Prot keywords, and protein similarity data are used to map the ESTs to functional descriptors. Conclusion EST2Prot extends and significantly enriches the popular UniGene mapping by utilizing multiple relations between known biological entities. It produces a mapping between ESTs and proteins in real-time through a simple web-interface. The system is part of the Biozon database and is accessible at .
Collapse
Affiliation(s)
- Paul Shafer
- Department of Computer Science, Cornell University, Ithaca, NY, USA
| | - David M Lin
- Department of Biomedical Sciences, Cornell University, Ithaca, NY, USA
| | - Golan Yona
- Department of Computer Science, Cornell University, Ithaca, NY, USA
| |
Collapse
|
22
|
Birney E, Andrews D, Caccamo M, Chen Y, Clarke L, Coates G, Cox T, Cunningham F, Curwen V, Cutts T, Down T, Durbin R, Fernandez-Suarez XM, Flicek P, Gräf S, Hammond M, Herrero J, Howe K, Iyer V, Jekosch K, Kähäri A, Kasprzyk A, Keefe D, Kokocinski F, Kulesha E, London D, Longden I, Melsopp C, Meidl P, Overduin B, Parker A, Proctor G, Prlic A, Rae M, Rios D, Redmond S, Schuster M, Sealy I, Searle S, Severin J, Slater G, Smedley D, Smith J, Stabenau A, Stalker J, Trevanion S, Ureta-Vidal A, Vogel J, White S, Woodwark C, Hubbard TJP. Ensembl 2006. Nucleic Acids Res 2006; 34:D556-61. [PMID: 16381931 PMCID: PMC1347495 DOI: 10.1093/nar/gkj133] [Citation(s) in RCA: 323] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
The Ensembl () project provides a comprehensive and integrated source of annotation of large genome sequences. Over the last year the number of genomes available from the Ensembl site has increased from 4 to 19, with the addition of the mammalian genomes of Rhesus macaque and Opossum, the chordate genome of Ciona intestinalis and the import and integration of the yeast genome. The year has also seen extensive improvements to both data analysis and presentation, with the introduction of a redesigned website, the addition of RNA gene and regulatory annotation and substantial improvements to the integration of human genome variation data.
Collapse
Affiliation(s)
- E Birney
- European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SA, UK.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
23
|
Teber ET, Crawford E, Bolton KB, Van Dyk D, Schofield PR, Kapoor V, Church WB. Djinn Lite: a tool for customised gene transcript modelling, annotation-data enrichment and exploration. BMC Bioinformatics 2006; 7:33. [PMID: 16426464 PMCID: PMC1397871 DOI: 10.1186/1471-2105-7-33] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2005] [Accepted: 01/23/2006] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND There is an ever increasing rate of data made available on genetic variation, transcriptomes and proteomes. Similarly, a growing variety of bioinformatic programs are becoming available from many diverse sources, designed to identify a myriad of sequence patterns considered to have potential biological importance within inter-genic regions, genes, transcripts, and proteins. However, biologists require easy to use, uncomplicated tools to integrate this information, visualise and print gene annotations. Integrating this information usually requires considerable informatics skills, and comprehensive knowledge of the data format to make full use of this information. Tools are needed to explore gene model variants by allowing users the ability to create alternative transcript models using novel combinations of exons not necessarily represented in current database deposits of mRNA/cDNA sequences. RESULTS Djinn Lite is designed to be an intuitive program for storing and visually exploring of custom annotations relating to a eukaryotic gene sequence and its modelled gene products. In particular, it is helpful in developing hypothesis regarding alternate splicing of transcripts by allowing the construction of model transcripts and inspection of their resulting translations. It facilitates the ability to view a gene and its gene products in one synchronised graphical view, allowing one to drill down into sequence related data. Colour highlighting of selected sequences and added annotations further supports exploration, visualisation of sequence regions and motifs known or predicted to be biologically significant. CONCLUSION Gene annotating remains an ongoing and challenging task that will continue as gene structures, gene transcription repertoires, disease loci, protein products and their interactions become more precisely defined. Djinn Lite offers an accessible interface to help accumulate, enrich, and individualize sequence annotations relating to a gene, its transcripts and translations. The mechanism of transcript definition and creation, and subsequent navigation and exploration of features, are very intuitive and demand only a short learning curve. Ultimately, Djinn Lite can form the basis for providing valuable clues to plan new experiments, providing storage of sequences and annotations for dedication to customised projects. The application is appropriate for Windows 98-ME-2000-XP-2003 operating systems.
Collapse
Affiliation(s)
- Erdahl T Teber
- School of Medical Sciences, University of New South Wales NSW 2052, Australia
- Neurobiology Division, Garvan Institute of Medical Research, Sydney NSW 2010, Australia
- Faculty of Pharmacy, University of Sydney NSW 2006, Australia
| | - Edward Crawford
- School of Medical Sciences, University of New South Wales NSW 2052, Australia
| | - Kent B Bolton
- EBM Pty Ltd, Level 6, 110 Sussex Street, Sydney, NSW 2000, Australia
| | - Derek Van Dyk
- NSW Ministry for Science and Medical Research, GPO Box 5341, Sydney NSW 2001, Australia
| | - Peter R Schofield
- Neurobiology Division, Garvan Institute of Medical Research, Sydney NSW 2010, Australia
- Prince of Wales Medical Research Institute, Sydney NSW 2031, Australia
| | - Vimal Kapoor
- School of Medical Sciences, University of New South Wales NSW 2052, Australia
- Department of Medicine and Pharmacology, University of Western Australia, Crawley WA 6009, Australia
| | - W Bret Church
- School of Medical Sciences, University of New South Wales NSW 2052, Australia
- Neurobiology Division, Garvan Institute of Medical Research, Sydney NSW 2010, Australia
- Faculty of Pharmacy, University of Sydney NSW 2006, Australia
| |
Collapse
|
24
|
Win J, Kanneganti TD, Torto-Alalibo T, Kamoun S. Computational and comparative analyses of 150 full-length cDNA sequences from the oomycete plant pathogen Phytophthora infestans. Fungal Genet Biol 2006; 43:20-33. [PMID: 16380277 DOI: 10.1016/j.fgb.2005.10.003] [Citation(s) in RCA: 59] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2005] [Revised: 10/05/2005] [Accepted: 10/05/2005] [Indexed: 11/16/2022]
Abstract
Phytophthora infestans is a devastating phytopathogenic oomycete that causes late blight on tomato and potato. Recent genome sequencing efforts of P. infestans and other Phytophthora species are generating vast amounts of sequence data providing opportunities to unlock the complex nature of pathogenesis. However, accurate annotation of Phytophthora genomes will be a significant challenge. Most of the information about gene structure in these species was gathered from a handful of genes resulting in significant limitations for development of ab initio gene-calling programs. In this study, we collected a total of 150 bioinformatically determined near full-length cDNA (FLcDNA) sequences of P. infestans that were predicted to contain full open reading frame sequences. We performed detailed computational analyses of these FLcDNA sequences to obtain a snapshot of P. infestans gene structure, gauge the degree of sequence conservation between P. infestans genes and those of Phytophthora sojae and Phytophthora ramorum, and identify patterns of gene conservation between P. infestans and various eukaryotes, particularly fungi, for which genome-wide translated protein sequences are available. These analyses helped us to define the structural characteristics of P. infestans genes using a validated data set. We also determined the degree of sequence conservation within the genus Phytophthora and identified a set of fast evolving genes. Finally, we identified a set of genes that are shared between Phytophthora and fungal phytopathogens but absent in animal fungal pathogens. These results confirm that plant pathogenic oomycetes and fungi share virulence components, and suggest that eukaryotic microbial pathogens that share similar lifestyles also share a similar set of genes independently of their phylogenetic relatedness.
Collapse
Affiliation(s)
- Joe Win
- Department of Plant Pathology, The Ohio State University, Ohio Agricultural Research and Development Center, Wooster, OH 44691, USA
| | | | | | | |
Collapse
|
25
|
Zhang J, Zhang L, Coombes KR. Gene sequence signatures revealed by mining the UniGene affiliation network. Bioinformatics 2005; 22:385-91. [PMID: 16339286 DOI: 10.1093/bioinformatics/bti796] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
BACKGROUND In the post-genomic era, developing tools to decode biological information from genomic sequences is important. Inspired by affiliation network theory, we investigated gene sequences of two kinds of UniGene clusters (UCs): narrowly expressed transcripts (NETs), whose expression is confined to a few tissues; and prevalently expressed transcripts (PETs) that are expressed in many tissues. RESULTS We explored the human and the mouse UniGene databases to compare NETs and PETs from different perspectives. We found that NETs were associated with smaller cluster size, shorter sequence length, a lower likelihood of having LocusLink annotations, and lower and more sporadic levels of expression. Significantly, the dinucleotide frequencies of NETs are similar to those of intergenic sequences in the genome, and they differ from those of PETs. We used these differences in dinucleotide frequencies to develop a discriminant analysis model to distinguish PETs from intergenic sequences. CONCLUSIONS Our results show that most NETs resemble intergenic sequences, casting doubts on the quality of such UniGene clusters. However, we also noted that a fraction of NETs resemble PETs in terms of dinucleotide frequencies and other features. Such NETs may have fewer quality problems. This work may be helpful in the studies of non-coding RNAs and in the validation of gene sequence databases.
Collapse
Affiliation(s)
- Jiexin Zhang
- Department of Biostatistics and Applied Mathematics, The University of Texas M.D. Anderson Cancer Center, 1515 Holcombe Boulevard, Box 447, Houston, TX 77030-4009, USA
| | | | | |
Collapse
|
26
|
Mirus O, Schleiff E. Prediction of beta-barrel membrane proteins by searching for restricted domains. BMC Bioinformatics 2005; 6:254. [PMID: 16225682 PMCID: PMC1280923 DOI: 10.1186/1471-2105-6-254] [Citation(s) in RCA: 17] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2005] [Accepted: 10/14/2005] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The identification of beta-barrel membrane proteins out of a genomic/proteomic background is one of the rapidly developing fields in bioinformatics. Our main goal is the prediction of such proteins in genome/proteome wide analyses. RESULTS For the prediction of beta-barrel membrane proteins within prokaryotic proteomes a set of parameters was developed. We have focused on a procedure with a low false positive rate beside a procedure with lowest false prediction rate to obtain a high certainty for the predicted sequences. We demonstrate that the discrimination between beta-barrel membrane proteins and other proteins is improved by analyzing a length limited region. The developed set of parameters is applied to the proteome of E. coli and the results are compared to four other described procedures. CONCLUSION Analyzing the beta-barrel membrane proteins revealed the presence of a defined membrane inserted beta-barrel region. This information can now be used to refine other prediction programs as well. So far, all tested programs fail to predict outer membrane proteins in the proteome of the prokaryote E. coli with high reliability. However, the reliability of the prediction is improved significantly by a combinatory approach of several programs. The consequences and usability of the developed scores are discussed.
Collapse
Affiliation(s)
- Oliver Mirus
- Botanisches Institut der Ludwig-Maximilians-Universität München, Menzinger Str. 67, 80638 München, Germany
| | - Enrico Schleiff
- Botanisches Institut der Ludwig-Maximilians-Universität München, Menzinger Str. 67, 80638 München, Germany
| |
Collapse
|
27
|
Gevaert K, Van Damme P, Martens L, Vandekerckhove J. Diagonal reverse-phase chromatography applications in peptide-centric proteomics: Ahead of catalogue-omics? Anal Biochem 2005; 345:18-29. [PMID: 16181830 DOI: 10.1016/j.ab.2005.01.038] [Citation(s) in RCA: 57] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2004] [Revised: 11/29/2004] [Accepted: 01/04/2005] [Indexed: 10/25/2022]
Abstract
Diagonal electrophoresis/chromatography was described 40 years ago and was used to isolate specific sets of peptides from simple peptide mixtures such as protease digests of purified proteins. Recently, we have adapted the core technology of diagonal chromatography so that the technique can be used in so-called gel-free, peptide-centric proteome studies. Here we review the different procedures we have developed over the past few years, sorting of methionyl, cysteinyl, amino terminal, and phosphorylated peptides. We illustrate the power of the technique, termed COFRADIC (combined fractional diagonal chromatography), in the case of a peptide-centric analysis of a sputum sol phase sample of a patient suffering from chronic obstructive pulmonary disease (COPD). We were able to identify an unexpectedly high number of intracellular proteins next to known biomarkers.
Collapse
Affiliation(s)
- Kris Gevaert
- Department of Medical Protein Research, Flanders Interuniversity Institute for Biotechnology, Department of Biochemistry, Ghent University, A. Baertsoenkaai 3, B-9000 Ghent, Belgium.
| | | | | | | |
Collapse
|
28
|
Abstract
According to the most recent estimates, the number of human genes is possibly--but not certainly--between 20,000 and 25,000. To contribute strategies to reduce this uncertainty, several groups working on computational gene prediction met recently at the Welcome Trust Sanger Institute with the goal to test and compare predictive methods of genome annotation.
Collapse
Affiliation(s)
- Roderic Guigó
- Municipal Institute of Medical Research and Center for Genomic Regulation, University Pompeu Fabra, C/ Dr. Aiguader 80, 08003 Barcelona, Catalonia, Spain.
| | | |
Collapse
|
29
|
Ma L, Chen C, Liu X, Jiao Y, Su N, Li L, Wang X, Cao M, Sun N, Zhang X, Bao J, Li J, Pedersen S, Bolund L, Zhao H, Yuan L, Wong GKS, Wang J, Deng XW, Wang J. A microarray analysis of the rice transcriptome and its comparison to Arabidopsis. Genome Res 2005; 15:1274-83. [PMID: 16140994 PMCID: PMC1199542 DOI: 10.1101/gr.3657405] [Citation(s) in RCA: 108] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2005] [Accepted: 05/18/2005] [Indexed: 11/25/2022]
Abstract
Arabidopsis and rice are the only two model plants whose finished phase genome sequence has been completed. Here we report the construction of an oligomer microarray based on the presently known and predicted gene models in the rice genome. This microarray was used to analyze the transcriptional activity of the gene models in representative rice organ types. Expression of 86% of the 41,754 known and predicted gene models was detected. A significant fraction of these expressed gene models are organized into chromosomal regions, about 100 kb in length, that exhibit a coexpression pattern. Compared with similar genome-wide surveys of the Arabidopsis transcriptome, our results indicate that similar proportions of the two genomes are expressed in their corresponding organ types. A large percentage of the rice gene models that lack significant Arabidopsis homologs are expressed. Furthermore, the expression patterns of rice and Arabidopsis best-matched homologous genes in distinct functional groups indicate dramatic differences in their degree of conservation between the two species. Thus, this initial comparative analysis reveals some basic similarities and differences between the Arabidopsis and rice transcriptomes.
Collapse
Affiliation(s)
- Ligeng Ma
- Peking-Yale Joint Center of Plant Molecular Genetics and Agrobiotechnology, College of Life Sciences, Peking University, Beijing 100871
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
30
|
Hodges E, Redelius JS, Wu W, Höög C. Accelerated discovery of novel protein function in cultured human cells. Mol Cell Proteomics 2005; 4:1319-27. [PMID: 15965266 DOI: 10.1074/mcp.m500117-mcp200] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
Experimental approaches that enable direct investigation of human protein function are necessary for comprehensive annotation of the human proteome. We introduce a cell-based platform for rapid and unbiased functional annotation of undercharacterized human proteins. Utilizing a library of antibody biomarkers, the full-length proteins are investigated by tracking phenotypic changes caused by overexpression in human cell lines. We combine reverse transfection and immunodetection by fluorescence microscopy to facilitate this procedure at high resolution. Demonstrating the advantage of this approach, new annotations are provided for two novel proteins: 1) a membrane-bound O-acyltransferase protein (C3F) that, when overexpressed, disrupts Golgi and endosome integrity due likely to an endoplasmic reticulum-Golgi transport block and 2) a tumor marker (BC-2) that prompts a redistribution of a transcriptional silencing protein (BMI1) and a mitogen-activated protein kinase mediator (Rac1) to distinct nuclear regions that undergo chromatin compaction. Our strategy is an immediate application for directly addressing those proteins whose molecular function remains unknown.
Collapse
Affiliation(s)
- Emily Hodges
- Center for Genomics and Bioinformatics, Karolinska Institute, SE-171 77 Stockholm, Sweden
| | | | | | | |
Collapse
|
31
|
Kim TH, Barrera LO, Qu C, Van Calcar S, Trinklein ND, Cooper SJ, Luna RM, Glass CK, Rosenfeld MG, Myers RM, Ren B. Direct isolation and identification of promoters in the human genome. Genome Res 2005; 15:830-9. [PMID: 15899964 PMCID: PMC1142473 DOI: 10.1101/gr.3430605] [Citation(s) in RCA: 71] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2004] [Accepted: 03/28/2005] [Indexed: 12/15/2022]
Abstract
Transcriptional regulatory elements play essential roles in gene expression during animal development and cellular response to environmental signals, but our knowledge of these regions in the human genome is limited despite the availability of the complete genome sequence. Promoters mark the start of every transcript and are an important class of regulatory elements. A large, complex protein structure known as the pre-initiation complex (PIC) is assembled on all active promoters, and the presence of these proteins distinguishes promoters from other sequences in the genome. Using components of the PIC as tags, we isolated promoters directly from human cells as protein-DNA complexes and identified the resulting DNA sequences using genomic tiling microarrays. Our experiments in four human cell lines uncovered 252 PIC-binding sites in 44 semirandomly selected human genomic regions comprising 1% (30 megabase pairs) of the human genome. Nearly 72% of the identified fragments overlap or immediately flank 5' ends of known cDNA sequences, while the remainder is found in other genomic regions that likely harbor putative promoters of unannotated transcripts. Indeed, molecular analysis of the RNA isolated from one cell line uncovered transcripts initiated from over half of the putative promoter fragments, and transient transfection assays revealed promoter activity for a significant proportion of fragments when they were fused to a luciferase reporter gene. These results demonstrate the specificity of a genome-wide analysis method for mapping transcriptional regulatory elements and also indicate that a small, yet significant number of human genes remains to be discovered.
Collapse
Affiliation(s)
- Tae Hoon Kim
- Ludwig Institute for Cancer Research, University of California, San Diego, La Jolla, California 92093, USA
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
32
|
Ashurst JL, Chen CK, Gilbert JGR, Jekosch K, Keenan S, Meidl P, Searle SM, Stalker J, Storey R, Trevanion S, Wilming L, Hubbard T. The Vertebrate Genome Annotation (Vega) database. Nucleic Acids Res 2005; 33:D459-65. [PMID: 15608237 PMCID: PMC540089 DOI: 10.1093/nar/gki135] [Citation(s) in RCA: 119] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023] Open
Abstract
The Vertebrate Genome Annotation (Vega) database (http://vega.sanger.ac.uk) has been designed to be a community resource for browsing manual annotation of finished sequences from a variety of vertebrate genomes. Its core database is based on an Ensembl-style schema, extended to incorporate curation-specific metadata. In collaboration with the genome sequencing centres, Vega attempts to present consistent high-quality annotation of the published human chromosome sequences. In addition, it is also possible to view various finished regions from other vertebrates, including mouse and zebrafish. Vega displays only manually annotated gene structures built using transcriptional evidence, which can be examined in the browser. Attempts have been made to standardize the annotation procedure across each vertebrate genome, which should aid comparative analysis of orthologues across the different finished regions.
Collapse
Affiliation(s)
- J L Ashurst
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SA, UK.
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
33
|
Martin RE, Henry RI, Abbey JL, Clements JD, Kirk K. The 'permeome' of the malaria parasite: an overview of the membrane transport proteins of Plasmodium falciparum. Genome Biol 2005; 6:R26. [PMID: 15774027 PMCID: PMC1088945 DOI: 10.1186/gb-2005-6-3-r26] [Citation(s) in RCA: 129] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2004] [Revised: 12/31/2004] [Accepted: 01/28/2005] [Indexed: 11/24/2022] Open
Abstract
Bioinformatic and expression analyses attribute putative functions to transporters and channels encoded by the Plasmodium falciparum genome. The malaria parasite has substantially more membrane transport proteins than previously thought. Background The uptake of nutrients, expulsion of metabolic wastes and maintenance of ion homeostasis by the intraerythrocytic malaria parasite is mediated by membrane transport proteins. Proteins of this type are also implicated in the phenomenon of antimalarial drug resistance. However, the initial annotation of the genome of the human malaria parasite Plasmodium falciparum identified only a limited number of transporters, and no channels. In this study we have used a combination of bioinformatic approaches to identify and attribute putative functions to transporters and channels encoded by the malaria parasite, as well as comparing expression patterns for a subset of these. Results A computer program that searches a genome database on the basis of the hydropathy plots of the corresponding proteins was used to identify more than 100 transport proteins encoded by P. falciparum. These include all the transporters previously annotated as such, as well as a similar number of candidate transport proteins that had escaped detection. Detailed sequence analysis enabled the assignment of putative substrate specificities and/or transport mechanisms to all those putative transport proteins previously without. The newly-identified transport proteins include candidate transporters for a range of organic and inorganic nutrients (including sugars, amino acids, nucleosides and vitamins), and several putative ion channels. The stage-dependent expression of RNAs for 34 candidate transport proteins of particular interest are compared. Conclusion The malaria parasite possesses substantially more membrane transport proteins than was originally thought, and the analyses presented here provide a range of novel insights into the physiology of this important human pathogen.
Collapse
Affiliation(s)
- Rowena E Martin
- School of Biochemistry and Molecular Biology, Faculty of Science, The Australian National University, Canberra, ACT 0200, Australia
| | - Roselani I Henry
- School of Biochemistry and Molecular Biology, Faculty of Science, The Australian National University, Canberra, ACT 0200, Australia
| | - Janice L Abbey
- School of Biochemistry and Molecular Biology, Faculty of Science, The Australian National University, Canberra, ACT 0200, Australia
| | - John D Clements
- School of Biochemistry and Molecular Biology, Faculty of Science, The Australian National University, Canberra, ACT 0200, Australia
- Division of Neuroscience, The John Curtin School of Medical Research, The Australian National University, Canberra, ACT 0200, Australia
| | - Kiaran Kirk
- School of Biochemistry and Molecular Biology, Faculty of Science, The Australian National University, Canberra, ACT 0200, Australia
| |
Collapse
|
34
|
Hubbard T, Andrews D, Caccamo M, Cameron G, Chen Y, Clamp M, Clarke L, Coates G, Cox T, Cunningham F, Curwen V, Cutts T, Down T, Durbin R, Fernandez-Suarez XM, Gilbert J, Hammond M, Herrero J, Hotz H, Howe K, Iyer V, Jekosch K, Kahari A, Kasprzyk A, Keefe D, Keenan S, Kokocinsci F, London D, Longden I, McVicker G, Melsopp C, Meidl P, Potter S, Proctor G, Rae M, Rios D, Schuster M, Searle S, Severin J, Slater G, Smedley D, Smith J, Spooner W, Stabenau A, Stalker J, Storey R, Trevanion S, Ureta-Vidal A, Vogel J, White S, Woodwark C, Birney E. Ensembl 2005. Nucleic Acids Res 2005; 33:D447-53. [PMID: 15608235 PMCID: PMC540092 DOI: 10.1093/nar/gki138] [Citation(s) in RCA: 341] [Impact Index Per Article: 17.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2004] [Revised: 11/01/2004] [Accepted: 11/01/2004] [Indexed: 11/17/2022] Open
Abstract
The Ensembl (http://www.ensembl.org/) project provides a comprehensive and integrated source of annotation of large genome sequences. Over the last year the number of genomes available from the Ensembl site has increased by 7 to 16, with the addition of the six vertebrate genomes of chimpanzee, dog, cow, chicken, tetraodon and frog and the insect genome of honeybee. The majority have been annotated automatically using the Ensembl gene build system, showing its flexibility to reliably annotate a wide variety of genomes. With the increased number of vertebrate genomes, the comparative analysis provided to users has been greatly improved, with new website interfaces allowing annotation of different genomes to be directly compared. The Ensembl software system is being increasingly widely reused in different projects showing the benefits of a completely open approach to software development and distribution.
Collapse
Affiliation(s)
- T Hubbard
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SA, UK
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
35
|
Sequence and comparative analysis of the chicken genome provide unique perspectives on vertebrate evolution. Nature 2004; 432:695-716. [PMID: 15592404 DOI: 10.1038/nature03154] [Citation(s) in RCA: 1988] [Impact Index Per Article: 94.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2004] [Accepted: 11/01/2004] [Indexed: 12/28/2022]
Abstract
We present here a draft genome sequence of the red jungle fowl, Gallus gallus. Because the chicken is a modern descendant of the dinosaurs and the first non-mammalian amniote to have its genome sequenced, the draft sequence of its genome--composed of approximately one billion base pairs of sequence and an estimated 20,000-23,000 genes--provides a new perspective on vertebrate genome evolution, while also improving the annotation of mammalian genomes. For example, the evolutionary distance between chicken and human provides high specificity in detecting functional elements, both non-coding and coding. Notably, many conserved non-coding sequences are far from genes and cannot be assigned to defined functional classes. In coding regions the evolutionary dynamics of protein domains and orthologous groups illustrate processes that distinguish the lineages leading to birds and mammals. The distinctive properties of avian microchromosomes, together with the inferred patterns of conserved synteny, provide additional insights into vertebrate chromosome architecture.
Collapse
|
36
|
Blakesley RW, Hansen NF, Mullikin JC, Thomas PJ, McDowell JC, Maskeri B, Young AC, Benjamin B, Brooks SY, Coleman BI, Gupta J, Ho SL, Karlins EM, Maduro QL, Stantripop S, Tsurgeon C, Vogt JL, Walker MA, Masiello CA, Guan X, Bouffard GG, Green ED. An intermediate grade of finished genomic sequence suitable for comparative analyses. Genome Res 2004; 14:2235-44. [PMID: 15479945 PMCID: PMC525681 DOI: 10.1101/gr.2648404] [Citation(s) in RCA: 66] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2004] [Accepted: 08/16/2004] [Indexed: 11/25/2022]
Abstract
Although the cost of generating draft-quality genomic sequence continues to decline, refining that sequence by the process of "sequence finishing" remains expensive. Near-perfect finished sequence is an appropriate goal for the human genome and a small set of reference genomes; however, such a high-quality product cannot be cost-justified for large numbers of additional genomes, at least for the foreseeable future. Here we describe the generation and quality of an intermediate grade of finished genomic sequence (termed comparative-grade finished sequence), which is tailored for use in multispecies sequence comparisons. Our analyses indicate that this sequence is very high quality (with the residual gaps and errors mostly falling within repetitive elements) and reflects 99% of the total sequence. Importantly, comparative-grade sequence finishing requires approximately 40-fold less reagents and approximately 10-fold less personnel effort compared to the generation of near-perfect finished sequence, such as that produced for the human genome. Although applied here to finishing sequence derived from individual bacterial artificial chromosome (BAC) clones, one could envision establishing routines for refining sequences emanating from whole-genome shotgun sequencing projects to a similar quality level. Our experience to date demonstrates that comparative-grade sequence finishing represents a practical and affordable option for sequence refinement en route to comparative analyses.
Collapse
Affiliation(s)
- Robert W Blakesley
- NIH Intramural Sequencing Center, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland 20892, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
37
|
Issac B, Raghava GPS. EGPred: prediction of eukaryotic genes using ab initio methods after combining with sequence similarity approaches. Genome Res 2004; 14:1756-66. [PMID: 15342559 PMCID: PMC515322 DOI: 10.1101/gr.2524704] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2004] [Accepted: 07/07/2004] [Indexed: 11/24/2022]
Abstract
EGPred is a Web-based server that combines ab initio methods and similarity searches to predict genes, particularly exon regions, with high accuracy. The EGPred program proceeds in the following steps: (1) an initial BLASTX search of genomic sequence against the RefSeq database is used to identify protein hits with an E-value <1; (2) a second BLASTX search of genomic sequence against the hits from the previous run with relaxed parameters (E-values <10) helps to retrieve all probable coding exon regions; (3) a BLASTN search of genomic sequence against the intron database is then used to detect probable intron regions; (4) the probable intron and exon regions are compared to filter/remove wrong exons; (5) the NNSPLICE program is then used to reassign splicing signal site positions in the remaining probable coding exons; and (6) finally ab initio predictions are combined with exons derived from the fifth step based on the relative strength of start/stop and splice signal sites as obtained from ab initio and similarity search. The combination method increases the exon level performance of five different ab initio programs by 4%-10% when evaluated on the HMR195 data set. Similar improvement is observed when ab initio programs are evaluated on the Burset/Guigo data set. Finally, EGPred is demonstrated on an approximately 95-Mbp fragment of human chromosome 13. The list of predicted genes from this analysis are available in the supplementary material. The EGPred program is computationally intensive due to multiple BLAST runs during each analysis. The EGPred server is available at http://www.imtech.res.in/raghava/egpred/.
Collapse
Affiliation(s)
- Biju Issac
- Institute of Microbial Technology, Sector 39A, Chandigarh-160036. India
| | | |
Collapse
|
38
|
Stewart CA, Horton R, Allcock RJN, Ashurst JL, Atrazhev AM, Coggill P, Dunham I, Forbes S, Halls K, Howson JMM, Humphray SJ, Hunt S, Mungall AJ, Osoegawa K, Palmer S, Roberts AN, Rogers J, Sims S, Wang Y, Wilming LG, Elliott JF, de Jong PJ, Sawcer S, Todd JA, Trowsdale J, Beck S. Complete MHC haplotype sequencing for common disease gene mapping. Genome Res 2004; 14:1176-87. [PMID: 15140828 PMCID: PMC419796 DOI: 10.1101/gr.2188104] [Citation(s) in RCA: 247] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2003] [Accepted: 02/13/2004] [Indexed: 11/24/2022]
Abstract
The future systematic mapping of variants that confer susceptibility to common diseases requires the construction of a fully informative polymorphism map. Ideally, every base pair of the genome would be sequenced in many individuals. Here, we report 4.75 Mb of contiguous sequence for each of two common haplotypes of the major histocompatibility complex (MHC), to which susceptibility to >100 diseases has been mapped. The autoimmune disease-associated-haplotypes HLA-A3-B7-Cw7-DR15 and HLA-A1-B8-Cw7-DR3 were sequenced in their entirety through a bacterial artificial chromosome (BAC) cloning strategy using the consanguineous cell lines PGF and COX, respectively. The two sequences were annotated to encompass all described splice variants of expressed genes. We defined the complete variation content of the two haplotypes, revealing >18,000 variations between them. Average SNP densities ranged from less than one SNP per kilobase to >60. Acquisition of complete and accurate sequence data over polymorphic regions such as the MHC from large-insert cloned DNA provides a definitive resource for the construction of informative genetic maps, and avoids the limitation of chromosome regions that are refractory to PCR amplification.
Collapse
Affiliation(s)
- C Andrew Stewart
- Wellcome Trust Sanger Institute, Genome Campus, Hinxton, Cambridge CB10 1SA, United Kingdom
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
39
|
Affiliation(s)
- Kerstin Jekosch
- Wellcome Trust Sanger Institute, Cambridge CB10 1SA, United Kingdom
| |
Collapse
|
40
|
|
41
|
Rogers J. The Finished Genome Sequence of Homo sapiens. COLD SPRING HARBOR SYMPOSIA ON QUANTITATIVE BIOLOGY 2003; 68:1-11. [PMID: 15338597 DOI: 10.1101/sqb.2003.68.1] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Affiliation(s)
- J Rogers
- The Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1SA, United Kingdom
| |
Collapse
|