1
|
CORRELATIONS BETWEEN SOME MEASURES OF GENETIC DISTANCE. Evolution 2017; 30:851-853. [DOI: 10.1111/j.1558-5646.1976.tb00970.x] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/1975] [Revised: 03/23/1976] [Indexed: 11/28/2022]
|
2
|
Abstract
In vivo euterium MR imaging (2H MR) was investigated in rats after intraperitoneal administration of deuterated saline, and a dynamic study of the water movement in rat eyes was performed. Deuterium MR imaging was carried out by means of a gradient-echo (GRE) and a spin-echo (SE) pulse sequence. The rat eye was imaged in 2H MR more selectively by SE than by GRE, but a lower signal-to-noise ratio was obtained in 2H MR imaging using the SE sequence. The MR signal intensity of the rat eye was followed by a 3-compartment model, which enabled determination of the flow rate constant of the water in the eye (0.359/min). Deuterium MR imaging is useful to visualize the dynamic change of water in rat eyes using 2H MR at the same magnetic field (2 T) that can also be used for conventional MR imaging in humans.
Collapse
|
3
|
Abstract
Orthologs are widely used for phylogenetic analysis of species; however, identifying genuine orthologs among distantly related species is challenging, because genes obtained through horizontal gene transfer (HGT) and out-paralogs derived from gene duplication before speciation are often present among the predicted orthologs. We developed a program, “Ortholog-Finder,” to obtain ortholog data sets for performing phylogenetic analysis by using all open-reading frame data of species. The program includes five processes for minimizing the effects of HGT and out-paralogs in phylogeny construction: 1) HGT filtering: Genes derived from HGT could be detected and deleted from the initial sequence data set by examining their base compositions. 2) Out-paralog filtering: Out-paralogs are detected and deleted from the data set based on sequence similarity. 3) Classification of phylogenetic trees: Phylogenetic trees generated for ortholog candidates are classified as monophyletic or polyphyletic trees. 4) Tree splitting: Polyphyletic trees are bisected to obtain monophyletic trees and remove HGT genes and out-paralogs. 5) Threshold changing: Out-paralogs are further excluded from the data set based on the difference in the similarity scores of genuine orthologs and out-paralogs. We examined how out-paralogs and HGTs affected phylogenetic trees constructed for species based on ortholog data sets obtained by Ortholog-Finder with the use of simulation data, and we determined the effects of confounding factors. We then used Ortholog-Finder in phylogeny construction for 12 Gram-positive bacteria from two phyla and validated each node of the constructed tree by comparison with individually constructed ortholog trees.
Collapse
|
4
|
Dopamine receptor genes and evolutionary differentiation in the domestication of fighting cocks and long-crowing chickens. PLoS One 2014; 9:e101778. [PMID: 25078403 PMCID: PMC4117491 DOI: 10.1371/journal.pone.0101778] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2014] [Accepted: 06/11/2014] [Indexed: 11/23/2022] Open
Abstract
The chicken domestication process represents a typical model of artificial selection, and gives significant insight into the general understanding of the influence of artificial selection on recognizable phenotypes. Two Japanese domesticated chicken varieties, the fighting cock (Shamo) and the long-crowing chicken (Naganakidori), have been selectively bred for dramatically different phenotypes. The former has been selected exclusively for aggressiveness and the latter for long crowing with an obedient sitting posture. To understand the particular mechanism behind these genetic changes during domestication, we investigated the degree of genetic differentiation in the aforementioned chickens, focusing on dopamine receptor D2, D3, and D4 genes. We studied other ornamental chickens such as Chabo chickens as a reference for comparison. When genetic differentiation was measured by an index of nucleotide differentiation (NST) newly devised in this study, we found that the NST value of DRD4 for Shamo (0.072) was distinctively larger than those of the other genes among the three populations, suggesting that aggressiveness has been selected for in Shamo by collecting a variety of single nucleotide polymorphisms. In addition, we found that in DRD4 in Naganakidori, there is a deletion variant of one proline at the 24th residue in the repeat of nine prolines of exon 1. We thus conclude that artificial selection has operated on these different kinds of genetic variation in the DRD4 genes of Shamo and Naganakidori so strongly that the two domesticated varieties have differentiated to obtain their present opposite features in a relatively short period of time.
Collapse
|
5
|
Divergence of East Asians and Europeans estimated using male- and female-specific genetic markers. Genome Biol Evol 2014; 6:466-73. [PMID: 24589501 PMCID: PMC3971580 DOI: 10.1093/gbe/evu027] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022] Open
Abstract
To study the male and female lineages of East Asian and European humans, we have sequenced 25 short tandem repeat markers on 453 Y-chromosomes and collected sequences of 72 complete mitochondrial genomes to construct independent phylogenetic trees for male and female lineages. The results indicate that East Asian individuals fall into two clades, one that includes East Asian individuals only and a second that contains East Asian and European individuals. Surprisingly, the European individuals did not form an independent clade, but branched within in the East Asians. We then estimated the divergence time of the root of the European clade as ∼41,000 years ago. These data indicate that, contrary to traditional views, Europeans diverged from East Asians around that time. We also address the origin of the Ainu lineage in northern Japan.
Collapse
|
6
|
Age-dependent changes in the functions and compositions of photosynthetic complexes in the thylakoid membranes of Arabidopsis thaliana. PHOTOSYNTHESIS RESEARCH 2013; 117:547-56. [PMID: 23975202 DOI: 10.1007/s11120-013-9906-2] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/28/2013] [Accepted: 07/30/2013] [Indexed: 05/05/2023]
Abstract
Photosynthetic complexes in the thylakoid membrane of plant leaves primarily function as energy-harvesting machinery during the growth period. However, leaves undergo developmental and functional transitions along aging and, at the senescence stage, these complexes become major sources for nutrients to be remobilized to other organs such as developing seeds. Here, we investigated age-dependent changes in the functions and compositions of photosynthetic complexes during natural leaf senescence in Arabidopsis thaliana. We found that Chl a/b ratios decreased during the natural leaf senescence along with decrease of the total chlorophyll content. The photosynthetic parameters measured by the chlorophyll fluorescence, photochemical efficiency (F v/F m) of photosystem II, non-photochemical quenching, and the electron transfer rate, showed a differential decline in the senescing part of the leaves. The CO2 assimilation rate and the activity of PSI activity measured from whole senescing leaves remained relatively intact until 28 days of leaf age but declined sharply thereafter. Examination of the behaviors of the individual components in the photosynthetic complex showed that the components on the whole are decreased, but again showed differential decline during leaf senescence. Notably, D1, a PSII reaction center protein, was almost not present but PsaA/B, a PSI reaction center protein is still remained at the senescence stage. Taken together, our results indicate that the compositions and structures of the photosynthetic complexes are differentially utilized at different stages of leaf, but the most dramatic change was observed at the senescence stage, possibly to comply with the physiological states of the senescence process.
Collapse
|
7
|
Purification and characterization of two phospho-β-galactosidases, LacG1 and LacG2, from Lactobacillus gasseri ATCC33323(T). J GEN APPL MICROBIOL 2012; 58:11-7. [PMID: 22449746 DOI: 10.2323/jgam.58.11] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022]
Abstract
Lactobacillus gasseri ATCC33323(T) expresses four enzymes showing phospho-β-galactosidase activity (LacG1, LacG2, Pbg1 and Pbg2). We previously reported the purification and characterization of two phospho-β-galactosidases (Pbg1 and Pbg2) from Lactobacillus gasseri JCM1031 cultured in lactose medium. Here we aimed to characterize LacG1 and LacG2, and classify the four enzymes into 'phospho-β-galactosidase' or 'phospho-β-glucosidase.' LacG1 and recombinant LacG2 (rLacG2), from Lb. gasseri ATCC33323(T), were purified to homogeneity using column chromatography. Kinetic experiments were performed using sugar substrates, o-nitrophenyl-β-D-galactopyranoside 6-phosphate (ONPGal-6P) and o-nitrophenyl-β-D-glucopyranoside 6-phosphate (ONPGlc-6P), synthesized in our laboratory. LacG1 and rLacG2 exhibited high k(cat)/K(m) values for ONPGal-6P as compared with Pbg1 and Pbg2. The V(max) values for ONPGal-6P were higher than phospho-β-galactosidases previously purified and characterized from several lactic acid bacteria. A phylogenetic tree analysis showed that LacG1 and LacG2 belong to the phospho-β-galactosidase cluster and Pbg1 and Pbg2 belong to the phospho-β-glucosidase cluster. Our data suggest two phospho-β-galactosidase, LacG1 and LacG2, are the primary enzymes for lactose utilization in Lb. gasseri ATCC33323(T). We propose a reclassification of Pbg1 and Pbg2 as phospho-β-glucosidase.
Collapse
|
8
|
HGT-Gen: a tool for generating a phylogenetic tree with horizontal gene transfer. Bioinformation 2011; 7:211-3. [PMID: 22125388 PMCID: PMC3218414 DOI: 10.6026/97320630007211] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2011] [Accepted: 10/24/2011] [Indexed: 11/23/2022] Open
Abstract
UNLABELLED Horizontal gene transfer (HGT) is a common event in prokaryotic evolution. Therefore, it is very important to consider HGT in the study of molecular evolution of prokaryotes. This is true also for conducting computer simulations of their molecular phylogeny because HGT is known to be a serious disturbing factor for estimating their correct phylogeny. To the best of our knowledge, no existing computer program has generated a phylogenetic tree with HGT from an original phylogenetic tree. We developed a program called HGT-Gen that generates a phylogenetic tree with HGT on the basis of an original phylogenetic tree of a protein or gene. HGT-Gen converts an operational taxonomic unit or a clade from one place to another in a given phylogenetic tree. We have also devised an algorithm to compute the average length between any pair of branches in the tree. It defines and computes the relative evolutionary time to normalize evolutionary time for each lineage. The algorithm can generate an HGT between a pair of donor and acceptor lineages at the same evolutionary time. HGT-Gen is used with a sequence-generating program to evaluate the influence of HGT on the molecular phylogeny of prokaryotes in a computer simulation study. AVAILABILITY The database is available for free at http://www.grl.shizuoka.ac.jp/˜thoriike/HGT-Gen.html.
Collapse
|
9
|
Evolutionary conserved microRNAs are ubiquitously expressed compared to tick-specific miRNAs in the cattle tick Rhipicephalus (Boophilus) microplus. BMC Genomics 2011; 12:328. [PMID: 21699734 PMCID: PMC3141673 DOI: 10.1186/1471-2164-12-328] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2011] [Accepted: 06/24/2011] [Indexed: 12/19/2022] Open
Abstract
BACKGROUND MicroRNAs (miRNAs) are small non-coding RNAs that act as regulators of gene expression in eukaryotes modulating a large diversity of biological processes. The discovery of miRNAs has provided new opportunities to understand the biology of a number of species. The cattle tick, Rhipicephalus (Boophilus) microplus, causes significant economic losses in cattle production worldwide and this drives us to further understand their biology so that effective control measures can be developed. To be able to provide new insights into the biology of cattle ticks and to expand the repertoire of tick miRNAs we utilized Illumina technology to sequence the small RNA transcriptomes derived from various life stages and selected organs of R. microplus. RESULTS To discover and profile cattle tick miRNAs we employed two complementary approaches, one aiming to find evolutionary conserved miRNAs and another focused on the discovery of novel cattle-tick specific miRNAs. We found 51 evolutionary conserved R. microplus miRNA loci, with 36 of these previously found in the tick Ixodes scapularis. The majority of the R. microplus miRNAs are perfectly conserved throughout evolution with 11, 5 and 15 of these conserved since the Nephrozoan (640 MYA), Protostomian (620MYA) and Arthropoda (540 MYA) ancestor, respectively. We then employed a de novo computational screening for novel tick miRNAs using the draft genome of I. scapularis and genomic contigs of R. microplus as templates. This identified 36 novel R. microplus miRNA loci of which 12 were conserved in I. scapularis. Overall we found 87 R. microplus miRNA loci, of these 15 showed the expression of both miRNA and miRNA* sequences. R. microplus miRNAs showed a variety of expression profiles, with the evolutionary-conserved miRNAs mainly expressed in all life stages at various levels, while the expression of novel tick-specific miRNAs was mostly limited to particular life stages and/or tick organs. CONCLUSIONS Anciently acquired miRNAs in the R. microplus lineage not only tend to accumulate the least amount of nucleotide substitutions as compared to those recently acquired miRNAs, but also show ubiquitous expression profiles through out tick life stages and organs contrasting with the restricted expression profiles of novel tick-specific miRNAs.
Collapse
|
10
|
Identification of a new adhesin-like protein from Lactobacillus mucosae ME-340 with specific affinity to the human blood group A and B antigens. J Appl Microbiol 2011; 109:927-35. [PMID: 20408914 DOI: 10.1111/j.1365-2672.2010.04719.x] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
Abstract
AIMS To identify and characterize a new adhesin-like protein of probiotics that show specific adhesion to human blood group A and B antigens. METHODS AND RESULTS Using the BIACORE assay, the adhesion of cell surface components obtained from four lactobacilli strains that adhered to blood group A and B antigens was tested. Their components showed a significant adhesion to A and B antigens when compared to the bovine serum albumin (BSA) control. The 1 mol l(-1) GHCl fraction extracted from Lactobacillus mucosae ME-340 contained a 29-kDa band (Lam29) using SDS-PAGE. The N-terminal amino acid sequence and homology analysis showed that Lam29 was 90% similar to the substrate-binding protein of the ATP-binding cassette (ABC) transporter from Lactobacillus fermentum IFO 3956. The complete nucleotide sequence (858 bp) of Lam29 was determined and encoded a protein of 285 amino acid residues. Phylogenetic analysis and multiple sequence alignments indicated this protein may be related to the cysteine-binding transporter. CONCLUSIONS The adhesion of ME-340 strain to blood group A and B antigens was mediated by Lam29 that is a putative component of ABC transporter as an adhesin-like protein. SIGNIFICANCE AND IMPACT OF THE STUDY Lactobacillus mucosae ME-340 expressing Lam29 may be useful for competitive exclusion of pathogens via blood group antigen receptors in the human gastrointestinal mucosa and in the development of new probiotic foods.
Collapse
|
11
|
Evolution of protein phosphorylation for distinct functional modules in vertebrate genomes. Mol Biol Evol 2010; 28:1131-40. [PMID: 20956806 DOI: 10.1093/molbev/msq268] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Recent publications have revealed that the evolution of phosphosites is influenced by the local protein structures and whether the phosphosites have characterized functions or not. With knowledge of the wide functional range of phosphorylation, we attempted to clarify whether the evolutionary conservation of phosphosites is different among distinct functional modules. We grouped the phosphosites in the human genome into the modules according to the functional categories of KEGG (Kyoto Encyclopedia of Genes and Genomes) and investigated their evolutionary conservation in vertebrate genomes from mouse to zebrafish. We have found that the phosphosites in the vertebrate-specific functional modules (VFMs), such as cellular signaling processes and responses to stimuli, are evolutionarily more conserved than those in the basic functional modules (BFMs), such as metabolic and genetic processes. The phosphosites in the VFMs are also significantly more conserved than their flanking regions, whereas those in the BFMs are not. These results hold for both serine/threonine and tyrosine residues, although the fraction of phosphorylated tyrosine residues is increased in the VFMs. Moreover, the difference in the evolutionary conservation of the phosphosites between the VFMs and BFMs could not be explained by the difference in the local protein structures. There is also a higher fraction of phosphosites with known functions in the VFMs than BFMs. Based on these findings, we have concluded that protein phosphorylation may play more dominant roles for the VFMs than BFMs during the vertebrate evolution. As phosphorylation is a quite rapid biological reaction, the VFMs that quickly respond to outer stimuli and inner signals might heavily depend on this regulatory mechanism. Our results imply that phosphorylation may have an essential role in the evolution of vertebrates.
Collapse
|
12
|
Biological Databases at DNA Data Bank of Japan in the Era of Next-Generation Sequencing Technologies. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2010; 680:125-35. [DOI: 10.1007/978-1-4419-5913-3_15] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
|
13
|
BioCaster: detecting public health rumors with a Web-based text mining system. Bioinformatics 2008; 24:2940-1. [PMID: 18922806 PMCID: PMC2639299 DOI: 10.1093/bioinformatics/btn534] [Citation(s) in RCA: 155] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2008] [Revised: 10/07/2008] [Accepted: 10/09/2008] [Indexed: 11/12/2022] Open
Abstract
SUMMARY BioCaster is an ontology-based text mining system for detecting and tracking the distribution of infectious disease outbreaks from linguistic signals on the Web. The system continuously analyzes documents reported from over 1700 RSS feeds, classifies them for topical relevance and plots them onto a Google map using geocoded information. The background knowledge for bridging the gap between Layman's terms and formal-coding systems is contained in the freely available BioCaster ontology which includes information in eight languages focused on the epidemiological role of pathogens as well as geographical locations with their latitudes/longitudes. The system consists of four main stages: topic classification, named entity recognition (NER), disease/location detection and event recognition. Higher order event analysis is used to detect more precisely specified warning signals that can then be notified to registered users via email alerts. Evaluation of the system for topic recognition and entity identification is conducted on a gold standard corpus of annotated news articles. AVAILABILITY The BioCaster map and ontology are freely available via a web portal at http://www.biocaster.org.
Collapse
|
14
|
An evolutionary origin and selection process of goldfish. Gene 2008; 430:5-11. [PMID: 19027055 DOI: 10.1016/j.gene.2008.10.019] [Citation(s) in RCA: 43] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2008] [Revised: 10/16/2008] [Accepted: 10/21/2008] [Indexed: 10/21/2022]
Abstract
Many different physical characteristics of goldfish (Carassius auratus auratus), such as celestial and telescopic eyes, fancy but uncontrollable shapes of tail fin, an unfittingly fat body, and loss of dorsal fins, provide us with a unique opportunity of studying artificial selection on phenotypic changes on the basis of molecular evolution. The aim of the present study is to elucidate the evolutionary origin and history of goldfish, taking into account the different characteristics of goldfish and human culture. Collecting 44 samples of a variety of goldfish from Japan and China as well as common and Crucian carps, we determined the nucleotide sequences for a substantial portion of mitochondrial genome including eight gene regions (D-loop, 12SrRNA, 16SrRNA, ND1, ND2, COI, ND5 and Cyt b) of approximately 11,180 bps. We, then, constructed phylogenetic trees for a total of 78 fishes, adding the 19 sequence data available in the international DNA database DDBJ/EMBL/GenBank to our 59 sequence data determined. From the phylogenetic trees obtained, we found that Japanese goldfish are not relative to Japanese Crucian carp (Carassius auratus langsdorfi) and that all the goldfish examined were originated from one of the two groups of the Chinese Crucian carp "Gibelio" (Carassius auratus gibelio). Moreover, we found that the process of artificial selection began from losing the dorsal fin followed by diversification of other characters such as eyes. This is supported by our further observations that the improvement of celestial and telescope eyes took place independently at different times, implying that goldfish was imposed by strong artificial selection only to meet diversified needs of human preferences in a unsystematic way.
Collapse
|
15
|
The GTOP database in 2009: updated content and novel features to expand and deepen insights into protein structures and functions. Nucleic Acids Res 2008; 37:D333-7. [PMID: 18987007 PMCID: PMC2686575 DOI: 10.1093/nar/gkn855] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
The Genomes TO Protein Structures and Functions (GTOP) database (http://spock.genes.nig.ac.jp/~genome/gtop.html) freely provides an extensive collection of information on protein structures and functions obtained by application of various computational tools to the amino acid sequences of entirely sequenced genomes. GTOP contains annotations of 3D structures, protein families, functions, and other useful data of a protein of interest in user-friendly ways to give a deep insight into the protein structure. From the initial 1999 version, GTOP has been continually updated to reap the fruits of genome projects and augmented to supply novel information, in particular intrinsically disordered regions. As intrinsically disordered regions constitute a considerable fraction of proteins and often play crucial roles especially in eukaryotes, their assignments give important additional clues to the functionality of proteins. Additionally, we have incorporated the following features into GTOP: a platform independent structural viewer, results of HMM searches against SCOP and Pfam, secondary structure predictions, color display of exon boundaries in eukaryotic proteins, assignments of gene ontology terms, search tools, and master files.
Collapse
|
16
|
Phylogenetic construction of 17 bacterial phyla by new method and carefully selected orthologs. Gene 2008; 429:59-64. [PMID: 19000750 DOI: 10.1016/j.gene.2008.10.006] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2008] [Revised: 10/03/2008] [Accepted: 10/06/2008] [Indexed: 11/26/2022]
Abstract
Here, we constructed a phylogenetic tree of 17 bacterial phyla covering eubacteria and archaea by using a new method and 102 carefully selected orthologs from their genomes. One of the serious disturbing factors in phylogeny construction is the existence of out-paralogs that cannot easily be found out and discarded. In our method, out-paralogs are detected and removed by constructing a phylogenetic tree of the genes in question and examining the clustered genes in the tree. We also developed a method for comparing two tree topologies or shapes, ComTree. Applying ComTree to the constructed tree we computed the relative number of orthologs that support a node of the tree. This number is called the Positive Ortholog Ratio (POR), which is conceptually and methodologically different from the frequently used bootstrap value. Our study concretely shows drawbacks of the bootstrap test. Our result of bacterial phylogeny analysis is consistent with previous ones showing that hyperthermophilic bacteria such as Thermotogae and Aquificae diverged earlier than the others in the eubacterial phylogeny studied. It is noted that our results are consistent whether thermophilic archaea or mesophilic archaea is employed for determining the root of the tree. The earliest divergence of hyperthermophilic eubacteria is supported by genes involved in fundamental metabolic processes such as glycolysis, nucleotide and amino acid syntheses.
Collapse
|
17
|
Abstract
DNA Data Bank of Japan (DDBJ) (http://www.ddbj.nig.ac.jp) collected and released 2 368 110 entries or 1 415 106 598 bases in the period from July 2007 to June 2008. The releases in this period include genome scale data of Bombyx mori, Oryzas latipes, Drosophila and Lotus japonicus. In addition, from this year we collected and released trace archive data in collaboration with National Center for Biotechnology Information (NCBI). The first release contains those of O. latipes and bacterial meta genomes in human gut. To cope with the current progress of sequencing technology, we also accepted and released more than 100 million of short reads of parasitic protozoa and their hosts that were produced by using a Solexa sequencer.
Collapse
|
18
|
Abstract
With the quantity of genomic data increasing at an exponential rate, it is imperative that these data be captured electronically, in a standard format. Standardization activities must proceed within the auspices of open-access and international working bodies. To tackle the issues surrounding the development of better descriptions of genomic investigations, we have formed the Genomic Standards Consortium (GSC). Here, we introduce the minimum information about a genome sequence (MIGS) specification with the intent of promoting participation in its development and discussing the resources that will be required to develop improved mechanisms of metadata capture and exchange. As part of its wider goals, the GSC also supports improving the 'transparency' of the information contained in existing genomic databases.
Collapse
|
19
|
Abstract
AIM Currently, approximately 44 000 hepatitis C virus (HCV), 11 000 hepatitis B virus (HBV), and 1600 hepatitis E virus (HEV) sequences are available at the International Nucleotide Sequence Database Collaboration (INSDC, previously known as DDBJ/EMBL/GenBank), and the number of these virus sequences is growing rapidly. However, since INDSC is not specialized to hepatitis viruses, it is difficult to retrieve information of virological or clinical interests from it. Thus, it is quite worthwhile to construct a specialized database for the hepatitis virus sequences and to make it accessible to researchers worldwide. METHODS We developed a WWW-based database hepatitis virus database (HVDB), which contains all the HCV, HBV, and HEV sequences available at INSDC. In the HVDB, all piece sequences obtained from INSDC are arranged to the genomesequence of each virus. Also given in the database are the phylogenetic relationships of each locus on the genome among variants for each virus. RESULTS Users of the database can easily retrieve entries (sequences with annotations) of the specific genotype by referring to the phylogenetic relationships or those of specific loci by referring to the genome map information. HVDB provides users with a tool for phylogenetic analysis that can be used in combination with the data retrieval tools. CONCLUSION The latest release is publicly accessible at the HVDB website: http://s2as02.genes.nig.ac.jp.
Collapse
|
20
|
[International collaboration among DDBJ, EMBL Bank and GenBank]. TANPAKUSHITSU KAKUSAN KOSO. PROTEIN, NUCLEIC ACID, ENZYME 2008; 53:182-189. [PMID: 18240597] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 05/25/2023]
|
21
|
The H-Invitational Database (H-InvDB), a comprehensive annotation resource for human genes and transcripts. Nucleic Acids Res 2007; 36:D793-9. [PMID: 18089548 PMCID: PMC2238988 DOI: 10.1093/nar/gkm999] [Citation(s) in RCA: 44] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Here we report the new features and improvements in our latest release of the H-Invitational Database (H-InvDB; http://www.h-invitational.jp/), a comprehensive annotation resource for human genes and transcripts. H-InvDB, originally developed as an integrated database of the human transcriptome based on extensive annotation of large sets of full-length cDNA (FLcDNA) clones, now provides annotation for 120 558 human mRNAs extracted from the International Nucleotide Sequence Databases (INSD), in addition to 54 978 human FLcDNAs, in the latest release H-InvDB_4.6. We mapped those human transcripts onto the human genome sequences (NCBI build 36.1) and determined 34 699 human gene clusters, which could define 34 057 (98.1%) protein-coding and 642 (1.9%) non-protein-coding loci; 858 (2.5%) transcribed loci overlapped with predicted pseudogenes. For all these transcripts and genes, we provide comprehensive annotation including gene structures, gene functions, alternative splicing variants, functional non-protein-coding RNAs, functional domains, predicted sub cellular localizations, metabolic pathways, predictions of protein 3D structure, mapping of SNPs and microsatellite repeat motifs, co-localization with orphan diseases, gene expression profiles, orthologous genes, protein-protein interactions (PPI) and annotation for gene families. The current H-InvDB annotation resources consist of two main views: Transcript view and Locus view and eight sub-databases: the DiseaseInfo Viewer, H-ANGEL, the Clustering Viewer, G-integra, the TOPO Viewer, Evola, the PPI view and the Gene family/group.
Collapse
|
22
|
Abstract
The Rice Annotation Project Database (RAP-DB) was created to provide the genome sequence assembly of the International Rice Genome Sequencing Project (IRGSP), manually curated annotation of the sequence, and other genomics information that could be useful for comprehensive understanding of the rice biology. Since the last publication of the RAP-DB, the IRGSP genome has been revised and reassembled. In addition, a large number of rice-expressed sequence tags have been released, and functional genomics resources have been produced worldwide. Thus, we have thoroughly updated our genome annotation by manual curation of all the functional descriptions of rice genes. The latest version of the RAP-DB contains a variety of annotation data as follows: clone positions, structures and functions of 31 439 genes validated by cDNAs, RNA genes detected by massively parallel signature sequencing (MPSS) technology and sequence similarity, flanking sequences of mutant lines, transposable elements, etc. Other annotation data such as Gnomon can be displayed along with those of RAP for comparison. We have also developed a new keyword search system to allow the user to access useful information. The RAP-DB is available at: http://rapdb.dna.affrc.go.jp/ and http://rapdb.lab.nig.ac.jp/.
Collapse
|
23
|
Abstract
Fundamental biological processes can now be studied by applying the full range of OMICS technologies (genomics, transcriptomics, proteomics, metabolomics, and beyond) to the same biological sample. Clearly, it would be desirable if the concept of sample were shared among these technologies, especially as up until the time a biological sample is prepared for use in a specific OMICS assay, its description is inherently technology independent. Sharing a common informatic representation would encourage data sharing (rather than data replication), thereby reducing redundant data capture and the potential for error. This would result in a significant degree of harmonization across different OMICS data standardization activities, a task that is critical if we are to integrate data from these different data sources. Here, we review the current concept of sample in OMICS technologies as it is being dealt with by different OMICS standardization initiatives and discuss the special role that the newly formed Genomic Standards Consortium (GSC) might have to play in this domain.
Collapse
|
24
|
β-Galactosidase, phospho-β-galactosidase and phospho-β-glucosidase activities in lactobacilli strains isolated from human faeces. Lett Appl Microbiol 2007; 45:461-6. [DOI: 10.1111/j.1472-765x.2007.02176.x] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
25
|
Abstract
DDBJ (http://www.ddbj.nig.ac.jp) collected and released 1 880 115 entries or 1 134 086 245 bases in the period from July 2006 to June 2007. The released data contains the high-throughput cDNAs of cricket and high-quality draft genome of medaka among others. Our computer system has been upgraded since March 2007. Another new aspect is an efficient data retrieval tool that has recently been equipped and served at DDBJ. It is called All-round Retrieval for Sequence and Annotation, which enables the user to search for keywords also in the Feature/Qualifier of the International Nucleotide Sequence Database Collaboration (http://www.insdc.org/). We will also replace our home page with a more efficient one by the end of 2007.
Collapse
|
26
|
Abstract
It is desirable to estimate a tree of life, a species tree including all available species in the 3 superkingdoms, Archaea, Bacteria, and Eukaryota, using not a limited number of genes but full-scale genome information. Here, we report a new method for constructing a tree of life based on protein domain organizations, that is, sequential order of domains in a protein, of all proteins detected in a genome of an organism. The new method is free from the identification of orthologous gene sets and therefore does not require the burdensome and error-prone computation. By pairwise comparisons of the repertoires of protein domain organizations of 17 archaeal, 136 bacterial, and 14 eukaryotic organisms, we computed evolutionary distances among them and constructed a tree of life. Our tree shows monophyly in Archaea, Bacteria, and Eukaryota and then monophyly in each of eukaryotic kingdoms and in most bacterial phyla. In addition, the branching pattern of the bacterial phyla in our tree is consistent with the widely accepted bacterial taxonomy and is very close to other genome-based trees. A couple of inconsistent aspects between the traditional trees and the genome-based trees including ours, however, would perhaps urge to revise the conventional view, particularly on the phylogenetic positions of hyperthermophiles.
Collapse
|
27
|
Exploration and grading of possible genes from 183 bacterial strains by a common protocol to identification of new genes: Gene Trek in Prokaryote Space (GTPS). DNA Res 2006; 13:245-54. [PMID: 17166861 DOI: 10.1093/dnares/dsl014] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
A large number of complete microorganism genomes has been sequenced and submitted to the public database and then incorporated into our complete genome database, Genome Information Broker (GIB, http://gib.genes.nig.ac.jp/). However, when comparative genomics is carried out, researchers must be aware that there are protein-coding genes not confirmed by homology or motif search and that reliable protein-coding genes are missing. Therefore, we developed a protocol (Gene Trek in Prokaryote Space, GTPS) for finding possible protein-coding genes in bacterial genomes. GTPS assigns a degree of reliability to predicted protein-coding genes. We first systematically applied the protocol to the complete genomes of all 123 bacterial species and strains that were publicly available as of July 2003, and then to those of 183 species and strains available as of September 2004. We found a number of incorrect genes and several new ones in the genome data in question. We also found a way to estimate the total number of orthologous genes in the bacterial world.
Collapse
|
28
|
Abstract
DNA Data Bank of Japan (DDBJ) (http://www.ddbj.nig.ac.jp) newly collected and released 12,927,184 entries or 13,787,688,598 bases in the period from July 2005 to June 2006. The released data contain honeybee expressed sequence tags (ESTs), re-examined and re-annotated complete genome data of Escherichia coli K-12 W3110, medaka WGS and human MGA. We also systematically evaluated and classified the genes in the complete bacterial genomes submitted to the International Nucleotide Sequence Database Collaboration (INSDC, http://insdc.org) that is composed of DDBJ, EMBL Bank and GenBank. The examination and classification selected 557,000 genes as reliable ones among all the bacterial genes predicted by us.
Collapse
|
29
|
Evidence Standards in Experimental and Inferential INSDC Third Party Annotation Data. OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY 2006; 10:105-13. [PMID: 16901214 DOI: 10.1089/omi.2006.10.105] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
The Third Party Annotation (TPA) project collects and presents high-quality annotation of nucleotide sequence. Annotation is submitted by researchers who have not themselves generated novel nucleotide sequence. In its first few years, the resource has proven to be popular with submitters from a range of biological research areas. Central to the project is the requirement for high-quality data, resulting from experimental and inferred analysis discussed in peer-reviewed publications. The data are divided into two tiers: those with experimental evidence and those with inferential evidence. Standards for TPA are detailed and illustrated with the aid of case studies.
Collapse
|
30
|
DDBJ in preparation for overview of research activities behind data submissions. Nucleic Acids Res 2006; 34:D6-9. [PMID: 16381940 PMCID: PMC1347473 DOI: 10.1093/nar/gkj111] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2005] [Revised: 10/18/2005] [Accepted: 10/18/2005] [Indexed: 11/26/2022] Open
Abstract
In the past year, DDBJ (http://www.ddbj.nig.ac.jp) collected and released 1,956,826 entries or 1,741,313,111 bases. The released data include approximately 90,000 ESTs and cDNAs of Macaca fascicularis, and 280 million bases of mouse GSS. In addition to the data collection, we have indexed the submitted data to the International Nucleotide Sequence Database Collaboration (INSDC, http://www.insdc.org) to classify the entries into research projects behind data submissions. They are expected to be useful to the data submitters and users for enhancing the data submission, retrieval and systematic data analyses at INSDC. The results of indexing also allow one to grasp research projects in life sciences that promoted and produced the DNA sequences submitted to INSDC.
Collapse
|
31
|
Abstract
To elucidate the origins of the MHC-B-MHC-C pair and the MHC class I chain-related molecule (MIC)A-MICB pair, we sequenced an MHC class I genomic region of humans, chimpanzees, and rhesus monkeys and analyzed the regions from an evolutionary stand-point, focusing first on LINE sequences that are paralogous within each of the first two species and orthologous between them. Because all the long interspersed nuclear element (LINE) sequences were fragmented and nonfunctional, they were suitable for conducting phylogenetic study and, in particular, for estimating evolutionary time. Our study has revealed that MHC-B and MHC-C duplicated 22.3 million years (Myr) ago, and the ape MICA and MICB duplicated 14.1 Myr ago. We then estimated the divergence time of the rhesus monkey by using other orthologous LINE sequences in the class I regions of the three primate species. The result indicates that rhesus monkeys, and possibly the Old World monkeys in general, diverged from humans 27-30 Myr ago. Interestingly, rhesus monkeys were found to have not the pair of MHC-B and MHC-C but many repeated genes similar to MHC-B. These results support our inference that MHC-B and MHC-C duplicated after the divergence between apes and Old World monkeys.
Collapse
|
32
|
Development of a spot reliability evaluation score for DNA microarrays. Gene 2005; 350:149-60. [PMID: 15788151 DOI: 10.1016/j.gene.2005.02.003] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2004] [Revised: 01/28/2005] [Accepted: 02/08/2005] [Indexed: 10/25/2022]
Abstract
We developed a reliability index named SRED (Spot Reliability Evaluation Score for DNA microarrays) that represents the probability that the calibrated gene expression level from a DNA microarray would be less than a factor of 2 different from that of quantitative real-time polymerase chain reaction assays whose dynamic quantification range is treated statistically to be similar to that of the DNA microarray. To define the SRED score, two parameters, the reproducibility of measurement value and the relative expression value were selected from nine candidate parameters. The SRED score supplies the probability that the expression level in each spot of a microarray is less than a certain-fold different compared to other expression profiling data, such as QRT-PCR. This score was applied to approximately 1,500,000 points of the expression profile in the RIKEN Expression Array Database.
Collapse
|
33
|
Abstract
In the past year, we at DDBJ (DNA Data Bank of Japan; http://www.ddbj.nig.ac.jp) collected and released 1 066 084 entries or 718 072 425 bases including the whole chromosome 22 of chimpanzee, the whole-genome shotgun sequences of silkworm and various others. On the other hand, we hosted workshops for human full-length cDNA annotation and participated in jamborees of mouse full-length cDNA annotation. The annotated data are made public at DDBJ. We are also in collaboration with a RIKEN team to accept and release the CAGE (Cap Analysis Gene Expression) data under a new category, MGA (Mass Sequences for Genome Annotation). The data will be useful for studying gene expression control in many aspects.
Collapse
|
34
|
[International public gene expression database (CIBEX) and data submission]. TANPAKUSHITSU KAKUSAN KOSO. PROTEIN, NUCLEIC ACID, ENZYME 2004; 49:2678-83. [PMID: 15669238] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 05/01/2023]
|
35
|
Extensive analysis of ORF sequences from two different cichlid species in Lake Victoria provides molecular evidence for a recent radiation event of the Victoria species flock. Gene 2004; 343:263-9. [PMID: 15588581 DOI: 10.1016/j.gene.2004.09.013] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2004] [Revised: 08/05/2004] [Accepted: 09/10/2004] [Indexed: 11/23/2022]
Abstract
The Lake Victoria Cichlid fishes have diverged very rapidly. The estimated 500 species inhabiting the lake are believed to have arisen within the last 14,000 years. The fishes' jaws and teeth have diverged markedly to adapt to different feeding behaviors and environments. To examine how the genomes of these fishes differentiated during speciation, we performed comparative analysis of expressed sequenced tag (EST) sequences. We constructed cDNA libraries derived only from the jaw portions of two cichlid species endemic to Lake Victoria. We sequenced 17,280 cDNA clones from Haplochromis chilotes and 9600 cDNA clones from Haplochromis sp. "Redtailsheller" and obtained 543 different genes common to both species. Of these genes, 441 were essentially identical between species and 102 contained base replacements in their open reading frame (ORF) or untranslated (UTR) regions. Comparative analysis of 71 selected sequences has revealed that while the degree of polymorphism is 0.0054/site for H. chilotes and 0.0047/site for H. sp. "Redtailsheller", genetic distance between the two species is 0.0031/site. The genetic distance particularly indicates that the two species diverged about 890,000 years ago.
Collapse
|
36
|
Abstract
Vitamin B(6) (VB6) functions as a cofactor of many diverse enzymes in amino acid metabolism. Three metabolic pathways for pyridoxal 5'-phosphate (PLP; the active form of VB6) are known: the de novo pathway, the salvage pathway, and the fungal type pathway. Most unicellular organisms and plants biosynthesize VB6 using one or two of these three biosynthetic pathways. However, animals such as insects and mammals do not possess any of the pathways and, thus, need to intake VB6 in their diet to survive. It is conceivable that breakdowns of these pathways occurred in the evolutionary lineages of insects and mammals, and one of the major reasons for this would be the loss of pertinent genes. We studied the evolution of VB6 biosynthesis from the view of the gain and loss of 10 pertinent genes in 122 species whose genome sequences were completely determined. The results revealed that each gene in the pathways was lost more than once in the entire evolutionary lineages of the 122 species. We also found the following three points regarding the evolution of PLP biosynthesis: (1) the breakdown of the PLP biosynthetic pathways occurred independently at least three times in animal lineages, (2) the de novo pathway was formed by the generation of pdxB in gamma-proteobacteria, and (3) the order of the gene loss in VB6 metabolism was conserved among different evolutionary lineages. These results suggest that the evolution of VB6 metabolism was subject to gains and frequent losses of related genes in the 122 species examined. This dynamic nature of the evolutionary changes must have been responsible for the breakdowns of the pathways, resulting in profound differentiation of heterotrophy among the species.
Collapse
|
37
|
Japanese domesticated chickens have been derived from Shamo traditional fighting cocks. Mol Phylogenet Evol 2004; 33:16-21. [PMID: 15324835 DOI: 10.1016/j.ympev.2004.04.019] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2004] [Revised: 04/29/2004] [Indexed: 11/19/2022]
Abstract
With the aim of elucidating the evolutionary origin of Japanese domesticated chickens, this study evolutionarily analyzed 85 chicken mtDNA sequences. Thirty-four various ornamental chickens, 42 fighting cocks (Shamo), and nine long-crowing chickens (Naganakidori) were included. Of the Shamo, 18 were sampled from Okinawa, while the remaining 24 were collected in other islands around Japan. In addition, three Southeast Asian Junglefowls were used as a reference to determine the common ancestor of Japanese domesticated chickens. A phylogenetic tree was constructed for the 88 mtDNA sequences revealing that the Shamo group from Okinawa clearly diverged from the other Japanese domesticated chickens studied. This strongly suggests that all Japanese domesticated chickens, including the ornamental varieties and Naganakidori, derived from the ancestors of the Shamo in Okinawa. To create novel varieties of ornamental chickens, intensive artificial selection is imposed on ancestral Shamo populations, resulting in profoundly differentiated Japanese domesticated chickens.
Collapse
|
38
|
Abstract
The Microarray Gene Expression Data Society believe that the time is right for journals to require that microarray data be deposited in public repositories, as a condition for publication
Collapse
|
39
|
|
40
|
Structural and functional differences in two cyclic bacteriocins with the same sequences produced by lactobacilli. Appl Environ Microbiol 2004; 70:2906-11. [PMID: 15128550 PMCID: PMC404377 DOI: 10.1128/aem.70.5.2906-2911.2004] [Citation(s) in RCA: 71] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Lactobacillus gasseri LA39 and L. reuteri LA6 isolated from feces of the same human infant were found to produce similar cyclic bacteriocins (named gassericin A and reutericin 6, respectively) that cannot be distinguished by molecular weights or primary amino acid sequences. However, reutericin 6 has a narrower spectrum than gassericin A. In this study, gassericin A inhibited the growth of L. reuteri LA6, but reutericin 6 did not inhibit the growth of L. gasseri LA39. Both bacteriocins caused potassium ion efflux from indicator cells and liposomes, but the amounts of efflux and patterns of action were different. Although circular dichroism spectra of purified bacteriocins revealed that both antibacterial peptides are composed mainly of alpha-helices, the spectra of the bacteriocins did not coincide. The results of D- and L-amino acid composition analysis showed that two residues and one residue of D-Ala were detected among 18 Ala residues of gassericin A and reutericin 6, respectively. These findings suggest that the different D-alanine contents of the bacteriocins may cause the differences in modes of action, amounts of potassium ion efflux, and secondary structures. This is the first report that characteristics of native bacteriocins produced by wild lactobacillus strains having the same structural genes are influenced by a difference in D-amino acid contents in the molecules.
Collapse
|
41
|
Abstract
We describe the current status of the gene expression database CIBEX (Center for Information Biology gene EXpression database, http://cibex.nig.ac.jp), with a data retrieval system in compliance with MIAME, a standard that the MGED Society has developed for comparing and data produced in microarray experiments at different laboratories worldwide. CIBEX serves as a public repository for a wide range of high-throughput experimental data in gene expression research, including microarray-based experiments measuring mRNA, serial analysis of gene expression (SAGE tags), and mass spectrometry proteomic data.
Collapse
|
42
|
Integrative annotation of 21,037 human genes validated by full-length cDNA clones. PLoS Biol 2004; 2:e162. [PMID: 15103394 PMCID: PMC393292 DOI: 10.1371/journal.pbio.0020162] [Citation(s) in RCA: 267] [Impact Index Per Article: 13.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2003] [Accepted: 04/01/2004] [Indexed: 01/08/2023] Open
Abstract
The human genome sequence defines our inherent biological potential; the realization of the biology encoded therein requires knowledge of the function of each gene. Currently, our knowledge in this area is still limited. Several lines of investigation have been used to elucidate the structure and function of the genes in the human genome. Even so, gene prediction remains a difficult task, as the varieties of transcripts of a gene may vary to a great extent. We thus performed an exhaustive integrative characterization of 41,118 full-length cDNAs that capture the gene transcripts as complete functional cassettes, providing an unequivocal report of structural and functional diversity at the gene level. Our international collaboration has validated 21,037 human gene candidates by analysis of high-quality full-length cDNA clones through curation using unified criteria. This led to the identification of 5,155 new gene candidates. It also manifested the most reliable way to control the quality of the cDNA clones. We have developed a human gene database, called the H-Invitational Database (H-InvDB; http://www.h-invitational.jp/). It provides the following: integrative annotation of human genes, description of gene structures, details of novel alternative splicing isoforms, non-protein-coding RNAs, functional domains, subcellular localizations, metabolic pathways, predictions of protein three-dimensional structure, mapping of known single nucleotide polymorphisms (SNPs), identification of polymorphic microsatellite repeats within human genes, and comparative results with mouse full-length cDNAs. The H-InvDB analysis has shown that up to 4% of the human genome sequence (National Center for Biotechnology Information build 34 assembly) may contain misassembled or missing regions. We found that 6.5% of the human gene candidates (1,377 loci) did not have a good protein-coding open reading frame, of which 296 loci are strong candidates for non-protein-coding RNA genes. In addition, among 72,027 uniquely mapped SNPs and insertions/deletions localized within human genes, 13,215 nonsynonymous SNPs, 315 nonsense SNPs, and 452 indels occurred in coding regions. Together with 25 polymorphic microsatellite repeats present in coding regions, they may alter protein structure, causing phenotypic effects or resulting in disease. The H-InvDB platform represents a substantial contribution to resources needed for the exploration of human biology and pathology.
Collapse
|
43
|
DDBJ in the stream of various biological data. Nucleic Acids Res 2004; 32:D31-4. [PMID: 14681352 PMCID: PMC308861 DOI: 10.1093/nar/gkh127] [Citation(s) in RCA: 45] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2003] [Revised: 10/03/2003] [Accepted: 10/23/2003] [Indexed: 11/13/2022] Open
Abstract
In the past year we at DDBJ (http://www.ddbj.nig. ac.jp) have made a steady increase in the number of data submissions with a 50.6% increment in the number of bases or 46.5% increment in the number of entries. Among them the genome data of man, ascidian and rice hold the top three. Our activity has extended to providing a tool that enables sequence retrieval using regular expressions, and to launching our SOAP server and web services to facilitate the acquisition of proper data and tools from a huge number of biological data resources on websites worldwide. We have also opened our public gene expression database, CIBEX.
Collapse
|
44
|
Highly differentiated and conserved sex chromosome in fish species (Aulopus japonicus: Teleostei, Aulopidae). Gene 2003; 317:187-93. [PMID: 14604807 DOI: 10.1016/s0378-1119(03)00702-9] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
While highly differentiated and long-conserved sex chromosomes such as XY and ZW chromosomes are observed, respectively, in mammalian and avian species, no counterparts to such chromosomes were observed in fish until we reported in the previous study that well-conserved and highly differentiated ZW sex chromosomes existed in the family of Synodontidae. Then, the problem was if the evolutionary history of the fish ZW chromosomes was long enough to be comparable to the mammalian and avian counterparts. To tackle the problem, we had to extend our finding of the fish sex chromosomes further than a family alone. For this purpose, we chose Aulopus japonicus that belonged to one of the related families to Synodontidae. Our cytogenetic and fluorescence in situ hybridization (FISH) analyses have clearly demonstrated that A. japonicus also has ZW chromosomes. We have also found that 5S rDNA clusters are located on the Z and W chromosomes in this species. Using nontranscribed intergenic sequences in the 5S rDNA clusters as PCR primers, we successfully amplified a 6-kb-long female-specific sequence on the W chromosome. The 6-kb-long sequence contained one transposable element and two tRNA sequences. The function of the sequence remains to be studied. Our Southern blot analysis confirmed that the 6-kb sequence was located only on the W chromosome.Therefore, it is now said that highly differentiated ZW chromosomes have been conserved over two fish families. As these families were reported to have been diverged 30-60 million years ago, the fish ZW chromosomes have an evolutionary history corresponding to the history of the families. This is perhaps the first case that fish sex chromosomes are shown to have such a long evolutionary lineage.
Collapse
|
45
|
[Standardization of microarray experiment data]. TANPAKUSHITSU KAKUSAN KOSO. PROTEIN, NUCLEIC ACID, ENZYME 2003; 48:280-5. [PMID: 12652749] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 03/01/2023]
|
46
|
Parallel evolution of ligand specificity between LacI/GalR family repressors and periplasmic sugar-binding proteins. Mol Biol Evol 2003; 20:267-77. [PMID: 12598694 DOI: 10.1093/molbev/msg038] [Citation(s) in RCA: 49] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
The bacterial LacI/GalR family repressors such as lactose operon repressor (LacI), purine nucleotide synthesis repressor (PurR), and trehalose operon repressor (TreR) consist of not only the N-terminal helix-turn-helix DNA-binding domain but also the C-terminal ligand-binding domain that is structurally homologous to periplasmic sugar-binding proteins. These structural features imply that the repressor family evolved by acquiring the DNA-binding domain in the N-terminal of an ancestral periplasmic binding protein (PBP). Phylogenetic analysis of the LacI/GalR family repressors and their PBP homologues revealed that the acquisition of the DNA-binding domain occurred first in the family, and ligand specificity then evolved. The phylogenetic tree also indicates that the acquisition occurred only once before the divergence of the major lineages of eubacteria, and that the LacI/GalR and the PBP families have since undergone extensive gene duplication/loss independently along the evolutionary lineages. Multiple alignments of the repressors and PBPs furthermore revealed that repressors and PBPs with the same ligand specificity have the same or similar residues in their binding sites. This result, together with the phylogenetic relationship, demonstrates that the repressors and the PBPs individually acquired the same ligand specificity by homoplasious replacement, even though their genes are encoded in the same operon.
Collapse
|
47
|
Abstract
The DNA Data Bank of Japan (DDBJ, http://www.ddbj.nig.ac.jp) has collected and released more entries and bases than last year. This is mainly due to large-scale submissions from Japanese sequencing teams on mouse, rice, chimpanzee, nematoda and other organisms. The contributions of DDBJ over the past year are 17.3% (entries) and 10.3% (bases) of the combined outputs of the International Nucleotide Sequence Databases (INSD). Our complete genome sequence database, Genome Information Broker (GIB), has been improved by incorporating XML. It is now possible to perform a more sophisticated database search against the new GIB than the ordinary BLAST or FASTA search.
Collapse
|
48
|
A polymerase chain reaction-based method for cloning novel members of a gene family using a combination of degenerate and inhibitory primers. Gene 2002; 289:177-84. [PMID: 12036596 DOI: 10.1016/s0378-1119(02)00547-4] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
We have developed a novel method for cloning gene family members by using a polymerase chain reaction technique. The method is based on the amplification of a broad range of homologous genes in combination with the specific inhibition of already cloned genes. To accomplish this, we designed degenerate primers to highly conserved regions among the gene family members, and inhibitory primers to the divergent region at the 3'-margin of each degenerate primer. The 5'-end of the inhibitory primer, the 3'-end of which was aminated, had 3-4 bases overlapping the 3'-end of the degenerate primer. The potential of this method was demonstrated by the successful cloning of a novel member of the yeast MKC7/YAP3 gene family homologue from a filamentous fungus, Aspergillus oryzae, by inhibiting amplification of an already cloned homologue, opsB.
Collapse
|
49
|
Abstract
The DNA Data Bank of Japan (DDBJ, http://www.ddbj.nig.ac.jp) has made an effort to collect as much data as possible mainly from Japanese researchers. The increase rates of the data we collected, annotated and released to the public in the past year are 43% for the number of entries and 52% for the number of bases. The increase rates are accelerated even after the human genome was sequenced, because sequencing technology has been remarkably advanced and simplified, and research in life science has been shifted from the gene scale to the genome scale. In addition, we have developed the Genome Information Broker (GIB, http://gib.genes.nig.ac.jp) that now includes more than 50 complete microbial genome and Arabidopsis genome data. We have also developed a database of the human genome, the Human Genomics Studio (HGS, http://studio.nig.ac.jp). HGS provides one with a set of sequences being as continuous as possible in any one of the 24 chromosomes. Both GIB and HGS have been updated incorporating newly available data and retrieval tools.
Collapse
|
50
|
[Muscle involvement of Stormorken's syndrome]. Rinsho Shinkeigaku 2000; 40:915-20. [PMID: 11257789] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/19/2023]
Abstract
We described two patients, a mother and daughter, of Stormorken's syndrome. The syndrome is characterized clinically by autosomal dominant inheritance, congenital miosis, thrombocytopenia, asplenia and muscle weakness. Both patients had bleeding tendency, ichthyosis of arms, and muscle weakness. The daughter additionally had short stature (146 cm), low body weight (32 kg) and muscle cramp. Neurological findings of the patients included migraine-like headache, cognitive dysfunction, limitation of upward and lateral gaze, and amydriasis. Femoral muscle MRI of the daughter demonstrated decreased volume with patchy high intensity areas in the hamstrings. A muscle biopsy from the daughter showed myogenic changes with muscle fiber necrosis and regeneration, variation in fiber size, tubular aggregates in approximately 5% of fibers, and fibrous tissue proliferation. Dystrophin, dystrophin-associated proteins and dysferlin were normally expressed. Although both patients had elevated creatine kinase levels and generalized muscle wasting, muscle weakness was mild with slow progression. A certain membrane defect in the platelet and muscle fiber might be responsible for the pathogenesis of this syndrome.
Collapse
|