1
|
Wu X, Han M, Song X, He S, Bo X, Zhu Y. COMMO: a web server for the identification and analysis of consensus gene modules across multiple methods. Bioinformatics 2023; 39:btad708. [PMID: 37995293 PMCID: PMC10713113 DOI: 10.1093/bioinformatics/btad708] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2023] [Revised: 11/05/2023] [Accepted: 11/22/2023] [Indexed: 11/25/2023] Open
Abstract
SUMMARY A variety of computational methods have been developed to identify functionally related gene modules from genome-wide gene expression profiles. Integrating the results of these methods to identify consensus modules is a promising approach to produce more accurate and robust results. In this application note, we introduce COMMO, the first web server to identify and analyze consensus gene functionally related gene modules from different module detection methods. First, COMMO implements eight state-of-the-art module detection methods and two consensus clustering algorithms. Second, COMMO provides users with mRNA and protein expression data for 33 cancer types from three public databases. Users can also upload their own data for module detection. Third, users can perform functional enrichment and two types of survival analyses on the observed gene modules. Finally, COMMO provides interactive, customizable visualizations and exportable results. With its extensive analysis and interactive capabilities, COMMO offers a user-friendly solution for conducting module-based precision medicine research. AVAILABILITY AND IMPLEMENTATION COMMO web is available at https://commo.ncpsb.org.cn/, with the source code available on GitHub: https://github.com/Song-xinyu/COMMO/tree/master.
Collapse
Affiliation(s)
- Xiaojing Wu
- Basic Medical School, Anhui Medical University, Hefei 230022, China
- National Center for Protein Sciences (Beijing), Beijing Proteome Research Center, Beijing Institute of Lifeomics, Beijing 102206, China
| | - Mingfei Han
- National Center for Protein Sciences (Beijing), Beijing Proteome Research Center, Beijing Institute of Lifeomics, Beijing 102206, China
| | - Xinyu Song
- Center for Artificial Intelligence in Medicine, Medical Innovation Research Division of Chinese, PLA General Hospital, Beijing 100853, China
| | - Song He
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing 100850, China
| | - Xiaochen Bo
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing 100850, China
| | - Yunping Zhu
- Basic Medical School, Anhui Medical University, Hefei 230022, China
- National Center for Protein Sciences (Beijing), Beijing Proteome Research Center, Beijing Institute of Lifeomics, Beijing 102206, China
| |
Collapse
|
2
|
Manganaro L, Bianco S, Bironzo P, Cipollini F, Colombi D, Corà D, Corti G, Doronzo G, Errico L, Falco P, Gandolfi L, Guerrera F, Monica V, Novello S, Papotti M, Parab S, Pittaro A, Primo L, Righi L, Sabbatini G, Sandri A, Vattakunnel S, Bussolino F, Scagliotti GV. Consensus clustering methodology to improve molecular stratification of non-small cell lung cancer. Sci Rep 2023; 13:7759. [PMID: 37173325 PMCID: PMC10182023 DOI: 10.1038/s41598-023-33954-x] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2022] [Accepted: 04/21/2023] [Indexed: 05/15/2023] Open
Abstract
Recent advances in machine learning research, combined with the reduced sequencing costs enabled by modern next-generation sequencing, paved the way to the implementation of precision medicine through routine multi-omics molecular profiling of tumours. Thus, there is an emerging need of reliable models exploiting such data to retrieve clinically useful information. Here, we introduce an original consensus clustering approach, overcoming the intrinsic instability of common clustering methods based on molecular data. This approach is applied to the case of non-small cell lung cancer (NSCLC), integrating data of an ongoing clinical study (PROMOLE) with those made available by The Cancer Genome Atlas, to define a molecular-based stratification of the patients beyond, but still preserving, histological subtyping. The resulting subgroups are biologically characterized by well-defined mutational and gene-expression profiles and are significantly related to disease-free survival (DFS). Interestingly, it was observed that (1) cluster B, characterized by a short DFS, is enriched in KEAP1 and SKP2 mutations, that makes it an ideal candidate for further studies with inhibitors, and (2) over- and under-representation of inflammation and immune systems pathways in squamous-cell carcinomas subgroups could be potentially exploited to stratify patients treated with immunotherapy.
Collapse
Affiliation(s)
- L Manganaro
- aizoOn Technology Consulting S.R.L, Torino, Italy
| | - S Bianco
- aizoOn Technology Consulting S.R.L, Torino, Italy
| | - P Bironzo
- Medical Oncology Division at San Luigi Hospital, Department of Oncology, University of Torino, Orbassano (TO), Italy
| | - F Cipollini
- aizoOn Technology Consulting S.R.L, Torino, Italy
| | - D Colombi
- aizoOn Technology Consulting S.R.L, Torino, Italy
| | - D Corà
- Department of Translational Medicine, Piemonte Orientale University, Novara, Italy
- Center for Translational Research on Autoimmune and Allergic Diseases-CAAD, Novara, Italy
| | - G Corti
- Department of Oncology, University of Torino, 10060, Candiolo, Italy
- Candiolo Cancer Institute-IRCCS-FPO, 10060, Candiolo, Italy
| | - G Doronzo
- Department of Oncology, University of Torino, 10060, Candiolo, Italy
- Candiolo Cancer Institute-IRCCS-FPO, 10060, Candiolo, Italy
| | - L Errico
- Division of Thoracic Surgery at AOU San Luigi, Department of Oncology, University of Torino, Orbassano (TO), Italy
| | - P Falco
- aizoOn Technology Consulting S.R.L, Torino, Italy
| | - L Gandolfi
- Department of Oncology, University of Torino, 10060, Candiolo, Italy
- Candiolo Cancer Institute-IRCCS-FPO, 10060, Candiolo, Italy
| | - F Guerrera
- Division of Thoracic Surgery at AOU Città della Salute e della Scienza, Department of Surgical Sciences, University of Torino, Torino, Italy
| | - V Monica
- Department of Oncology, University of Torino, 10060, Candiolo, Italy
- Candiolo Cancer Institute-IRCCS-FPO, 10060, Candiolo, Italy
| | - S Novello
- Medical Oncology Division at San Luigi Hospital, Department of Oncology, University of Torino, Orbassano (TO), Italy
| | - M Papotti
- Pathology Division at AOU Città della Salute e della Scienza, Department of Oncology, University of Torino, Torino, Italy
| | - S Parab
- Department of Oncology, University of Torino, 10060, Candiolo, Italy
- Candiolo Cancer Institute-IRCCS-FPO, 10060, Candiolo, Italy
| | - A Pittaro
- Pathology Division at AOU Città della Salute e della Scienza, Department of Oncology, University of Torino, Torino, Italy
| | - L Primo
- Department of Oncology, University of Torino, 10060, Candiolo, Italy
- Candiolo Cancer Institute-IRCCS-FPO, 10060, Candiolo, Italy
| | - L Righi
- Pathology Division at AOU San Luigi, Department of Oncology, University of Torino, Orbassano (TO), Italy
| | - G Sabbatini
- aizoOn Technology Consulting S.R.L, Torino, Italy
| | - A Sandri
- Division of Thoracic Surgery at AOU San Luigi, Department of Oncology, University of Torino, Orbassano (TO), Italy
| | | | - F Bussolino
- Department of Oncology, University of Torino, 10060, Candiolo, Italy
- Candiolo Cancer Institute-IRCCS-FPO, 10060, Candiolo, Italy
| | - G V Scagliotti
- Medical Oncology Division at San Luigi Hospital, Department of Oncology, University of Torino, Orbassano (TO), Italy.
| |
Collapse
|
3
|
Banoei MM, Mahé E, Mansoor A, Stewart D, Winston BW, Habibi HR, Shabani-Rad MT. NMR-based metabolomic profiling can differentiate follicular lymphoma from benign lymph node tissues and may be predictive of outcome. Sci Rep 2022; 12:8294. [PMID: 35585165 PMCID: PMC9117304 DOI: 10.1038/s41598-022-12445-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2022] [Accepted: 05/10/2022] [Indexed: 11/10/2022] Open
Abstract
Follicular lymphoma (FL) is a cancer of B-cells, representing the second most common type of non-Hodgkin lymphoma and typically diagnosed at advanced stage in older adults. In contrast to the wide range of available molecular genetic data, limited data relating the metabolomic features of follicular lymphoma are known. Metabolomics is a promising analytical approach employing metabolites (molecules < 1 kDa in size) as potential biomarkers in cancer research. In this pilot study, we performed proton nuclear magnetic resonance spectroscopy (1H-NMR) on 29 cases of FL and 11 control patient specimens. The resulting spectra were assessed by both unsupervised and supervised statistical methods. We report significantly discriminant metabolomic models of common metabolites distinguishing FL from control tissues. Within our FL case series, we also report discriminant metabolomic signatures predictive of progression-free survival.
Collapse
Affiliation(s)
- Mohammad Mehdi Banoei
- Department of Critical Care Medicine, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada
| | - Etienne Mahé
- Department of Pathology and Laboratory Medicine, Foothills Medical Centre, Cumming School of Medicine, University of Calgary, McCaig Tower, Room MT7523, 1403 29 St NW, Calgary, AB, T2N 2T9, Canada.
| | - Adnan Mansoor
- Department of Pathology and Laboratory Medicine, Foothills Medical Centre, Cumming School of Medicine, University of Calgary, McCaig Tower, Room MT7523, 1403 29 St NW, Calgary, AB, T2N 2T9, Canada
| | - Douglas Stewart
- Departments of Oncology and Medicine, University of Calgary and Tom Baker Cancer Centre, Calgary, AB, Canada
| | - Brent W Winston
- Departments of Critical Care Medicine, Medicine and Biochemistry and Molecular Biology, University of Calgary, Calgary, AB, Canada
| | - Hamid R Habibi
- Department of Biological Sciences, University of Calgary, Calgary, AB, Canada
| | - Meer-Taher Shabani-Rad
- Department of Pathology and Laboratory Medicine, Foothills Medical Centre, Cumming School of Medicine, University of Calgary, McCaig Tower, Room MT7523, 1403 29 St NW, Calgary, AB, T2N 2T9, Canada
| |
Collapse
|
4
|
Yan Z, Shen Z, Li Z, Chao Q, Kong L, Gao ZF, Li QW, Zheng HY, Zhao CF, Lu CM, Wang YW, Wang BC. Genome-wide transcriptome and proteome profiles indicate an active role of alternative splicing during de-etiolation of maize seedlings. PLANTA 2020; 252:60. [PMID: 32964359 DOI: 10.1007/s00425-020-03464-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/22/2020] [Accepted: 09/12/2020] [Indexed: 06/11/2023]
Abstract
AS events affect genes encoding protein domain composition and make the single gene produce more proteins with a certain number of genes to satisfy the establishment of photosynthesis during de-etiolation. The drastic switch from skotomorphogenic to photomorphogenic development is an excellent system to elucidate rapid developmental responses to environmental stimuli in plants. To decipher the effects of different light wavelengths on de-etiolation, we illuminated etiolated maize seedlings with blue, red, blue-red mixed and white light, respectively. We found that blue light alone has the strongest effect on photomorphogenesis and that this effect can be attributed to the higher number and expression levels of photosynthesis and chlorosynthesis proteins. Deep sequencing-based transcriptome analysis revealed gene expression changes under different light treatments and a genome-wide alteration in alternative splicing (AS) profiles. We discovered 41,188 novel transcript isoforms for annotated genes, which increases the percentage of multi-exon genes with AS to 63% in maize. We provide peptide support for all defined types of AS, especially retained introns. Further in silico prediction revealed that 58.2% of retained introns have changes in domains compared with their most similar annotated protein isoform. This suggests that AS acts as a protein function switch allowing rapid light response through the addition or removal of functional domains. The richness of novel transcripts and protein isoforms also demonstrates the potential and importance of integrating proteomics into genome annotation in maize.
Collapse
Affiliation(s)
- Zhen Yan
- Photosynthesis Research Center, Key Laboratory of Photobiology, Institute of Botany, Chinese Academy of Sciences, No. 20 Nanxincun, Xiangshan, Beijing, 100093, China
- University of Chinese Academy of Sciences, 100049, Beijing, China
| | - Zhuo Shen
- Vegetable Research Institute, Guangdong Academy of Agricultural Sciences, Guangdong Key Laboratory for New Technology Research of Vegetables, Guangzhou, 510640, China
| | - Zhe Li
- Precision Scientific (Beijing) Co., Ltd., Beijing, 100085, China
| | - Qing Chao
- Photosynthesis Research Center, Key Laboratory of Photobiology, Institute of Botany, Chinese Academy of Sciences, No. 20 Nanxincun, Xiangshan, Beijing, 100093, China
- University of Chinese Academy of Sciences, 100049, Beijing, China
- The Innovative Academy of Seed Design, Chinese Academy of Sciences, Beijing, 100039, China
| | - Lei Kong
- State Key Laboratory of Protein and Plant Gene Research, Center for Bioinformatics, College of Life Sciences, Peking University, Beijing, 100871, China
| | - Zhi-Fang Gao
- Photosynthesis Research Center, Key Laboratory of Photobiology, Institute of Botany, Chinese Academy of Sciences, No. 20 Nanxincun, Xiangshan, Beijing, 100093, China
| | - Qing-Wei Li
- Beijing Botanical Garden, Institute of Botany, Chinese Academy of Sciences, Beijing, 100093, China
| | - Hai-Yan Zheng
- Center for Advanced Biotechnology and Medicine, Biological Mass Spectrometry Facility, Rutgers University, Piscataway, NJ, 08855, USA
| | - Cai-Feng Zhao
- Center for Advanced Biotechnology and Medicine, Biological Mass Spectrometry Facility, Rutgers University, Piscataway, NJ, 08855, USA
| | - Cong-Ming Lu
- State Key Laboratory of Crop Biology, College of Life Sciences, Shandong Agricultural University, Taian, 271018, Shandong, China
| | - Ying-Wei Wang
- Beijing Botanical Garden, Institute of Botany, Chinese Academy of Sciences, Beijing, 100093, China.
| | - Bai-Chen Wang
- Photosynthesis Research Center, Key Laboratory of Photobiology, Institute of Botany, Chinese Academy of Sciences, No. 20 Nanxincun, Xiangshan, Beijing, 100093, China.
- University of Chinese Academy of Sciences, 100049, Beijing, China.
- The Innovative Academy of Seed Design, Chinese Academy of Sciences, Beijing, 100039, China.
| |
Collapse
|
5
|
Li HM, Liu P, Zhang XJ, Li LM, Jiang HY, Yan H, Hou FH, Chen JP. Combined proteomics and transcriptomics reveal the genetic basis underlying the differentiation of skin appendages and immunity in pangolin. Sci Rep 2020; 10:14566. [PMID: 32884035 PMCID: PMC7471334 DOI: 10.1038/s41598-020-71513-w] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2020] [Accepted: 08/17/2020] [Indexed: 11/18/2022] Open
Abstract
Pangolin (Mains javanica) is an interesting endangered mammal with special morphological characteristics. Here, we applied proteomics and transcriptomics to explore the differentiation of pangolin skin appendages at two developmental stages and to compare gene expression profiles between abdomen hair and dorsal scale tissues. We identified 4,311 genes and 91 proteins differentially expressed between scale-type and hair-type tissue, of which 6 genes were shared by the transcriptome and proteome. Differentiation altered the abundance of hundreds of proteins and mRNA in the two types of skin appendages, many of which are involved in keratinocyte differentiation, epidermal cell differentiation, and multicellular organism development based on GO enrichment analysis, and FoxO, MAPK, and p53 signalling pathways based on KEGG enrichment analysis. DEGs in scale-type tissues were also significantly enriched in immune-related terms and pathways compared with that in hair-type tissues. Thus, we propose that pangolins have a normal skin innate immune system. Compared with the abdomen, the back skin of pangolins had more genes involved in the regulation of immune function, which may be an adaptive adjustment for the vulnerability of scaly skin to infection and injury. This investigation provides a scientific basis for the study of development and immunity of pangolin skin, which may be helpful in the protection of wild pangolin in China.
Collapse
Affiliation(s)
- Hui-Ming Li
- Guangdong Key Laboratory of Animal Conservation and Resource Utilization, Guangdong Public Laboratory of Wild Animal Conservation and Utilization, Institute of Zoology, Guangdong Academy of Science, Guangzhou, 510260, China
| | - Ping Liu
- Guangdong Key Laboratory of Animal Conservation and Resource Utilization, Guangdong Public Laboratory of Wild Animal Conservation and Utilization, Institute of Zoology, Guangdong Academy of Science, Guangzhou, 510260, China
| | - Xiu-Juan Zhang
- Guangdong Key Laboratory of Animal Conservation and Resource Utilization, Guangdong Public Laboratory of Wild Animal Conservation and Utilization, Institute of Zoology, Guangdong Academy of Science, Guangzhou, 510260, China
| | - Lin-Miao Li
- Guangdong Key Laboratory of Animal Conservation and Resource Utilization, Guangdong Public Laboratory of Wild Animal Conservation and Utilization, Institute of Zoology, Guangdong Academy of Science, Guangzhou, 510260, China
| | - Hai-Ying Jiang
- Guangdong Key Laboratory of Animal Conservation and Resource Utilization, Guangdong Public Laboratory of Wild Animal Conservation and Utilization, Institute of Zoology, Guangdong Academy of Science, Guangzhou, 510260, China
| | - Hua Yan
- Guangdong Provincial Key Laboratory of Silviculture, Protection and Utilization, Guangdong Academy of Forestry, Guangzhou, Guangdong Province, China
| | - Fang-Hui Hou
- Guangdong Provincial Wildlife Rescue Centre, Guangzhou, Guangdong Province, China
| | - Jin-Ping Chen
- Guangdong Key Laboratory of Animal Conservation and Resource Utilization, Guangdong Public Laboratory of Wild Animal Conservation and Utilization, Institute of Zoology, Guangdong Academy of Science, Guangzhou, 510260, China.
| |
Collapse
|
6
|
Tao J, Han Q, Zhou H, Diao X. Transcriptomic responses of regenerating earthworms (Eisenia foetida) to retinoic acid reveals the role of pluripotency genes. CHEMOSPHERE 2019; 226:47-59. [PMID: 30913427 DOI: 10.1016/j.chemosphere.2019.03.111] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/14/2019] [Revised: 03/16/2019] [Accepted: 03/16/2019] [Indexed: 06/09/2023]
Abstract
Exogenous retinoic acid (RA) delays and disturbs the regeneration of Eisenia foetida and inhibits the expression of pluripotent gene Sox2. However, studies of E. foetida conducted at the molecular level have been unable to elucidate its regeneration and mechanisms of RA effects on its regeneration. We merged existing transcriptomic data for E. foetida to generate a high-confidence set of transcriptomes. The de novo assembly of transcriptomes was performed by using the Trinity method, and functional annotations were analysed. We performed RNA-seq on four samples of regenerating tail fragments, three across a time-course (0, 3 and 7 days post amputation) and the fourth sample exposed to RA (7 days post amputation). E. foetida regeneration genes underwent significant upregulation and downregulation over the examined time periods, which may have been caused by a shared regulatory programme controlled by multiple gene families. The inhibition of RA against earthworm regeneration is likely related to the expression of these genes. Using annotation data and clustering, we also identified specific transcripts of 6 gene superfamilies enriched among genes exhibiting differential expression during regeneration periods and exhibiting the same expression patterns as those of the Sox2 gene. The regeneration transcriptome of tail fragment regeneration serves as a strong resource for investigating global expression changes that occur during regeneration and the toxicity of RA. This study offers insight for better understanding the regeneration of lower animals and molecular mechanisms of RA toxicity in invertebrates.
Collapse
Affiliation(s)
- Jing Tao
- State Key Laboratory of Marine Resource Utilization in South China Sea, Hainan University, Haikou, 570228, China; College of Life Sciences and Pharmacy, Hainan University, Haikou, 570228, China; State Key Laboratory of Virology, College of Life Sciences, Wuhan University, Wuhan, 430072, China.
| | - Qian Han
- State Key Laboratory of Marine Resource Utilization in South China Sea, Hainan University, Haikou, 570228, China; College of Life Sciences and Pharmacy, Hainan University, Haikou, 570228, China.
| | - Hailong Zhou
- State Key Laboratory of Marine Resource Utilization in South China Sea, Hainan University, Haikou, 570228, China; College of Life Sciences and Pharmacy, Hainan University, Haikou, 570228, China.
| | - Xiaoping Diao
- State Key Laboratory of Marine Resource Utilization in South China Sea, Hainan University, Haikou, 570228, China; College of Life Science, Hainan Normal University, Haikou, 571158, China.
| |
Collapse
|
7
|
Manners HN, Roy S, Kalita JK. Intrinsic-overlapping co-expression module detection with application to Alzheimer's Disease. Comput Biol Chem 2018; 77:373-389. [PMID: 30466046 DOI: 10.1016/j.compbiolchem.2018.10.014] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2018] [Revised: 10/28/2018] [Accepted: 10/29/2018] [Indexed: 11/18/2022]
Abstract
Genes interact with each other and may cause perturbation in the molecular pathways leading to complex diseases. Often, instead of any single gene, a subset of genes interact, forming a network, to share common biological functions. Such a subnetwork is called a functional module or motif. Identifying such modules and central key genes in them, that may be responsible for a disease, may help design patient-specific drugs. In this study, we consider the neurodegenerative Alzheimer's Disease (AD) and identify potentially responsible genes from functional motif analysis. We start from the hypothesis that central genes in genetic modules are more relevant to a disease that is under investigation and identify hub genes from the modules as potential marker genes. Motifs or modules are often non-exclusive or overlapping in nature. Moreover, they sometimes show intrinsic or hierarchical distributions with overlapping functional roles. To the best of our knowledge, no prior work handles both the situations in an integrated way. We propose a non-exclusive clustering approach, CluViaN (Clustering Via Network) that can detect intrinsic as well as overlapping modules from gene co-expression networks constructed using microarray expression profiles. We compare our method with existing methods to evaluate the quality of modules extracted. CluViaN reports the presence of intrinsic and overlapping motifs in different species not reported by any other research. We further apply our method to extract significant AD specific modules using CluViaN and rank them based the number of genes from a module involved in the disease pathways. Finally, top central genes are identified by topological analysis of the modules. We use two different AD phenotype data for experimentation. We observe that central genes, namely PSEN1, APP, NDUFB2, NDUFA1, UQCR10, PPP3R1 and a few more, play significant roles in the AD. Interestingly, our experiments also find a hub gene, PML, which has recently been reported to play a role in plasticity, circadian rhythms and the response to proteins which can cause neurodegenerative disorders. MUC4, another hub gene that we find experimentally is yet to be investigated for its potential role in AD. A software implementation of CluViaN in Java is available for download at https://sites.google.com/site/swarupnehu/publications/resources/CluViaN Software.rar.
Collapse
Affiliation(s)
- Hazel Nicolette Manners
- Department of Information Technology, North Eastern Hill University, Shillong, Meghalaya, India.
| | - Swarup Roy
- Department of Computer Applications, Sikkim University, Gangtok, Sikkim, India; Department of Information Technology, North Eastern Hill University, Shillong, Meghalaya, India.
| | - Jugal K Kalita
- Department of Computer Science, University of Colorado, Colorado Springs, USA.
| |
Collapse
|
8
|
Abstract
Codon usage depends on mutation bias, tRNA-mediated selection, and the need for high efficiency and accuracy in translation. One codon in a synonymous codon family is often strongly over-used, especially in highly expressed genes, which often leads to a high dN/dS ratio because dS is very small. Many different codon usage indices have been proposed to measure codon usage and codon adaptation. Sense codon could be misread by release factors and stop codons misread by tRNAs, which also contribute to codon usage in rare cases. This chapter outlines the conceptual framework on codon evolution, illustrates codon-specific and gene-specific codon usage indices, and presents their applications. A new index for codon adaptation that accounts for background mutation bias (Index of Translation Elongation) is presented and contrasted with codon adaptation index (CAI) which does not consider background mutation bias. They are used to re-analyze data from a recent paper claiming that translation elongation efficiency matters little in protein production. The reanalysis disproves the claim.
Collapse
|
9
|
A Survey of Data Mining and Deep Learning in Bioinformatics. J Med Syst 2018; 42:139. [DOI: 10.1007/s10916-018-1003-9] [Citation(s) in RCA: 81] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2018] [Accepted: 06/21/2018] [Indexed: 12/13/2022]
|
10
|
Aschenbrenner AC, Bassler K, Brondolin M, Bonaguro L, Carrera P, Klee K, Ulas T, Schultze JL, Hoch M. A cross-species approach to identify transcriptional regulators exemplified for Dnajc22 and Hnf4a. Sci Rep 2017; 7:4056. [PMID: 28642491 PMCID: PMC5481429 DOI: 10.1038/s41598-017-04370-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2016] [Accepted: 05/05/2017] [Indexed: 12/03/2022] Open
Abstract
There is an enormous need to make better use of the ever increasing wealth of publicly available genomic information and to utilize the tremendous progress in computational approaches in the life sciences. Transcriptional regulation of protein-coding genes is a major mechanism of controlling cellular functions. However, the myriad of transcription factors potentially controlling transcription of any given gene makes it often difficult to quickly identify the biological relevant transcription factors. Here, we report on the identification of Hnf4a as a major transcription factor of the so far unstudied DnaJ heat shock protein family (Hsp40) member C22 (Dnajc22). We propose an approach utilizing recent advances in computational biology and the wealth of publicly available genomic information guiding the identification of potential transcription factor candidates together with wet-lab experiments validating computational models. More specifically, the combined use of co-expression analyses based on self-organizing maps with sequence-based transcription factor binding prediction led to the identification of Hnf4a as the potential transcriptional regulator for Dnajc22 which was further corroborated using publicly available datasets on Hnf4a. Following this procedure, we determined its functional binding site in the murine Dnajc22 locus using ChIP-qPCR and luciferase assays and verified this regulatory loop in fruitfly, zebrafish, and humans.
Collapse
Affiliation(s)
- A C Aschenbrenner
- Developmental Genetics & Molecular Physiology, Life & Medical Sciences Institute (LIMES), University of Bonn, Bonn, Germany.
| | - K Bassler
- Genomics and Immunoregulation, Life & Medical Sciences Institute (LIMES), University of Bonn, Bonn, Germany
| | - M Brondolin
- Developmental Genetics & Molecular Physiology, Life & Medical Sciences Institute (LIMES), University of Bonn, Bonn, Germany
- Department of Craniofacial Development and Stem Cell Biology, Dental Institute, King's College London, SE1 9RT, London, United Kingdom
| | - L Bonaguro
- Developmental Genetics & Molecular Physiology, Life & Medical Sciences Institute (LIMES), University of Bonn, Bonn, Germany
| | - P Carrera
- Developmental Genetics & Molecular Physiology, Life & Medical Sciences Institute (LIMES), University of Bonn, Bonn, Germany
| | - K Klee
- Genomics and Immunoregulation, Life & Medical Sciences Institute (LIMES), University of Bonn, Bonn, Germany
| | - T Ulas
- Genomics and Immunoregulation, Life & Medical Sciences Institute (LIMES), University of Bonn, Bonn, Germany
| | - J L Schultze
- Genomics and Immunoregulation, Life & Medical Sciences Institute (LIMES), University of Bonn, Bonn, Germany
- Single Cell Genomics and Epigenomics Unit at the German Center for Neurodegenerative Diseases and the University of Bonn, 53175, Bonn, Germany
| | - M Hoch
- Developmental Genetics & Molecular Physiology, Life & Medical Sciences Institute (LIMES), University of Bonn, Bonn, Germany
| |
Collapse
|
11
|
Kanzaki N, Kataoka T, Etani R, Sasaoka K, Kanagawa A, Yamaoka K. Analysis of liver damage from radon, X-ray, or alcohol treatments in mice using a self-organizing map. JOURNAL OF RADIATION RESEARCH 2017; 58:33-40. [PMID: 27614200 PMCID: PMC5321189 DOI: 10.1093/jrr/rrw083] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/27/2016] [Revised: 06/24/2016] [Accepted: 06/30/2016] [Indexed: 05/30/2023]
Abstract
In our previous studies, we found that low-dose radiation inhibits oxidative stress-induced diseases due to increased antioxidants. Although these effects of low-dose radiation were demonstrated, further research was needed to clarify the effects. However, the analysis of oxidative stress is challenging, especially that of low levels of oxidative stress, because antioxidative substances are intricately involved. Thus, we proposed an approach for analysing oxidative liver damage via use of a self-organizing map (SOM)-a novel and comprehensive technique for evaluating hepatic and antioxidative function. Mice were treated with radon inhalation, irradiated with X-rays, or subjected to intraperitoneal injection of alcohol. We evaluated the oxidative damage levels in the liver from the SOM results for hepatic function and antioxidative substances. The results showed that the effects of low-dose irradiation (radon inhalation at a concentration of up to 2000 Bq/m3, or X-irradiation at a dose of up to 2.0 Gy) were comparable with the effect of alcohol administration at 0.5 g/kg bodyweight. Analysis using the SOM to discriminate small changes was made possible by its ability to 'learn' to adapt to unexpected changes. Moreover, when using a spherical SOM, the method comprehensively examined liver damage by radon, X-ray, and alcohol. We found that the types of liver damage caused by radon, X-rays, and alcohol have different characteristics. Therefore, our approaches would be useful as a method for evaluating oxidative liver damage caused by radon, X-rays and alcohol.
Collapse
Affiliation(s)
- Norie Kanzaki
- Graduate School of Health Sciences, Okayama University, 5-1 Shikatacho, 2-chome, Kita-ku, Okayama-shi, Okayama 700-8558, Japan
| | - Takahiro Kataoka
- Graduate School of Health Sciences, Okayama University, 5-1 Shikatacho, 2-chome, Kita-ku, Okayama-shi, Okayama 700-8558, Japan
| | - Reo Etani
- Graduate School of Health Sciences, Okayama University, 5-1 Shikatacho, 2-chome, Kita-ku, Okayama-shi, Okayama 700-8558, Japan
| | - Kaori Sasaoka
- Graduate School of Health Sciences, Okayama University, 5-1 Shikatacho, 2-chome, Kita-ku, Okayama-shi, Okayama 700-8558, Japan
| | - Akihiro Kanagawa
- Faculty of Computer Science and System Engineering, Okayama Prefectural University, 111 Kuboki, Soja-shi, Okayama 719-1197, Japan
| | - Kiyonori Yamaoka
- Graduate School of Health Sciences, Okayama University, 5-1 Shikatacho, 2-chome, Kita-ku, Okayama-shi, Okayama 700-8558, Japan
| |
Collapse
|
12
|
Seno A, Kasai T, Ikeda M, Vaidyanath A, Masuda J, Mizutani A, Murakami H, Ishikawa T, Seno M. Characterization of Gene Expression Patterns among Artificially Developed Cancer Stem Cells Using Spherical Self-Organizing Map. Cancer Inform 2016; 15:163-78. [PMID: 27559294 PMCID: PMC4988459 DOI: 10.4137/cin.s39839] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2016] [Revised: 05/15/2016] [Accepted: 05/30/2016] [Indexed: 12/20/2022] Open
Abstract
We performed gene expression microarray analysis coupled with spherical self-organizing map (sSOM) for artificially developed cancer stem cells (CSCs). The CSCs were developed from human induced pluripotent stem cells (hiPSCs) with the conditioned media of cancer cell lines, whereas the CSCs were induced from primary cell culture of human cancer tissues with defined factors (OCT3/4, SOX2, and KLF4). These cells commonly expressed human embryonic stem cell (hESC)/hiPSC-specific genes (POU5F1, SOX2, NANOG, LIN28, and SALL4) at a level equivalent to those of control hiPSC 201B7. The sSOM with unsupervised method demonstrated that the CSCs could be divided into three groups based on their culture conditions and original cancer tissues. Furthermore, with supervised method, sSOM nominated TMED9, RNASE1, NGFR, ST3GAL1, TNS4, BTG2, SLC16A3, CD177, CES1, GDF15, STMN2, FAM20A, NPPB, CD99, MYL7, PRSS23, AHNAK, and LOC152573 genes commonly upregulating among the CSCs compared to hiPSC, suggesting the gene signature of the CSCs.
Collapse
Affiliation(s)
- Akimasa Seno
- Laboratory of Nano-Biotechnology, Department of Medical Bioengineering Science, Graduate School of Natural Science and Technology, Okayama University, Kita-ku, Okayama, Japan
| | - Tomonari Kasai
- Laboratory of Nano-Biotechnology, Department of Medical Bioengineering Science, Graduate School of Natural Science and Technology, Okayama University, Kita-ku, Okayama, Japan
| | - Masashi Ikeda
- Laboratory of Nano-Biotechnology, Department of Medical Bioengineering Science, Graduate School of Natural Science and Technology, Okayama University, Kita-ku, Okayama, Japan
| | - Arun Vaidyanath
- Laboratory of Nano-Biotechnology, Department of Medical Bioengineering Science, Graduate School of Natural Science and Technology, Okayama University, Kita-ku, Okayama, Japan
| | - Junko Masuda
- Laboratory of Nano-Biotechnology, Department of Medical Bioengineering Science, Graduate School of Natural Science and Technology, Okayama University, Kita-ku, Okayama, Japan
| | - Akifumi Mizutani
- Laboratory of Nano-Biotechnology, Department of Medical Bioengineering Science, Graduate School of Natural Science and Technology, Okayama University, Kita-ku, Okayama, Japan
| | - Hiroshi Murakami
- Laboratory of Nano-Biotechnology, Department of Medical Bioengineering Science, Graduate School of Natural Science and Technology, Okayama University, Kita-ku, Okayama, Japan
| | - Tetsuya Ishikawa
- Cell Biology, Core Facilities for Research and Innovative Medicine, National Cancer Center Research Institute, Chuo-ku, Tokyo, Japan.; Central Animal Division, Fundamental Innovative Oncology Core Center, National Cancer Center Research Institute, Chuo-ku, Tokyo, Japan
| | - Masaharu Seno
- Laboratory of Nano-Biotechnology, Department of Medical Bioengineering Science, Graduate School of Natural Science and Technology, Okayama University, Kita-ku, Okayama, Japan
| |
Collapse
|
13
|
Udatha DBRKG, Topakas E, Salazar M, Olsson L, Andersen MR, Panagiotou G. Deciphering the signaling mechanisms of the plant cell wall degradation machinery in Aspergillus oryzae. BMC SYSTEMS BIOLOGY 2015; 9:77. [PMID: 26573537 PMCID: PMC4647334 DOI: 10.1186/s12918-015-0224-5] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/29/2015] [Accepted: 10/28/2015] [Indexed: 12/30/2022]
Abstract
Background The gene expression and secretion of fungal lignocellulolytic enzymes are tightly controlled at the transcription level using independent mechanisms to respond to distinct inducers from plant biomass. An advanced systems-level understanding of transcriptional regulatory networks is required to rationally engineer filamentous fungi for more efficient bioconversion of different types of biomass. Results In this study we focused on ten chemically defined inducers to drive expression of cellulases, hemicellulases and accessory enzymes in the model filamentous fungus Aspergillus oryzae and shed light on the complex network of transcriptional activators required. The chemical diversity analysis of the inducers, based on 186 chemical descriptors calculated from the structure, resulted into three clusters, however, the global, metabolic and extracellular protein transcription of the A. oryzae genome were only partially explained by the chemical similarity of the enzyme inducers. Genes encoding enzymes that have attracted considerable interest such as cellobiose dehydrogenases and copper-dependent polysaccharide mono-oxygenases presented a substrate-specific induction. Several homology-model structures were derived using ab-initio multiple threading alignment in our effort to elucidate the interplay of transcription factors involved in regulating plant-deconstructing enzymes and metabolites. Systematic investigation of metabolite-protein interactions, using the 814 unique reactants involved in 2360 reactions in the genome scale metabolic network of A. oryzae, was performed through a two-step molecular docking against the binding pockets of the transcription factors AoXlnR and AoAmyR. A total of six metabolites viz., sulfite (H2SO3), sulfate (SLF), uroporphyrinogen III (UPGIII), ethanolamine phosphate (PETHM), D-glyceraldehyde 3-phosphate (T3P1) and taurine (TAUR) were found as strong binders, whereas the genes involved in the metabolic reactions that these metabolites appear were found to be significantly differentially expressed when comparing the inducers with glucose. Conclusions Based on our observations, we believe that specific binding of sulfite to the regulator of the cellulase gene expression, AoXlnR, may be the molecular basis for the connection of sulfur metabolism and cellulase gene expression in filamentous fungi. Further characterization and manipulation of the regulatory network components identified in this study, will enable rational engineering of industrial strains for improved production of the sophisticated set of enzymes necessary to break-down chemically divergent plant biomass. Electronic supplementary material The online version of this article (doi:10.1186/s12918-015-0224-5) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- D B R K Gupta Udatha
- The Norwegian Structural Biology Centre, Department of Chemistry, Faculty of Science and Technology, University of Tromsø, N-9037, Tromsø, Norway. .,Section for Genetics and Evolutionary Biology, Department of Biosciences, University of Oslo, Oslo, Norway.
| | - Evangelos Topakas
- Biotechnology Laboratory, School of Chemical Engineering, National Technical University of Athens, 5 Iroon Polytechniou Str., Zografou Campus, Athens, 15780, Greece.
| | - Margarita Salazar
- Department of Chemical and Biological Engineering, Chalmers University of Technology, Gothenburg, Sweden. .,Wallenberg Wood Science Center, Chalmers University of Technology, SE-412 96, Gothenburg, Sweden.
| | - Lisbeth Olsson
- Department of Chemical and Biological Engineering, Chalmers University of Technology, Gothenburg, Sweden. .,Wallenberg Wood Science Center, Chalmers University of Technology, SE-412 96, Gothenburg, Sweden.
| | - Mikael R Andersen
- Department of Systems Biology, Technical University of Denmark, Søltofts plads 223, DK-2800, Lyngby, Denmark.
| | - Gianni Panagiotou
- School of Biological Sciences, The University of Hong Kong, Kadoorie Biological Sciences Building, Hong Kong, China.
| |
Collapse
|
14
|
LacSubPred: predicting subtypes of Laccases, an important lignin metabolism-related enzyme class, using in silico approaches. BMC Bioinformatics 2014; 15 Suppl 11:S15. [PMID: 25350584 PMCID: PMC4251044 DOI: 10.1186/1471-2105-15-s11-s15] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Background Laccases (E.C. 1.10.3.2) are multi-copper oxidases that have gained importance in many industries such as biofuels, pulp production, textile dye bleaching, bioremediation, and food production. Their usefulness stems from the ability to act on a diverse range of phenolic compounds such as o-/p-quinols, aminophenols, polyphenols, polyamines, aryl diamines, and aromatic thiols. Despite acting on a wide range of compounds as a family, individual Laccases often exhibit distinctive and varied substrate ranges. This is likely due to Laccases involvement in many metabolic roles across diverse taxa. Classification systems for multi-copper oxidases have been developed using multiple sequence alignments, however, these systems seem to largely follow species taxonomy rather than substrate ranges, enzyme properties, or specific function. It has been suggested that the roles and substrates of various Laccases are related to their optimal pH. This is consistent with the observation that fungal Laccases usually prefer acidic conditions, whereas plant and bacterial Laccases prefer basic conditions. Based on these observations, we hypothesize that a descriptor-based unsupervised learning system could generate homology independent classification system for better describing the functional properties of Laccases. Results In this study, we first utilized unsupervised learning approach to develop a novel homology independent Laccase classification system. From the descriptors considered, physicochemical properties showed the best performance. Physicochemical properties divided the Laccases into twelve subtypes. Analysis of the clusters using a t-test revealed that the majority of the physicochemical descriptors had statistically significant differences between the classes. Feature selection identified the most important features as negatively charges residues, the peptide isoelectric point, and acidic or amidic residues. Secondly, to allow for classification of new Laccases, a supervised learning system was developed from the clusters. The models showed high performance with an overall accuracy of 99.03%, error of 0.49%, MCC of 0.9367, precision of 94.20%, sensitivity of 94.20%, and specificity of 99.47% in a 5-fold cross-validation test. In an independent test, our models still provide a high accuracy of 97.98%, error rate of 1.02%, MCC of 0.8678, precision of 87.88%, sensitivity of 87.88% and specificity of 98.90%. Conclusion This study provides a useful classification system for better understanding of Laccases from their physicochemical properties perspective. We also developed a publically available web tool for the characterization of Laccase protein sequences (http://lacsubpred.bioinfo.ucr.edu/). Finally, the programs used in the study are made available for researchers interested in applying the system to other enzyme classes (https://github.com/tweirick/SubClPred).
Collapse
|
15
|
Inano R, Oishi N, Kunieda T, Arakawa Y, Yamao Y, Shibata S, Kikuchi T, Fukuyama H, Miyamoto S. Voxel-based clustered imaging by multiparameter diffusion tensor images for glioma grading. NEUROIMAGE-CLINICAL 2014; 5:396-407. [PMID: 25180159 PMCID: PMC4145535 DOI: 10.1016/j.nicl.2014.08.001] [Citation(s) in RCA: 32] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/02/2014] [Revised: 07/15/2014] [Accepted: 08/05/2014] [Indexed: 11/26/2022]
Abstract
Gliomas are the most common intra-axial primary brain tumour; therefore, predicting glioma grade would influence therapeutic strategies. Although several methods based on single or multiple parameters from diagnostic images exist, a definitive method for pre-operatively determining glioma grade remains unknown. We aimed to develop an unsupervised method using multiple parameters from pre-operative diffusion tensor images for obtaining a clustered image that could enable visual grading of gliomas. Fourteen patients with low-grade gliomas and 19 with high-grade gliomas underwent diffusion tensor imaging and three-dimensional T1-weighted magnetic resonance imaging before tumour resection. Seven features including diffusion-weighted imaging, fractional anisotropy, first eigenvalue, second eigenvalue, third eigenvalue, mean diffusivity and raw T2 signal with no diffusion weighting, were extracted as multiple parameters from diffusion tensor imaging. We developed a two-level clustering approach for a self-organizing map followed by the K-means algorithm to enable unsupervised clustering of a large number of input vectors with the seven features for the whole brain. The vectors were grouped by the self-organizing map as protoclusters, which were classified into the smaller number of clusters by K-means to make a voxel-based diffusion tensor-based clustered image. Furthermore, we also determined if the diffusion tensor-based clustered image was really helpful for predicting pre-operative glioma grade in a supervised manner. The ratio of each class in the diffusion tensor-based clustered images was calculated from the regions of interest manually traced on the diffusion tensor imaging space, and the common logarithmic ratio scales were calculated. We then applied support vector machine as a classifier for distinguishing between low- and high-grade gliomas. Consequently, the sensitivity, specificity, accuracy and area under the curve of receiver operating characteristic curves from the 16-class diffusion tensor-based clustered images that showed the best performance for differentiating high- and low-grade gliomas were 0.848, 0.745, 0.804 and 0.912, respectively. Furthermore, the log-ratio value of each class of the 16-class diffusion tensor-based clustered images was compared between low- and high-grade gliomas, and the log-ratio values of classes 14, 15 and 16 in the high-grade gliomas were significantly higher than those in the low-grade gliomas (p < 0.005, p < 0.001 and p < 0.001, respectively). These classes comprised different patterns of the seven diffusion tensor imaging-based parameters. The results suggest that the multiple diffusion tensor imaging-based parameters from the voxel-based diffusion tensor-based clustered images can help differentiate between low- and high-grade gliomas. We have developed a novel unsupervised method for voxel-based clustered imaging. Each class ratio in clustered images differentiated high from low-grade gliomas. The 16-class clustered images showed the best performance for the differentiation. Each class comprised different patterns of the seven diffusion tensor-based features. Multiple parameters from diffusion tensor images are useful for glioma grading.
Collapse
Key Words
- ADC, apparent diffusion coefficient
- AUC, area under the curve
- BET, FSL's Brain extraction Tool
- BLSOM, batch-learning self-organizing map
- CI, confidence interval
- CNS, central nervous system
- DTI, diffusion tensor imaging
- DTcI, diffusion tensor-based clustered image
- DWI, diffusion-weighted imaging
- Diffusion tensor imaging
- EPI, echo planar image
- FA, fractional anisotropy
- FDT, FMRIB's diffusion toolbox
- FLAIR, fluid-attenuated inversion-recovery
- FSL, FMRIB Software Library
- Glioma grading
- HGG, high-grade glioma
- K-means
- KM++, K-means++
- KM, K-means
- L1, first eigenvalue
- L2, second eigenvalue
- L3, third eigenvalue
- LGG, low-grade glioma
- LOOCV, leave-one-out cross-validation
- MD, mean diffusivity
- MP-RAGE, magnetization-prepared rapid gradient-echo
- MRI, magnetic resonance imaging
- PET, positron emission tomography
- ROC, receiver operating characteristic
- ROI, region of interest
- S0, raw T2 signal with no diffusion weighting
- SOM, self-organizing map
- SVM, support vector machine
- Self-organizing map
- Support vector machine
- T1WI, T1-weighted image
- T1WIce, contrast-enhanced T1-weighted image
- T2WI, T2-weighted image
- Voxel-based clustering
- WHO, World Health Organization
Collapse
Affiliation(s)
- Rika Inano
- Department of Neurosurgery, Kyoto University Graduate School of Medicine, Kyoto, Japan ; Human Brain Research Center, Kyoto University Graduate School of Medicine, Kyoto, Japan
| | - Naoya Oishi
- Human Brain Research Center, Kyoto University Graduate School of Medicine, Kyoto, Japan
| | - Takeharu Kunieda
- Department of Neurosurgery, Kyoto University Graduate School of Medicine, Kyoto, Japan
| | - Yoshiki Arakawa
- Department of Neurosurgery, Kyoto University Graduate School of Medicine, Kyoto, Japan
| | - Yukihiro Yamao
- Department of Neurosurgery, Kyoto University Graduate School of Medicine, Kyoto, Japan ; Human Brain Research Center, Kyoto University Graduate School of Medicine, Kyoto, Japan
| | - Sumiya Shibata
- Department of Neurosurgery, Kyoto University Graduate School of Medicine, Kyoto, Japan ; Human Brain Research Center, Kyoto University Graduate School of Medicine, Kyoto, Japan
| | - Takayuki Kikuchi
- Department of Neurosurgery, Kyoto University Graduate School of Medicine, Kyoto, Japan
| | - Hidenao Fukuyama
- Human Brain Research Center, Kyoto University Graduate School of Medicine, Kyoto, Japan
| | - Susumu Miyamoto
- Department of Neurosurgery, Kyoto University Graduate School of Medicine, Kyoto, Japan
| |
Collapse
|
16
|
Milone DH, Stegmayer G, López M, Kamenetzky L, Carrari F. Improving clustering with metabolic pathway data. BMC Bioinformatics 2014; 15:101. [PMID: 24717120 PMCID: PMC4002909 DOI: 10.1186/1471-2105-15-101] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2013] [Accepted: 03/25/2014] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND It is a common practice in bioinformatics to validate each group returned by a clustering algorithm through manual analysis, according to a-priori biological knowledge. This procedure helps finding functionally related patterns to propose hypotheses for their behavior and the biological processes involved. Therefore, this knowledge is used only as a second step, after data are just clustered according to their expression patterns. Thus, it could be very useful to be able to improve the clustering of biological data by incorporating prior knowledge into the cluster formation itself, in order to enhance the biological value of the clusters. RESULTS A novel training algorithm for clustering is presented, which evaluates the biological internal connections of the data points while the clusters are being formed. Within this training algorithm, the calculation of distances among data points and neurons centroids includes a new term based on information from well-known metabolic pathways. The standard self-organizing map (SOM) training versus the biologically-inspired SOM (bSOM) training were tested with two real data sets of transcripts and metabolites from Solanum lycopersicum and Arabidopsis thaliana species. Classical data mining validation measures were used to evaluate the clustering solutions obtained by both algorithms. Moreover, a new measure that takes into account the biological connectivity of the clusters was applied. The results of bSOM show important improvements in the convergence and performance for the proposed clustering method in comparison to standard SOM training, in particular, from the application point of view. CONCLUSIONS Analyses of the clusters obtained with bSOM indicate that including biological information during training can certainly increase the biological value of the clusters found with the proposed method. It is worth to highlight that this fact has effectively improved the results, which can simplify their further analysis.The algorithm is available as a web-demo at http://fich.unl.edu.ar/sinc/web-demo/bsom-lite/. The source code and the data sets supporting the results of this article are available at http://sourceforge.net/projects/sourcesinc/files/bsom.
Collapse
Affiliation(s)
- Diego H Milone
- Research Center for Signals, Systems and Computational Intelligence, sinc(i), FICH-UNL, CONICET, Ciudad Universitaria UNL, (3000) Santa Fe, Argentina.
| | | | | | | | | |
Collapse
|
17
|
Maksimov P, Zerweck J, Dubey JP, Pantchev N, Frey CF, Maksimov A, Reimer U, Schutkowski M, Hosseininejad M, Ziller M, Conraths FJ, Schares G. Serotyping of Toxoplasma gondii in cats (Felis domesticus) reveals predominance of type II infections in Germany. PLoS One 2013; 8:e80213. [PMID: 24244652 PMCID: PMC3820565 DOI: 10.1371/journal.pone.0080213] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2013] [Accepted: 10/01/2013] [Indexed: 11/19/2022] Open
Abstract
Background Cats are definitive hosts of Toxoplasma gondii and play an essential role in the epidemiology of this parasite. The study aims at clarifying whether cats are able to develop specific antibodies against different clonal types of T. gondii and to determine by serotyping the T. gondii clonal types prevailing in cats as intermediate hosts in Germany. Methodology To establish a peptide-microarray serotyping test, we identified 24 suitable peptides using serological T. gondii positive (n=21) and negative cat sera (n=52). To determine the clonal type-specific antibody response of cats in Germany, 86 field sera from T. gondii seropositive naturally infected cats were tested. In addition, we analyzed the antibody response in cats experimentally infected with non-canonical T. gondii types (n=7). Findings Positive cat reference sera reacted predominantly with peptides harbouring amino acid sequences specific for the clonal T. gondii type the cats were infected with. When the array was applied to field sera from Germany, 98.8% (85/86) of naturally-infected cats recognized similar peptide patterns as T. gondii type II reference sera and showed the strongest reaction intensities with clonal type II-specific peptides. In addition, naturally infected cats recognized type II-specific peptides significantly more frequently than peptides of other type-specificities. Cats infected with non-canonical types showed the strongest reactivity with peptides presenting amino-acid sequences specific for both, type I and type III. Conclusions Cats are able to mount a clonal type-specific antibody response against T. gondii. Serotyping revealed for most seropositive field sera patterns resembling those observed after clonal type II-T. gondii infection. This finding is in accord with our previous results on the occurrence of T. gondii clonal types in oocysts shed by cats in Germany.
Collapse
Affiliation(s)
- Pavlo Maksimov
- Institute of Epidemiology, Friedrich-Loeffler-Institut, Federal Research Institute for Animal Health, Greifswald - Insel Riems, Germany
- * E-mail: (PM); (GS)
| | | | - Jitender P. Dubey
- Animal Parasitic Diseases Laboratory, USDA, ARS, ANRI, BARC-East, Beltsville, Maryland, United States of America
| | - Nikola Pantchev
- Vet Med Labor GmbH, Division of IDEXX Laboratories, Ludwigsburg, Germany
| | - Caroline F. Frey
- Institute of Parasitology, Vetsuisse Faculty, University of Bern, Bern, Switzerland
| | - Aline Maksimov
- Institute of Epidemiology, Friedrich-Loeffler-Institut, Federal Research Institute for Animal Health, Greifswald - Insel Riems, Germany
| | - Ulf Reimer
- JPT, Peptide Technologies GmbH, Berlin, Germany
| | - Mike Schutkowski
- Institute for Biochemistry & Biotechnology, Department of Enzymology, Martin-Luther-University Halle-Wittenberg, Halle (Saale), Germany
| | | | - Mario Ziller
- Workgroup Biomathematics, Friedrich-Loeffler-Institut, Federal Research Institute for Animal Health, Greifswald - Insel Riems, Germany
| | - Franz J. Conraths
- Institute of Epidemiology, Friedrich-Loeffler-Institut, Federal Research Institute for Animal Health, Greifswald - Insel Riems, Germany
| | - Gereon Schares
- Institute of Epidemiology, Friedrich-Loeffler-Institut, Federal Research Institute for Animal Health, Greifswald - Insel Riems, Germany
- * E-mail: (PM); (GS)
| |
Collapse
|
18
|
Kutsuna N, Higaki T, Matsunaga S, Otsuki T, Yamaguchi M, Fujii H, Hasezawa S. Active learning framework with iterative clustering for bioimage classification. Nat Commun 2013; 3:1032. [PMID: 22929789 PMCID: PMC3432472 DOI: 10.1038/ncomms2030] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2012] [Accepted: 07/30/2012] [Indexed: 11/19/2022] Open
Abstract
Advances in imaging systems have yielded a flood of images into the research field. A semi-automated facility can reduce the laborious task of classifying this large number of images. Here we report the development of a novel framework, CARTA (Clustering-Aided Rapid Training Agent), applicable to bioimage classification that facilitates annotation and selection of features. CARTA comprises an active learning algorithm combined with a genetic algorithm and self-organizing map. The framework provides an easy and interactive annotation method and accurate classification. The CARTA framework enables classification of subcellular localization, mitotic phases and discrimination of apoptosis in images of plant and human cells with an accuracy level greater than or equal to annotators. CARTA can be applied to classification of magnetic resonance imaging of cancer cells or multicolour time-course images after surgery. Furthermore, CARTA can support development of customized features for classification, high-throughput phenotyping and application of various classification schemes dependent on the user's purpose. Semi-automated imaging systems help with the task of classifying large numbers of biological images. This study presents a novel framework—CARTA—with an active learning algorithm combined with a genetic algorithm, whose applications include the classification of magnetic resonance imaging of cancer cells.
Collapse
Affiliation(s)
- Natsumaro Kutsuna
- Department of Integrated Biosciences, Graduate School of Frontier Sciences, University of Tokyo, 5-1-5 Kashiwanoha, Chiba 277-8562, Japan
| | | | | | | | | | | | | |
Collapse
|
19
|
Stegmayer G, Gerard M, Milone D. Data Mining Over Biological Datasets: An Integrated Approach Based on Computational Intelligence. IEEE COMPUT INTELL M 2012. [DOI: 10.1109/mci.2012.2215122] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
20
|
Song WM, Di Matteo T, Aste T. Hierarchical information clustering by means of topologically embedded graphs. PLoS One 2012; 7:e31929. [PMID: 22427814 PMCID: PMC3302882 DOI: 10.1371/journal.pone.0031929] [Citation(s) in RCA: 70] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2011] [Accepted: 01/17/2012] [Indexed: 11/19/2022] Open
Abstract
We introduce a graph-theoretic approach to extract clusters and hierarchies in complex data-sets in an unsupervised and deterministic manner, without the use of any prior information. This is achieved by building topologically embedded networks containing the subset of most significant links and analyzing the network structure. For a planar embedding, this method provides both the intra-cluster hierarchy, which describes the way clusters are composed, and the inter-cluster hierarchy which describes how clusters gather together. We discuss performance, robustness and reliability of this method by first investigating several artificial data-sets, finding that it can outperform significantly other established approaches. Then we show that our method can successfully differentiate meaningful clusters and hierarchies in a variety of real data-sets. In particular, we find that the application to gene expression patterns of lymphoma samples uncovers biologically significant groups of genes which play key-roles in diagnosis, prognosis and treatment of some of the most relevant human lymphoid malignancies.
Collapse
Affiliation(s)
- Won-Min Song
- Applied Mathematics, Research School of Physics and Engineering, The Australian National University, Canberra, Australia
| | - T. Di Matteo
- Applied Mathematics, Research School of Physics and Engineering, The Australian National University, Canberra, Australia
- Department of Mathematics, King's College London, London, United Kingdom
| | - Tomaso Aste
- Applied Mathematics, Research School of Physics and Engineering, The Australian National University, Canberra, Australia
- School of Physical Sciences, University of Kent, Kent, United Kingdom
| |
Collapse
|
21
|
MALDI-typing of infectious algae of the genus Prototheca using SOM portraits. J Microbiol Methods 2012; 88:83-97. [DOI: 10.1016/j.mimet.2011.10.013] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2011] [Revised: 10/17/2011] [Accepted: 10/20/2011] [Indexed: 01/13/2023]
|
22
|
Gillet JP, Wang J, Calcagno AM, Green LJ, Varma S, Elstrand MB, Trope CG, Ambudkar SV, Davidson B, Gottesman MM. Clinical relevance of multidrug resistance gene expression in ovarian serous carcinoma effusions. Mol Pharm 2011; 8:2080-8. [PMID: 21761824 PMCID: PMC3224865 DOI: 10.1021/mp200240a] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
The presence of tumor cells in effusions within serosal cavities is a clinical manifestation of advanced-stage cancer and is generally associated with poor survival. Identifying molecular targets may help to design efficient treatments to eradicate these aggressive cancer cells and improve patient survival. Using a state-of-the-art TaqMan-based qRT-PCR assay, we investigated the multidrug resistance (MDR) transcriptome of 32 unpaired ovarian serous carcinoma effusion samples obtained at diagnosis or at disease recurrence following chemotherapy. MDR genes were selected a priori based on an extensive curation of the literature published during the last three decades. We found three gene signatures with a statistically significant correlation with overall survival (OS), response to treatment [complete response (CR) vs other], and progression free survival (PFS). The median log-rank p-values for the signatures were 0.023, 0.034, and 0.008, respectively. No correlation was found with residual tumor status after cytoreductive surgery, treatment (with or without chemotherapy) and stage defined according to the International Federation of Gynecology and Obstetrics. Further analyses demonstrated that gene expression alone can effectively predict the survival outcome of women with ovarian serous carcinoma (OS, log-rank p = 0.0000; and PFS, log-rank p = 0.002). Interestingly, the signature for overall survival is the same in patients at first presentation and those who had chemotherapy and relapsed. This pilot study highlights two new gene signatures that may help in optimizing the treatment for ovarian carcinoma patients with effusions.
Collapse
Affiliation(s)
- Jean-Pierre Gillet
- Laboratory of Cell Biology, Center for Cancer Research, National Cancer Institute, NIH
| | - Junbai Wang
- Division of Pathology, Norwegian Radium Hospital, Oslo University Hospital, N-0310 Oslo, Norway
| | - Anna Maria Calcagno
- Laboratory of Cell Biology, Center for Cancer Research, National Cancer Institute, NIH
| | - Lisa J. Green
- Laboratory of Cell Biology, Center for Cancer Research, National Cancer Institute, NIH
| | - Sudhir Varma
- Bioinformatics and Computational Biosciences Branch, Office of Cyber Infrastructure and Computational Biology, Office of Science Management and Operations, National Institute of Allergy and Infectious Diseases, NIH, Bethesda, MD
| | - Mari Bunkholt Elstrand
- Department of Gynecologic Oncology, Norwegian Radium Hospital, Oslo University Hospital, N-0310 Oslo, Norway
| | - Claes G. Trope
- Department of Gynecologic Oncology, Norwegian Radium Hospital, Oslo University Hospital, N-0310 Oslo, Norway
- The Medical Faculty, University of Oslo, N-0316 Oslo, Norway
| | - Suresh V. Ambudkar
- Laboratory of Cell Biology, Center for Cancer Research, National Cancer Institute, NIH
| | - Ben Davidson
- Division of Pathology, Norwegian Radium Hospital, Oslo University Hospital, N-0310 Oslo, Norway
- The Medical Faculty, University of Oslo, N-0316 Oslo, Norway
| | - Michael M. Gottesman
- Laboratory of Cell Biology, Center for Cancer Research, National Cancer Institute, NIH
| |
Collapse
|
23
|
Zheng CH, Zhang L, Ng VTY, Shiu SCK, Huang DS. Molecular pattern discovery based on penalized matrix decomposition. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2011; 8:1592-1603. [PMID: 21519114 DOI: 10.1109/tcbb.2011.79] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/30/2023]
Abstract
A reliable and precise identification of the type of tumors is crucial to the effective treatment of cancer. With the rapid development of microarray technologies, tumor clustering based on gene expression data is becoming a powerful approach to cancer class discovery. In this paper, we apply the penalized matrix decomposition (PMD) to gene expression data to extract metasamples for clustering. The extracted metasamples capture the inherent structures of samples belong to the same class. At the same time, the PMD factors of a sample over the metasamples can be used as its class indicator in return. Compared with the conventional methods such as hierarchical clustering (HC), self-organizing maps (SOM), affinity propagation (AP) and nonnegative matrix factorization (NMF), the proposed method can identify the samples with complex classes. Moreover, the factor of PMD can be used as an index to determine the cluster number. The proposed method provides a reasonable explanation of the inconsistent classifications made by the conventional methods. In addition, it is able to discover the modules in gene expression data of conterminous developmental stages. Experiments on two representative problems show that the proposed PMD-based method is very promising to discover biological phenotypes.
Collapse
Affiliation(s)
- Chun-Hou Zheng
- College of Electrical Engineering and Automation, Anhui University, Hefei, Anhui 230039, China.
| | | | | | | | | |
Collapse
|
24
|
Lamb PF, Mündermann A, Bartlett RM, Robins A. Visualizing changes in lower body coordination with different types of foot orthoses using self-organizing maps (SOM). Gait Posture 2011; 34:485-9. [PMID: 21821418 DOI: 10.1016/j.gaitpost.2011.06.024] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/21/2010] [Revised: 06/06/2011] [Accepted: 06/11/2011] [Indexed: 02/02/2023]
Abstract
Human movement involves the coordination of individual segments controlled by the central nervous system and powered by the muscles. However, visualization of this high-dimensional coordination between kinematic and kinetic parameters is challenging. The purposes of this study were (a) to identify differences in lower extremity coordination between different types of foot orthoses using Kohonen self-organizing maps (SOM) and (b) to demonstrate the SOM visualization of high-dimensional coordination in gait. This study used gait data for twenty subjects while running in four different orthotic conditions (control, posted, molded, and posted-molded) from a previous study. Data for one exemplar participant was used to demonstrate the visualization technique. In this visualization, areas on an output map represent certain characteristics of the gait cycle. By comparing trials of gait in different orthotic conditions a visual analysis of high-dimensional coordination is possible. Posting orthoses were shown to reduce and molded orthoses were shown to increase ankle mobility, respectively. However, when posting and molding were combined, the effects of the molded orthoses over-rode those of the posted orthoses. In fact, trials using posted-molded orthoses enhanced the effects of molded orthoses. SOMs may contribute to a better understanding of changes in the coordination of kinematic and kinetic variables at certain phases of the gait cycle under different conditions.
Collapse
Affiliation(s)
- Peter F Lamb
- School of Physical Education, University of Otago, Dunedin, New Zealand.
| | | | | | | |
Collapse
|
25
|
Wirth H, Löffler M, von Bergen M, Binder H. Expression cartography of human tissues using self organizing maps. BMC Bioinformatics 2011; 12:306. [PMID: 21794127 PMCID: PMC3161046 DOI: 10.1186/1471-2105-12-306] [Citation(s) in RCA: 65] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2011] [Accepted: 07/27/2011] [Indexed: 12/02/2022] Open
Abstract
Background Parallel high-throughput microarray and sequencing experiments produce vast quantities of multidimensional data which must be arranged and analyzed in a concerted way. One approach to addressing this challenge is the machine learning technique known as self organizing maps (SOMs). SOMs enable a parallel sample- and gene-centered view of genomic data combined with strong visualization and second-level analysis capabilities. The paper aims at bridging the gap between the potency of SOM-machine learning to reduce dimension of high-dimensional data on one hand and practical applications with special emphasis on gene expression analysis on the other hand. Results The method was applied to generate a SOM characterizing the whole genome expression profiles of 67 healthy human tissues selected from ten tissue categories (adipose, endocrine, homeostasis, digestion, exocrine, epithelium, sexual reproduction, muscle, immune system and nervous tissues). SOM mapping reduces the dimension of expression data from ten of thousands of genes to a few thousand metagenes, each representing a minicluster of co-regulated single genes. Tissue-specific and common properties shared between groups of tissues emerge as a handful of localized spots in the tissue maps collecting groups of co-regulated and co-expressed metagenes. The functional context of the spots was discovered using overrepresentation analysis with respect to pre-defined gene sets of known functional impact. We found that tissue related spots typically contain enriched populations of genes related to specific molecular processes in the respective tissue. Analysis techniques normally used at the gene-level such as two-way hierarchical clustering are better represented and provide better signal-to-noise ratios if applied to the metagenes. Metagene-based clustering analyses aggregate the tissues broadly into three clusters containing nervous, immune system and the remaining tissues. Conclusions The SOM technique provides a more intuitive and informative global view of the behavior of a few well-defined modules of correlated and differentially expressed genes than the separate discovery of the expression levels of hundreds or thousands of individual genes. The program is available as R-package 'oposSOM'.
Collapse
Affiliation(s)
- Henry Wirth
- Interdisciplinary Centre for Bioinformatics of Leipzig University, D-4107 Leipzig, Härtelstr. 16-18, Germany.
| | | | | | | |
Collapse
|
26
|
Dalton L, Ballarin V, Brun M. Clustering algorithms: on learning, validation, performance, and applications to genomics. Curr Genomics 2011; 10:430-45. [PMID: 20190957 PMCID: PMC2766793 DOI: 10.2174/138920209789177601] [Citation(s) in RCA: 56] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2009] [Revised: 04/20/2009] [Accepted: 05/11/2009] [Indexed: 11/22/2022] Open
Abstract
The development of microarray technology has enabled scientists to measure the expression of thousands of genes simultaneously, resulting in a surge of interest in several disciplines throughout biology and medicine. While data clustering has been used for decades in image processing and pattern recognition, in recent years it has joined this wave of activity as a popular technique to analyze microarrays. To illustrate its application to genomics, clustering applied to genes from a set of microarray data groups together those genes whose expression levels exhibit similar behavior throughout the samples, and when applied to samples it offers the potential to discriminate pathologies based on their differential patterns of gene expression. Although clustering has now been used for many years in the context of gene expression microarrays, it has remained highly problematic. The choice of a clustering algorithm and validation index is not a trivial one, more so when applying them to high throughput biological or medical data. Factors to consider when choosing an algorithm include the nature of the application, the characteristics of the objects to be analyzed, the expected number and shape of the clusters, and the complexity of the problem versus computational power available. In some cases a very simple algorithm may be appropriate to tackle a problem, but many situations may require a more complex and powerful algorithm better suited for the job at hand. In this paper, we will cover the theoretical aspects of clustering, including error and learning, followed by an overview of popular clustering algorithms and classical validation indices. We also discuss the relative performance of these algorithms and indices and conclude with examples of the application of clustering to computational biology.
Collapse
Affiliation(s)
- Lori Dalton
- Department of Electrical and Computer Engineering, Texas A&M University, College Station, Texas 77843-3128, USA
| | | | | |
Collapse
|
27
|
The interplay of descriptor-based computational analysis with pharmacophore modeling builds the basis for a novel classification scheme for feruloyl esterases. Biotechnol Adv 2011; 29:94-110. [DOI: 10.1016/j.biotechadv.2010.09.003] [Citation(s) in RCA: 61] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2010] [Revised: 08/27/2010] [Accepted: 09/06/2010] [Indexed: 11/18/2022]
|
28
|
Elgaaen BV, Haug KBF, Wang J, Olstad OK, Fortunati D, Onsrud M, Staff AC, Sauer T, Gautvik KM. POLD2 and KSP37 (FGFBP2) correlate strongly with histology, stage and outcome in ovarian carcinomas. PLoS One 2010; 5:e13837. [PMID: 21079801 PMCID: PMC2973954 DOI: 10.1371/journal.pone.0013837] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2010] [Accepted: 10/01/2010] [Indexed: 12/16/2022] Open
Abstract
BACKGROUND Epithelial ovarian cancer (EOC) constitutes more than 90% of ovarian cancers and is associated with high mortality. EOC comprises a heterogeneous group of tumours, and the causes and molecular pathology are essentially unknown. Improved insight into the molecular characteristics of the different subgroups of EOC is urgently needed, and should eventually lead to earlier diagnosis as well as more individualized and effective treatments. Previously, we reported a limited number of mRNAs strongly upregulated in human osteosarcomas and other malignancies, and six were selected to be tested for a possible association with three subgroups of ovarian carcinomas and clinical parameters. METHODOLOGY/PRINCIPAL FINDINGS The six selected mRNAs were quantified by RT-qPCR in biopsies from eleven poorly differentiated serous carcinomas (PDSC, stage III-IV), twelve moderately differentiated serous carcinomas (MDSC, stage III-IV) and eight clear cell carcinomas (CCC, stage I-IV) of the ovary. Superficial scrapings from six normal ovaries (SNO), as well as biopsies from three normal ovaries (BNO) and three benign ovarian cysts (BBOC) were analyzed for comparison. The gene expression level was related to the histological and clinical parameters of human ovarian carcinoma samples. One of the mRNAs, DNA polymerase delta 2 small subunit (POLD2), was increased in average 2.5- to almost 20-fold in MDSC and PDSC, respectively, paralleling the degree of dedifferentiation and concordant with a poor prognosis. Except for POLD2, the serous carcinomas showed a similar transcription profile, being clearly different from CCC. Another mRNA, Killer-specific secretory protein of 37 kDa (KSP37) showed six- to eight-fold higher levels in CCC stage I compared with the more advanced staged carcinomas, and correlated positively with an improved clinical outcome. CONCLUSIONS/SIGNIFICANCE We have identified two biomarkers which are markedly upregulated in two subgroups of ovarian carcinomas and are also associated with stage and outcome. The results suggest that POLD2 and KSP37 might be potential prognostic biomarkers.
Collapse
|
29
|
Milone DH, Stegmayer GS, Kamenetzky L, López M, Lee JM, Giovannoni JJ, Carrari F. *omeSOM: a software for clustering and visualization of transcriptional and metabolite data mined from interspecific crosses of crop plants. BMC Bioinformatics 2010; 11:438. [PMID: 20796314 PMCID: PMC2942854 DOI: 10.1186/1471-2105-11-438] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2010] [Accepted: 08/26/2010] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND modern biology uses experimental systems that involve the exploration of phenotypic variation as a result of the recombination of several genomes. Such systems are useful to investigate the functional evolution of metabolic networks. One such approach is the analysis of transcript and metabolite profiles. These kinds of studies generate a large amount of data, which require dedicated computational tools for their analysis. RESULTS this paper presents a novel software named *omeSOM (transcript/metabol-ome Self Organizing Map) that implements a neural model for biological data clustering and visualization. It allows the discovery of relationships between changes in transcripts and metabolites of crop plants harboring introgressed exotic alleles and furthermore, its use can be extended to other type of omics data. The software is focused on the easy identification of groups including different molecular entities, independently of the number of clusters formed. The *omeSOM software provides easy-to-visualize interfaces for the identification of coordinated variations in the co-expressed genes and co-accumulated metabolites. Additionally, this information is linked to the most widely used gene annotation and metabolic pathway databases. CONCLUSIONS *omeSOM is a software designed to give support to the data mining task of metabolic and transcriptional datasets derived from different databases. It provides a user-friendly interface and offers several visualization features, easy to understand by non-expert users. Therefore, *omeSOM provides support for data mining tasks and it is applicable to basic research as well as applied breeding programs. The software and a sample dataset are available free of charge at http://sourcesinc.sourceforge.net/omesom/.
Collapse
Affiliation(s)
- Diego H Milone
- Research Center for Signals, Systems and Computational Intelligence, FICH-UNL, CONICET, Ciudad Universitaria UNL, Santa Fe, (3000), Argentina
| | | | | | | | | | | | | |
Collapse
|
30
|
Sun J, Masterman-Smith MD, Graham NA, Jiao J, Mottahedeh J, Laks DR, Ohashi M, DeJesus J, Kamei KI, Lee KB, Wang H, Yu ZTF, Lu YT, Hou S, Li K, Liu M, Zhang N, Wang S, Angenieux B, Panosyan E, Samuels ER, Park J, Williams D, Konkankit V, Nathanson D, van Dam RM, Phelps ME, Wu H, Liau LM, Mischel PS, Lazareff JA, Kornblum HI, Yong WH, Graeber TG, Tseng HR. A microfluidic platform for systems pathology: multiparameter single-cell signaling measurements of clinical brain tumor specimens. Cancer Res 2010; 70:6128-38. [PMID: 20631065 DOI: 10.1158/0008-5472.can-10-0076] [Citation(s) in RCA: 94] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Abstract
The clinical practice of oncology is being transformed by molecular diagnostics that will enable predictive and personalized medicine. Current technologies for quantitation of the cancer proteome are either qualitative (e.g., immunohistochemistry) or require large sample sizes (e.g., flow cytometry). Here, we report a microfluidic platform-microfluidic image cytometry (MIC)-capable of quantitative, single-cell proteomic analysis of multiple signaling molecules using only 1,000 to 2,800 cells. Using cultured cell lines, we show simultaneous measurement of four critical signaling proteins (EGFR, PTEN, phospho-Akt, and phospho-S6) within the oncogenic phosphoinositide 3-kinase (PI3K)/Akt/mammalian target of rapamycin (mTOR) signaling pathway. To show the clinical application of the MIC platform to solid tumors, we analyzed a panel of 19 human brain tumor biopsies, including glioblastomas. Our MIC measurements were validated by clinical immunohistochemistry and confirmed the striking intertumoral and intratumoral heterogeneity characteristic of glioblastoma. To interpret the multiparameter, single-cell MIC measurements, we adapted bioinformatic methods including self-organizing maps that stratify patients into clusters that predict tumor progression and patient survival. Together with bioinformatic analysis, the MIC platform represents a robust, enabling in vitro molecular diagnostic technology for systems pathology analysis and personalized medicine.
Collapse
Affiliation(s)
- Jing Sun
- Crump Institute for Molecular Imaging, University of California at Los Angeles, Los Angeles, California 90095, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
31
|
Guo Y, Eichler GS, Feng Y, Ingber DE, Huang S. Towards a holistic, yet gene-centered analysis of gene expression profiles: a case study of human lung cancers. J Biomed Biotechnol 2010; 2006:69141. [PMID: 17489018 PMCID: PMC1698264 DOI: 10.1155/jbb/2006/69141] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
Genome-wide gene expression profile studies encompass increasingly large number of samples, posing a challenge to their presentation and interpretation without losing the notion that each transcriptome constitutes a complex biological entity. Much like pathologists who visually analyze information-rich histological sections as a whole, we propose here an integrative approach. We use a self-organizing maps -based software, the gene expression dynamics inspector (GEDI) to analyze gene expression profiles of various lung tumors. GEDI allows the comparison of tumor profiles based on direct visual detection of transcriptome patterns. Such intuitive “gestalt” perception promotes the discovery of interesting relationships in the absence of an existing hypothesis. We uncovered qualitative relationships between squamous cell tumors, small-cell tumors, and carcinoid tumor that would have escaped existing algorithmic classifications. These results suggest that GEDI may be a valuable explorative tool that combines global and gene-centered analyses of molecular profiles from large-scale microarray experiments.
Collapse
Affiliation(s)
- Yuchun Guo
- Vascular Biology Program, Department of Surgery, Children's Hospital, Harvard Medical School, Boston 02115, MA, USA
| | - Gabriel S. Eichler
- Bioinformatics Program, Boston University, Boston 02215, MA, USA
- Laboratory of Molecular Pharmacology, CCR, NCI, NIH, Bethesda 20892, MD, USA
| | - Ying Feng
- Vascular Biology Program, Department of Surgery, Children's Hospital, Harvard Medical School, Boston 02115, MA, USA
| | - Donald E. Ingber
- Vascular Biology Program, Department of Surgery, Children's Hospital, Harvard Medical School, Boston 02115, MA, USA
| | - Sui Huang
- Vascular Biology Program, Department of Surgery, Children's Hospital, Harvard Medical School, Boston 02115, MA, USA
- *Sui Huang:
| |
Collapse
|
32
|
Newman AM, Cooper JB. AutoSOME: a clustering method for identifying gene expression modules without prior knowledge of cluster number. BMC Bioinformatics 2010; 11:117. [PMID: 20202218 PMCID: PMC2846907 DOI: 10.1186/1471-2105-11-117] [Citation(s) in RCA: 68] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2009] [Accepted: 03/04/2010] [Indexed: 12/25/2022] Open
Abstract
Background Clustering the information content of large high-dimensional gene expression datasets has widespread application in "omics" biology. Unfortunately, the underlying structure of these natural datasets is often fuzzy, and the computational identification of data clusters generally requires knowledge about cluster number and geometry. Results We integrated strategies from machine learning, cartography, and graph theory into a new informatics method for automatically clustering self-organizing map ensembles of high-dimensional data. Our new method, called AutoSOME, readily identifies discrete and fuzzy data clusters without prior knowledge of cluster number or structure in diverse datasets including whole genome microarray data. Visualization of AutoSOME output using network diagrams and differential heat maps reveals unexpected variation among well-characterized cancer cell lines. Co-expression analysis of data from human embryonic and induced pluripotent stem cells using AutoSOME identifies >3400 up-regulated genes associated with pluripotency, and indicates that a recently identified protein-protein interaction network characterizing pluripotency was underestimated by a factor of four. Conclusions By effectively extracting important information from high-dimensional microarray data without prior knowledge or the need for data filtration, AutoSOME can yield systems-level insights from whole genome microarray expression studies. Due to its generality, this new method should also have practical utility for a variety of data-intensive applications, including the results of deep sequencing experiments. AutoSOME is available for download at http://jimcooperlab.mcdb.ucsb.edu/autosome.
Collapse
Affiliation(s)
- Aaron M Newman
- Biomolecular Science and Engineering Program, University of California, Santa Barbara, CA 93106, USA
| | | |
Collapse
|
33
|
Reexamination of risk criteria in dengue patients using the self-organizing map. Med Biol Eng Comput 2009; 48:293-301. [PMID: 20016950 DOI: 10.1007/s11517-009-0561-x] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2009] [Accepted: 11/17/2009] [Indexed: 10/20/2022]
Abstract
Even though the World Health Organization criteria's for classifying the dengue infection have been used for long time, recent studies declare that several difficulties have been faced by the clinicians to apply these criteria. Accordingly, many studies have proposed modified criteria to identify the risk in dengue patients based on statistical analysis techniques. None of these studies utilized the powerfulness of the self-organized map (SOM) in visualizing, understanding, and exploring the complexity in multivariable data. Therefore, this study utilized the clustering of the SOM technique to identify the risk criteria in 195 dengue patients. The new risk criteria were defined as: platelet count less than or equal 40,000 cells per mm(3), hematocrit concentration great than or equal 25% and aspartate aminotransferase (AST) rose by fivefold the normal upper limit for AST/alanine aminotransfansferase (ALT) rose by fivefold the normal upper limit for ALT. The clusters analysis indicated that any dengue patient fulfills any two of the risk criteria is consider as high risk dengue patient.
Collapse
|
34
|
Application of Kohonen maps for solving the classification puzzle in AGC kinase protein sequences. Interdiscip Sci 2009; 1:173-8. [DOI: 10.1007/s12539-009-0032-1] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2008] [Revised: 02/25/2009] [Accepted: 03/16/2009] [Indexed: 10/20/2022]
|
35
|
Blazadonakis ME, Zervakis M. The linear neuron as marker selector and clinical predictor in cancer gene analysis. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2008; 91:22-35. [PMID: 18423925 DOI: 10.1016/j.cmpb.2008.02.009] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/11/2007] [Revised: 02/22/2008] [Accepted: 02/23/2008] [Indexed: 05/26/2023]
Abstract
OBJECTIVE The problem of gene selection has been extensively studied in a number of scientific works using various kinds of methods. However, the application of a linear neuron is a novel approach possessing several advantages. In this work we propose to study the behavior of such a linear neuron, appropriately adapted and trained to the problem of gene selection in the DNA microarray experiment. METHODS AND MATERIALS We explore the proposed approach in terms of an accuracy evaluation criterion, which is used to assess the performance of the proposed methodology, but we also evaluate the produced results in terms of cluster quality and survival prediction. Cluster quality reflects the ability of the method to select differentially expressed genes, which in turn leads to better clustering and survival prediction. RESULTS We directly compare the proposed methodology with RFE-SVM, a well known and broadly accepted method demonstrating remarkable performance on various data sets of clinical interest. CONCLUSIONS Conducted computational experiments show that the proposed approach can be efficiently used within the field of gene selection producing high-quality results in terms of accuracy and robustness.
Collapse
Affiliation(s)
- Michalis E Blazadonakis
- Technical University of Crete, Department of Electronic and Computer Engineering, University Campus, Chania Crete 73100, Greece.
| | | |
Collapse
|
36
|
Sivaraksa M, Lowe D. Predictive gene lists for breast cancer prognosis: a topographic visualisation study. BMC Med Genomics 2008; 1:8. [PMID: 18419801 PMCID: PMC2375896 DOI: 10.1186/1755-8794-1-8] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2007] [Accepted: 04/17/2008] [Indexed: 11/10/2022] Open
Abstract
Background The controversy surrounding the non-uniqueness of predictive gene lists (PGL) of small selected subsets of genes from very large potential candidates as available in DNA microarray experiments is now widely acknowledged [1]. Many of these studies have focused on constructing discriminative semi-parametric models and as such are also subject to the issue of random correlations of sparse model selection in high dimensional spaces. In this work we outline a different approach based around an unsupervised patient-specific nonlinear topographic projection in predictive gene lists. Methods We construct nonlinear topographic projection maps based on inter-patient gene-list relative dissimilarities. The Neuroscale, the Stochastic Neighbor Embedding(SNE) and the Locally Linear Embedding(LLE) techniques have been used to construct two-dimensional projective visualisation plots of 70 dimensional PGLs per patient, classifiers are also constructed to identify the prognosis indicator of each patient using the resulting projections from those visualisation techniques and investigate whether a-posteriori two prognosis groups are separable on the evidence of the gene lists. A literature-proposed predictive gene list for breast cancer is benchmarked against a separate gene list using the above methods. Generalisation ability is investigated by using the mapping capability of Neuroscale to visualise the follow-up study, but based on the projections derived from the original dataset. Results The results indicate that small subsets of patient-specific PGLs have insufficient prognostic dissimilarity to permit a distinction between two prognosis patients. Uncertainty and diversity across multiple gene expressions prevents unambiguous or even confident patient grouping. Comparative projections across different PGLs provide similar results. Conclusion The random correlation effect to an arbitrary outcome induced by small subset selection from very high dimensional interrelated gene expression profiles leads to an outcome with associated uncertainty. This continuum and uncertainty precludes any attempts at constructing discriminative classifiers. However a patient's gene expression profile could possibly be used in treatment planning, based on knowledge of other patients' responses. We conclude that many of the patients involved in such medical studies are intrinsically unclassifiable on the basis of provided PGL evidence. This additional category of 'unclassifiable' should be accommodated within medical decision support systems if serious errors and unnecessary adjuvant therapy are to be avoided.
Collapse
|
37
|
Fernandez EA, Balzarini M. Improving cluster visualization in self-organizing maps: Application in gene expression data analysis. Comput Biol Med 2007; 37:1677-89. [PMID: 17544390 DOI: 10.1016/j.compbiomed.2007.04.003] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2006] [Revised: 04/03/2007] [Accepted: 04/10/2007] [Indexed: 11/26/2022]
Abstract
Cluster analysis is one of the crucial steps in gene expression pattern (GEP) analysis. It leads to the discovery or identification of temporal patterns and coexpressed genes. GEP analysis involves highly dimensional multivariate data which demand appropriate tools. A good alternative for grouping many multidimensional objects is self-organizing maps (SOM), an unsupervised neural network algorithm able to find relationships among data. SOM groups and maps them topologically. However, it may be difficult to identify clusters with the usual visualization tools for SOM. We propose a simple algorithm to identify and visualize clusters in SOM (the RP-Q method). The RP is a new node-adaptive attribute that moves in a two dimensional virtual space imitating the movement of the codebooks vectors of the SOM net into the input space. The Q statistic evaluates the SOM structure providing an estimation of the number of clusters underlying the data set. The SOM-RP-Q algorithm permits the visualization of clusters in the SOM and their node patterns. The algorithm was evaluated in several simulated and real GEP data sets. Results show that the proposed algorithm successfully displays the underlying cluster structure directly from the SOM and is robust to different net sizes.
Collapse
Affiliation(s)
- Elmer A Fernandez
- Faculty of Engineering, Catholic University of Córdoba, Córdoba, Camino Alta Gracia Km 10, Cordoba, Argentina.
| | | |
Collapse
|
38
|
Mueller CG, Boix C, Kwan WH, Daussy C, Fournier E, Fridman WH, Molina TJ. Critical role of monocytes to support normal B cell and diffuse large B cell lymphoma survival and proliferation. J Leukoc Biol 2007; 82:567-75. [PMID: 17575267 DOI: 10.1189/jlb.0706481] [Citation(s) in RCA: 57] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022] Open
Abstract
Large B cell lymphomas can comprise numerous CD14+ cells in the tumor stroma, which raises the question of whether monocytes can support B cell survival and proliferation. We show that the coculture of monocytes with B cells from peripheral blood or from diffuse large B cell lymphoma enabled prolonged B cell survival. Under these conditions, diffuse large lymphoma B cells proliferated, and addition of B cell-activating factor of the TNF family (BAFF) and IL-2 enhanced cell division. Monocytes and dendritic cells (DC) had similar antiapoptotic activity on healthy B cells but displayed differences with respect to B cell proliferation. Monocytes and cord blood-derived CD14+ cells promoted B cell proliferation in the presence of an anti-CD40 stimulus, whereas DC supported B cell proliferation when activated through the BCR. DC and CD14+ cells were able to induce plasmocyte differentiation. When B cells were activated via the BCR or CD40, they released the leukocyte attractant CCL5, and this chemokine is one of the main chemokines expressed in diffuse large B cell lymphoma. The data support the notion that large B cell lymphoma recruit monocytes via CCL5 to support B cell survival and proliferation.
Collapse
Affiliation(s)
- Chris G Mueller
- INSERM, U872, Centre de Recherches Biomédicales des Cordeliers Université Pierre et Marie Curie (Paris VI) et René Descartes (Paris V), UMR S 872, Paris, France.
| | | | | | | | | | | | | |
Collapse
|
39
|
Kaput J, Dawson K. Complexity of type 2 diabetes mellitus data sets emerging from nutrigenomic research: a case for dimensionality reduction? Mutat Res 2007; 622:19-32. [PMID: 17559889 PMCID: PMC1994901 DOI: 10.1016/j.mrfmmm.2007.02.033] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2006] [Accepted: 02/13/2007] [Indexed: 02/07/2023]
Abstract
Nutrigenomics promises personalized nutrition and an improvement in preventing, delaying, and reducing the symptoms of chronic diseases such as diabetes. Nutritional genomics is the study of how foods affect the expression of genetic information in an individual and how an individual's genetic makeup affects the metabolism and response to nutrients and other bioactive components in food. The path to those promises has significant challenges, from experimental designs that include analysis of genetic heterogeneity to the complexities of food and environmental factors. One of the more significant complications in developing the knowledge base and potential applications is how to analyze high-dimensional datasets of genetic, nutrient, metabolomic (clinical), and other variables influencing health and disease processes. Type 2 diabetes mellitus (T2DM) is used as an illustration of the challenges in studying complex phenotypes with nutrigenomics concepts and approaches.
Collapse
Affiliation(s)
- Jim Kaput
- Center of Excellence in Nutritional Genomics, University of California at Davis, Davis, CA 95616, USA.
| | | |
Collapse
|
40
|
Belacel N, Wang Q, Cuperlovic-Culf M. Clustering methods for microarray gene expression data. OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY 2007; 10:507-31. [PMID: 17233561 DOI: 10.1089/omi.2006.10.507] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
Within the field of genomics, microarray technologies have become a powerful technique for simultaneously monitoring the expression patterns of thousands of genes under different sets of conditions. A main task now is to propose analytical methods to identify groups of genes that manifest similar expression patterns and are activated by similar conditions. The corresponding analysis problem is to cluster multi-condition gene expression data. The purpose of this paper is to present a general view of clustering techniques used in microarray gene expression data analysis.
Collapse
Affiliation(s)
- Nabil Belacel
- National Research Council Canada, Institute for Information Technology, Scientific Park, Moncton, New Brunswick, Canada.
| | | | | |
Collapse
|
41
|
Weeraratna AT, Taub DD. Microarray data analysis: an overview of design, methodology, and analysis. Methods Mol Biol 2007; 377:1-16. [PMID: 17634607 DOI: 10.1007/978-1-59745-390-5_1] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
Microarray analysis results in the gathering of massive amounts of information concerning gene expression profiles of different cells and experimental conditions. Analyzing these data can often be a quagmire, with endless discussion as to what the appropriate statistical analyses for any given experiment might be. As a result many different methods of data analysis have evolved, the basics of which are outlined in this chapter.
Collapse
Affiliation(s)
- Ashani T Weeraratna
- Laboratory of Immunology, National Institutes of Health, National Institute on Aging, Gerontology Research Center, Baltimore, MD, USA
| | | |
Collapse
|
42
|
Microarray expression technology in clinical research of non-Hodgkin lymphoma. ARCHIVE OF ONCOLOGY 2007. [DOI: 10.2298/aoo0702028b] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
Nowadays, in genomocentric era accelerated research of the human genome coupled with advances is enabling the comprehensive molecular profiling of human tissue. Particularly, DNA microarrays are powerful tools for obtaining global view of human non-Hodgkin lymphomas gene expression. Complex information from lymphomas "expression profiling" studies can, in turn, be used to create molecular markers that have diagnostic or prognostic implications. The gene "expression profiling" is not of routine clinical oncology practice, but is used in genomic classification of clinically relevant subgroups of non-Hodgkin lymphoma. The genomics biomarkers have been incorporated into current prognostic models which are based on IPI, R-IPI, and FLIPI. Molecular or pharmacogenomic profiling can be used as new therapeutic targets for patients who are refractory to current therapy. We discus the utility of DNA microarray-based lymphoma profiling in clinical oncology research, and identify future of research in lymphoma evolving fields.
Collapse
|
43
|
Sjödin A, Bylesjö M, Skogström O, Eriksson D, Nilsson P, Rydén P, Jansson S, Karlsson J. UPSC-BASE--Populus transcriptomics online. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2006; 48:806-17. [PMID: 17092314 DOI: 10.1111/j.1365-313x.2006.02920.x] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/12/2023]
Abstract
The increasing accessibility and use of microarrays in transcriptomics has accentuated the need for purpose-designed storage and analysis tools. Here we present UPSC-BASE, a database for analysis and storage of Populus DNA microarray data. A microarray analysis pipeline has also been established to allow consistent and efficient analysis (from small to large scale) of samples in various experimental designs. A range of optimized experimental protocols is provided for each step in generating the data. Within UPSC-BASE, researchers can perform standard and advanced microarray analysis procedures in a user-friendly environment. Background corrections, normalizations, quality-control tools, visualizations, hypothesis tests and export tools are provided without requirements for expert-level knowledge. Although the database has been developed primarily for handling Populus DNA microarrays, most of the tools are generic and can be used for other types of microarray. UPSC-BASE is also a repository of Populus microarray information, providing data from 21 experiments on a total of 407 microarray hybridizations in the public domain of the database. There are also an additional 10 experiments containing 347 hybridizations, where the automatically analysed data are searchable.
Collapse
Affiliation(s)
- Andreas Sjödin
- Umeå Plant Science Centre, Department of Plant Physiology, Umeå University, SE-901 87 Umeå, Sweden
| | | | | | | | | | | | | | | |
Collapse
|
44
|
Moorman C, Sun LV, Wang J, de Wit E, Talhout W, Ward LD, Greil F, Lu XJ, White KP, Bussemaker HJ, van Steensel B. Hotspots of transcription factor colocalization in the genome of Drosophila melanogaster. Proc Natl Acad Sci U S A 2006; 103:12027-32. [PMID: 16880385 PMCID: PMC1567692 DOI: 10.1073/pnas.0605003103] [Citation(s) in RCA: 158] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2006] [Indexed: 11/18/2022] Open
Abstract
Regulation of gene expression is a highly complex process that requires the concerted action of many proteins, including sequence-specific transcription factors, cofactors, and chromatin proteins. In higher eukaryotes, the interplay between these proteins and their interactions with the genome still is poorly understood. We systematically mapped the in vivo binding sites of seven transcription factors with diverse physiological functions, five cofactors, and two heterochromatin proteins at approximately 1-kb resolution in a 2.9 Mb region of the Drosophila melanogaster genome. Surprisingly, all tested transcription factors and cofactors show strongly overlapping localization patterns, and the genome contains many "hotspots" that are targeted by all of these proteins. Several control experiments show that the strong overlap is not an artifact of the techniques used. Colocalization hotspots are 1-5 kb in size, spaced on average by approximately 50 kb, and preferentially located in regions of active transcription. We provide evidence that protein-protein interactions play a role in the hotspot association of some transcription factors. Colocalization hotspots constitute a previously uncharacterized type of feature in the genome of Drosophila, and our results provide insights into the general targeting mechanisms of transcription regulators in a higher eukaryote.
Collapse
Affiliation(s)
- Celine Moorman
- *Department of Molecular Biology, Netherlands Cancer Institute, Plesmanlaan 121, 1066 CX, Amsterdam, The Netherlands
| | - Ling V. Sun
- Department of Genetics, Yale University School of Medicine, New Haven, CT 06520; and
| | - Junbai Wang
- Department of Biological Sciences and Center for Computational Biology and Bioinformatics, Columbia University, New York, NY 10027
| | - Elzo de Wit
- *Department of Molecular Biology, Netherlands Cancer Institute, Plesmanlaan 121, 1066 CX, Amsterdam, The Netherlands
| | - Wendy Talhout
- *Department of Molecular Biology, Netherlands Cancer Institute, Plesmanlaan 121, 1066 CX, Amsterdam, The Netherlands
| | - Lucas D. Ward
- Department of Biological Sciences and Center for Computational Biology and Bioinformatics, Columbia University, New York, NY 10027
| | - Frauke Greil
- *Department of Molecular Biology, Netherlands Cancer Institute, Plesmanlaan 121, 1066 CX, Amsterdam, The Netherlands
| | - Xiang-Jun Lu
- Department of Biological Sciences and Center for Computational Biology and Bioinformatics, Columbia University, New York, NY 10027
| | - Kevin P. White
- Department of Genetics, Yale University School of Medicine, New Haven, CT 06520; and
| | - Harmen J. Bussemaker
- Department of Biological Sciences and Center for Computational Biology and Bioinformatics, Columbia University, New York, NY 10027
| | - Bas van Steensel
- *Department of Molecular Biology, Netherlands Cancer Institute, Plesmanlaan 121, 1066 CX, Amsterdam, The Netherlands
| |
Collapse
|
45
|
Dunphy CH. Gene expression profiling data in lymphoma and leukemia: review of the literature and extrapolation of pertinent clinical applications. Arch Pathol Lab Med 2006; 130:483-520. [PMID: 16594743 DOI: 10.5858/2006-130-483-gepdil] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
CONTEXT Gene expression (GE) analyses using microarrays have become an important part of biomedical and clinical research in hematolymphoid malignancies. However, the methods are time-consuming and costly for routine clinical practice. OBJECTIVES To review the literature regarding GE data that may provide important information regarding pathogenesis and that may be extrapolated for use in diagnosing and prognosticating lymphomas and leukemias; to present GE findings in Hodgkin and non-Hodgkin lymphomas, acute leukemias, and chronic myeloid leukemia in detail; and to summarize the practical clinical applications in tables that are referenced throughout the text. DATA SOURCE PubMed was searched for pertinent literature from 1993 to 2005. CONCLUSIONS Gene expression profiling of lymphomas and leukemias aids in the diagnosis and prognostication of these diseases. The extrapolation of these findings to more timely, efficient, and cost-effective methods, such as flow cytometry and immunohistochemistry, results in better diagnostic tools to manage the diseases. Flow cytometric and immunohistochemical applications of the information gained from GE profiling assist in the management of chronic lymphocytic leukemia, other low-grade B-cell non-Hodgkin lymphomas and leukemias, diffuse large B-cell lymphoma, nodular lymphocyte-predominant Hodgkin lymphoma, and classic Hodgkin lymphoma. For practical clinical use, GE profiling of precursor B acute lymphoblastic leukemia, precursor T acute lymphoblastic leukemia, and acute myeloid leukemia has supported most of the information that has been obtained by cytogenetic and molecular studies (except for the identification of FLT3 mutations for molecular analysis), but extrapolation of the analyses leaves much to be gained based on the GE profiling data.
Collapse
Affiliation(s)
- Cherie H Dunphy
- Department of Pathology and Laboratory Medicine, The University of North Carolina, Chapel Hill, NC 27599-7525, USA.
| |
Collapse
|
46
|
Kaderali L, Zander T, Faigle U, Wolf J, Schultze JL, Schrader R. CASPAR: a hierarchical bayesian approach to predict survival times in cancer from gene expression data. Bioinformatics 2006; 22:1495-502. [PMID: 16554338 DOI: 10.1093/bioinformatics/btl103] [Citation(s) in RCA: 33] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION DNA microarrays allow the simultaneous measurement of thousands of gene expression levels in any given patient sample. Gene expression data have been shown to correlate with survival in several cancers, however, analysis of the data is difficult, since typically at most a few hundred patients are available, resulting in severely underdetermined regression or classification models. Several approaches exist to classify patients in different risk classes, however, relatively little has been done with respect to the prediction of actual survival times. We introduce CASPAR, a novel method to predict true survival times for the individual patient based on microarray measurements. CASPAR is based on a multivariate Cox regression model that is embedded in a Bayesian framework. A hierarchical prior distribution on the regression parameters is specifically designed to deal with high dimensionality (large number of genes) and low sample size settings, that are typical for microarray measurements. This enables CASPAR to automatically select small, most informative subsets of genes for prediction. RESULTS Validity of the method is demonstrated on two publicly available datasets on diffuse large B-cell lymphoma (DLBCL) and on adenocarcinoma of the lung. The method successfully identifies long and short survivors, with high sensitivity and specificity. We compare our method with two alternative methods from the literature, demonstrating superior results of our approach. In addition, we show that CASPAR can further refine predictions made using clinical scoring systems such as the International Prognostic Index (IPI) for DLBCL and clinical staging for lung cancer, thus providing an additional tool for the clinician. An analysis of the genes identified confirms previously published results, and furthermore, new candidate genes correlated with survival are identified.
Collapse
Affiliation(s)
- Lars Kaderali
- Cologne University Bioinformatics Center and Center for Applied Computer Science Weyertal 80, 50931 Köln, Germany.
| | | | | | | | | | | |
Collapse
|
47
|
Carmona-Saez P, Pascual-Marqui RD, Tirado F, Carazo JM, Pascual-Montano A. Biclustering of gene expression data by Non-smooth Non-negative Matrix Factorization. BMC Bioinformatics 2006; 7:78. [PMID: 16503973 PMCID: PMC1434777 DOI: 10.1186/1471-2105-7-78] [Citation(s) in RCA: 94] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2005] [Accepted: 02/17/2006] [Indexed: 12/01/2022] Open
Abstract
Background The extended use of microarray technologies has enabled the generation and accumulation of gene expression datasets that contain expression levels of thousands of genes across tens or hundreds of different experimental conditions. One of the major challenges in the analysis of such datasets is to discover local structures composed by sets of genes that show coherent expression patterns across subsets of experimental conditions. These patterns may provide clues about the main biological processes associated to different physiological states. Results In this work we present a methodology able to cluster genes and conditions highly related in sub-portions of the data. Our approach is based on a new data mining technique, Non-smooth Non-Negative Matrix Factorization (nsNMF), able to identify localized patterns in large datasets. We assessed the potential of this methodology analyzing several synthetic datasets as well as two large and heterogeneous sets of gene expression profiles. In all cases the method was able to identify localized features related to sets of genes that show consistent expression patterns across subsets of experimental conditions. The uncovered structures showed a clear biological meaning in terms of relationships among functional annotations of genes and the phenotypes or physiological states of the associated conditions. Conclusion The proposed approach can be a useful tool to analyze large and heterogeneous gene expression datasets. The method is able to identify complex relationships among genes and conditions that are difficult to identify by standard clustering algorithms.
Collapse
Affiliation(s)
- Pedro Carmona-Saez
- BioComputing Unit. National Center of Biotechnology. Campus Universidad Autónoma de Madrid. 28049. Spain
| | - Roberto D Pascual-Marqui
- The KEY Institute for Brain-Mind Research, University Hospital of Psychiatry. Lenggstr. 31, CH-8029 Zurich, Switzerland
| | - F Tirado
- Computer Architecture Department. Facultad de Ciencias Físicas. Universidad Complutense de Madrid. 28040. Spain
| | - Jose M Carazo
- BioComputing Unit. National Center of Biotechnology. Campus Universidad Autónoma de Madrid. 28049. Spain
| | - Alberto Pascual-Montano
- Computer Architecture Department. Facultad de Ciencias Físicas. Universidad Complutense de Madrid. 28040. Spain
| |
Collapse
|
48
|
Gulmann C, Espina V, Petricoin E, Longo DL, Santi M, Knutsen T, Raffeld M, Jaffe ES, Liotta LA, Feldman AL. Proteomic Analysis of Apoptotic Pathways Reveals Prognostic Factors in Follicular Lymphoma. Clin Cancer Res 2005; 11:5847-55. [PMID: 16115925 DOI: 10.1158/1078-0432.ccr-05-0637] [Citation(s) in RCA: 86] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Follicular lymphoma (FL) is the second most common non-Hodgkin's lymphoma and generally is incurable. Reliable prognostic markers to differentiate patients who progress rapidly from those who survive for years with indolent disease have not been established. Most cases overexpress Bcl-2, but the pathogenesis of FL remains incompletely understood. To determine whether a proteomic approach could help overcome these obstacles, we procured lymphoid follicles from 20 cases of FL and 15 cases of benign follicular hyperplasia (FH) using laser capture microdissection. Lysates were spotted on reverse-phase protein microarrays and probed with 21 antibodies to proteins in the intrinsic apoptotic pathway, including those specific for posttranslational modifications such as phosphorylation. A panel of three antibodies [phospho-Akt(Ser473), Bcl-2, and cleaved poly(ADP-ribose) polymerase] segregated most cases of FL from FH. Phospho-Akt(Ser473) and Bcl-2 were significantly increased in FL (P = 0.001 and P < 0.0001, respectively). Additionally, the Bcl-2/Bak ratio completely segregated FL from FH. High ratios of Bcl-2/Bak and Bcl-2/Bax were associated with early death from disease with differences in median survival times of 7.3 years (P = 0.0085) and 3.8 years (P = 0.018), respectively. Using protein microarrays, we identified candidate proteins that may signify clinically relevant molecular events in FL. This approach showed significant changes at the posttranslational level, including Akt phosphorylation, and suggested new prognostic markers, including the Bcl-2/Bak and Bcl-2/Bax ratios. Proteomic end points should be incorporated in larger, multicenter trials to validate the clinical utility of these protein microarray findings.
Collapse
Affiliation(s)
- Christian Gulmann
- National Cancer Institute--Food and Drug Administration Clinical Proteomics Program, Laboratory of Pathology, Bethesda, MD 20892, USA
| | | | | | | | | | | | | | | | | | | |
Collapse
|
49
|
Jiang YM, Yamamoto M, Kobayashi Y, Yoshihara T, Liang Y, Terao S, Takeuchi H, Ishigaki S, Katsuno M, Adachi H, Niwa JI, Tanaka F, Doyu M, Yoshida M, Hashizume Y, Sobue G. Gene expression profile of spinal motor neurons in sporadic amyotrophic lateral sclerosis. Ann Neurol 2005; 57:236-51. [PMID: 15668976 DOI: 10.1002/ana.20379] [Citation(s) in RCA: 191] [Impact Index Per Article: 10.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
The causative pathomechanism of sporadic amyotrophic lateral sclerosis (ALS) is not clearly understood. Using microarray technology combined with laser-captured microdissection, gene expression profiles of degenerating spinal motor neurons isolated from autopsied patients with sporadic ALS were examined. Gene expression was quantitatively assessed by real-time reverse transcription polymerase chain reaction and in situ hybridization. Spinal motor neurons showed a distinct gene expression profile from the whole spinal ventral horn. Three percent of genes examined were downregulated, and 1% were upregulated in motor neurons. Downregulated genes included those associated with cytoskeleton/axonal transport, transcription, and cell surface antigens/receptors, such as dynactin, microtubule-associated proteins, and early growth response 3 (EGR3). In contrast, cell death-associated genes were mostly upregulated. Promoters for cell death pathway, death receptor 5, cyclins A1 and C, and caspases-1, -3, and -9, were upregulated, whereas cell death inhibitors, acetyl-CoA transporter, and NF-kappaB were also upregulated. Moreover, neuroprotective neurotrophic factors such as ciliary neurotrophic factor (CNTF), Hepatocyte growth factor (HGF), and glial cell line-derived neurotrophic factor were upregulated. Inflammation-related genes, such as those belonging to the cytokine family, were not, however, significantly upregulated in either motor neurons or ventral horns. The motor neuron-specific gene expression profile in sporadic ALS can provide direct information on the genes leading to neurodegeneration and neuronal death and are helpful for developing new therapeutic strategies.
Collapse
Affiliation(s)
- Yue-Mei Jiang
- Department of Neurology, Nagoya University Graduate School of Medicine, Nagoya, Japan
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
50
|
Atalay V, Cetin-Atalay R. Implicit motif distribution based hybrid computational kernel for sequence classification. Bioinformatics 2004; 21:1429-36. [PMID: 15598837 DOI: 10.1093/bioinformatics/bti212] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION We designed a general computational kernel for classification problems that require specific motif extraction and search from sequences. Instead of searching for explicit motifs, our approach finds the distribution of implicit motifs and uses as a feature for classification. Implicit motif distribution approach may be used as modus operandi for bioinformatics problems that require specific motif extraction and search, which is otherwise computationally prohibitive. RESULTS A system named P2SL that infer protein subcellular targeting was developed through this computational kernel. Targeting-signal was modeled by the distribution of subsequence occurrences (implicit motifs) using self-organizing maps. The boundaries among the classes were then determined with a set of support vector machines. P2SL hybrid computational system achieved approximately 81% of prediction accuracy rate over ER targeted, cytosolic, mitochondrial and nuclear protein localization classes. P2SL additionally offers the distribution potential of proteins among localization classes, which is particularly important for proteins, shuttle between nucleus and cytosol. AVAILABILITY http://staff.vbi.vt.edu/volkan/p2sl and http://www.i-cancer.fen.bilkent.edu.tr/p2sl CONTACT rengul@bilkent.edu.tr.
Collapse
Affiliation(s)
- Volkan Atalay
- Virginia Bioinformatics Institute, Virginia Polytechnic Institute and State University, Blacksburg, VA 24061, USA
| | | |
Collapse
|