Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Liu Q, Xia F, Yin Q, Jiang R. Chromatin accessibility prediction via a hybrid deep convolutional neural network. Bioinformatics 2018;34:732-738. [PMID: 29069282 PMCID: PMC6192215 DOI: 10.1093/bioinformatics/btx679] [Citation(s) in RCA: 47] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2017] [Accepted: 10/20/2017] [Indexed: 11/26/2022] Open

For:	Liu Q, Xia F, Yin Q, Jiang R. Chromatin accessibility prediction via a hybrid deep convolutional neural network. Bioinformatics 2018;34:732-738. [PMID: 29069282 PMCID: PMC6192215 DOI: 10.1093/bioinformatics/btx679] [Citation(s) in RCA: 47] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2017] [Accepted: 10/20/2017] [Indexed: 11/26/2022] Open

Number

Cited by Other Article(s)

Zhao L, Hao R, Chai Z, Fu W, Yang W, Li C, Liu Q, Jiang Y. DeepOCR: A multi-species deep-learning framework for accurate identification of open chromatin regions in livestock. Comput Biol Chem 2024;110:108077. [PMID: 38691895 DOI: 10.1016/j.compbiolchem.2024.108077] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2024] [Revised: 03/27/2024] [Accepted: 04/16/2024] [Indexed: 05/03/2024]

Kim J, Seok J. ctGAN: combined transformation of gene expression and survival data with generative adversarial network. Brief Bioinform 2024;25:bbae325. [PMID: 38980369 PMCID: PMC11232285 DOI: 10.1093/bib/bbae325] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2024] [Revised: 05/29/2024] [Accepted: 06/21/2024] [Indexed: 07/10/2024] Open

Gao Z, Liu Q, Zeng W, Jiang R, Wong WH. EpiGePT: a Pretrained Transformer model for epigenomics. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.07.15.549134. [PMID: 37502861 PMCID: PMC10370089 DOI: 10.1101/2023.07.15.549134] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/29/2023]

Sigala RE, Lagou V, Shmeliov A, Atito S, Kouchaki S, Awais M, Prokopenko I, Mahdi A, Demirkan A. Machine Learning to Advance Human Genome-Wide Association Studies. Genes (Basel) 2023;15:34. [PMID: 38254924 PMCID: PMC10815885 DOI: 10.3390/genes15010034] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2023] [Revised: 12/19/2023] [Accepted: 12/22/2023] [Indexed: 01/24/2024] Open

Affiliation(s)

Rafaella E. Sigala Section of Statistical Multi-Omics, Department of Clinical and Experimental Medicine, Guildford GU2 7XH, Surrey, UK; (R.E.S.); (V.L.); (A.S.); (I.P.)
Vasiliki Lagou Section of Statistical Multi-Omics, Department of Clinical and Experimental Medicine, Guildford GU2 7XH, Surrey, UK; (R.E.S.); (V.L.); (A.S.); (I.P.)
Aleksey Shmeliov Section of Statistical Multi-Omics, Department of Clinical and Experimental Medicine, Guildford GU2 7XH, Surrey, UK; (R.E.S.); (V.L.); (A.S.); (I.P.)
Sara Atito Surrey Institute for People-Centred Artificial Intelligence, University of Surrey, Guildford GU2 7XH, Surrey, UK; (S.A.); (S.K.); (M.A.) Centre for Vision, Speech and Signal Processing, University of Surrey, Guildford GU2 7XH, Surrey, UK
Samaneh Kouchaki Surrey Institute for People-Centred Artificial Intelligence, University of Surrey, Guildford GU2 7XH, Surrey, UK; (S.A.); (S.K.); (M.A.) Centre for Vision, Speech and Signal Processing, University of Surrey, Guildford GU2 7XH, Surrey, UK
Muhammad Awais Surrey Institute for People-Centred Artificial Intelligence, University of Surrey, Guildford GU2 7XH, Surrey, UK; (S.A.); (S.K.); (M.A.) Centre for Vision, Speech and Signal Processing, University of Surrey, Guildford GU2 7XH, Surrey, UK
Inga Prokopenko Section of Statistical Multi-Omics, Department of Clinical and Experimental Medicine, Guildford GU2 7XH, Surrey, UK; (R.E.S.); (V.L.); (A.S.); (I.P.) Surrey Institute for People-Centred Artificial Intelligence, University of Surrey, Guildford GU2 7XH, Surrey, UK; (S.A.); (S.K.); (M.A.)
Adam Mahdi Oxford Internet Institute, University of Oxford, Oxford OX1 3JS, Oxfordshire, UK;
Ayse Demirkan Section of Statistical Multi-Omics, Department of Clinical and Experimental Medicine, Guildford GU2 7XH, Surrey, UK; (R.E.S.); (V.L.); (A.S.); (I.P.) Surrey Institute for People-Centred Artificial Intelligence, University of Surrey, Guildford GU2 7XH, Surrey, UK; (S.A.); (S.K.); (M.A.)

Collapse

Deng Z, Ji Y, Han B, Tan Z, Ren Y, Gao J, Chen N, Ma C, Zhang Y, Yao Y, Lu H, Huang H, Xu M, Chen L, Zheng L, Gu J, Xiong D, Zhao J, Gu J, Chen Z, Wang K. Early detection of hepatocellular carcinoma via no end-repair enzymatic methylation sequencing of cell-free DNA and pre-trained neural network. Genome Med 2023;15:93. [PMID: 37936230 PMCID: PMC10631027 DOI: 10.1186/s13073-023-01238-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2022] [Accepted: 09/26/2023] [Indexed: 11/09/2023] Open

Abstract

BACKGROUND

Early detection of hepatocellular carcinoma (HCC) is important in order to improve patient prognosis and survival rate. Methylation sequencing combined with neural networks to identify cell-free DNA (cfDNA) carrying aberrant methylation offers an appealing and non-invasive approach for HCC detection. However, some limitations exist in traditional methylation detection technologies and models, which may impede their performance in the read-level detection of HCC.

METHODS

We developed a low DNA damage and high-fidelity methylation detection method called No End-repair Enzymatic Methyl-seq (NEEM-seq). We further developed a read-level neural detection model called DeepTrace that can better identify HCC-derived sequencing reads through a pre-trained and fine-tuned neural network. After pre-training on 11 million reads from NEEM-seq, DeepTrace was fine-tuned using 1.2 million HCC-derived reads from tumor tissue DNA after noise reduction, and 2.7 million non-tumor reads from non-tumor cfDNA. We validated the model using data from 130 individuals with cfDNA whole-genome NEEM-seq at around 1.6X depth.

RESULTS

NEEM-seq overcomes the drawbacks of traditional enzymatic methylation sequencing methods by avoiding the introduction of unmethylation errors in cfDNA. DeepTrace outperformed other models in identifying HCC-derived reads and detecting HCC individuals. Based on the whole-genome NEEM-seq data of cfDNA, our model showed high accuracy of 96.2%, sensitivity of 93.6%, and specificity of 98.5% in the validation cohort consisting of 62 HCC patients, 48 liver disease patients, and 20 healthy individuals. In the early stage of HCC (BCLC 0/A and TNM I), the sensitivity of DeepTrace was 89.6 and 89.5% respectively, outperforming Alpha Fetoprotein (AFP) which showed much lower sensitivity in both BCLC 0/A (50.5%) and TNM I (44.7%).

CONCLUSIONS

By combining high-fidelity methylation data from NEEM-seq with the DeepTrace model, our method has great potential for HCC early detection with high sensitivity and specificity, making it potentially suitable for clinical applications. DeepTrace: https://github.com/Bamrock/DeepTrace.

Collapse

Affiliation(s)

Zhenzhong Deng Department of Oncology, Xinhua Hospital Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai, China
Yongkun Ji BamRock Research Department, Suzhou BamRock Biotechnology Ltd., Suzhou, Jiangsu Province, China
Bing Han Division of Hepatobiliary and Transplantation Surgery, Department of General Surgery, Nanjing Drum Tower Hospital, the Affiliated Hospital of Medical School, Nanjing University, Nanjing, China Department of Transplantation, Xinhua Hospital Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai, China
Zhongming Tan Hepatobiliary Center, the First Affiliated Hospital of Nanjing Medical University, Nanjing, Jiangsu Province, China
Yuqi Ren College of Intelligence and Computing, Tianjin University, Tianjin, China
Jinghan Gao Department of Software Engineering, Tsinghua University, Beijing, China
Nan Chen National Clinical Research Center for Hematologic Diseases, Jiangsu Institute of Hematology, The First Affiliated Hospital of Soochow University, Suzhou, Jiangsu Province, China
Cong Ma Suzhou Known Biotechnology Ltd, Suzhou, Jiangsu Province, China
Yichi Zhang Department of Transplantation, Xinhua Hospital Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai, China
Yunhai Yao Infectious Disease Department, the First Affiliated Hospital of Soochow University, Suzhou, Jiangsu Province, China
Hong Lu Infectious Disease Department, the First Affiliated Hospital of Soochow University, Suzhou, Jiangsu Province, China
Heqing Huang Infectious Disease Department, the First Affiliated Hospital of Soochow University, Suzhou, Jiangsu Province, China
Midie Xu Department of Pathology, Fudan University Shanghai Cancer Center, Shanghai, China Department of Oncology, Shanghai Medical College, Fudan University, Shanghai, China
Lei Chen Department of Pathology, Xinhua Hospital Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai, China
Leizhen Zheng Department of Oncology, Xinhua Hospital Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai, China
Jianchun Gu Department of Oncology, Xinhua Hospital Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai, China.
Deyi Xiong College of Intelligence and Computing, Tianjin University, Tianjin, China.
Jianxin Zhao Department of Interventional Medicine, the affiliated hospital of infectious diseases of Soochow University, Suzhou, 215131, Jiangsu Province, China.
Jinyang Gu Department of Transplantation, Xinhua Hospital Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai, China. Liver Transplantation Center, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei Province, China.
Zutao Chen Infectious Disease Department, the First Affiliated Hospital of Soochow University, Suzhou, Jiangsu Province, China. Suzhou Key Laboratory of Pathogen Bioscience and Anti-Infective Medicine, Suzhou, Jiangsu Province, China.
Ke Wang Hepatobiliary Center, the First Affiliated Hospital of Nanjing Medical University, Nanjing, Jiangsu Province, China.

Collapse

Khodursky S, Zheng EB, Svetec N, Durkin SM, Benjamin S, Gadau A, Wu X, Zhao L. The evolution and mutational robustness of chromatin accessibility in Drosophila. Genome Biol 2023;24:232. [PMID: 37845780 PMCID: PMC10578003 DOI: 10.1186/s13059-023-03079-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2022] [Accepted: 09/29/2023] [Indexed: 10/18/2023] Open

Abstract

BACKGROUND

The evolution of genomic regulatory regions plays a critical role in shaping the diversity of life. While this process is primarily sequence-dependent, the enormous complexity of biological systems complicates the understanding of the factors underlying regulation and its evolution. Here, we apply deep neural networks as a tool to investigate the sequence determinants underlying chromatin accessibility in different species and tissues of Drosophila.

RESULTS

We train hybrid convolution-attention neural networks to accurately predict ATAC-seq peaks using only local DNA sequences as input. We show that our models generalize well across substantially evolutionarily diverged species of insects, implying that the sequence determinants of accessibility are highly conserved. Using our model to examine species-specific gains in accessibility, we find evidence suggesting that these regions may be ancestrally poised for evolution. Using in silico mutagenesis, we show that accessibility can be accurately predicted from short subsequences in each example. However, in silico knock-out of these sequences does not qualitatively impair classification, implying that accessibility is mutationally robust. Subsequently, we show that accessibility is predicted to be robust to large-scale random mutation even in the absence of selection. Conversely, simulations under strong selection demonstrate that accessibility can be extremely malleable despite its robustness. Finally, we identify motifs predictive of accessibility, recovering both novel and previously known motifs.

CONCLUSIONS

These results demonstrate the conservation of the sequence determinants of accessibility and the general robustness of chromatin accessibility, as well as the power of deep neural networks to explore fundamental questions in regulatory genomics and evolution.

Collapse

Prakash A, Banerjee M. An interpretable block-attention network for identifying regulatory feature interactions. Brief Bioinform 2023;24:bbad250. [PMID: 37401370 DOI: 10.1093/bib/bbad250] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2022] [Revised: 05/15/2023] [Accepted: 06/16/2023] [Indexed: 07/05/2023] Open

Wang M, Yan L, Jia J, Lai J, Zhou H, Yu B. DE-MHAIPs: Identification of SARS-CoV-2 phosphorylation sites based on differential evolution multi-feature learning and multi-head attention mechanism. Comput Biol Med 2023;160:106935. [PMID: 37120990 PMCID: PMC10140648 DOI: 10.1016/j.compbiomed.2023.106935] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2023] [Revised: 03/12/2023] [Accepted: 04/13/2023] [Indexed: 05/02/2023]

Jing Y, Zhang S, Wang H. DapNet-HLA: Adaptive dual-attention mechanism network based on deep learning to predict non-classical HLA binding sites. Anal Biochem 2023;666:115075. [PMID: 36740003 DOI: 10.1016/j.ab.2023.115075] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2022] [Revised: 01/30/2023] [Accepted: 02/02/2023] [Indexed: 02/05/2023]

Quan L, Chu X, Sun X, Wu T, Lyu Q. How Deepbics Quantifies Intensities of Transcription Factor-DNA Binding and Facilitates Prediction of Single Nucleotide Variant Pathogenicity With a Deep Learning Model Trained On ChIP-Seq Data Sets. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023;20:1594-1599. [PMID: 35471887 DOI: 10.1109/tcbb.2022.3170343] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]

Gan M, Ma Y. Mapping user interest into hyper-spherical space: A novel POI recommendation method. Inf Process Manag 2023. [DOI: 10.1016/j.ipm.2022.103169] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]

Liu Q, Zeng W, Zhang W, Wang S, Chen H, Jiang R, Zhou M, Zhang S. Deep generative modeling and clustering of single cell Hi-C data. Brief Bioinform 2023;24:6858951. [PMID: 36458445 DOI: 10.1093/bib/bbac494] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2022] [Revised: 09/28/2022] [Accepted: 10/18/2022] [Indexed: 12/05/2022] Open

Accurate Prediction of Anti-hypertensive Peptides Based on Convolutional Neural Network and Gated Recurrent unit. Interdiscip Sci 2022;14:879-894. [PMID: 35474167 DOI: 10.1007/s12539-022-00521-3] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2021] [Revised: 03/30/2022] [Accepted: 04/06/2022] [Indexed: 12/30/2022]

Abstract

Hypertension (HT) is a general disease, and also one of the most ordinary and major causes of cardiovascular disease. Some diseases are caused by high blood pressure, including impairment of heart and kidney function, cerebral hemorrhage and myocardial infarction. Due to the limitations of laboratory methods, bioactive peptides for the treatment of HT need a long time to be identified. Therefore, it is of great immediate significance for the identification of anti-hypertensive peptides (AHTPs). With the prevalence of machine learning, it is suggested to use it as a supplementary method for AHTPs classification. Therefore, we develop a new model to identify AHTPs based on multiple features and deep learning. And the deep model is constructed by combining a convolutional neural network (CNN) and a gated recurrent unit (GRU). The unique convolution structure is used to reduce the feature dimension and running time. The data processed by CNN is input into the recurrent structure GRU, and important information is filtered out through the reset gate and update gate. Finally, the output layer adopts Sigmoid activation function. Firstly, we use Kmer, the deviation between the dipeptide frequency and the expected mean (DDE), encoding based on grouped weight (EBGW), enhanced grouped amino acid composition (EGAAC) and dipeptide binary profile and frequency (DBPF) to extract features. For Kmer, DDE, EBGW and EGAAC, it is widely used in the field of protein research. DBPF is a new feature representation method designed by us. It corresponds dipeptides to binary numbers, and finally obtains a binary coding file and a frequency file. Then these features are spliced together and input into our proposed model for prediction and analysis. After a tenfold cross-validation test, this model has a better competitive advantage than the previous methods, and the accuracy is 96.23% and 99.10%, respectively. From the results, compared with the previous methods, it has been greatly improved. It shows that the combination of convolution calculation and recurrent structure has a positive impact on the classification of AHTPs. The results show that this method is a feasible, efficient and competitive sequence analysis tool for AHTPs. Meanwhile, we design a friendly online prediction tool and it is freely accessible at http://ahtps.zhanglab.site/ .

Collapse

Wang H, Li H, Gao W, Xie J. PrUb-EL: A hybrid framework based on deep learning for identifying ubiquitination sites in Arabidopsis thaliana using ensemble learning strategy. Anal Biochem 2022;658:114935. [PMID: 36206844 DOI: 10.1016/j.ab.2022.114935] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2022] [Revised: 09/25/2022] [Accepted: 09/26/2022] [Indexed: 12/30/2022]

Abstract

Identification of ubiquitination sites is central to many biological experiments. Ubiquitination is a kind of post-translational protein modification (PTM). It is a key mechanism for increasing protein diversity and plays a vital role in regulating cell function. In recent years, many models have been developed to predict ubiquitination sites in humans, mice and yeast. However, few studies have predicted ubiquitination sites in Arabidopsis thaliana. In view of this, a deep network model named PrUb-EL is proposed to predict ubiquitination sites in Arabidopsis thaliana. Firstly, six features based on the protein sequence are extracted with amino acid index database (AAindex), dipeptide deviates from the expected mean (DDE), dipeptide composition (DPC), blocks substitution matrix (BLOSUM62), enhanced amino acid composition (EAAC) and binary encoding. Secondly, the synthetic minority over-sampling technique (SMOTE) is utilized to process the imbalanced data set. Then a new classifier named DG is presented, which includes Dense block, Residual block and Gated recurrent unit (GRU) block. Finally, each of six feature extraction methods is integrated into the DG model, and the ensemble learning strategy is used to gain the final prediction result. Experimental results show that PrUb-EL has good predictive ability with the accuracy (ACC) and area under the ROC curve (auROC) values of 91.00% and 97.70% using 5-fold cross-validation, respectively. Note that the values of ACC and auROC are 88.58% and 96.09% in the independent test, respectively. Compared with previous studies, our model has significantly improved performance thus it is an excellent method for identifying ubiquitination sites in Arabidopsis thaliana. The datasets and code used for the article are available at https://github.com/Tom-Wangy/PreUb-EL.git.

Collapse

Li X, Zhang S, Shi H. An improved residual network using deep fusion for identifying RNA 5-methylcytosine sites. Bioinformatics 2022;38:4271-4277. [PMID: 35866985 DOI: 10.1093/bioinformatics/btac532] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2022] [Revised: 06/30/2022] [Accepted: 07/21/2022] [Indexed: 12/24/2022] Open

Abstract

MOTIVATION

5-Methylcytosine (m5C) is a crucial post-transcriptional modification. With the development of technology, it is widely found in various RNAs. Numerous studies have indicated that m5C plays an essential role in various activities of organisms, such as tRNA recognition, stabilization of RNA structure, RNA metabolism and so on. Traditional identification is costly and time-consuming by wet biological experiments. Therefore, computational models are commonly used to identify the m5C sites. Due to the vast computing advantages of deep learning, it is feasible to construct the predictive model through deep learning algorithms.

RESULTS

In this study, we construct a model to identify m5C based on a deep fusion approach with an improved residual network. First, sequence features are extracted from the RNA sequences using Kmer, K-tuple nucleotide frequency component (KNFC), Pseudo dinucleotide composition (PseDNC) and Physical and chemical property (PCP). Kmer and KNFC extract information from a statistical point of view. PseDNC and PCP extract information from the physicochemical properties of RNA sequences. Then, two parts of information are fused with new features using bidirectional long- and short-term memory and attention mechanisms, respectively. Immediately after, the fused features are fed into the improved residual network for classification. Finally, 10-fold cross-validation and independent set testing are used to verify the credibility of the model. The results show that the accuracy reaches 91.87%, 95.55%, 92.27% and 95.60% on the training sets and independent test sets of Arabidopsis thaliana and M.musculus, respectively. This is a considerable improvement compared to previous studies and demonstrates the robust performance of our model.

AVAILABILITY AND IMPLEMENTATION

The data and code related to the study are available at https://github.com/alivelxj/m5c-DFRESG.

Collapse

Wrightsman T, Marand AP, Crisp PA, Springer NM, Buckler ES. Modeling chromatin state from sequence across angiosperms using recurrent convolutional neural networks. THE PLANT GENOME 2022;15:e20249. [PMID: 35924336 DOI: 10.1002/tpg2.20249] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/08/2021] [Accepted: 06/20/2022] [Indexed: 06/06/2024]

Alharbi WS, Rashid M. A review of deep learning applications in human genomics using next-generation sequencing data. Hum Genomics 2022;16:26. [PMID: 35879805 PMCID: PMC9317091 DOI: 10.1186/s40246-022-00396-x] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2021] [Accepted: 07/12/2022] [Indexed: 12/02/2022] Open

Dodlapati S, Jiang Z, Sun J. Completing Single-Cell DNA Methylome Profiles via Transfer Learning Together With KL-Divergence. Front Genet 2022;13:910439. [PMID: 35938031 PMCID: PMC9353187 DOI: 10.3389/fgene.2022.910439] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2022] [Accepted: 05/25/2022] [Indexed: 11/13/2022] Open

Asim MN, Ibrahim MA, Malik MI, Razzak I, Dengel A, Ahmed S. Histone-Net: a multi-paradigm computational framework for histone occupancy and modification prediction. COMPLEX INTELL SYST 2022. [DOI: 10.1007/s40747-022-00802-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]

Yang CH, Wu KC, Chuang LY, Chang HW. DeepBarcoding: Deep Learning for Species Classification Using DNA Barcoding. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022;19:2158-2165. [PMID: 33600318 DOI: 10.1109/tcbb.2021.3056570] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]

Routhier E, Mozziconacci J. Genomics enters the deep learning era. PeerJ 2022;10:e13613. [PMID: 35769139 PMCID: PMC9235815 DOI: 10.7717/peerj.13613] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2022] [Accepted: 05/30/2022] [Indexed: 01/17/2023] Open

Liu Q, Hua K, Zhang X, Wong WH, Jiang R. DeepCAGE: Incorporating Transcription Factors in Genome-wide Prediction of Chromatin Accessibility. GENOMICS, PROTEOMICS & BIOINFORMATICS 2022;20:496-507. [PMID: 35293310 PMCID: PMC9801045 DOI: 10.1016/j.gpb.2021.08.015] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/31/2021] [Revised: 05/31/2021] [Accepted: 09/27/2021] [Indexed: 01/26/2023]

Yu B, Zhang Y, Wang X, Gao H, Sun J, Gao X. Identification of DNA modification sites based on elastic net and bidirectional gated recurrent unit with convolutional neural network. Biomed Signal Process Control 2022. [DOI: 10.1016/j.bspc.2022.103566] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]

Yin Q, Liu Q, Fu Z, Zeng W, Zhang B, Zhang X, Jiang R, Lv H. scGraph: a graph neural network-based approach to automatically identify cell types. Bioinformatics 2022;38:2996-3003. [PMID: 35394015 DOI: 10.1093/bioinformatics/btac199] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2021] [Revised: 12/13/2021] [Accepted: 04/07/2020] [Indexed: 11/13/2022] Open

Affiliation(s)

Qijin Yin Ministry of Education Key Laboratory of Bioinformatics, Research Department of Bioinformatics at the Beijing National Research Center for Information Science and Technology, Center for Synthetic and Systems Biology, Department of Automation, Tsinghua University, Beijing 100084, China
Qiao Liu Department of Statistics, Stanford University Stanford, CA 94305
Zhuoran Fu Ministry of Education Key Laboratory of Bioinformatics, Research Department of Bioinformatics at the Beijing National Research Center for Information Science and Technology, Center for Synthetic and Systems Biology, Department of Automation, Tsinghua University, Beijing 100084, China
Wanwen Zeng Department of Statistics, Stanford University Stanford, CA 94305.,College of Software, Nankai University, Tianjin, 300350, China
Boheng Zhang Ministry of Education Key Laboratory of Bioinformatics, Research Department of Bioinformatics at the Beijing National Research Center for Information Science and Technology, Center for Synthetic and Systems Biology, Department of Automation, Tsinghua University, Beijing 100084, China
Xuegong Zhang Ministry of Education Key Laboratory of Bioinformatics, Research Department of Bioinformatics at the Beijing National Research Center for Information Science and Technology, Center for Synthetic and Systems Biology, Department of Automation, Tsinghua University, Beijing 100084, China
Rui Jiang Ministry of Education Key Laboratory of Bioinformatics, Research Department of Bioinformatics at the Beijing National Research Center for Information Science and Technology, Center for Synthetic and Systems Biology, Department of Automation, Tsinghua University, Beijing 100084, China
Hairong Lv Ministry of Education Key Laboratory of Bioinformatics, Research Department of Bioinformatics at the Beijing National Research Center for Information Science and Technology, Center for Synthetic and Systems Biology, Department of Automation, Tsinghua University, Beijing 100084, China.,Fuzhou Institute of Data Technology, Changle, Fuzhou, 350200, China

Collapse

SemanticCAP: Chromatin Accessibility Prediction Enhanced by Features Learning from a Language Model. Genes (Basel) 2022;13:genes13040568. [PMID: 35456374 PMCID: PMC9028922 DOI: 10.3390/genes13040568] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2022] [Revised: 03/22/2022] [Accepted: 03/22/2022] [Indexed: 11/16/2022] Open

Liu S, Cheng H, Ashraf J, Zhang Y, Wang Q, Lv L, He M, Song G, Zuo D. Interpretation of convolutional neural networks reveals crucial sequence features involving in transcription during fiber development. BMC Bioinformatics 2022;23:91. [PMID: 35291940 PMCID: PMC8922751 DOI: 10.1186/s12859-022-04619-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2021] [Accepted: 02/22/2022] [Indexed: 11/10/2022] Open

Affiliation(s)

Shang Liu Institute of Cotton Research of Chinese Academy of Agricultural Sciences, Anyang, 455000, China.,Zhengzhou Research Base, State Key Laboratory of Cotton Biology, Zhengzhou University, Zhengzhou, 450001, China
Hailiang Cheng Institute of Cotton Research of Chinese Academy of Agricultural Sciences, Anyang, 455000, China.,Zhengzhou Research Base, State Key Laboratory of Cotton Biology, Zhengzhou University, Zhengzhou, 450001, China
Javaria Ashraf Institute of Cotton Research of Chinese Academy of Agricultural Sciences, Anyang, 455000, China.,Department of Plant Breeding and Genetics, University College of Agriculture and Environmental Sciences, The Islamia University of Bahawalpur, Punjab, 63100, Pakistan
Youping Zhang Institute of Cotton Research of Chinese Academy of Agricultural Sciences, Anyang, 455000, China.,Zhengzhou Research Base, State Key Laboratory of Cotton Biology, Zhengzhou University, Zhengzhou, 450001, China
Qiaolian Wang Institute of Cotton Research of Chinese Academy of Agricultural Sciences, Anyang, 455000, China.,Zhengzhou Research Base, State Key Laboratory of Cotton Biology, Zhengzhou University, Zhengzhou, 450001, China
Limin Lv Institute of Cotton Research of Chinese Academy of Agricultural Sciences, Anyang, 455000, China.,Zhengzhou Research Base, State Key Laboratory of Cotton Biology, Zhengzhou University, Zhengzhou, 450001, China
Man He Institute of Cotton Research of Chinese Academy of Agricultural Sciences, Anyang, 455000, China
Guoli Song Institute of Cotton Research of Chinese Academy of Agricultural Sciences, Anyang, 455000, China. .,Zhengzhou Research Base, State Key Laboratory of Cotton Biology, Zhengzhou University, Zhengzhou, 450001, China.
Dongyun Zuo Institute of Cotton Research of Chinese Academy of Agricultural Sciences, Anyang, 455000, China. .,Zhengzhou Research Base, State Key Laboratory of Cotton Biology, Zhengzhou University, Zhengzhou, 450001, China.

Collapse

Base-resolution prediction of transcription factor binding signals by a deep learning framework. PLoS Comput Biol 2022;18:e1009941. [PMID: 35263332 PMCID: PMC8982852 DOI: 10.1371/journal.pcbi.1009941] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2021] [Revised: 04/05/2022] [Accepted: 02/19/2022] [Indexed: 01/13/2023] Open

Air Pollution and Perinatal Health in the Eastern Mediterranean Region: Challenges, Limitations, and the Potential of Epigenetics. Curr Environ Health Rep 2022;9:1-10. [PMID: 35080743 DOI: 10.1007/s40572-022-00337-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/26/2021] [Indexed: 11/03/2022]

Zhao Y, Dong Y, Hong W, Jiang C, Yao K, Cheng C. Computational modeling of chromatin accessibility identified important epigenomic regulators. BMC Genomics 2022;23:19. [PMID: 34996354 PMCID: PMC8742372 DOI: 10.1186/s12864-021-08234-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2021] [Accepted: 12/03/2021] [Indexed: 11/28/2022] Open

Vaz JM, Balaji S. Convolutional neural networks (CNNs): concepts and applications in pharmacogenomics. Mol Divers 2021;25:1569-1584. [PMID: 34031788 PMCID: PMC8342355 DOI: 10.1007/s11030-021-10225-3] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2021] [Accepted: 04/21/2021] [Indexed: 12/17/2022]

Atak ZK, Taskiran II, Demeulemeester J, Flerin C, Mauduit D, Minnoye L, Hulselmans G, Christiaens V, Ghanem GE, Wouters J, Aerts S. Interpretation of allele-specific chromatin accessibility using cell state-aware deep learning. Genome Res 2021;31:1082-1096. [PMID: 33832990 PMCID: PMC8168584 DOI: 10.1101/gr.260851.120] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2020] [Accepted: 04/05/2021] [Indexed: 12/26/2022]

Liu Q, Hu Z, Jiang R, Zhou M. DeepCDR: a hybrid graph convolutional network for predicting cancer drug response. Bioinformatics 2021;36:i911-i918. [PMID: 33381841 DOI: 10.1093/bioinformatics/btaa822] [Citation(s) in RCA: 84] [Impact Index Per Article: 28.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open

Chen S, Gan M, Lv H, Jiang R. DeepCAPE: A Deep Convolutional Neural Network for the Accurate Prediction of Enhancers. GENOMICS PROTEOMICS & BIOINFORMATICS 2021;19:565-577. [PMID: 33581335 PMCID: PMC9040020 DOI: 10.1016/j.gpb.2019.04.006] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/24/2018] [Revised: 03/15/2019] [Accepted: 04/29/2019] [Indexed: 12/12/2022]

Abstract

The establishment of a landscape of enhancers across human cells is crucial to deciphering the mechanism of gene regulation, cell differentiation, and disease development. High-throughput experimental approaches, which contain successfully reported enhancers in typical cell lines, are still too costly and time-consuming to perform systematic identification of enhancers specific to different cell lines. Existing computational methods, capable of predicting regulatory elements purely relying on DNA sequences, lack the power of cell line-specific screening. Recent studies have suggested that chromatin accessibility of a DNA segment is closely related to its potential function in regulation, and thus may provide useful information in identifying regulatory elements. Motivated by the aforementioned understanding, we integrate DNA sequences and chromatin accessibility data to accurately predict enhancers in a cell line-specific manner. We proposed DeepCAPE, a deep convolutional neural network to predict enhancers via the integration of DNA sequences and DNase-seq data. Benefitting from the well-designed feature extraction mechanism and skip connection strategy, our model not only consistently outperforms existing methods in the imbalanced classification of cell line-specific enhancers against background sequences, but also has the ability to self-adapt to different sizes of datasets. Besides, with the adoption of auto-encoder, our model is capable of making cross-cell line predictions. We further visualize kernels of the first convolutional layer and show the match of identified sequence signatures and known motifs. We finally demonstrate the potential ability of our model to explain functional implications of putative disease-associated genetic variants and discriminate disease-related enhancers. The source code and detailed tutorial of DeepCAPE are freely available at https://github.com/ShengquanChen/DeepCAPE.

Collapse

Jing F, Zhang SW, Cao Z, Zhang S. An Integrative Framework for Combining Sequence and Epigenomic Data to Predict Transcription Factor Binding Sites Using Deep Learning. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2021;18:355-364. [PMID: 30835229 DOI: 10.1109/tcbb.2019.2901789] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]

The progress on the estimation of DNA methylation level and the detection of abnormal methylation. QUANTITATIVE BIOLOGY 2021. [DOI: 10.15302/j-qb-022-0289] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]

He Y, Shen Z, Zhang Q, Wang S, Huang DS. A survey on deep learning in DNA/RNA motif mining. Brief Bioinform 2020;22:5916939. [PMID: 33005921 PMCID: PMC8293829 DOI: 10.1093/bib/bbaa229] [Citation(s) in RCA: 34] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2020] [Revised: 08/19/2020] [Accepted: 08/24/2020] [Indexed: 01/18/2023] Open

Zeng W, Wang Y, Jiang R. Integrating distal and proximal information to predict gene expression via a densely connected convolutional neural network. Bioinformatics 2020;36:496-503. [PMID: 31318408 DOI: 10.1093/bioinformatics/btz562] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2018] [Revised: 05/19/2019] [Accepted: 07/16/2019] [Indexed: 01/05/2023] Open

Abstract

MOTIVATION

Interactions among cis-regulatory elements such as enhancers and promoters are main driving forces shaping context-specific chromatin structure and gene expression. Although there have been computational methods for predicting gene expression from genomic and epigenomic information, most of them neglect long-range enhancer-promoter interactions, due to the difficulty in precisely linking regulatory enhancers to target genes. Recently, HiChIP, a novel high-throughput experimental approach, has generated comprehensive data on high-resolution interactions between promoters and distal enhancers. Moreover, plenty of studies suggest that deep learning achieves state-of-the-art performance in epigenomic signal prediction, and thus promoting the understanding of regulatory elements. In consideration of these two factors, we integrate proximal promoter sequences and HiChIP distal enhancer-promoter interactions to accurately predict gene expression.

RESULTS

We propose DeepExpression, a densely connected convolutional neural network, to predict gene expression using both promoter sequences and enhancer-promoter interactions. We demonstrate that our model consistently outperforms baseline methods, not only in the classification of binary gene expression status but also in regression of continuous gene expression levels, in both cross-validation experiments and cross-cell line predictions. We show that the sequential promoter information is more informative than the experimental enhancer information; meanwhile, the enhancer-promoter interactions within ±100 kbp around the TSS of a gene are most beneficial. We finally visualize motifs in both promoter and enhancer regions and show the match of identified sequence signatures with known motifs. We expect to see a wide spectrum of applications using HiChIP data in deciphering the mechanism of gene regulation.

AVAILABILITY AND IMPLEMENTATION

DeepExpression is freely available at https://github.com/wanwenzeng/DeepExpression.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

Collapse

Fu L, Peng Q, Chai L. Predicting DNA Methylation States with Hybrid Information Based Deep-Learning Model. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2020;17:1721-1728. [PMID: 30951477 DOI: 10.1109/tcbb.2019.2909237] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]

Liu Q, Lv H, Jiang R. hicGAN infers super resolution Hi-C data with generative adversarial networks. Bioinformatics 2020;35:i99-i107. [PMID: 31510693 PMCID: PMC6612845 DOI: 10.1093/bioinformatics/btz317] [Citation(s) in RCA: 40] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open

Attentive gated neural networks for identifying chromatin accessibility. Neural Comput Appl 2020. [DOI: 10.1007/s00521-020-04879-7] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]

Xu C, Liu Q, Zhou J, Xie M, Feng J, Jiang T. Quantifying functional impact of non-coding variants with multi-task Bayesian neural network. Bioinformatics 2020;36:1397-1404. [PMID: 31693090 DOI: 10.1093/bioinformatics/btz767] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2019] [Revised: 09/29/2019] [Accepted: 11/04/2019] [Indexed: 11/14/2022] Open

Abstract

MOTIVATION

Advances in high-throughput genotyping and sequencing technologies during recent years have revealed essential roles of non-coding regions in gene regulation. Genome-wide association studies (GWAS) suggested that a large proportion of risk variants are located in non-coding regions and remain unexplained by current expression quantitative trait loci catalogs. Interpreting the causal effects of these genetic modifications is crucial but difficult owing to our limited knowledge of how regulatory elements function. Although several computational methods have been designed to prioritize regulatory variants that substantially impact human phenotypes, few of them achieve consistently high performance even when large-scale multi-omic data are integrated.

RESULTS

We propose a novel multi-task framework based on Bayesian deep neural networks, MtBNN, to quantify the deleterious impact of single nucleotide polymorphisms in non-coding genomic regions. With the high-efficiency provided by the multi-task Bayesian framework to integrate information from different sources, MtBNN is capable of extracting features from genomic sequences of large-scale chromatin-profiling data, such as chromatin accessibility and transcript factor binding affinities, and calculating the distribution of the probability that a non-coding variant disrupts regulatory activities. A series of comprehensive experiments show that MtBNN quantifies the functional impact of cis-regulatory variations with high accuracy, including expression quantitative trait locus, DNase I sensitivity quantitative trait locus and functional genetic variants located within ATAC-peaks that affect the accessibility of the corresponding peak and achieves significantly better performance than the existing methods. Moreover, MtBNN has applications in the discovery of potentially causal disease-associated single-nucleotide polymorphisms (SNPs), thus helping fine-map the GWAS SNPs.

AVAILABILITY AND IMPLEMENTATION

Code can be downloaded from https://github.com/Zoesgithub/MtBNN.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

Collapse

Yan F, Powell DR, Curtis DJ, Wong NC. From reads to insight: a hitchhiker's guide to ATAC-seq data analysis. Genome Biol 2020;21:22. [PMID: 32014034 PMCID: PMC6996192 DOI: 10.1186/s13059-020-1929-3] [Citation(s) in RCA: 204] [Impact Index Per Article: 51.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2019] [Accepted: 01/08/2020] [Indexed: 12/16/2022] Open

Guo Y, Zhou D, Nie R, Ruan X, Li W. DeepANF: A deep attentive neural framework with distributed representation for chromatin accessibility prediction. Neurocomputing 2020. [DOI: 10.1016/j.neucom.2019.10.091] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023]

Niu X, Yang K, Zhang G, Yang Z, Hu X. A Pretraining-Retraining Strategy of Deep Learning Improves Cell-Specific Enhancer Predictions. Front Genet 2020;10:1305. [PMID: 31969903 PMCID: PMC6960260 DOI: 10.3389/fgene.2019.01305] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2019] [Accepted: 11/26/2019] [Indexed: 01/22/2023] Open

Abstract

Deciphering the code of cis-regulatory element (CRE) is one of the core issues of today’s biology. Enhancers are distal CREs and play significant roles in gene transcriptional regulation. Although identifications of enhancer locations across the whole genome [discriminative enhancer predictions (DEP)] is necessary, it is more important to predict in which specific cell or tissue types, they will be activated and functional [tissue-specific enhancer predictions (TSEP)]. Although existing deep learning models achieved great successes in DEP, they cannot be directly employed in TSEP because a specific cell or tissue type only has a limited number of available enhancer samples for training. Here, we first adopted a reported deep learning architecture and then developed a novel training strategy named “pretraining-retraining strategy” (PRS) for TSEP by decomposing the whole training process into two successive stages: a pretraining stage is designed to train with the whole enhancer data for performing DEP, and a retraining strategy is then designed to train with tissue-specific enhancer samples based on the trained pretraining model for making TSEP. As a result, PRS is found to be valid for DEP with an AUC of 0.922 and a GM (geometric mean) of 0.696, when testing on a larger-scale FANTOM5 enhancer dataset via a five-fold cross-validation. Interestingly, based on the trained pretraining model, a new finding is that only additional twenty epochs are needed to complete the retraining process on testing 23 specific tissues or cell lines. For TSEP tasks, PRS achieved a mean GM of 0.806 which is significantly higher than 0.528 of gkm-SVM, an existing mainstream method for CRE predictions. Notably, PRS is further proven superior to other two state-of-the-art methods: DEEP and BiRen. In summary, PRS has employed useful ideas from the domain of transfer learning and is a reliable method for TSEPs.

Collapse

Henderson J, Ly V, Olichwier S, Chainani P, Liu Y, Soibam B. Accurate prediction of boundaries of high resolution topologically associated domains (TADs) in fruit flies using deep learning. Nucleic Acids Res 2020;47:e78. [PMID: 31049567 PMCID: PMC6648328 DOI: 10.1093/nar/gkz315] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2019] [Revised: 04/02/2019] [Accepted: 04/18/2019] [Indexed: 01/01/2023] Open

Su X, Xu J, Yin Y, Quan X, Zhang H. Antimicrobial peptide identification using multi-scale convolutional network. BMC Bioinformatics 2019;20:730. [PMID: 31870282 PMCID: PMC6929291 DOI: 10.1186/s12859-019-3327-y] [Citation(s) in RCA: 32] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2019] [Accepted: 12/16/2019] [Indexed: 01/14/2023] Open

Song S, Cui H, Chen S, Liu Q, Jiang R. EpiFIT: functional interpretation of transcription factors based on combination of sequence and epigenetic information. QUANTITATIVE BIOLOGY 2019. [DOI: 10.1007/s40484-019-0175-8] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]

Yin Q, Wu M, Liu Q, Lv H, Jiang R. DeepHistone: a deep learning approach to predicting histone modifications. BMC Genomics 2019;20:193. [PMID: 30967126 PMCID: PMC6456942 DOI: 10.1186/s12864-019-5489-4] [Citation(s) in RCA: 35] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open

Abstract

MOTIVATION

Quantitative detection of histone modifications has emerged in the recent years as a major means for understanding such biological processes as chromosome packaging, transcriptional activation, and DNA damage. However, high-throughput experimental techniques such as ChIP-seq are usually expensive and time-consuming, prohibiting the establishment of a histone modification landscape for hundreds of cell types across dozens of histone markers. These disadvantages have been appealing for computational methods to complement experimental approaches towards large-scale analysis of histone modifications.

RESULTS

We proposed a deep learning framework to integrate sequence information and chromatin accessibility data for the accurate prediction of modification sites specific to different histone markers. Our method, named DeepHistone, outperformed several baseline methods in a series of comprehensive validation experiments, not only within an epigenome but also across epigenomes. Besides, sequence signatures automatically extracted by our method was consistent with known transcription factor binding sites, thereby giving insights into regulatory signatures of histone modifications. As an application, our method was shown to be able to distinguish functional single nucleotide polymorphisms from their nearby genetic variants, thereby having the potential to be used for exploring functional implications of putative disease-associated genetic variants.

CONCLUSIONS

DeepHistone demonstrated the possibility of using a deep learning framework to integrate DNA sequence and experimental data for predicting epigenomic signals. With the state-of-the-art performance, DeepHistone was expected to shed light on a variety of epigenomic studies. DeepHistone is freely available in https://github.com/QijinYin/DeepHistone .

Collapse

Zou Q, Xing P, Wei L, Liu B. Gene2vec: gene subsequence embedding for prediction of mammalian N⁶-methyladenosine sites from mRNA. RNA (NEW YORK, N.Y.) 2019;25:205-218. [PMID: 30425123 PMCID: PMC6348985 DOI: 10.1261/rna.069112.118] [Citation(s) in RCA: 311] [Impact Index Per Article: 62.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/03/2018] [Accepted: 11/01/2018] [Indexed: 05/20/2023]

Xue L, Tang B, Chen W, Luo J. DeepT3: deep convolutional neural networks accurately identify Gram-negative bacterial type III secreted effectors using the N-terminal sequence. Bioinformatics 2018;35:2051-2057. [DOI: 10.1093/bioinformatics/bty931] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2018] [Revised: 10/22/2018] [Accepted: 11/07/2018] [Indexed: 11/12/2022] Open