1
|
Liu S, Cheng H, Ashraf J, Zhang Y, Wang Q, Lv L, He M, Song G, Zuo D. Interpretation of convolutional neural networks reveals crucial sequence features involving in transcription during fiber development. BMC Bioinformatics 2022; 23:91. [PMID: 35291940 PMCID: PMC8922751 DOI: 10.1186/s12859-022-04619-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2021] [Accepted: 02/22/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Upland cotton provides the most natural fiber in the world. During fiber development, the quality and yield of fiber were influenced by gene transcription. Revealing sequence features related to transcription has a profound impact on cotton molecular breeding. We applied convolutional neural networks to predict gene expression status based on the sequences of gene transcription start regions. After that, a gradient-based interpretation and an N-adjusted kernel transformation were implemented to extract sequence features contributing to transcription. RESULTS Our models had approximate 80% accuracies, and the area under the receiver operating characteristic curve reached over 0.85. Gradient-based interpretation revealed 5' untranslated region contributed to gene transcription. Furthermore, 6 DOF binding motifs and 4 transcription activator binding motifs were obtained by N-adjusted kernel-motif transformation from models in three developmental stages. Apart from 10 general motifs, 3 DOF5.1 genes were also detected. In silico analysis about these motifs' binding proteins implied their potential functions in fiber formation. Besides, we also found some novel motifs in plants as important sequence features for transcription. CONCLUSIONS In conclusion, the N-adjusted kernel transformation method could interpret convolutional neural networks and reveal important sequence features related to transcription during fiber development. Potential functions of motifs interpreted from convolutional neural networks could be validated by further wet-lab experiments and applied in cotton molecular breeding.
Collapse
Affiliation(s)
- Shang Liu
- Institute of Cotton Research of Chinese Academy of Agricultural Sciences, Anyang, 455000, China.,Zhengzhou Research Base, State Key Laboratory of Cotton Biology, Zhengzhou University, Zhengzhou, 450001, China
| | - Hailiang Cheng
- Institute of Cotton Research of Chinese Academy of Agricultural Sciences, Anyang, 455000, China.,Zhengzhou Research Base, State Key Laboratory of Cotton Biology, Zhengzhou University, Zhengzhou, 450001, China
| | - Javaria Ashraf
- Institute of Cotton Research of Chinese Academy of Agricultural Sciences, Anyang, 455000, China.,Department of Plant Breeding and Genetics, University College of Agriculture and Environmental Sciences, The Islamia University of Bahawalpur, Punjab, 63100, Pakistan
| | - Youping Zhang
- Institute of Cotton Research of Chinese Academy of Agricultural Sciences, Anyang, 455000, China.,Zhengzhou Research Base, State Key Laboratory of Cotton Biology, Zhengzhou University, Zhengzhou, 450001, China
| | - Qiaolian Wang
- Institute of Cotton Research of Chinese Academy of Agricultural Sciences, Anyang, 455000, China.,Zhengzhou Research Base, State Key Laboratory of Cotton Biology, Zhengzhou University, Zhengzhou, 450001, China
| | - Limin Lv
- Institute of Cotton Research of Chinese Academy of Agricultural Sciences, Anyang, 455000, China.,Zhengzhou Research Base, State Key Laboratory of Cotton Biology, Zhengzhou University, Zhengzhou, 450001, China
| | - Man He
- Institute of Cotton Research of Chinese Academy of Agricultural Sciences, Anyang, 455000, China
| | - Guoli Song
- Institute of Cotton Research of Chinese Academy of Agricultural Sciences, Anyang, 455000, China. .,Zhengzhou Research Base, State Key Laboratory of Cotton Biology, Zhengzhou University, Zhengzhou, 450001, China.
| | - Dongyun Zuo
- Institute of Cotton Research of Chinese Academy of Agricultural Sciences, Anyang, 455000, China. .,Zhengzhou Research Base, State Key Laboratory of Cotton Biology, Zhengzhou University, Zhengzhou, 450001, China.
| |
Collapse
|
3
|
Inácio de Carvalho V, de Carvalho M, Branscum AJ. Nonparametric Bayesian covariate‐adjusted estimation of the Youden index. Biometrics 2017; 73:1279-1288. [DOI: 10.1111/biom.12686] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2016] [Revised: 01/01/2017] [Accepted: 01/01/2017] [Indexed: 11/29/2022]
Affiliation(s)
| | | | - Adam J. Branscum
- College of Public Health and Human SciencesOregon State UniversityOregonU.S.A
| |
Collapse
|