1
|
Mittal A, Ali SE, Mathews DH. Using the RNAstructure Software Package to Predict Conserved RNA Structures. Curr Protoc 2024; 4:e70054. [PMID: 39540715 DOI: 10.1002/cpz1.70054] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2024]
Abstract
The structures of many non-coding RNAs (ncRNA) are conserved by evolution to a greater extent than their sequences. By predicting the conserved structure of two or more homologous sequences, the accuracy of secondary structure prediction can be improved as compared to structure prediction for a single sequence. Here, we provide protocols for the use of four programs in the RNAstructure suite to predict conserved structures: Multilign, TurboFold, Dynalign, and PARTS. TurboFold iteratively aligns multiple homologous sequences and estimates the pairing probabilities for the conserved structure. Dynalign, PARTS, and Multilign are dynamic programming algorithms that simultaneously align sequences and identify the common secondary structure. Dynalign uses a pair of homologs and finds the lowest free energy common structure. PARTS uses a pair of homologs and estimates pairing probabilities from the base pairing probabilities estimated for each sequence. Multilign uses two or more homologs and finds the lowest free energy common structure using multiple pairwise calculations with Dynalign. It scales linearly with the number of sequences. We outline the strengths of each program. These programs can be run through web servers, on the command line, or with graphical user interfaces. © 2024 Wiley Periodicals LLC. Basic Protocol 1: Predicting a structure conserved in three or more sequences with the RNAstructure web server Basic Protocol 2: Predicting a structure conserved in two sequences with the RNAstructure web server Alternative Protocol 1: Predicting a structure conserved in multiple sequences in the RNAstructure graphical user interface Alternative Protocol 2: Predicting a structure conserved in two sequences with Dynalign in the RNAstructure graphical user interface Alternative Protocol 3: Running TurboFold on the command line.
Collapse
Affiliation(s)
- Abhinav Mittal
- Department of Biochemistry & Biophysics and Center for RNA Biology, University of Rochester Medical Center, Rochester, New York
| | - Sara E Ali
- Department of Biochemistry & Biophysics and Center for RNA Biology, University of Rochester Medical Center, Rochester, New York
| | - David H Mathews
- Department of Biochemistry & Biophysics and Center for RNA Biology, University of Rochester Medical Center, Rochester, New York
| |
Collapse
|
2
|
Zeng J, Song K, Wang J, Wen H, Zhou J, Ni T, Lu H, Yu Y. Characterization and optimization of 5´ untranslated region containing poly-adenine tracts in Kluyveromyces marxianus using machine-learning model. Microb Cell Fact 2024; 23:7. [PMID: 38172836 PMCID: PMC10763412 DOI: 10.1186/s12934-023-02271-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2023] [Accepted: 12/12/2023] [Indexed: 01/05/2024] Open
Abstract
BACKGROUND The 5´ untranslated region (5´ UTR) plays a key role in regulating translation efficiency and mRNA stability, making it a favored target in genetic engineering and synthetic biology. A common feature found in the 5´ UTR is the poly-adenine (poly(A)) tract. However, the effect of 5´ UTR poly(A) on protein production remains controversial. Machine-learning models are powerful tools for explaining the complex contributions of features, but models incorporating features of 5´ UTR poly(A) are currently lacking. Thus, our goal is to construct such a model, using natural 5´ UTRs from Kluyveromyces marxianus, a promising cell factory for producing heterologous proteins. RESULTS We constructed a mini-library consisting of 207 5´ UTRs harboring poly(A) and 34 5´ UTRs without poly(A) from K. marxianus. The effects of each 5´ UTR on the production of a GFP reporter were evaluated individually in vivo, and the resulting protein abundance spanned an approximately 450-fold range throughout. The data were used to train a multi-layer perceptron neural network (MLP-NN) model that incorporated the length and position of poly(A) as features. The model exhibited good performance in predicting protein abundance (average R2 = 0.7290). The model suggests that the length of poly(A) is negatively correlated with protein production, whereas poly(A) located between 10 and 30 nt upstream of the start codon (AUG) exhibits a weak positive effect on protein abundance. Using the model as guidance, the deletion or reduction of poly(A) upstream of 30 nt preceding AUG tended to improve the production of GFP and a feruloyl esterase. Deletions of poly(A) showed inconsistent effects on mRNA levels, suggesting that poly(A) represses protein production either with or without reducing mRNA levels. CONCLUSION The effects of poly(A) on protein production depend on its length and position. Integrating poly(A) features into machine-learning models improves simulation accuracy. Deleting or reducing poly(A) upstream of 30 nt preceding AUG tends to enhance protein production. This optimization strategy can be applied to enhance the yield of K. marxianus and other microbial cell factories.
Collapse
Affiliation(s)
- Junyuan Zeng
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Fudan University, Shanghai, China
- Shanghai Engineering Research Center of Industrial Microorganisms, Shanghai, 200438, China
| | - Kunfeng Song
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Fudan University, Shanghai, China
- Shanghai Engineering Research Center of Industrial Microorganisms, Shanghai, 200438, China
| | - Jingqi Wang
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Fudan University, Shanghai, China
- Shanghai Engineering Research Center of Industrial Microorganisms, Shanghai, 200438, China
| | - Haimei Wen
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Fudan University, Shanghai, China
- Shanghai Engineering Research Center of Industrial Microorganisms, Shanghai, 200438, China
| | - Jungang Zhou
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Fudan University, Shanghai, China
- Shanghai Engineering Research Center of Industrial Microorganisms, Shanghai, 200438, China
| | - Ting Ni
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Fudan University, Shanghai, China
- Shanghai Engineering Research Center of Industrial Microorganisms, Shanghai, 200438, China
| | - Hong Lu
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Fudan University, Shanghai, China
- Shanghai Engineering Research Center of Industrial Microorganisms, Shanghai, 200438, China
| | - Yao Yu
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Fudan University, Shanghai, China.
- Shanghai Engineering Research Center of Industrial Microorganisms, Shanghai, 200438, China.
| |
Collapse
|
3
|
Shang R, Lai EC. Parameters of clustered suboptimal miRNA biogenesis. Proc Natl Acad Sci U S A 2023; 120:e2306727120. [PMID: 37788316 PMCID: PMC10576077 DOI: 10.1073/pnas.2306727120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2023] [Accepted: 08/23/2023] [Indexed: 10/05/2023] Open
Abstract
The nuclear cleavage of a suboptimal primary miRNA hairpin by the Drosha/DGCR8 complex ("Microprocessor") can be enhanced by an optimal miRNA neighbor, a phenomenon termed cluster assistance. Several features and biological impacts of this new layer of miRNA regulation are not fully known. Here, we elucidate the parameters of cluster assistance of a suboptimal miRNA and also reveal competitive interactions amongst optimal miRNAs within a cluster. We exploit cluster assistance as a functional assay for suboptimal processing and use this to invalidate putative suboptimal substrates, as well as identify a "solo" suboptimal miRNA. Finally, we report complexity in how specific mutations might affect the biogenesis of clustered miRNAs in disease contexts. This includes how an operon context can buffer the effect of a deleterious processing variant, but reciprocally how a point mutation can have a nonautonomous effect to impair the biogenesis of a clustered, suboptimal, neighbor. These data expand our knowledge regarding regulated miRNA biogenesis in humans and represent a functional assay for empirical definition of suboptimal Microprocessor substrates.
Collapse
Affiliation(s)
- Renfu Shang
- Department of Developmental Biology, Sloan Kettering Institute, New York, NY10065
| | - Eric C. Lai
- Department of Developmental Biology, Sloan Kettering Institute, New York, NY10065
| |
Collapse
|