Abstract

Article Figures and data Abstract Editor's evaluation Introduction Results Discussion Materials and methods Data availability References Decision letter Author response Article and author information Metrics Abstract Recently, aCPSF1 was reported to function as the long-sought global transcription termination factor of archaea; however, the working mechanism remains elusive. This work, through analyzing transcript-3′end-sequencing data of Methanococcus maripaludis, found genome-wide positive correlations of both the terminator uridine(U)-tract and aCPSF1 with hierarchical transcription termination efficacies (TTEs). In vitro assays determined that aCPSF1 specifically binds to the terminator U-tract with U-tract number-related binding affinity, and in vivo assays demonstrated the two elements are indispensable in dictating high TTEs, revealing that aCPSF1 and the terminator U-tract cooperatively determine high TTEs. The N-terminal KH domains equip aCPSF1 with specific-binding capacity to terminator U-tract and the aCPSF1-terminator U-tract cooperation; while the nuclease activity of aCPSF1 was also required for TTEs. aCPSF1 also guarantees the terminations of transcripts with weak intrinsic terminator signals. aCPSF1 orthologs from Lokiarchaeota and Thaumarchaeota exhibited similar U-tract cooperation in dictating TTEs. Therefore, aCPSF1 and the intrinsic U-rich terminator could work in a noteworthy two-in-one termination mode in archaea, which may be widely employed by archaeal phyla; using one trans-action factor to recognize U-rich terminator signal and cleave transcript 3′-end, the archaeal aCPSF1-dependent transcription termination may represent a simplified archetypal mode of the eukaryotic RNA polymerase II termination machinery. Editor's evaluation The process of termination in Archael species is poorly defined despite a high relation to eukaryotes and a shared homology of termination factors. In this study, the authors defined key features that drive termination to include an upstream uridine track that is bound by the CPSF ribonuclease through KH RNA binding domains not present in the CSPF counterparts. This work provides fundamental mechanistic insight into the conserved manner of termination in the archael species. https://doi.org/10.7554/eLife.70464.sa0 Decision letter Reviews on Sciety eLife's review process Introduction Transcription termination is an essential and highly regulated process in all forms of life, which not only determines the accurate 3′-end boundary of a transcript and transcription-related regulatory events, but is also important in shaping programmed transcriptomes of living organisms (Peters et al., 2011; Porrua et al., 2016; Porrua and Libri, 2015; Ray-Soni et al., 2016; Yue et al., 2020). Highly controlled transcription termination, which prevents read-through resulted undesired increases in downstream coding regions and the accumulation of antisense transcripts, can be particularly important in prokaryotes because of their densely packed genomes (Peters et al., 2012; Yue et al., 2020). Research has indicated that bacteria primarily employ two transcription termination mechanisms, Rho-dependent and -independent (intrinsic). In the Rho-dependent mechanism, the RNA translocase Rho, via recognizing a cytosine-rich sequence in nascent transcripts, dissociates the processive transcription elongation complex (TEC) based on its ATPase activity. In contrast, the intrinsic termination merely depends on a nascent RNA structure with a 7–8 base-paired hairpin followed by a run of uridines (Us) (Gusarov and Nudler, 1999; Peters et al., 2011; Porrua et al., 2016; Ray-Soni et al., 2016). A bacteria-like intrinsic termination mechanism that depends on a U-stretch is also found in the eukaryotic RNA polymerase (RNAP) III (Nielsen et al., 2013; Orioli et al., 2011). Distinctively, transcription termination of the eukaryotic RNAP II, which transcribes mRNAs and non-coding RNAs, usually involves a transcript 3′-end processing event, in which the cleavage and polyadenylation factor complex (CPF/CPSF), under assistance of the accessory cleavage factors CFIA and CFIB, recognizes the termination signal, a poly(A) site at transcript 3′-end. Following recognition, the CPF/CPSF complex cleaves downstream the termination signal of the nascent RNA and polyadenylates at the cleaved 3′-end for mRNA maturation, and triggers RNAP II dissociation for transcription termination (Baejen et al., 2017; Eaton et al., 2018; Grzechnik et al., 2015; Kim et al., 2004; Kuehner et al., 2011; Larochelle et al., 2018; Porrua et al., 2016). Compared with bacteria and eukaryotes, knowledge of the transcription termination mechanisms in the third form of life, archaea, is very limited (Dar et al., 2016a; Maier and Marchfelder, 2019). Archaea represent a primary domain of cellular life and phylogenetically are more closely related to eukaryotes than bacteria (Eme et al., 2017; Williams et al., 2020; Zaremba-Niedzwiedzka et al., 2017). Specifically, archaea employ a eukaryotic RNAP II homolog, archaeal RNAP (aRNAP) (Werner and Grohmann, 2011), but have compact genomes with short intergenic regions (IGRs) and co-transcribed polycistronic operons, highlighting the importance of a controllable transcription termination. Earlier studies have suggested that, similar to the bacterial intrinsic termination, transcription termination of aRNAP may depend on a short U-rich sequence at the transcript 3′-end but with no strict requirements of an upstream hairpin structure (Hirtreiter et al., 2010; Maier and Marchfelder, 2019; Santangelo et al., 2009; Santangelo and Reeve, 2006; Spitalny and Thomm, 2008; Thomm et al., 1993). Recently, Term-seq, an approach that enables accurate mapping of all exposed RNA 3′-ends in prokaryotes and determines the transcription termination sites (TTSs) at the genome-wide level in representative bacteria and archaea (Dar et al., 2016b; Porrua et al., 2016; Yue et al., 2020), has been developed. Through Term-seq, U-rich sequences preceding TTSs, without preceding hairpin structures, were identified to be overrepresented in the transcripts of four representative archaeal species: Methanosarcina mazei, Sulfolobus acidocaldarius, Haloferax volcanii, and Methanococcus maripaludis (Berkemer et al., 2020; Dar et al., 2016b; Yue et al., 2020). Therefore, the U-tract sequences at the transcript 3′-ends are assumed to be the intrinsic termination signals of archaea; in addition, without strictly requiring an upstream hairpin structure in most of the archaeal terminator sequences suggests a distinct intrinsic termination mechanism of archaea from that of bacteria (Maier and Marchfelder, 2019). The protein factors that mediate archaeal transcription termination have been reported in recent years. The Thermococcus kodakarensis Eta (euryarchaeal termination activity) has been reported to transiently engage the TEC and release the stalled TEC from damaged DNA lesions, resembling the bacterial Mfd termination factor and functioning specifically in response to DNA damage (Walker et al., 2017). Most recently, aCPSF1, also named FttA (Factor that terminates transcription in Archaea), has been demonstrated as a transcription termination factor of archaea because it could competitively disrupt the processive TEC at normal transcription elongation rate and implement a kinetically competitive termination dependent on both the stalk domain of RNAP and the transcription elongation factor Spt4/5 in vitro (Sanders et al., 2020). aCPSF1 is affiliated within the β-CASP ribonuclease family, and is ubiquitously distributed in all archaeal phyla (Li et al., 2021; Phung et al., 2013; Yue et al., 2020). Initially, aCPSF1 was assumed to function in RNA maturation and turnover of Archaea (Clouet-d’Orval et al., 2015), and endoribonuclease activity was identified for three aCPSF1 orthologs in vitro (Levy et al., 2011; Phung et al., 2013; Silva et al., 2011), with one also exhibiting 5′–3′ exoribonuclease activity (Phung et al., 2013). Our recent study reported that aCPSF1, depending on its ribonuclease activity, controls in vivo transcription termination at the genome-wide level and ensures programmed transcriptome in M. maripaludis, and its orthologs from the distant relatives, Lokiarchaeota and Thaumarchaeota, implement the same function in termination (Yue et al., 2020). However, although the in vitro enzymatic assay determined that aCPSF1 primarily and endoribonucleolytically cleaves downstream of a U-rich motif that precedes the identified TTSs (Yue et al., 2020), some open questions remain, such as (i) whether the aCPSF1-dependent and the U-tract terminator-based intrinsic terminations are two independent mechanisms, or the two in fact work cooperatively in archaea, or (ii) if the aCPSF1-dependent termination simply serves as a backup mechanism for the genes/operons containing less-efficient intrinsic termination signals as assumed (Sanders et al., 2020; Wenck and Santangelo, 2020); (iii) what the exact sequence motifs that aCPSF1 recognizes are, and (iv) whether, like the eukaryotic multiple subunit composed termination complex, aCPSF1 also requires others to recognize the termination signals. In the present work, via an intensive analysis of the Term-seq data in M. maripaludis, we comprehensively evaluated the correlations of the transcription termination efficacies (TTEs) for all identified TTSs with both the cis-element U-tract terminator and the trans-action termination factor aCPSF1. Further, in combination with molecular and genetic validations, we determined that aCPSF1 and the terminator U-tract cooperatively dictate high TTEs. The in vitro and in vivo assays together demonstrated that the N-terminal K homolog (KH) domains of aCPSF1 specifically recognize and bind to the terminator U-tract. Therefore, the archaeal termination factor aCPSF1 could accomplish the U-tract terminator recognition and transcript 3′-end cleavage by itself, and the factor-dependent transcription termination may be the primary mechanism used by archaea. Results A positive correlation is found between the TTEs and the terminator four-uridine (U4) tract numbers preceding TTSs in M. maripaludis In an attempt to evaluate the specific termination signals recognized by the termination factor aCPSF1 and its role in dictating the in vivo TTEs, we intensively reanalyzed the Term-seq data of M. maripaludis obtained previously (Yue et al., 2020). By following a stringent filtration workflow in TTS definition and to preclude identifying sites derived from stale RNA processing or degradation products, TTS searching was restricted within 200 nt downstream of the stop codon of a gene. This served to maximally enrich the authentic TTSs near the gene 3′-ends and only sites that appeared in both biological replicates with high coverage (see Materials and methods) were selected. In total, 2357 TTSs were obtained, including the previously identified 998 primary and 1,359 newly identified secondary TTSs (Supplementary file 1). Multiple consecutive TTSs were found in >50% of transcription units (TUs) of M. maripaludis (Figure 1—figure supplement 1), which could produce multi-isoforms of a transcript with varying 3′-UTRs, as found in M. mazei and S. acidocaldarius (Dar et al., 2016a). Nevertheless, compared with the primary TTSs, which have the highest Term-seq reads among all identified 3′-end sites in each TU, much lower median read abundances, TTEs, and motif scores were found for the secondary TTSs (Figure 1—figure supplement 2). This indicates that TUs are mainly terminated at the primary TTSs, which were therefore used for further investigation. Sequence analysis of the 961 primary TTSs of coding TUs found a featured terminator motif, a 23 nt U-tract with four consecutive uridine nucleotides (U4) that are most proximal to the TTS having the highest matching. To evaluate the contribution of the U-tract sequence preceding TTSs to transcription termination in M. maripaludis, we first defined and calculated TTE of each TU. After inspecting the genome-wide Term-seq mapping file, a dramatic decreasing pattern was observed in the mapping reads at four nucleotides (nts) between sites +2 and −2 flanking TTS (−1 nt) in the majority of the primary TTSs (Figure 1A and Figure 1—figure supplement 3). This indicates that transcription appears to be terminated most frequently at the four nucleotides, which was therefore defined as the TTS quadruplet. Through pair-wisely comparing the reads of each nt in a TTS quadruplet, the maximal abundance decrease was found from sites −2 nt (upstream) to +2 nt (downstream) flanking most TTSs (Figure 1B). Thus, the read ratio between −2 and +2 nts was used as the measurement of TTE, which was calculated based on “TTE = 1−[+2] / [−2]”, where [−2] and [+2] represent the read abundances at −2 nt and +2 nt in Term-seq data, respectively. Figure 1 with 4 supplements see all Download asset Open asset A positive correlation is observed between the terminator U4-tract numbers and the TTEs among the TUs of M.maripaludis. (A) A representative Term-seq map of MMP0020 showing a dramatical decreasing pattern of sequencing reads at four nucleotides that flank the identified transcription termination site (TTS, -1 site indicated by bent arrow). The magnified mapping region (dotted red frame) shows reads dramatically decreasing from −2 (two nts upstream) to +2 (two nts downstream) of the TTS. The chromosome locations of the genes are indicated at the top, and the Term-seq read heights are shown in brackets. (B) Box-plot diagrams showing the TTE statistics of the 998 transcripts, which were calculated based on the reads ratio of nts +1 to −1 ([+1]/[−1]), and that of nts +2 to −2 ([+2]/[−2]) respectively up- and down-stream of the primary TTSs. Between the upper and lower lines are TTEs of 50% of transcripts, and the middle line represents the TTE median. (C) Logo representations of the terminator motif signatures in three groups of transcripts with different TTEs ( > 60%, > 30% and < 60%, < 30%). The transcript numbers of each group are indicated inside parentheses. The correlation of TTEs with the terminator U4-tract numbers was analyzed using Wilcox test, and the P values between Groups I and II, I and III, II and III were 3.4e-12, 2.22e-16 and 2.1e-5, respectively. (D) Box-plot diagrams showing the statistics of TTE values among the four groups of terminators that carry various numbers of U-tracts. The diagram representations are the same as those in (B). The statistical significance for the TTEs of the four groups analyzed by Wilcoxon rank sum test are shown in Supplementary file 4c. Figure 1—source data 1 Includes the statistic source data of Figure 1, Figure 1—figure supplements 1 and 2. https://cdn.elifesciences.org/articles/70464/elife-70464-fig1-data1-v1.xlsx Download elife-70464-fig1-data1-v1.xlsx Next, all identified TUs were ranked into three hierarchical groups: high TTE (> 60%), medium TTE (30% < TTE < 60%), and low TTE (< 30%) groups. Statistically, approximately 32.5%, 44%, and 23.5% of TUs fell in the high, medium, and low TTE groups, respectively (Figure 1C). Sequence motifs, generated from −30 nt until +5 nt flanking TTSs by Weblogo, showed characteristic U-rich tracts, with each containing four consecutive uridine nucleotides (U4) preceding the TTSs among the overrepresented TUs in all the three groups (Figure 1C). Noticeably, a positive correlation was found between the TTE and the terminator U4-tract numbers (P<2.2e-16, spearman’s cor = 0.33): two or more than two U4-tracts were found overrepresented in the high TTE group, while the U4-tract was underrepresented in the low TTE group (Figure 1C). To further evaluate the correlation between the U4-tracts and the TTEs, we first classified all of the defined TUs into four groups based on the U4-tract numbers preceding the primary TTSs (Figure 1—figure supplement 4A), and then generated the sequence motif (Figure 1—figure supplement 4B) and statistically calculated the TTE distribution (Figure 1D) in each group. Similarly, a marked positive correlation between the U4-tract numbers and the TTEs was also observed, such as TUs in the groups of >2, 2, 1, and 0 U4-tracts had the median TTEs of 55.5%, 52.1%, 43.3%, and 30%, respectively (Figure 1D), which demonstrated that TUs with more U4-tract numbers had higher TTEs. These analyses suggest that the U4-tract preceding the TTS could be a key signal (motif) in dictating or affecting RNAP to pause and triggering transcription termination, and more U4-tracts could result in higher TTEs in M. maripaludis. Concurrence of the terminator U-tract and termination factor aCPSF1 in dictating high TTEs Based on our recent finding that aCPSF1 functions as the archaeal general transcription termination factor (Yue et al., 2020), we quantitatively compared the Term-seq identified TTEs in the wild-type (WT) and aCPSF1 expression depleted strain (▽aCPSF1, a mutant retaining a residual 20% aCPSF1 abundance compared to WT at 22 °C), and found an average 50% reduction in the TTEs of primary TTSs in ▽aCPSF1 (Figure 2A and Figure 2—figure supplement 1). Further, to quantify the contribution of aCPSF1 to TTE, the aCPSF1 dependency of a TU in transcription termination was calculated based on its TTS quadruplet read changes in ▽aCPSF1 compared to WT using the following formula: TTSQuadrupletReadRatio(TQRR) =S2[+2/-2]▽aCPSF1[+2/-2]. That TUs having TQRR <1, that is, the read ratio between +2 and −2 nt in the TTS quadruplet is reduced due to aCPSF1 depletion, was identified as aCPSF1-dependent termination. Unexpectedly, we found that 91.6% (880/961) coding TUs have TQRR <1 (Figure 1—figure supplement 4A and Supplementary file 2), and observed an approximately linear correlation between TQRR and TTE for all studied 961 coding TUs (Figure 2B). This finding indicates that the majority of TUs were terminated in an aCPSF1-dependent manner, and the higher TTE of a TU, the more dependency of aCPSF1. Therefore, aCPSF1 could display a positive correlation with TTEs as well as the terminator U4-tracts and play a key role in dictating high TTEs at the genome-wide level. Figure 2 with 3 supplements see all Download asset Open asset Co-occurrence of aCPSF1 and the terminator U4-tract is correlated with the genome-wide TTEs of M. maripaludis. (A) Visualized Term-seq read maps of the representative genes, MMP0511 (top) and MMP0760 (bottom), show sharper reads decreasing between the −2 and +2 nts (dotted frame) respectively down- and up-stream of the TTSs (-1) in the wild-type (WT) strain (S2) than in the aCPSF1 depletion mutant (▽aCPSF1). The chromosome locations of the genes are indicated at the top. The bent arrow indicates the Term-seq identified TTS. TQRR represents the TTS quadruplet read ratio of a TU in WT (S2) to that in ▽aCPSF1, with the lower values representing a higher aCPSF1 dependency of a TU in transcription termination. The mapping read heights are shown inside the brackets. The TTE is calculated as above. (B) A linear correlation is observed between the TQRRs and TTEs of 961 protein coding TUs. (C) Logo representations of the terminator motif signatures are shown for highly aCPSF1-dependent (TQRR ≤60%), moderately aCPSF1-dependent (60%< TQRR < 100%) and non-dependent (TQRR ≥100%) groups. The TU numbers of each group are shown in parentheses. (D) Box-plot diagrams showing the TQRR (aCPSF1 dependency) statistics of the terminators carrying >2, 2, 1, and 0 U4-tracts. Between the upper and lower lines are TQRRs of 50% of transcripts, and the middle line represents the TQRR median. The statistical significance for the TQRRs of the four groups analyzed by Wilcoxon rank sum test are shown in Supplementary file 4d. Figure 2—source data 1 Includes the statistic source data of Figure 2, Figure 2—figure supplements 1 and 2. https://cdn.elifesciences.org/articles/70464/elife-70464-fig2-data1-v1.xlsx Download elife-70464-fig2-data1-v1.xlsx Subsequently, we explored the cis-elements that may determine the aCPSF1-dependent TTEs, that is, the cis-elements recognized by aCPSF1, through statistically analyzing the correlation between the aCPSF1 dependency of TTEs and the presence of sequence motifs preceding TTSs of the coding TUs. We classified the 961 TUs into three groups based on the TQRRs: the highly aCPSF1-dependent (TQRR ≤0.6), moderately aCPSF1-dependent (0.6< TQRR < 1) and non-aCPSF1-dependent (TQRR ≥1) groups, and generated the TTS preceding sequence motif for each group using Weblogo. Interestingly, we found not only that 29.3% (282/961) and 62.2% (597/961) of TUs belonged to the highly and moderately aCPSF1-dependent groups respectively, and only 8.4% (81/961) of TUs belonged to the aCPSF1-independent group, but also a significant positive correlation between the aCPSF1-dependency and the numbers of U4-tract preceding TTSs, namely, the higher aCPSF1 dependency, the more featured U4-tracts of the TU groups (Figure 2C). Additionally, we evaluated the relationship between the aCPSF1-dependency and the above four U4-tract TU groups analyzed in Figure 1D. We found that 94.5% (736/779) of TUs in the TU groups with ≥1 U4-tracts have a TQRR <1 (Figure 1—figure supplement 4A), and TU groups with ≥2, 2, 1, and 0 U4-tracts had median TQRRs of 0.655, 0.67, 0.76, and 0.92, respectively. These findings indicate that the majority of TUs with U4-tract depend on aCPSF1 for termination and the TU groups containing more U4-tracts have lower median TQRR values, namely, higher aCPSF1-dependency (Figure 2D). Noteworthily, even 79.1% (144/182) of TUs with 0 U4-tract had a TQRR <1 (Figure 1—figure supplement 4A), suggesting that the transcription of these TUs with weak terminator can also be terminated under the assistance of aCPSF1. Additionally, among the 8.4% (81/961) of TUs that fell into the aCPSF1-independent group, only 3.95% (38 of 961) had 0 U4-tract (Figure 1—figure supplement 4A), suggesting that these very few TUs may be terminated by mechanisms independent of both aCPSF1 and U4-tract terminator, or that the TTSs identified for these TUs are potential RNA processing sites derived from stale RNA processing or degradation products. Similar transcriptional termination features were also found in the non-coding RNAs as follows: (i) a shorter U-tract terminator motif (U-tract) preceding the TTSs (Figure 2—figure supplement 2A and Supplementary file 3), (ii) a positive correlation between the TTE and the terminator U-tract length (Figure 2—figure supplement 2B), and (iii) a linear correlation between TQRRs and TTSs (Figure 2—figure supplement 2C). Consistently, through querying the transcription pattern of two representative noncoding RNAs, we found prolonged transcript 3′-ends (Figure 2—figure supplement 3A) and detected transcription readthrough using Northern blot (Figure 2—figure supplement 3B). These findings indicate that transcription termination of non-coding RNAs could resemble that of coding TUs and depend on both the U-tract terminator and termination factor aCPSF1 as well. Collectively, these results indicate that both the terminator cis-element U4-tracts and the trans-action termination factor aCPSF1 exhibit high positive correlation with the TTEs, and the two appear to cooperatively dictate high TTEs at a genome-wide level in vivo. aCPSF1 specifically binds to RNAs embedding the terminator U4-tract sequence in vitro The collaboration of aCPSF1 and the terminator U-tract in dictating TTE suggests that this termination factor may specifically recognize the terminator U-tract motif embedded in the nascent transcript 3′-end to dictate archaeal transcription termination. To test this hypothesis, we first assayed the binding ability of aCPSF1 to three synthetic RNAs that contain the terminator U-tract sequences of the transcripts MMP0901, MMP1149, and MMP1100, which were determined to be cleaved by the recombinant aCPSF1 in our previous study (Yue et al., 2020). An RNA fragment with the transcript 3′-end sequence of MMP1697 lacking a U-tract was included as a control. Using RNA electrophoretic mobility shift assay (rEMSA), shifted protein–RNA complex bands could be observed in the three U-tract containing RNAs, but not in that without U-tract at the same concentrations of aCPSF1 (Figure 3—figure supplement 1). Next, 12 additional RNA fragments in a consensus 36 nt length and of transcript 3′-end sequences of genes listed in Figure 3, from 30 nts upstream to 5 nts downstream of the TTS, were used to compare the binding ability of aCPSF1. These sequences were derived from six transcripts embedding ≥2 U4-tracts (Figure 3A), three transcripts embedding 1 U4-tract (Figure 3B) and three with no (0) U4-tract (Figure 3C), respectively. The rEMSA results indicated that aCPSF1 exhibited the strongest binding to those with ≥2 U4-tracts, an average ~5 fold weaker binding to those with 1 U4-tract and weakest binding to those with 0 U-tract (Figure 3 and Figure 3—figure supplement 2). Supportively, through surface plasmon resonance (SPR) assay using the same concentrations of aCPSF1, the highest and lowest resonance units (RUs) were determined for the RNA containing the longest U-tract from the MMP0400 3′-end and that carrying the shortest U-tract from MMP1406 3′-end, respectively (Figure 3—figure supplement 3). Therefore, these results demonstrated that aCPSF1 specifically recognizes the transcripts with U-tract sequences and binds preferentially to the transcripts carrying more U4-tracts at the 3′-ends. Figure 3 with 3 supplements see all Download asset Open asset Binding specificity of aCPSF1 to RNAs carrying different numbers of U4-tracts determined by rEMSA assays. RNAs with a consensus length of 36 nt derived from the indicated gene terminators that carry ≥2 U4-tracts (A), 1 U4-tract (B), and 0 U4-tract (C) were used as the binding substrates. RNA sequences are shown in the top panels with red letters indicating Term-seq identified TTSs. The gradient concentrations of aCPSF1 used in the binding reactions are indicated at the top of gels. Detailed binding procedure is described in the Materials and methods section. The arrows and red asterisks indicate the free RNA substrates and the shifted RNA-aCPSF1 complexes, respectively. The binding assay for each RNA substrate was performed in triplicate. Equilibrium dissociation constants (Kd) were calculated from the binding curves based on the quantification of unbound and bound substrates, and the average Kd and standard deviations are shown. Figure 3—source data 1 Includes the gel source data of Figure 3. https://cdn.elifesciences.org/articles/70464/elife-70464-fig3-data1-v1.zip Download elife-70464-fig3-data1-v1.zip Next, the minimal RNA length and U-tract sequence required for aCPSF1 binding were investigated. The 36 nt RNA sequences embedding ≥2 U4-tracts from the transcript 3′-ends of MMP0204 and MMP0400, were sequentially truncated by six nts from the 5′-end to generate 30 nt, 24 nt, and 18 nt RNA substrates. The rEMSA results indicated that aCPSF1 bound to the 36 nt, 30 nt and 24 nt RNAs with similar affinity, but had an average ~2 fold reduced affinity to the RNA of 18 nt (Figure 4A and Figure 4—figure supplements 1A and 2). To confirm the role of the U-tracts in determining the binding specificity of aCPSF1, base mutation was performed on either one U-tract (18nt-M1 and 18nt-M2) or both U-tracts (18nt-M3) in the 18 nt RNAs. The rEMSA results indicated that mutation of either one U4-tract or both U-tracts remarkably reduced the binding ability of aCPSF1 to the RNA substrate (Figure 4B and Figure 4—figure supplements 1B and 2). Reciprocally, by mutating two As to Us to increase 1 to 2 U-tracts on the RNA with the MMP0229 3′-end sequence (T0229-18nt and T0229-18nt-M1, respectively), the binding affinity of aCPSF1 was notably increased compared to T0229-18nt, whereas mutation of the four Us to four As at T0229-18 nt to obtain an RNA sequence lacking a U-tract (T0229-18nt-M2) completely abolished aCPSF1 binding (Figure 4C). Furthermore, the footprint assay was performed on the above used RNA with the T0204 terminator sequence using RNase I digestion in the absence or presence of aCPSF1, and a clear footprint of aCPSF1 on the U-tract region was found (Figure 4—figure supplement 3). These results demonstrated that both of the two U-tracts in the transcript 3′-end are necessary for efficient and specific binding of aCPSF1, and that the U-tract region is the exact region that aCPSF1 binds to. Followingly, the minimum length of the consecutive uridines in the two U-tracts was evaluated. U to C mutation was performed on the two U-tracts RNA of T0204-24nt to shorten the length of the consecutive uridines to generate T0204-DU5, T0204-DU4, and T0204-DU3, which carry 5, 4, or 3 consecutive uridines in each U-tract, respectively. The rEMSA results indicated that aCPSF1 efficiently bound to T0204-DU5 and T0204-DU4, but could not bind to T0204-DU3. This suggests that the two U4-tracts is the minimum terminator sequence for efficient binding of aCPSF1 (Figure 4D). Collectively, the in vitro RNA binding experiments demonstrated that an RNA fragment embedding a two U4-tracts and with a minimum length of 18 nt is the cis-element required by the termination factor aCPSF1 for efficient and specific binding. Figure 4 with 3 supplements see all Download asset Open asset The minimal RNA length and U-tract base stringency required for the specific binding of aCPSF1 determined by rEMSA assays. RNAs with indicated lengths and base mutations shown in the top panels that are derived from the native terminator sequences of MMP0204 (T0204) and MMP0229 (T0229) were used as the binding substrat

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call