Abstract

Article Figures and data Abstract Editor's evaluation Introduction Results Discussion Materials and methods Data availability References Decision letter Author response Article and author information Metrics Abstract Alternative polyadenylation yields many mRNA isoforms whose 3’ termini occur disproportionately in clusters within 3’ untranslated regions. Previously, we showed that profiles of poly(A) site usage are regulated by the rate of transcriptional elongation by RNA polymerase (Pol) II (Geisberg et al., 2020). Pol II derivatives with slow elongation rates confer an upstream-shifted poly(A) profile, whereas fast Pol II strains confer a downstream-shifted poly(A) profile. Within yeast isoform clusters, these shifts occur steadily from one isoform to the next across nucleotide distances. In contrast, the shift between clusters – from the last isoform of one cluster to the first isoform of the next – is much less pronounced, even over large distances. GC content in a region 13–30 nt downstream from isoform clusters correlates with their sensitivity to Pol II elongation rate. In human cells, the upstream shift caused by a slow Pol II mutant also occurs continuously at single nucleotide resolution within clusters but not between them. Pol II occupancy increases just downstream of poly(A) sites, suggesting a linkage between reduced elongation rate and cluster formation. These observations suggest that (1) Pol II elongation speed affects the nucleotide-level dwell time allowing polyadenylation to occur, (2) poly(A) site clusters are linked to the local elongation rate, and hence do not arise simply by intrinsically imprecise cleavage and polyadenylation of the RNA substrate, (3) DNA sequence elements can affect Pol II elongation and poly(A) profiles, and (4) the cleavage/polyadenylation and Pol II elongation complexes are spatially, and perhaps physically, coupled so that polyadenylation occurs rapidly upon emergence of the nascent RNA from the Pol II elongation complex. Editor's evaluation Geisberg et al. show, in yeast and human cells, a nucleotide-level relationship between the transcriptional elongation rate and the polyadenylation profile. This suggest that the cleavage/polyadenylation and Pol II elongation complexes are spatially, and perhaps physically coupled so that polyadenylation occurs rapidly upon emergence of nascent RNA from the Pol II elongation complex. Furthermore, the GC-content of sequences downstream of poly(A) clusters influences 3’isoform cluster profiles by slowing down elongation, allowing more time for the 3'-cleavage complex to find the poly(A) site. These findings contribute new information on how the transcription machinery determines which poly(A) site are utilized at the end of genes. https://doi.org/10.7554/eLife.83153.sa0 Decision letter Reviews on Sciety eLife's review process Introduction The 3’ ends of eukaryotic mRNAs are generated during the process of transcriptional elongation by cleavage of the nascent transcript downstream of the coding region followed by addition of a poly(A) tail (Proudfoot et al., 2002; Tian and Manley, 2013; Tian and Manley, 2017; Kumar et al., 2019). Formation of 3’ ends is mediated by a multiprotein cleavage/polyadenylation (CpA) complex that performs both steps. Alternative polyadenylation gives rise to many 3’ mRNA isoforms differing by the position at which the poly(A) tail is added. The poly(A) profile of a typical yeast gene has ~50 mRNA isoforms with 3’ endpoints occurring within an ‘end zone’ of ~200 nt (Ozsolak et al., 2010; Moqtaderi et al., 2013; Pelechano et al., 2013). The 3’ untranslated region (3’ UTR) is a modular entity that is sufficient to determine the poly(A) profile (Lui et al., 2022). Although mRNA isoforms with neighboring 3’ ends usually have similar properties, they can differ dramatically with respect to mRNA stability, structure throughout the 3’ UTR, and association with Pab1, the poly(A)-binding protein (Geisberg et al., 2014; Moqtaderi et al., 2018). Although polyadenylation occurs at numerous sites within the 3’ UTR, it rarely occurs within coding regions (Moqtaderi et al., 2013) and introns (Berg et al., 2012), even though these are usually much larger. This apparently paradoxical observation has implications for the specificity and mechanism of the CpA machinery, and hence, the poly(A) profile. Polyadenylation in yeast cells is associated with a degenerate sequence motif consisting of a long AU-rich stretch, followed by short U-rich sequences that flank several A residues immediately downstream of the cleavage site (Guo and Sherman, 1996; Moqtaderi et al., 2013). It has been suggested that long AU-rich stretches, which are not encountered until after coding regions, are important for restricting polyadenylation to 3’ UTRs (Lui et al., 2022). In metazoan mRNAs, an AAUAAA sequence is specifically recognized by the CpA complex (Chan et al., 2014; Schönemann et al., 2014; Sun et al., 2018), and it contributes significantly to determining where polyadenylation occurs. However, given its high frequency in the transcriptome, AAUAAA cannot be the only determinant of poly(A) sites. The large number of 3’ mRNA isoforms for individual genes indicates that the CpA machinery has relatively low sequence specificity. In addition, as previously noted and shown explicitly here, 3’ isoform endpoints tend to occur in clusters within the 3’ UTR. Such clustering, which is related to microheterogeneity, is usually explained by imprecision of the CpA machinery in the vicinity of a sequence recognition element (e.g. AAUAAA) and/or a preferred cleavage site. Polyadenylation is intimately connected to the process of transcriptional elongation in vivo (Nag et al., 2007; Pinto et al., 2011; Liu et al., 2017; Cortazar et al., 2019; Goering et al., 2021), and transcriptional pausing increases polyadenylation in vitro (Yonaha and Proudfoot, 1999). An intact RNA tether between RNA polymerase II (Pol II) and the poly(A) site is required for efficient 3’ end processing (Bird et al., 2005; Rigo et al., 2005). Furthermore, cleavage of the nascent mRNA not only leads to polyadenylation but is also the key step that initiates the processes of transcriptional termination and subsequent export of polyadenylated mRNAs from the nucleus (Connelly and Manley, 1988; Kim et al., 2004; West et al., 2004; Luo et al., 2006). In general, each nascent mRNA molecule is cleaved and polyadenylated just once, so the poly(A) profile represents an ensemble of independent events. However, at some human genes, it has been suggested that longer isoforms can be retained in the nuclear matrix and be processed into shorter poly(A) isoforms (Tang et al., 2022). In considering the link between elongation and polyadenylation, a key issue is the location of elongating Pol II, and hence, the length of accessible RNA at the time of cleavage and subsequent polyadenylation. The poly(A) profiles of most yeast genes are altered in cells expressing Pol II derivatives with altered elongation rates (Geisberg et al., 2020). In all cases, the same poly(A) sites are used but to different extents. Two slow Pol II mutants each cause a 5’ shift in poly(A) site use, with the slower mutant giving rise to a greater upstream shift. In contrast, each of two fast Pol II mutants causes a 3’ shift in preferred poly(A) sites, although the magnitude of this shift is less pronounced, and fewer genes are affected. These altered poly(A) profiles are due to the Pol II elongation rate because strains with reduced Pol II processivity but normal elongation rates have poly(A) profiles indistinguishable from wild-type strains (Geisberg et al., 2020; Yague-Sanz et al., 2020). Yeast cells undergoing the diauxic response have poly(A) profiles remarkably like those mediated by slow Pol II mutants, indicating the physiological relevance of Pol II elongation rate to poly(A) profiles (Geisberg et al., 2020). Transcription slows down in the vicinity of poly(A) sites, suggesting a functional link between 3’ end processing and elongation (Parua et al., 2018; Cortazar et al., 2019; Eaton et al., 2020). Conversely, elongation rate in metazoans affects alternative poly(A) site choice, with slow Pol II mutants favoring the use of more upstream sites, consistent with a ‘window of opportunity’ model of poly(A) site choice (Liu et al., 2017; Goering et al., 2021). However, the fine structure of metazoan poly(A) site clustering and how it is affected by elongation speed have not been investigated. The shifts in poly(A) profiles in strains expressing fast or slow Pol II mutants could occur gradually or in jumps throughout the 3’ UTR. Here, we address these possibilities by measuring the ratio of 3’ mRNA isoform levels in the speed-mutant vs. the wild-type Pol II in yeast and human cells. Unexpectedly, in both organisms, the mutant:wild-type ratio of isoform expression changes steadily on a nucleotide basis within isoform clusters, whereas it is only minimally changed between clusters. In yeast cells, DNA sequence preferences 13–30 nt downstream of isoform clusters suggest that DNA sequence elements can affect Pol II elongation, subsequent polyadenylation, and the formation of 3’ mRNA isoform clusters. In human cells, Pol II occupancy increases just downstream of poly(A) sites, suggesting a linkage between reduced elongation rate and isoform clusters. Taken together, our results suggest a spatial, and perhaps physical, coupling between the CpA and Pol II elongation complexes, such that cleavage and polyadenylation occur almost immediately upon emergence/accessibility of the RNA from the Pol II elongation complex. Results 3’ mRNA isoforms frequently occur in clusters of closely-spaced poly(A) sites The poly(A) profile of an individual gene is defined by the relative steady-state expression levels of all of its 3’ mRNA isoforms. In previous work, we used the 3’ READS technique to map 3’ mRNA isoforms, and hence poly(A) profiles, in yeast cells expressing wild-type, slow, or fast Pol II derivatives on a transcriptome scale (Geisberg et al., 2020). In yeast, 3’ mRNA isoform endpoints occur across a ~200 nt window within the 3’ UTR. Within this ‘end zone’, visual inspection suggests that isoforms are not randomly distributed but rather appear to occur in clusters of closely-spaced poly(A) sites (Figure 1A). For reasons to become apparent, we formalize this observation by considering the likelihood of cluster formation in randomly distributed isoforms for each gene. Figure 1 with 2 supplements see all Download asset Open asset Isoforms in yeast 3’ untranslated regions (UTRs) are clustered. (A) Polyadenylation profile of ATG27, a typical yeast gene, illustrating that major isoforms appear in clusters (represented as C1, C2, and C3 in red lettering). (B) Frequency distribution of clusters (all isoforms in cluster ≤4 nt apart) containing the indicated number of isoforms in either the randomized or genomic population. The number and frequency of all clusters were tabulated for 3774 genes (orange bars). Potential isoform positions were then shuffled 100,000 times within each gene’s 3’UTR, and the frequency and number of isoforms for each cluster were tabulated for every shuffled instance. Cluster frequencies were then combined across all 3774 genes and 100,000 shuffled instances/gene (blue bars). (C) Median likelihood (−log10 P value) that the experimentally observed cluster pattern for genes with the indicated number of major isoforms occurs by chance. Each point represents the probability that a given gene’s experimentally observed cluster frequency pattern is random. Horizontal bars inside boxplots represent the median values, while the top and bottom of each box represent the 25th and 75th percentiles. Values above dashed red line at –log10(P)=2 are considered statistically significant. In previous work, we defined a 3’ mRNA isoform cluster as a collection of isoforms with closely-spaced 3’ ends and similar half-lives (Geisberg et al., 2014). Here, we consider only the spacing between isoform endpoints, defining an isoform cluster as a group of isoforms in which each 3’ endpoint is no more than four nucleotides from the next (Figure 1A and Supplementary file 1). Our analyses are restricted to ‘major isoforms’ that are expressed at >5% of the level of the gene’s most highly expressed isoform. Major 3’ isoforms account for >97% of overall steady-state mRNA expression. The prevalence of clustered isoform endpoints in each 3’ UTR is far higher than that obtained by randomly distributing the same number of major isoforms over the same window (Figure 1B). The same result is obtained when the definition of a cluster is changed by varying the maximal inter-isoform distance from three to seven nucleotides (Figure 1—figure supplement 1). As expected, 3’ UTRs containing larger numbers of isoforms give rise to wider clusters but also to more complex cluster patterns that are exceedingly unlikely to be observed by chance (Figure 1C). Thus, poly(A) site isoforms occur disproportionately in clusters. Distinct patterns of speed-sensitivity within and between clusters in yeast cells The poly(A) profiles of most yeast genes are altered in yeast strains expressing Pol II derivatives with slow or fast elongation rates (Geisberg et al., 2020). Compared to the poly(A) profile in wild-type cells, poly(A) profiles in slow Pol II strains (‘slow’: F1086S; ‘slower’: H1085Q) are shifted in an upstream (ORF-proximal) direction, whereas poly(A) profiles in fast Pol II strains (‘fast’: L1101S; ‘faster’: E1103G) exhibit subtle downstream shifts. Some genes shift poly(A) profiles in both fast and slow Pol II strains. The Pol II elongation rate has no effect on isoform clustering (Figure 1—figure supplement 2). To address the mechanistic relationship between Pol II speed and poly(A) profiles, we asked whether the shifts in isoform distributions are continuous or occur in jumps throughout the 3’ UTR. For every isoform, we determined its sensitivity to Pol II speed by calculating the ratio of its expression in a Pol II elongation rate mutant (slow or fast) vs. that in a wild-type Pol II strain. Strikingly, the pattern of isoform ratios in slow vs. wild-type Pol II strains is very different for isoforms within clusters as opposed to isoforms between clusters (two specific examples shown in Figure 2A, and transcriptome-scale results shown in Figure 2B). Within isoform clusters, both the ‘slower’:wild-type and the ‘slow’:wild-type ratios continuously decrease from the most ORF-proximal to the most ORF-distal isoforms; i.e., the most downstream isoform within a cluster typically has the lowest slow:wild-type ratio. Remarkably, these decreases in the slow:wild-type ratios occur continuously at the nucleotide level (Figure 2B). In addition, the intra-cluster slope is steeper (i.e. the ratio decreases more rapidly) in the strain with the ‘slower’ Pol II derivative as compared to the ‘slow’ derivative. In contrast, both slow:wild-type ratios decrease only very slightly for isoforms from the end of one cluster to the beginning of the next cluster, even over large distances (Figure 2A and B). These observations do not depend on the maximal inter-isoform distance used to define clusters (Figure 2—figure supplement 1). Figure 2 with 1 supplement see all Download asset Open asset Pol II elongation rate drives poly(A) cluster formation. (A) Examples of poly(A) profiles in which ‘slower’/wild-type (WT) major isoform ratios (purple) decrease more rapidly within clusters than between clusters. Individual isoforms are defined by the number of nt downstream of the stop codon (x-axis). Clusters and inter-cluster regions are depicted as Cn and In in red and black lettering, respectively. The subscript n refers to the relative position of either the cluster or the inter-cluster region within the 3’ untranslated region, while brackets around clusters indicate that they contain <4 isoforms and thus were not used in cluster slope analysis. (B) Median relative ratios (downstream/upstream isoform) of genome-wide Rpb1(mutant)/Rpb1(WT) utilization at major isoform pairs as a function of nucleotide spacing either within clusters (circles) or in between clusters (diamonds). For each major isoform, Rpb1(mutant)/Rpb1(WT) utilization is computed by dividing the relative expression value of the isoform in the mutant strain by its relative expression in the WT strain. Relative ratios for each isoform pair are calculated by dividing downstream isoform utilization by upstream isoform utilization. Trend lines for ‘slower’/WT and ‘slow’/WT are depicted via dashes (within clusters) or as dots (between clusters). The same dichotomy of isoform ratios within clusters vs. between isoform clusters is observed for fast Pol II strains, except that the slopes of the ‘fast’:wild-type and ‘faster’:wild-type ratios across clusters are positive. Within a cluster, ORF-distal isoforms typically have the highest fast:wild-type expression ratios, with the overall ratios increasing continuously at the nucleotide level (specific example shown in Figure 3A, and transcriptome-scale results shown in Figure 3B). As observed with the ‘slow’ and ‘slower’ Pol II derivatives, the ‘faster’ Pol II derivative shows a steeper median slope than the ‘fast’ derivative (Figure 3B and C). As with both slow Pol II derivatives, the slope of the ratio change is much flatter between clusters. Again, this effect is independent of the precise cluster definition used (Figure 3 and Figure 3—figure supplement 1). Taken together, these results demonstrate a nucleotide-level linkage between Pol II elongation and polyadenylation. Figure 3 with 1 supplement see all Download asset Open asset Pol II elongation rate drives poly(A) cluster formation. (A) Example poly(A) profile in which ‘faster’/wild-type (WT) major isoform ratios (purple) increase more rapidly within clusters than between clusters. Clusters and inter-cluster regions are depicted as Cn and In in red and black lettering, respectively. The subscript n refers to the relative position of either the cluster or the inter-cluster region within the 3’ untranslated region, while brackets around clusters indicate that they contain <4 isoforms and thus were not used in cluster slope analysis. (B) Median relative ratios (downstream/upstream isoform) of genome-wide Rpb1(mutant)/Rpb1(WT) utilization at major isoform pairs as a function of nucleotide spacing either within clusters (circles) or in between clusters (diamonds). For each major isoform, Rpb1(mutant)/Rpb1(WT) utilization is computed by dividing the relative expression value of the isoform in the mutant strain by its relative expression in the WT strain. Relative ratios for each isoform pair are calculated by dividing downstream isoform utilization by upstream isoform utilization. Trend lines for ‘faster’/WT and ‘fast’/WT are depicted via dashes (within clusters) or as dots (between clusters). (C) Median relative ratios (downstream/upstream isoform) of utilization at major isoform pairs as a function of nucleotide spacing in all four yeast elongation rate mutants (‘slower’/WT in red, ‘slow’/WT in yellow, ‘fast’/WT in light green, and ‘faster’/WT in dark green). Relative utilization ratios are depicted as either circles (within clusters) or diamonds (between clusters). Trend lines are dashed for within clusters and dotted between clusters. Mammalian slow Pol II mutant affects poly(A) site micro-heterogeneity within clusters We compared the polyadenylation profiles in human HEK293 cell lines expressing either an α-amanitin resistant wild-type Pol II or the slow-elongation Pol II derivative with the Rpb1-R749H mutation in the funnel domain (Fong et al., 2014). This slow-elongation Pol II derivative often confers an upstream shift in the poly(A) profile resembling that observed in yeast slow Pol II mutants, though occurring at fewer genes (Goering et al., 2021). Using 3’ READS, we obtained an average of ~30 million reads per biological replicate, with high reproducibility of the data across replicates (Figure 4—figure supplement 1). Analysis of clusters in human cells is more challenging than in yeast due to the greater complexity of the human genome, lower sequencing depth, and the much longer lengths of mammalian 3’ UTRs. To work around these limitations, we modified the previous cluster analysis by including all isoforms that contained ≥5 reads in genes with <100 reads in the maximally expressed isoform in both wild-type and R749H cell lines, and by defining mammalian 3’ UTRs to encompass the region between 1 kb upstream and 5 kb downstream of the consensus stop codon in the Consensus protein coding sequences (CCDS) database. Remarkably, within clusters the median R749H:wild-type ratio exhibits a continuous, nucleotide-level decrease that resembles the decreases observed with both yeast slow Pol II derivatives (an example is shown in Figure 4A, and transcriptome-scale results shown in Figure 4B; compare Figure 4B to Figure 2B). As observed in yeast, R749H:wild-type ratios of isoforms from one cluster to the next exhibit much less change (Figure 4B). Importantly, the nucleotide-level decrease within clusters observed for R749H:wild-type ratios is independent of both 3’ UTR length and the minimal inter-cluster distance used for cluster definition (Figure 4—figure supplement 2). Figure 4 with 2 supplements see all Download asset Open asset Poly(A) cluster formation is also linked to Pol II elongation rate in human cell lines. (A) An example of a poly(A) profile in which R749H/wild-type (WT) major isoform ratios (purple) decrease more rapidly within a cluster than between clusters. Clusters and inter-cluster regions are depicted as Cn and In in red and black lettering, respectively. The subscript n refers to the relative position of either the cluster or the inter-cluster region within the 3’ untranslated region while brackets around clusters indicate that they contain <4 isoforms and thus were not used in cluster slope analysis. (B) Median relative ratios (downstream/upstream isoform) of isoform utilization (R749H/WT Pol II) in human 3’ isoform pairs as a function of nucleotide spacing. Relative ratios are depicted either as circles (within clusters) or as diamonds (between clusters), while trend lines are either dashed (within clusters) or dotted (between clusters). Cluster-independent analysis of isoform pairs in yeast and human cells The nucleotide-level link between Pol II elongation rate and polyadenylation is observed only for isoforms within, but not between, clusters. To address the basis of this difference, we performed a cluster-independent measurement of the upstream shift. Specifically, we measured the relative levels of adjacent isoforms in cells expressing slow and wild-type Pol II simply as a function of the distance (in nucleotides) between the isoforms (Figure 5). For both yeast slow Pol II mutants and at all distances, the mutant:wild-type expression ratio of the downstream isoform is lower than that of the upstream isoform; the lower the value, the greater the upstream shift. As expected, the ‘slower’ Pol II mutant confers a greater upstream shift than the ‘slow’ Pol II derivative. Interestingly, the magnitude of the upstream shift increases slightly with distance at isoform spacings between one and five nucleotides, but it is essentially constant at distances greater than five nucleotides (Figure 5A). Similar analysis of human cells expressing the slow R749H vs. the wild-type α-amanitin resistant Rpb1 derivatives yields roughly comparable results, with consistently lower slow:wild-type expression ratios at downstream positions relative to upstream positions within clusters (Figure 5B). Thus, in both yeast and human cells, the apparent discordance between slow Pol II effects on isoforms within or between clusters largely reflects the greater distance between consecutive isoforms, not the overall distance traveled by Pol II. Figure 5 Download asset Open asset Cluster-independent link between Pol II elongation rate and poly(A) formation. (A) Median utilization difference (downstream isoform mutant/wild-type expression ratio minus upstream isoform mutant/wild-type expression ratio) is plotted for either ‘slower’/wild-type (gray bars) or ‘slow’/wild-type (blue bars) as a function of isoform spacing. (B) Median utilization difference (downstream isoform R749H/wild-type expression ratio minus upstream isoform R749H/wild-type expression ratio) as a function of isoform spacing. DNA sequence features linked to isoform clusters Although the nucleotide-level link between Pol II elongation and polyadenylation is based on the behavior of isoform clusters, the results above do not address why 3’ mRNA isoforms occur disproportionately in clusters. Toward this end, we considered the possibility that isoform clusters might form if Pol II encounters a DNA sequence element that slows elongation speed. We were unable to identify any such element spatially linked to yeast cluster formation in general. However, in yeast Pol II mutant strains, increased GC content in the region 13–30 bp downstream of a cluster’s most ORF-distal isoform is strongly correlated with more steeply declining (decreasing ‘slower’:wild-type Pol II and ‘slow’:wild-type Pol II isoform utilization) and more steeply rising (increasing ‘fast’:wild-type and ‘faster’:wild-type isoform ratios) cluster slopes (Figure 6A). ‘Slower’:wild-type Pol II and ‘slow’:wild-type Pol II clusters display excess GC content at +13 to +30 regions when cluster slopes are highly negative and reduced GC content when cluster slopes are positive (red and orange bars, respectively; Figure 6B). Figure 6 Download asset Open asset GC-rich region just downstream of isoform clusters. (A) GC content in a region downstream of clusters is correlated to cluster slopes in ‘slow’/wild-type (upper left), ‘slower’/wild-type (bottom left), ‘fast’/wild-type (top left), and ‘faster’/wild-type (bottom left) datasets. Pearson R at each position (blue; left axes) represents the correlation of GC content and cluster slopes in a 10-nt window starting at the position indicated on the x-axis. Red lines (right axes) represent false discovery rate (FDR)-corrected –log10 P values of each correlation, the dashed red line is the significance cutoff (correlations above it are deemed significant), and significant regions are highlighted in gray boxes. Segments at the bottom of each graph indicate the span of the GC-enriched sequence in each mutant/wild-type Pol II dataset. (B) GC content at +13 to +30 is linked to cluster slopes. Cluster slopes for ‘slower’/wild-type, ‘slow’/wild-type, ‘fast’/wild-type, and ‘faster’/wild-type were individually separated into quintiles, with the most negatively sloped clusters depicted on the left and the most positively sloped clusters depicted on the right. For each cluster, the percent change in GC content at +13 to +30 was computed relative to the median GC content at the equivalent genomic coordinates within 3’ untranslated regions. The y-axis depicts the average percent change in GC content for all clusters of a given category within each quintile. Median slopes for cluster categories within each quintile are shown on the bottom. (C) GC-rich region immediately downstream of poly(A) clusters in yeast. Elongating Pol II makes numerous contacts (black circles: identical residues in both mammalian and yeast Rpb1; gray circles: conserved residues in both mammalian and yeast Rpb1) with both DNA strands (purple: template strand, blue: non-template strand) and nascent RNA (red). The RNA addition site (+1), Pol II-protected region (gray oval), RNA:DNA hybrid (yellow), and the +13 to +30 region (boxed) are shown. Adapted from Bernecky et al., 2016. Conversely, ‘fast’:wild-type Pol II and ‘faster’:wild-type Pol II clusters exhibit increased GC content at +13 to +30 when their slopes are highly positive and lower relative GC content as the cluster slopes decrease (light and dark green bars, respectively; Figure 6B). The contrasting relationships between slow and fast Pol II mutants and GC content downstream of clusters strongly suggest that GC composition at +13 to +30 plays an important role in shaping polyadenylation patterns in clusters by affecting Pol II speed. Intriguingly, the distance between the +13 and +30 region and the 3’ boundary of an isoform cluster are strikingly similar to the length of the sequence protected by the elongating Pol II machinery (Bernecky et al., 2016; Figure 6C). This observation suggests the existence of a DNA sequence element that contributes to isoform cluster formation in yeast cells. In human isoform clusters, the lower number of sequence reads did not permit a similar analysis. Slower transcription downstream of polyadenylation sites in human genes Although poly(A) site choice at the nucleotide level is strongly affected by Pol II speed, the relationship between Pol II elongation rate and CpA in the immediate vicinity of poly(A) sites is unknown. We investigated this question by performing eNETseq, a modification of the mNET-seq technique (Nojima et al., 2015), that maps the 3’-OH ends of Pol II-associated nascent transcripts, and hence Pol II occupancy, at single base pair resolution in human cells. Reduced Pol II speed at a particular region reflects a longer Pol II dwell time that results in a relative increase in Pol II occupancy within this region. The composite eNETseq profiles around the region of poly(A) sites show decreasing Pol II occupancy just upstream (region between –40 and –1) of poly(A) sites followed by a biphasic increase in Pol II occupan

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call