Abstract
Background
The majority of eukaryotic promoters utilize multiple transcription start sites (TSSs). How multiple TSSs are specified at individual promoters across eukaryotes is not understood for most species. In Saccharomyces cerevisiae, a pre-initiation complex (PIC) comprised of Pol II and conserved general transcription factors (GTFs) assembles and opens DNA upstream of TSSs. Evidence from model promoters indicates that the PIC scans from upstream to downstream to identify TSSs. Prior results suggest that TSS distributions at promoters where scanning occurs shift in a polar fashion upon alteration in Pol II catalytic activity or GTF function.
Results
To determine the extent of promoter scanning across promoter classes in S. cerevisiae, we perturb Pol II catalytic activity and GTF function and analyze their effects on TSS usage genome-wide. We find that alterations to Pol II, TFIIB, or TFIIF function widely alter the initiation landscape consistent with promoter scanning operating at all yeast promoters, regardless of promoter class. Promoter architecture, however, can determine the extent of promoter sensitivity to altered Pol II activity in ways that are predicted by a scanning model.
Conclusions
Our observations coupled with previous data validate key predictions of the scanning model for Pol II initiation in yeast, which we term the shooting gallery. In this model, Pol II catalytic activity and the rate and processivity of Pol II scanning together with promoter sequence determine the distribution of TSSs and their usage.
Similar content being viewed by others
Background
Gene expression can be regulated at all levels, and its proper control is critical for cellular function. Transcription regulation has been of intense interest for decades as it determines how much RNA is synthesized for a given gene or locus. Much regulation occurs at the first step in transcription, initiation. A multitude of signals can be integrated with the activities of transcriptional regulators that converge on individual gene promoters. Subsequent to the integration of regulatory information, RNA Polymerase II (Pol II) and general transcription factors (GTFs) must recognize core promoters to together initiate transcription at specific sequences, transcription start sites (TSSs). As with any biochemical process, the efficiency of individual steps will shape the overall output. Thus, determinants of core promoter output during initiation, both overall expression level and the exact position of transcription start sites (TSSs), will be affected by the efficiency of biochemical events during initiation. How different core promoters modulate biochemical steps in initiation, and the nature of their functional interactions with the initiation machinery, remain to be determined.
Classes of eukaryotic core promoters can be distinguished by DNA sequence motifs and chromatin structure (reviews of the core promoter over time [1,2,3,4,5,6,7,8,9,10]). These features together comprise a promoter’s architecture, which may also correlate with differential recruitment or requirement for particular GTF complexes [11,12,13]. A theme across eukaryotes is that core promoters can be broadly separated into two main classes by examination of architectural features and factor requirements. A number of studies indicate that the most common eukaryotic promoters are nucleosome-depleted regions (NDRs) flanked by positioned nucleosomes, which can support divergent transcription through assembly of pre-initiation complexes (PICs) proximal to flanking nucleosomes (with exceptions) [14,15,16,17,18,19,20,21,22,23,24,25]. We will adhere to the definition of “core promoter” as representing the DNA elements and chromatin structure that facilitate transcription in one direction, to avoid definitional confusion that a “promoter” inherently drives divergent transcription [26,27,28]. In yeast, promoter classes have been distinguished in many ways with the end result generally being two main classes of promoter are recognized [16,17,18, 29,30,31]. These classes are distinguished by the presence or absence of a consensus TATA element [32, 33], presence or absence of stereotypical nucleosome organization [18], enrichment for specific transcription factor binding [14, 34, 35], enrichment for non-TATA sequence motifs [36, 37], and differential sensitivity to mutations in GTFs or transcription coactivators [32, 34, 35]. Core promoters attached to defined NDRs tend to lack canonical TATA elements. Conversely, in yeast and other eukaryotes, core promoters with TATA elements can lack stereotypical nucleosome organization and may have nucleosomes positioned over the TATA box in the absence of gene activation. While there have been a number of additional core promoter elements identified in other organisms, especially Drosophila melanogaster [38], we will focus on the distinction provided by the presence or absence of TATA elements.
The TATA element serves as a platform for core promoter binding of the TATA binding protein (TBP). TBP recognition of promoter DNA is assumed to be critical for PIC formation and Pol II promoter specificity. Functional distinction in promoter classes is supported by studies showing differential factor recruitment and requirements between them, with TATA promoters showing higher SAGA dependence and putatively reduced Taf1 (a TFIID subunit) recruitment [32,33,34,35], and though recent data have been interpreted as both SAGA and TFIID functioning at all yeast promoters [39, 40], a distinction between the two classes seems to hold [31]. Conversely, TATA-less promoters show higher Taf1 recruitment by chromatin IP and greater requirement for TBP-associated factor (TAF) function. Given differences in reported factor requirements and promoter architectures, it is important to understand the mechanistic differences between promoters and how these relate to gene regulation.
TSS selection in Saccharomyces cerevisiae has been used as a model to understand how initiation factors collaborate to promote initiation [41, 42]. The vast majority of yeast core promoters specify multiple TSSs [43,44,45], and multiple TSS usage is now known to be common to the majority of core promoters in other eukaryotes [46,47,48,49,50]. Biochemical properties of RNA polymerase initiation lead to TSSs selectively occurring at a purine (R=A or G) just downstream from a pyrimidine (Y=C or T)—the Y−1R+ 1 motif [51]. Y−1R+ 1 motifs may be additionally embedded in longer sequence motifs (the Inr element) [52, 53]. In yeast, the initiation factor TFIIB has been proposed to “read” TSS sequences to promote recognition of appropriate TSSs, with structural evidence supporting positioning of TFIIB to read DNA sequences upstream of a TSS [11, 54].
Budding yeast and their relatives differ from other model eukaryotes in that TSSs for TATA-containing core promoters are generally dispersed and are found ~ 40–120 nt downstream from the TATA [55, 56]. Conversely, TSSs at TATA promoters in other organisms are tightly associated ~ 31 nt downstream of the TATA (with the first T in “TATA” being + 1) [57]. As TATA promoters represent ~ 10% of promoters across well-studied organisms, they are the minority. Classic experiments using permanganate footprinting of melted DNA showed that promoter melting at two TATA promoters in yeast, GAL1 and GAL10, occurs far upstream of TSSs, at a distance downstream from TATA where melting would occur in other eukaryotes that have TSSs closer to the TATA element [58]. This discovery led Giardina and Lis to propose that yeast Pol II scans downstream from TATA boxes to find TSSs. A large number of mutants have been found in yeast which perturb TSS selection, allowing the genetic architecture of Pol II initiation to be dissected, from those in Pol II subunit-encoding genes RPB1, RPB2, RPB7, and RPB9 to GTF-encoding genes SUA7 (TFIIB), TFG1 and TFG2 (TFIIF), and SSL2 (TFIIH), along with the conserved transcription cofactor SUB1 [59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79]. Mutants in GTFs or Pol II subunits have been consistently found at model promoters to alter TSS usage distributions in a polar fashion by shifting TSS distributions upstream or downstream relative to WT. These observations coupled with the analysis of TSS mutations strongly support the directional scanning model for Pol II initiation (elegantly formulated in the work of Kuehner and Brow) [62].
Previous models for how initiation might be affected by Pol II mutants suggested that Pol II surfaces important for initiation functioned through interactions with GTFs within the PIC. We have previously found that altering residues deep in the Pol II active site, unlikely to be directly interacting with GTFs but instead altering Pol II catalytic activity, had strong, allele-specific effects on TSS selection for model promoters [80,81,82]. Observed effects on TSS distributions were polar in nature and consistent with the Pol II active site acting downstream of a scanning process but during TSS selection and not afterwards. In other words, Pol II catalytic efficiency appears to directly impact TSS selection. For example, it appeared that increased Pol II catalytic activity increased initiation probability, leading to an upstream shift in TSS usage at candidate promoters because less DNA needs to be scanned on average prior to initiation. Conversely, lowering Pol II catalytic activity results in downstream shifts to TSS usage at candidate promoters, because more promoter DNA has to be scanned prior to initiation. In general, candidate promoters examined for TSS selection have mostly been TATA containing (for example ADH1, HIS4); thus, it is not known how universal Pol II initiation behavior or mechanisms are across all yeast core promoters, which likely comprise different classes with distinct architectures. To examine initiation by promoter scanning on a global scale in yeast, we perturbed Pol II or GTF activity genetically to examine changes to TSS usage across a comprehensive set of promoters that likely represent all yeast promoter classes. We have found that promoter scanning appears to be universal across yeast core promoters. Furthermore, we find that core promoter architecture correlates with sensitivity of core promoters to TSS perturbation by Pol II and initiation factor mutants. Our results have enabled formulation of a model where Pol II and GTF function together in initiation to promote Pol II initiation efficiency at favorable DNA sequences. Finally, initiation by core promoter scanning prescribes a specific relationship between usable TSSs in a core promoter and the distribution of TSS usage, potentially allowing TSS distributions to be predicted if the sequence preferences for Pol II initiation can be measured.
Results
Initiation mutants affect TSS selection globally in Saccharomyces cerevisiae
We previously found that yeast Pol II active site catalytic mutants showed polar effects on TSS selection at the model ADH1 promoter in addition to some other promoters [81, 82]. ADH1 is a TATA-containing promoter with major TSSs positioned at 90 and 100 nucleotides downstream of its TATA box. A number of other mutants in Pol II and initiation factors also show TSS selection effects at ADH1. TSS selection effects have been hypothesized to relate to alterations in initiation sequence specificity. While the stereotypical polar effects of TSS-altering mutants are consistent with effects on scanning and not necessarily sequence specificity, these are not mutually exclusive models. To understand better how Pol II activity and GTFs cooperate to identify TSSs, we mapped capped RNA 5′ ends genome-wide in S. cerevisiae using TSS-seq for WT, a series of Pol II catalytic mutants, a TFIIB mutant (sua7-58A5) [80], and a TFIIF mutant (tfg2∆146-180) [83]. Positions of capped RNA 5′ ends are taken to represent positions of TSSs as Pol II-initiated RNA 5′ ends are capped shortly after emerging from the enzyme after initiation. We first determined how reproducible our pipeline (Fig. 1a) was across the yeast genome, examining the correlation of read positions corresponding to 5′ ends across all genome positions containing at least three mapped reads in each library being compared (Fig. 1b; Additional file 1: Fig. S1a). Examples of correlations between biological replicates are shown in Fig. 1b for WT, one catalytically fast Pol II allele (rpb1 E1103G) [84,85,86], and one catalytically slow Pol II allele (rpb1 H1085Y) [82]. We refer to fast Pol II alleles and those genetically related to them as “gain of function” (GOF) alleles and slow Pol II alleles (and genetically related) as “loss of function” (LOF) alleles [87]. Correlation plots for all other strains are shown in Additional file 1: Fig. S1A. Clustering analysis of Pearson correlation coefficients among libraries aggregated from biological replicates for each strain indicates that Pol II and initiation mutant classes can be distinguished based on RNA 5′ end mapping alone (Fig. 1b). Additional file 1: Fig. S1b shows clustering of Pearson correlation coefficients of individual biological replicate TSS-seq libraries for reads within promoter regions.
We first focused our analyses on promoter windows predicted from the localization of PIC components by Rhee and Pugh [14] and anchored on TATA or “TATA-like” elements (core promoter elements, or CPE, underlying PIC assembly points) as the + 1 position of the promoter window (Fig. 1d). RNA 5′ ends mapping to the top genome strand of these putative promoter windows indicates that these windows are associated with putative TSSs as expected. The majority of observed TSSs are downstream of predicted CPE/PIC locations from Rhee and Pugh, with TSSs originating at a range of distances from predicted CPE/PIC positions. We note that a fraction of promoter windows has TSS positions suggesting that the responsible PICs for those TSSs assemble at positions upstream or downstream from locations identified by Rhee and Pugh.
We asked if attributes of RNA 5′ end distributions within promoter windows could also distinguish mutant classes, given the distinct and polar alterations of TSS distribution at model genes by Pol II fast or Pol II slow mutants. To do this, we examined two attributes of TSS usage: the change in position of the median TSS usage in the promoter window from WT (TSS “shift”), and the change in the width between positions encompassing 80% of the TSS usage distribution (from 10 to 90%, the change (∆) in TSS “spread,” illustrated in Fig. 1a). TSS shifts found in each mutant for individual promoters are displayed in a heat map that clusters both by mutant and promoter profiles (Fig. 1e). Mutant TSS shift profiles in libraries compiled from all replicates distinguished two major groups representing slow and fast Pol II mutants. Principle component analysis (PCA) of TSS shifts (Additional file 1: Fig. S1c), total promoter reads (“Expression”, Additional file 1: Fig. S1d), or ∆ TSS spread (Additional file 1: Fig. S1e) distinguish between two major classes of mutant for all individual biological replicates, corresponding to Pol II slow and fast Pol II mutants. Both Pol II and GTF mutants showed widespread directional shifting of TSSs across nearly all promoters, with individual mutants generally shifting TSSs for most promoters either upstream (Pol II fast mutants) or downstream (Pol II slow mutants) (Fig. 1e; Additional file 1: Fig. S1f). Pol II GOF and tfg2∆146-180 strains exhibited primarily upstream shifts in TSS distributions within promoter windows, while Pol II LOF (slow) and sua7-58A5 exhibited primarily downstream shifts. TSS shifts are consistent with previously observed shifts at individual promoters, such as ADH1, suggesting that promoter scanning is operating across all yeast promoter classes. Our analyses recapitulate a relationship between expression and TSS spread similar to that recently described for promoters from yeast, mouse, and human [88, 89]. Highly expressed promoters tend to be more focused than those expressed at lower levels (Additional file 1: Fig. S1g). Additional file 1: Fig. S1h shows browser tracks for the example TUB2 promoter illustrating reproducibility at the level of individual libraries.
We examined changes in TSS distribution relative to promoter class and Pol II mutant strength to determine how each related to magnitude of TSS changes. To visualize changes, we separated promoters using classification by Taf1 enrichment or depletion as done previously. While recent work indicates that TFIID (containing Taf1) functions at all yeast promoters [31, 39], differential detection of Taf1 in chromatin IP correlates with promoter nucleosome organization, underlying DNA sequence composition, and DNA element enrichment (TATA etc.) [14, 18, 32, 33, 36], suggesting this metric is a useful proxy for promoter class. Figure 2a shows example heat maps of the difference of normalized TSS distributions between WT and a Pol II fast or a Pol II slow mutant. The stereotypical patterns of polar changes to TSS distributions where distribution of TSSs shifts upstream (increases upstream and decreases downstream, such as in rpb1 E1103G), or shifts downstream (increases downstream and decreases upstream, such as in rpb1 H1085Y), are observed across essentially all promoters, and for all mutants examined including GTF mutants (Additional file 1: Fig. S2). By determining the shift in median TSS position in promoter windows, we can see that mutants exhibit different strengths of effects on TSS distributions (Fig. 2b). A double mutant between tfg2∆146-180 and rpb1 E1103G shows enhancement of TSS defects across promoter classes (Fig. 2b, c), similarly to what was observed at ADH1 [80]. Counts of promoters with upstream or downstream shifts or statistical analyses for significant upstream or downstream shifts at the level of individual promoters demonstrate large directional biases for essentially all mutants (Additional file 1: Fig. S3). Examination of average TSS shift and measured in vitro elongation rate for Pol II mutants shows a correlation between the strength of in vivo TSS selection defect and in vitro Pol II elongation rate [81, 82] (Fig. 2d). These results are consistent with our earlier work that TSS selection being directly sensitive to Pol II catalytic activity [80, 82].
Altered TSS motif usage in TSS-shifting mutants
To understand the basis of directional TSS shifting in Pol II mutants, we asked how changes to TSS selection related to potential sequence specificity of initiation (Fig. 3). Earlier studies of TSS selection defects in yeast suggested that mutants might have altered sequence preferences for the PIC [41]. Our identified TSSs reflect what has been observed for Pol II initiation preferences, i.e., the simplest TSS motif is Y− 1R+ 1 as in most eukaryotes, with the previously observed budding yeast-specific preference for A− 8 at strongest TSSs [43] (Fig. 3b). Preference for Y− 1R+ 1 is common across RNA polymerases and likely reflects the stacking of an initiating purine (R, A/G) triphosphate onto a purine at the − 1 position on the template strand (reflected as pyrimidine (Y, C/T) on the transcribed strand) [51]. Within the most strongly expressed promoters, preference for A− 8 is greatest for the primary TSS and is reduced from secondarily to tertiarily preferred TSSs, even though these sites also support substantial amounts of initiation. Examination of the most focused, expressed promoters—promoters that contain the majority of their TSSs in a narrow window—reveals potential preferences at additional positions. We analyzed TSS usage within promoter windows by dividing all TSSs into 64 motifs based on identities of the − 8, − 1, and + 1 positions (Fig. 3c). We asked if Pol II or GTF mutants altered apparent preferences among these 64 motifs. Based on aggregate usage of sequences across our promoter set, we found that the top used motifs were generally A− 8Y−1R+ 1, with the next preferred motifs found among B− 8(not A)Y−1R+ 1 (Fig. 3d). Pol II and GTF mutants have apparent effects on motif usage distribution concerning the -8A position. Upstream TSS-shifting mutants (Pol II GOF and tfg2∆146-180) show apparent decreased preference for A− 8Y−1R+ 1 motifs concomitant with a gain in relative usage of B− 8Y−1R+ 1 motifs, while downstream TSS-shifting mutants (Pol II LOF and sua7-58A5) have the converse effect, though primarily through increases in A− 8C−1A+ 1 and A− 8C−1G+ 1. Total TSS usage might be affected by strong effects at a subset of highly expressed promoters; therefore, we also examined motif preference on a promoter by promoter basis (Additional file 1: Fig. S4a,b). rpb1 E1103G TSS preferences illustrate that the reduction in preference for A− 8Y−1R+ 1 motifs is observed across yeast promoters (Additional file 1: Fig. S4a) while H1085Y shows the converse (Additional file 1: Fig. S4b).
Different models might explain why initiation mutants alter apparent TSS sequence selectivity, and in doing so lead to polar changes to TSS distribution or vice versa (Fig. 3e). First, relaxation of a reliance on A−8 would allow, on average, earlier initiation in a scanning window. This would be because non-A−8 sites would be encountered by the PIC at higher frequency, whereas increased reliance on A−8 would have the opposite effect. Alternatively, altered Pol II catalytic activity or GTF function may broadly affect initiation efficiency across all sites, which allows at least two predictions. First, an apparent change in TSS selectivity could result from a corresponding uneven distribution in TSS motifs within promoter regions. It has already been observed that yeast promoter classes' sequence distributions deviate from random across promoters. Second, the enrichment of A−8Y−1R+ 1 TSSs and the ability of the -8A to also function as a TSS when it is part of a YR element likely underlies the prevalence for yeast TSSs to be 8 nt apart [45]. Only a subset of -8As will themselves be embedded in Y−1R+ 1 or A−8Y−1R+ 1 elements; therefore, any increase in TSS efficiencies across all sequences will be predicted to shift preference from A−8Y−1R+ 1 to B−8Y−1R+ 1. Here, we examined sequence distributions for individual nucleotides and for select A−8Y−1R+ 1 motifs relative to median TSS position for yeast promoters (Fig. 3f; Additional file 1: Fig. S4c). As noted previously, yeast promoter classes differ based on their distributions of A/T [36, 90]. In Wu and Li, promoters were classified based on their nucleosome structure. Our classification based on Taf1 enrichment similarly divides yeast promoters with Taf1-depleted promoters highly enriched for T and depleted for A on the top DNA strand (Additional file 1: Fig. S4c). Furthermore, the extent of T/A depletion or enrichment correlates with promoter expression level in vivo, fitting with predictions based on promoter reporter analyses [91]. Enrichment or depletion of individual nucleotides would also be expected to potentially alter distributions of N−8Y−1R+ 1 TSS motifs. Therefore, we extended our analyses to N−8Y−1R+ 1 motifs (Fig. 3f). We find that A−8C−1A+ 1, the apparent most-preferred TSS motif for Pol II in yeast, is markedly enriched at the median TSS and downstream positions with a sharp drop off upstream. A−8C−1A+ 1 enrichment also shows correlation with apparent promoter expression level. A less preferred motif, T−8T−1A+ 1, shows a distinct enrichment pattern (enriched upstream of median TSS, depleted downstream). This biased distribution in promoter sequence for TSS sequence motifs makes it difficult to determine whether apparent altered sequence specificity is a cause or consequence of altered TSS distributions.
Altered TSS motif efficiency and usage across a number of TSS motifs
To examine further, we looked at the overall shapes of TSS distributions to determine if mutants alter the shapes of TSS distributions or merely shifted them (Fig. 4). To do this, we examined overall usage across TSS motifs as well as for particular TSS motifs. In parallel, we examined efficiencies of TSS usage for individual TSS motifs (Fig. 4a, b). Efficiency is determined by the ratio of observed reads for a particular TSS to the sum of those reads and all downstream reads, as defined by Kuehner and Brow [62] (Fig. 4b). A scanning mechanism predicts first come-first served behavior in observed TSS usage dependent on innate efficiency of a given TSS (Fig. 4b). Scanning from upstream to downstream will create greater apparent usage for upstream TSSs relative to downstream TSSs, even if they are equally strong in promoting initiation. If Pol II mutants primarily affect initiation efficiency across TSSs, we have specific expectations for how TSS distributions will be affected. For example, if slow Pol II alleles decrease initiation efficiency across sequences, we predict that usage distribution will be flatter than WT. This “flatness” will appear as a downstream shift in usage, but result in the median observed TSS efficiency being lower than WT over all promoter positions except for the very downstream tail of usage. This would reflect a spreading out of the usage distribution to downstream positions as fewer Pol II molecules would initiate at upstream positions, and more Pol II would continue to scan to downstream relative to WT. Conversely, if fast Pol II alleles increase initiation efficiency across sequences, we would predict that both TSS usage and median efficiency increase for upstream promoter positions but return to baseline efficiency sooner than WT.
To partially account for innate sequence differences among TSS motifs, we examined TSS usage and efficiency across promoters for specific N−8Y−1R+ 1 motifs (Fig. 4c, Additional file 1: Fig. S5). Usage is defined as the reads found in particular TSS relative to the total reads for that promoter, whereas efficiency is an estimate of the strength of a TSS, assuming a polar scanning process as illustrated in Fig. 4b. Extending this motif analysis to a range of N−8Y−1R+ 1 motifs used at different levels (Fig. 4d, e, Additional file 1: Fig. S5a-d), we observe that upstream-shifting mutants shift TSS usage upstream for all examined motifs (Fig. 4d). Conversely, downstream-shifting mutants have the opposite effects on motif usage for all examined motifs. When examining N−8Y−1R+ 1 motif efficiencies across promoter positions, downstream-shifting mutants tended to reduce efficiencies across promoter positions while upstream-shifting mutants shifted TSS efficiencies upstream (Fig. 4e). These analyses are consistent with upstream-shifting mutants exhibiting increased efficiency across TSS motifs and promoter positions, which shifts both the usage and observed efficiency distributions to upstream positions, while furthermore, downstream-shifting mutants reduced the efficiency curves and essentially flattened the usage distributions, as would be expected from reduced initiation efficiency across promoter positions. Analysis indicates broad statistical significance for TSS usage and efficiency effects for examined rpb1 H1085Y and E1103G mutants across promoter positions and TSS motifs (Additional file 1: Fig. S5c,d).
Analysis of promoter architecture to understand the location of PIC assembly and estimate scanning distances for yeast promoters
High-resolution TSS data allow us to evaluate promoter features and their potential relationships to observed median TSS positions instead of using annotated TSSs from the Saccharomyces Genome Database (one per gene and not necessarily accurate). For example, in a scanning mechanism, TSSs may have evolved at different distances from the point of scanning initiation. This would mean that different promoters may have different average scanning distances, which could result in differential sensitivity to perturbation to initiation. As has previously been determined, a minority of yeast promoters contain consensus TATA elements (TATAWAWR) and these are enriched in Taf1-depleted promoters (illustrated in Fig. 5a) within ~ 50–100 bp upstream of TSS clusters. Furthermore, TATA enrichment tracks with apparent expression level determined by total RNA 5′ reads within promoter windows. For this class of promoter, a consensus TATA element seems the likely anchor location for PIC assembly and the determinant for the beginning of the scanning window. However, TATAWAWR elements are not enriched in Taf1-enriched promoters. On the basis of finding TATA-like elements within an apparent stereotypical ChIP-exo signal for GTFs, it has been proposed by Rhee and Pugh that promoters lacking consensus TATA elements can use TATA-like elements (TATAWAWR with one or two mismatches) analogously to a TATA element [14]. Therefore, such elements might potentially serve as core promoter elements anchoring PIC formation and determining the scanning distance for these promoters. Evidence for the function of such TATA-like elements is sparse. In vitro experiments suggested that a TBP footprint is positioned over potential TATA-like element in the RPS5 promoter, but the element itself is not required for this footprint [92]. In contrast, more recent results have suggested modest requirement for TATA-like elements at three promoters (~ 2-fold) in an in vitro transcription system [93]. Examination of the prevalence of elements with two mismatches from TATA consensus TATAWAWR within relatively AT-rich yeast promoter regions suggests that there is a high probability of finding a TATA-like element for any promoter (Fig. 5a). Taf1-enriched promoters show enrichment for an alternate sequence motif, a G-capped A tract (sequence GAAAAA), also called the GA element (GAE) [36, 37]. This positioning of GAEs approximately 50–100 bp upstream of TSSs is reminiscent of TATA positioning (Fig. 5a), and the GAE has been proposed to function as a core promoter element at non-TATA promoters [37]. Other studies describe the relationship of this element to nucleosome positioning and suggest that these elements may function directionally in nucleosome remodeling at NDR promoters as asymmetrically distributed poly dA/dT elements [94, 95]. To understand if these potential elements function in gene expression, which would be predicted if they served as potential PIC assembly locations, we cloned a number of candidate promoters upstream of a HIS3 reporter and deleted or mutated identified TATA, TATA-like, or GAE elements and examined effects on expression by Northern blotting (Fig. 5b, Additional file 1: Fig. S6). As expected, identified consensus TATAs positioned upstream of TSSs were important for promoter-driven of the HIS3 reporter. In contrast, neither TATA-like or GAE elements in general had strong effects on expression, though some individual mutations affected expression to the same extent as mutation of TATA elements in the control promoter set. We conclude that GAE or TATA-like elements do not generally function similarly to consensus TATAs for promoter expression.
TSS-shifting initiation mutants alter PIC component positioning consistent with the promoter scanning model
Given results above suggesting that TATA-like or GAE elements may not generally function as core promoter elements and therefore may lack value as potential PIC landmarks, we performed ChIP-exo for GTFs TFIIB (Sua7) and TFIIH (Ssl2) to directly examine PIC component localization in WT, rpb1 H1085Y, and rpb1 E1103G cells (Fig. 5c). Element-agnostic analyses of ChIP-exo [96] for Sua7 and Ssl2 were performed in duplicate for all strains. ChIP-exo v5.0 signal was highly reproducible (Additional file 1: Fig. S7a,b). We reasoned that ChIP-exo would allow us to determine where the PIC localizes for all promoter classes and, moreover, how PIC localization may be altered by Pol II mutants that alter TSS utilization. As discussed above, previous work anchored ChIP-exo signal for PIC components over TATA or TATA-like sequences and identified a stereotypical overall pattern for crosslinks relative to these anchor positions. These crosslink patterns were interpreted as relating to potential structure of the PIC open complex [14]. Subsequent work has identified that crosslinking in ChIP-exo can have some sequence bias [97] and this sequence bias may reflect partially the stereotypical crosslinking patterns observed around TATA/TATA-like sequences. Because the PIC must access TSSs downstream from the site of assembly, it is likely that observed ChIP-exo signal reflects the occupancies of PIC components across promoters and not only the site(s) of assembly. Using TATA-like sequences as anchors, Taf1-enriched promoters were found to have PIC components on average closer to TSSs than they were for Taf1-depleted promoters [14]. Here, we used our high-resolution TSS mapping data coupled with the determination of median position of ChIP-exo signal for Ssl2 or Sua7 within promoter windows to examine distance between putative PIC position and initiation zone as reflected by observed median TSSs (Fig. 5c–e). Figure 5c illustrates basic concepts of ChIP-exo in that the exonuclease approaches crosslinked promoter complexes from the upstream direction on the top DNA strand of a promoter and from the downstream direction on the bottom strand. Top and bottom strands are organized with the same upstream and downstream directions as they indicate the two DNA strands of a directional promoter region. Using median ChIP-exo signal within promoter windows for Ssl2 or Sua7 on top or bottom promoter strands (TOP or BOT), we find that this simple metric behaves as predicted for PIC component signal (Fig. 5d). Figure 5d shows the histogram for individual promoter median ChIP-exo positions for components on the two promoter strands Sua7 signal is slightly upstream of Ssl2 signal, as expected for upstream and downstream components of the PIC, though there is considerable overlap in signal if considering TOP-BOT distance. We also confirm that on average, ChIP-exo signal for PIC components is closer to median TSS position for Taf1-enriched promoters than for Taf1-depleted promoters.
We reasoned that if ChIP-exo signal for PIC components at least partially reflects promoter scanning, i.e., the interaction of PIC components with downstream DNA between PIC assembly position and the zone of initiation, then Pol II mutants that alter TSS usage distribution should also alter PIC component distribution across promoters. We observed changes to the aggregate distribution of ChIP-exo signal for both Taf1-enriched and Taf1-depleted promoter classes. The most obvious effects observed were on the downstream edge of the PIC as detected by Ssl2 signal on the bottom strand of promoter DNA, especially for rpb1 H1085Y (Fig. 5e, Additional file 1: Fig. S7a-c). The shifts observed in aggregate are also observed if we examine shifts for ChIP-exo medians of promoters individually (Additional file 1: Fig. S7a-c). In single molecule experiments examining putative promoter scrunching in the Pol II PIC, scrunching behavior was similar regardless of whether all NTPs (to allow initiation) were present [98]. This observation suggested the possibility that putative promoter scanning driven by TFIIH ATPase-mediated scrunching might be uncoupled from initiation (requiring additional NTPs). In other words, that TFIIH translocation might continue independently of whether Pol II initiates or not. However, we observed altered PIC component localization in Pol II mutants predicted to directly alter initiation efficiency but not necessarily other aspects of PIC function such as TFIIH-mediated scanning (directly). Thus, there may in fact be coupling of initiation and scanning in vivo. Apparent coupling has been observed in magnetic tweezers experiments where a short unwinding event that is strictly TFIIH-dependent can be extended to a larger unwinding event by addition of NTPs, presumably reflecting Pol II transcription [99].
Relationships of TSS selection altering initiation mutants with promoter architectural features
TSSs evolve at certain distances from the site of PIC assembly. This means that TSSs will be found at a range of distances from sites of initial assembly and will theoretically require scanning of different distances. We asked whether presumed scanning distance correlated with promoter sensitivity to Pol II mutants for TSS shifts (Additional file 1: Fig. S8). We observed at most a very modest correlation for TSS-shifting extent based on where TSSs are relative to PIC location for Taf1-enriched promoters (Additional file 1: Fig. S8a). Even where correlation shows strong significance, such correlation explains only a small fraction of TSS shift relative to ChIP-exo positions. However, greater correlation between TSS shift in initiation mutants and ChIP-exo signal was observed for Taf1-depleted promoters having consensus TATA elements (Additional file 1: Fig. S8b). These latter promoters have putative PIC assembly points at greater distances from TSSs on average. Within the range of distances where most of these promoters have their TSSs, promoters with TSSs evolved at downstream positions show the greatest effects of upstream-shifting mutants on the TSS distribution (the TSS shift). Conversely, promoters with TSSs evolved at upstream positions show the greatest effects of downstream-shifting mutants. These results are consistent with a facet of promoter architecture correlating with altered initiation activity, but with potential upstream and downstream limiters on this sensitivity (see the “Discussion” section for more).
The majority of yeast promoters, especially the Taf1-enriched class, are found within an NDR and flanked by an upstream (− 1) and a downstream (+ 1) nucleosome. Previous work showed association between ChIP-exo for GTFs and + 1 nucleosomes [14]. ChIP-exo for PIC components appeared to correlate with nucleosome position for Taf1-enriched promoters. How the PIC recognizes promoters in the absence of a TATA box is an open question. Correlation of PIC ChIP-exo and nucleosome positions is consistent with the fact that TFIID has been found to interact with nucleosomes [100] and with the possibility that the + 1 nucleosome may be instructive for, or responsive to, PIC positioning. Nucleosomes have previously been proposed as barriers to Pol II promoter scanning to explain the shorter distance between PIC component ChIP-exo footprints and TSSs at Taf1-enriched promoters [14]. Nucleosomes can be remodeled or be moved by transcription in yeast [15, 101], likely during initiation. This is because even for promoters with NDRs, TSSs can be found within the footprints of the + 1 nucleosome. We do not observe a differential barrier to downstream shifting in Pol II or GTF mutants for Taf1-enriched promoters, which have positioned nucleosomes (Fig. 2b). Therefore, it remains unclear whether the + 1 nucleosome can act as a barrier for Pol II scanning or TSS selection from our existing data.
To determine if altered initiation and PIC positioning of Pol II mutants, especially downstream-shifting rpb1 H1085Y, occurs in conjunction with altered + 1 nucleosome positioning, we performed MNase-seq in rpb1 H1085Y and E1103G mutants along with a WT control strain (Fig. 6, Additional file 1: Fig. S9,10). Determination of nucleosome positioning by MNase-seq can be sensitive to a number of variables (discussed in [102]); therefore, we isolated mononucleosomal DNA from a range of digestion conditions and examined fragment length distributions in MNase-seq libraries from a number of replicates (Additional file 1: Fig. S9a) to ensure we had matched digestion ranges for WT and mutant samples. Our data recapitulate the observed relationship between PIC component and nucleosome positioning (Additional file 1: Fig. S9b,c) [14]. Nucleosomes and PIC component signal do correlate but in an intermediate fashion relative to PIC-TSS correlation, which appears more obvious. We asked if + 1 nucleosome midpoints were affected in aggregate, if array spacing over genes was altered, or if individual + 1 nucleosomes shifted on average in Pol II mutants vs. WT. Aligning genes of Taf1-enriched promoters by the + 1 nucleosome position in WT suggests that both rpb1 H1085Y and rpb1 E1103G nucleosomes show significantly increased nucleosome repeat length, which becomes visually obvious at the + 3, + 4, and + 5 positions relative to WT (Fig. 6a, b; Additional file 1: Fig. S10a,b,h). For rpb1 H1085Y, we observed a slight but apparently significant shift for the aggregate + 1 position (Fig. 6c, top). The downstream shift in aggregate + 1 position also is reflected at the individual nucleosome level across rpb1 H1085Y replicates (violin plots, Additional file 1: Fig. S10c). To ask if this effect on nucleosomes reflected a global defect across genes or instead correlated with transcription (whether it be initiation or elongation), we performed the same analyses on the top expression decile (Fig. 6c, middle, Additional file 1: Fig. S10d,e) and bottom expression decile Taf1-enriched promoters (Fig. 6c, bottom, Additional file 1: Fig. S10f,g). The downstream shift was apparent in top expression decile promoters but not in bottom expression decile promoters, as would be predicted if the alteration were coupled to transcription. For rpb1 E1103G, we observed a slight shift (~ 1 nt) (Fig. 6d, Additional file 1: Fig. S10h,i). To potentially identify subpopulations of nucleosomes, we employed a more sophisticated analysis of nucleosomes using the approach of Zhou et al. [102] (Additional file 1: Fig. S9b). This approach recapitulated a similarly slight effect of H1085Y on shifting the + 1 nucleosome downstream across most H1085Y datasets relative to WT.
Discussion
Budding yeast has been a powerful model for understanding key mechanisms for transcription by Pol II. An early identified difference in promoter behavior for yeast TATA-containing promoters from classically studied TATA-containing human viral promoters such as adenovirus major late led to proposals that initiation mechanisms were fundamentally different between these species [55, 103]. TSSs for yeast TATA promoters were found downstream and spread among multiple positions while TSSs for viral and cellular TATA promoters were found to be tightly positioned ~ 31 nt downstream of the beginning of the element [57]. This positioning for TSSs at TATA promoters holds for many species including S. pombe [104] but not budding yeast. This being said, genome-wide studies of initiation indicate that the vast majority of promoters use multiple TSSs, though evolution appears to restrict TSS usage at highly expressed promoters in multiple species, including budding yeast (our work, [30, 88, 90]). How these TSSs are generated and if by conserved or disparate mechanisms is a critical unanswered question in gene expression.
We have shown here that Pol II catalytic activity, as determined by mutations deep in the active and essential “trigger loop,” confer widespread changes in TSS distributions across the genome regardless of promoter type. Mutants in core Pol II GTFs TFIIB (sua7 mutant) or TFIIF (tfg2 mutant) confer defects of similar character to downstream-shifting or upstream-shifting Pol II alleles, respectively. The changes observed are consistent with a model (Fig. 7) wherein TSSs are displayed to the Pol II active site directionally from upstream to downstream, with the probability of initiation controlled by the rate at which sequences are displayed (scanning rate), and by Pol II catalytic rate. This system is analogous to a “shooting gallery” where targets (TSSs) move relative to a fixed firing position (the Pol II active site) [105]. In this model, Pol II catalytic activity, the rate of target movement, i.e., scanning rate, and the length of DNA that can be scanned, i.e., scanning processivity, should all contribute to initiation probability at any particular sequence. Biochemical potential of any individual sequence will additionally contribute to initiation efficiency. Our results suggest that Pol II and tested GTF mutants affect initiation efficiency across sequence motifs and that differential effects in apparent motif usage genome-wide likely result from skewed distributions of bases within yeast promoters. Our in vivo results are consistent with elegant in vitro transcription experiments showing reduction of ATP levels (substrate for initiating base or for bases called for in very early elongation) confers downstream shifts in start site usage [106]. Reduction in substrate levels in vitro, therefore, is mimicked by reduction of catalytic activity in vivo.
How template sequence contributes to initiation beyond positions close to the template pyrimidine specifying the initial purine, and how they interact with scanning, is an open question. For models employing a scanning mechanism such as the “shooting gallery,” it can be imagined that bases adjacent to the TSS affect TSS positioning to allow successful interaction with the first two NTPs, while distal bases such as the -8T on the template strand (-8A on the non-template strand) stabilize or are caught by interaction with the yeast TFIIB “reader” to hold TSSs in the active site longer during scanning [54]. Critical to this model are the structural studies just cited of Sainsbury et al. [54] on an artificial initial transcribing complex showing direct interaction of Sua7 D69 and R64 and -8T and -7T on the template strand. There are a number of ways TFIIB may alter initiation efficiency beyond recognition of upstream DNA. TFIIB has also been proposed by Sainsbury et al. to allosterically affect Pol II active site Mg2+ binding and RNA-DNA hybrid positioning [11, 54]. Direct analysis of Kuehner and Brow [62] found evidence for lack of effect of sua7 R64A on efficiency of one non--8A site, while -8A sites were affected, consistent with this residue functioning as proposed. We isolated individual motifs to examine efficiency (Fig. 4c), and our tested sua7-58A5 allele reduced efficiencies of both -8A and non--8A motifs alike. This allele contains a five-alanine insertion at position 58 in Sua7, likely reducing efficiency of the B-reader but possibly leaving some R64 interactions intact. Specific tests of Sua7 R64 mutants under controlled promoter conditions will directly address whether this contact confers TSS selectivity. Additionally, altered selectivity alleles of Sua7 would be predicted if interactions with the template strand were altered.
Core transcriptional machinery for Pol II initiation is highly conserved in eukaryotes leading to the general expectation that key mechanisms for initiation will be conserved. While it has long been believed that budding yeast represents a special case for initiation, this has not systematically been addressed in eukaryotes. The question of how broadly conserved initiation mechanisms are in eukaryotic gene expression is open for a number of reasons. There are examples of diverse transcription mechanisms within organisms across development, for example tissues, cells, or gene sets using TBP-related factors to replace TBP in initiation roles. For example, in zebrafish, distinct core promoter “codes” have been described for genes that are transcribed in oocytes (maternal transcription) versus those transcribed during zygotic development (zygotic transcription) [107]. The maternal code is proposed to utilize an alternate TBP for initiation, while zygotic promoters utilize TBP. Distinct core promoters are used to drive maternal and zygotic expression. For genes transcribed both maternally and zygotically, distinct TSS clusters specific to each phase of development can be quite close to one another in the genome and may have superficially similar distribution characteristics, for example promoter widths or spreads. Comparison of TSS distributions using analyses aware of distribution of possible TSSs would be a powerful tool to probe initiation mechanisms.
Another major question is how promoters without TATA elements are specified. Organization of PIC components is relatively stereotypical within a number of species, as detected by ChIP methods for Pol II and GTFs [14, 108, 109], with the caveat that these are population-based approaches. The most common organization for promoters across examined eukaryotes is an NDR flanked by positioned nucleosomes. Such NDRs can support transcription bidirectionally, reflecting a pair of core promoters with TSSs proximal to the flanking nucleosomes [20, 21, 24,25,26, 110,111,112]. While sequence elements have been sought for these promoters, an alternate attractive possibility is that NDR promoters use nucleosome positioning to instruct PIC assembly. The association of TSSs with the edges of nucleosomes is striking across species, though in species with high levels of promoter proximal pausing, nucleosomes may be positioned downstream of the pause. Transcription itself has been linked to promoter nucleosome positioning, turnover, or exchange in yeast (for example, see [101]). Bulk nucleosome positions are detected in MNase analysis. The ability to detect the initiating state of chromatin will depend on kinetics of initiation and the duration of chromatin states supporting initiation (expected to be relatively infrequent). Therefore, the nature of initiating chromatin is unclear.
Finally, how does initiation interact with nucleosomes? In a scanning model, Pol II activity will not be expected to control the interactions with the downstream nucleosome. Instead, TFIIH bound to downstream DNA and translocating further downstream to power scanning will be expected to be the major interaction point of the PIC and the + 1 nucleosome. This model explains why downstream nucleosomes may not limit changes to scanning incurred by alterations to Pol II activity, because Pol II will be acting downstream of the TFIIH-nucleosome interaction. DNA translocation by TFIIH is expected to be competitive with the + 1 nucleosome for DNA as scanning proceeds into the territory of the nucleosome. Indeed, transcription and TFIIH activity are proposed to drive H2A.Z exchange in the + 1 nucleosome [101]. How TFIIH activity is controlled to either allow scanning in addition to promoter opening or be restricted to promoter opening is a major question in eukaryotic initiation. The S. cerevisiae CDK module of TFIIH has been implicated in restricting initiation close to the core promoter in vitro, but no evidence has emerged in vivo for this mechanism [113]. TFIIH components have long been implicated in controlling activities of the two ATPases—Ssl2 and Rad3 in yeast, XPB and XPD in humans—to enable or promote transcription or nucleotide excision repair [114,115,116]. These inputs may regulate activity of ATPases and their ability to be coupled to translocation activity analogous to paradigms for DNA translocase control in chromatin remodeling complexes [117].
Methods
Yeast strains, plasmids, and oligonucleotides
Yeast strains used in this study were constructed as described previously [80,81,82]. Briefly, plasmids containing rpo21/rpb1 mutants were introduced by transformation into a yeast strain containing a chromosomal deletion of rpo21/rpb1 but with a wild type RPO21/RPB1 URA3 plasmid, which was subsequently lost by plasmid shuffling. GTF mutant parental strains used for GTF single or GTF/Pol II double mutant analyses were constructed by chromosomal integration of GTF mutants into their respective native locus by way of two-step integrations [80]. Strains used in ChIP-exo were TAP-tagged [118] at target genes (SSL2, SUA7) using homologous recombination of TAP tag amplicons obtained from the yeast TAP-tag collection [119] (Open Biosystems) and transferred into our lab strain background [120]. All strains with mutations at chromosomal loci were verified by selectable marker, PCR genotyping, and sequencing. rpo21/rpb1 mutants were introduced to parental strains with or without chromosomal GTF locus mutation by plasmid shuffling [121], selecting for cells containing rpo21/rpb1 mutant plasmids (Leu+) in the absence of the RPB1 WT plasmid (Ura−), thus generating single rpo21/rpb1 mutation strain or double mutant strains combining mutations in GTF and rpo21/rpb1 alleles. Yeast strains in all experiments were grown on YPD (1% yeast extract, 2% peptone, 2% dextrose) medium unless otherwise noted. Mutant plasmids for yeast promoter analyses were constructed by Quikchange mutagenesis (Stratagene) following adaptation for use of Phusion DNA polymerase (NEB) [122]. All oligonucleotides were obtained from IDT. Yeast strains, plasmids, and oligonucleotide sequences are described in Additional file 2.
Sample preparation for 5′-RNA sequencing
Yeast strains were diluted from a saturated overnight YPD culture and grown to mid-log phase (~ 1.5 × 107/ml) in YPD and harvested. Total RNA was extracted by a hot phenol-chloroform method [123], followed by on-column incubation with DNase I to remove DNA (RNeasy Mini kit, Qiagen), and processing with a RiboZero rRNA removal kit (Epicentre/Illumina) to deplete rRNA. To construct the cDNA library, samples were treated with Terminator 5′ phosphate-dependent exonuclease (Epicentre) to remove RNAs with 5′ monophosphate (5′ P) ends, and remaining RNAs were purified using acid phenol/chloroform pH 4.5 (Ambion) and precipitated. Tobacco acid pyrophosphatase (TAP, Epicentre) was added to convert 5′ PPP or capped RNAs to 5′ P RNAs. RNAs were purified using acid phenol/chloroform and a SOLiD 5′ adaptor was ligated to RNAs with 5′ P (this step excludes 5′ OH RNAs), followed by gel size selection of 5′ adaptor ligated RNAs and reverse transcription (SuperScript III RT, Invitrogen) with 3′ random priming. RNase H (Ambion) was added to remove the RNA strand of DNA-RNA duplexes, cDNA was size selected for 90–500 nt lengths. For SOLiD sequencing, these cDNA libraries were amplified using SOLiD total RNA-seq kit (Applied Biosystems) and SOLiD Barcoding kit (Applied Biosystems), final DNA was gel size selected for 160–300 nt length, and sequenced by SOLiD (Applied Biosystems) as described previously [124, 125].
5′-RNA sequencing data analyses
SOLiD TSS raw data for libraries 446–465 was based on 35 nt short reads. The data were delivered in XSQ format and subsequently converted into Color Space csfasta format. Raw data for libraries VV497-520 were in FASTQ format. Multiple read files from each library were concatenated and aligned to S. cerevisiae R64-1-1 (SacCer3) reference genome from Saccharomyces Genome Database. We explored the possibility that alignments might be affected by miscalling of 5′ end base of the SOLiD reads. We trimmed one base at the 5′ end of the reads of the TSS libraries VV497-520 and aligned the trimmed reads independently from the raw reads for direct comparison. The alignment rates did not differ significantly, indicating 5′ end of our SOLiD libraries reads were not enriched for sequencing errors more than the rest of the reads. Sequences were with Bowtie [126] allowing 2 mismatches but only retaining uniquely mapped alignments. The aligned BAM files were converted to bedgraphs, and 5′ base (start tag) in each aligned read was extracted using Bedtools (v2.25.0) for downstream analyses [127]. Mapping statistics for TSS-seq, MNase-seq, and ChIP-exo libraries are described in Additional file 3.
To assess the correlation between biological replicates and different mutants, base-by-base coverage correlation between libraries was calculated for all bases genome-wide and for bases up and downstream of the promoter windows identified by [14](408 nt total width, described below). Given that Pearson correlation is sensitive to variability at lower coverage levels, we examined correlations for positions above a threshold of ≥ 3 reads in each library. Heat scatter plots were generated by the LSD R package (4.0–0) and compiled in Adobe Photoshop. Heat maps were generated using Morpheus (https://software.broadinstitute.org/morpheus/) or Java TreeView [128] and Cluster [129].
To create base-by-base coverage in selected windows of interest, computeMatrix reference-point() function from the deepTools package (2.1.0) was used [130]. There were two types of windows of interest. First, the promoter windows were established by expanding 200 nt up and downstream from the TATA/TATA-like elements identified by [14] (here we term them TATA/TATA-like centered windows) (408 nt total width). Most of these windows (5945/6044) were centered on TATA/TATA-like element annotated in [14], while 99 promoters did not have annotated TATA/TATA-like element and were centered on the TFIIB ChIP-exo peak. Second, we established windows centered on transcription start sites (TSSs) to investigate TSSs at promoters in a core promoter element-independent manner (here we term them TSS-anchored windows). For the TSS-anchored windows, we first determined the 50th percentile (median) TSS (see next paragraph for details) in the TATA/TATA-like centered promoter windows with WT TSS reads derived from RPB1 WT libraries VV446, VV456, VV497, and VV499 (see below) and expanded 200 nt upstream and 200 nt downstream from this “median” TSS position (401 nt total width), adjusting this window one time based on new TSSs potentially present after shifting the window, and then displaying 250 nt upstream and 150 nt downstream from the median TSS position.
Several characteristics of TSS utilization were calculated as follows: (1) The position of the TSS containing the 50th percentile of reads in the window and was termed the “median” TSS. (2) Distance between 10th percentile and 90th percentile TSS position in each promoter was used to measure the width of the TSS distribution, termed the “TSS Spread.” Specifically, TSS positions with 10th and 90th percentile reads were determined in a directional fashion (from upstream to downstream), the absolute value of the difference between two positions by subtraction was calculated as “TSS Spread.” (3) Total reads in windows of interest were summed as a measurement of apparent expression. (4) Normalized densities in windows were calculated as fraction of reads at each TSS position relative to the total number of reads in the window. The normalized densities were subsequently used for examination of TSS usage distribution at each promoter independent of expression level, comparison among different libraries, and start site usage pattern changes in mutants, and visualization. We observed that replicates of each strain (WT or mutant) were highly correlated at the base coverage level as well as primary characteristics of TSS usage (distance to core promoter element, apparent expression) as independently shown by pairwise Pearson correlation and Principal Component Analysis (PCA) (prcomp() in R). We therefore aggregated the counts from replicate strains for downstream analyses (i.e., aligned reads for all replicates of each strain were combined and treated as single “merged library”). Mutant vs WT relative changes of median TSS (Fig. 1e), TSS spread, and normalized TSS densities (Fig. 2) in the indicated windows were calculated in R and visualized in Morpheus or Graphpad Prism 8. Kruskal-Wallis test was employed to test how many promoters have non-identical distribution in all libraries, as previously described [131], with post hoc Dunn’s test to test how many promoters were significantly shifted in each mutant as compared to WT. Mann-Whitney U test was also employed to test how many promoters were significantly shifted in each mutant as compared to WT (p < 0.05) for all samples where n ≥ 3 biological replicates.
In the TSS motif analyses, two major characteristics were computed. First was TSS usage defined by the number of reads at each TSS divided by the total number of reads in the promoter window. Second, we calculated TSS efficiency by dividing TSS reads at an individual position by the reads at or downstream of the TSS, as a proxy to estimate how well each TSS gets utilized with regard to the available Pol II (TSS efficiency) [62]. TSS positions with ≥ 20% efficiency calculated with ≤ 5 reads were excluded (which definitionally are only found at the downstream edges of windows). The corresponding − 8, − 1, + 1 position underlying each TSS (N−8N−1N+ 1 motif) was extracted by Bedtools getfasta (v2.25.0). Start site motif compilation was done by WebLogo for indicated groups of TSSs. Reads for each N−8N−1N+ 1 motif of interest were summed, and fraction of the corresponding motif usage in total TSS reads was calculated for each library. Differences of fraction of start site motif usage in WT and mutants were calculated by subtracting the WT usage fraction from that in each mutant.
Northern blotting and RNA analysis
Northern blotting was performed essentially as described [132]. In brief, 20 μg of yeast total RNA was prepared in Glyoxal sample load dye (Ambion) and separated by 1% agarose gel electrophoresis. RNA was transferred on to membrane by capillary blotting for pre-hybridization. Pre-hybridization solution contained 50% formamide, 10% Dextran sulfate, 5× Denhardt’s solution, 1 M NaCl, 50 mM Tris-HCl pH 7.5, 0.1% SDS, 0.1% sodium pyrophosphate, and 500 μg/ml denatured salmon sperm DNA. DNA double-stranded probes were generated by PCR and radiolabeled with 32P-dATP using the Decaprime II kit (Ambion) according to the manufacturer’s instructions. Blots were hybridized over night at 42 °C and washed twice each in 2× SSC for 15 min at 42 °C, in 5× SSC with 0.5% SDS for 30 min at 65 °C, and in 0.2× SSC for 30 min at room temperature. Blots were visualized by phosphorimaging (Bio-Rad or GE Healthcare) and quantified using Quantity One (Bio-Rad).
ChIP-exo sequencing
Yeast cells containing the TAP-epitope [118, 119] were grown to an OD of 0.8 then crosslinked with formaldehyde to a final concentration of 1% for 15 min at room temperature. Crosslinking was quenched with a molar excess of glycine for 5 min at room temperature. Crosslinked cells were pelleted, washed, and then lysed in FA lysis buffer [133] using a chilled (− 20 °C) beadbeater for 3 min. The released nuclei were then pelleted and subsequently resuspended in 600 μl of FA Lysis buffer. The resuspended nuclei were sonicated in a Diagenode Bioruptor Pico for 12 cycles (15 s on/30 s off). Sonicated chromatin was then incubated overnight on Dynabeads conjugated with rabbit IgG (i5006). ChIP-exo was then performed as previously described [96]. The resulting ChIP-exo libraries were sequenced on a NextSeq 500 in paired-end mode: read 1, 40 bp and read 2, 36 bp with dual 8 bp indexes. Data were aligned to yeast R64-1-1 with BWA-MEM [134] with low-quality reads and PCR duplicates removed by Picard (http://broadinstitute.github.io/picard/) and samtools [135].
Nucleosome MNase sequencing
Nucleosomal DNAs were prepared by a method described elsewhere [136] with the following modifications. Yeast strains were grown in rich medium (YPD) to mid-log phase (~ 1.5 × 107/ml) and crosslinked with methanol-free formaldehyde (1% final concentration, Polysciences Inc) for 30 min and quenched with 0.25 M final concentration of glycine (from 2.5 M stock, pH 7). Cells were washed and digested with zymolyase-20T (Sunrise International) (6 mg for 500 ml culture) for ~ 17 min or until ~ 90% cells appeared as spheroplasts, followed by MNase (Thermo Fisher Scientific) digestion with different amount of MNase to generate “less” and “more” digested nucleosomes (in general, digests were limited such that at least mono, di, and trinucleosomes were still apparent after agarose gel electrophoresis). Crosslinks on nucleosomes were reversed at 65 °C in the presence of Proteinase K (G-Biosciences) overnight. DNA was extracted by phenol/chloroform and digested with RNase A (Thermo Fisher Scientific) to remove RNAs. Nucleosomal DNA was separated on 1.5% agarose gels containing SYBR gold dye (Thermo Fisher Scientific) and mono-nucleosome bands were identified and selected under blue light and gel purified (Omega Biotek). Mononucleosomal DNA fragments were sequenced on an Illumina HiSeq 2500 instrument (2 × 125 paired-end sequencing). Paired-end nucleosome reads were aligned to V64 (SacCer3) reference genome using Bowtie2 [137] allowing 1 mismatch, with only uniquely mapped alignments kept. We used Samtools [135] to extract the alignments to build genome coverage for visualization and start and end position of sequenced DNA fragments. Using the start and end positions of each fragments, fragment length and midpoint position of each fragment were calculated.
Midpoints were analyzed in two main windows of interest. First was median TSS centered window (− 250 upstream and + 150 downstream based on median TSS position as above). Second, windows were identified based on determined WT + 1 nucleosome peak position, as described below using custom scripts (NucSeq v1.0) [138]. Midpoints were assigned to relative coordinates of the window and smoothed using a triweight kernel (75 nt up/downstream total width with a uniform kernel with 5 nt up/downstream width) to get a “smoothed” midpoint profile. The nucleosome peak was called by identifying the local maximum using the smoothed profile. This method enabled us to call a single peak position in ranges of 150 nt windows using the smoothed nucleosome midpoint profiles, thus determining one peak per nucleosome. Average chromosomal coverage (sum of raw midpoints divided by current chromosome length) was calculated for each chromosome as a read threshold per position. The first peak downstream of the median TSS position that had larger than or equal to 20% of chromosomal average coverage and was also within a reasonable position range for a + 1 nucleosome was annotated as the + 1 nucleosome peak at each promoter (if present). + 1 nucleosome peaks were separately identified in two WT libraries (replicates for “less” and “more” digested chromatin). The replicates for “less” digested WT + 1 nucleosome peaks showed greater correlation. Five hundred nucleotides up/downstream of these base positions led to 5660 + 1 nucleosome centered 1001 nt wide windows, allowing observation of up to 8 nucleosomes surrounding + 1 nucleosomes. Nucleosome midpoints were subsequently assigned to this window using the same method as above. Aggregated nucleosome midpoint analysis was done by sorting the promoters by promoter class, expression level (TSS reads in window) followed by summing the nucleosome midpoint counts at each position in the window. For determination of nucleosome repeat length, we first mapped nucleosome midpoints to windows that span 200 nt upstream and 800 nt downstream of the determined average + 1 nucleosome positions in WT, and subsequently computed autocorrelation by distance to estimate the periodicity of the nucleosome midpoint peak signals. The periodicity of nucleosome signals was first confirmed by the sine wave of autocorrelation function, and the nucleosome repeat length was estimated from the distance of the first non-zero positive peak of autocorrelation function (> 0.05). Kernel smoothing (5 nt up/downstream width) was applied to the autocorrelation function before peak calling to minimize outlier bias.
Statistical analyses
Analyses for significance for TSS shifts at individual promoters were done in R (3.5.1). All other statistical analyses were performed in GraphPad Prism 8.4.2 where p values for the statistical tests employed on large datasets are approximate.
Availability of data and materials
References
Vo Ngoc L, Wang YL, Kassavetis GA, Kadonaga JT. The punctilious RNA polymerase II core promoter. Genes Dev. 2017;31:1289–301.
Kadonaga JT. Perspectives on the RNA polymerase II core promoter. Wiley Interdiscip Rev Dev Biol. 2012;1:40–51.
Juven-Gershon T, Kadonaga JT. Regulation of gene expression via the core promoter and the basal transcriptional machinery. Dev Biol. 2010;339:225–9.
Juven-Gershon T, Hsu JY, Kadonaga JT. Perspectives on the RNA polymerase II core promoter. Biochem Soc Trans. 2006;34:1047–50.
Smale ST, Kadonaga JT. The RNA polymerase II core promoter. Annu Rev Biochem. 2003;72:449–79.
Butler JE, Kadonaga JT. The RNA polymerase II core promoter: a key component in the regulation of gene expression. Genes Dev. 2002;16:2583–92.
Danino YM, Even D, Ideses D, Juven-Gershon T. The core promoter: at the heart of gene expression. Biochim Biophys Acta. 2015;1849:1116–31.
Juven-Gershon T, Hsu JY, Theisen JW, Kadonaga JT. The RNA polymerase II core promoter - the gateway to transcription. Curr Opin Cell Biol. 2008;20:253–9.
Haberle V, Stark A. Eukaryotic core promoters and the functional basis of transcription initiation. Nat Rev Mol Cell Biol. 2018;19:621–37.
Roy AL, Singer DS. Core promoters in transcription: old problem, new insights. Trends Biochem Sci. 2015;40:165–71.
Sainsbury S, Bernecky C, Cramer P. Structural basis of transcription initiation by RNA polymerase II. Nat Rev Mol Cell Biol. 2015;16:129–43.
Patel AB, Greber BJ, Nogales E. Recent insights into the structure of TFIID, its assembly, and its binding to core promoter. Curr Opin Struct Biol. 2019;61:17–24.
Nogales E, Patel AB, Louder RK. Towards a mechanistic understanding of core promoter recognition from cryo-EM studies of human TFIID. Curr Opin Struct Biol. 2017;47:60–6.
Rhee HS, Pugh BF. Genome-wide structure and organization of eukaryotic pre-initiation complexes. Nature. 2012;483:295–301.
Weiner A, Hughes A, Yassour M, Rando OJ, Friedman N. High-resolution nucleosome mapping reveals transcription-dependent promoter packaging. Genome Res. 2010;20:90–100.
Tirosh I, Barkai N, Verstrepen KJ. Promoter architecture and the evolvability of gene expression. J Biol. 2009;8:95.
Jiang C, Pugh BF. A compiled and systematic reference map of nucleosome positions across the Saccharomyces cerevisiae genome. Genome Biol. 2009;10:R109.
Tirosh I, Barkai N. Two strategies for gene regulation by promoter nucleosomes. Genome Res. 2008;18:1084–91.
Mavrich TN, Jiang C, Ioshikhes IP, Li X, Venters BJ, Zanton SJ, Tomsho LP, Qi J, Glaser RL, Schuster SC, et al. Nucleosome organization in the Drosophila genome. Nature. 2008;453:358–62.
Xu Z, Wei W, Gagneur J, Perocchi F, Clauder-Munster S, Camblong J, Guffanti E, Stutz F, Huber W, Steinmetz LM. Bidirectional promoters generate pervasive transcription in yeast. Nature. 2009;457:1033–7.
Neil H, Malabat C, d’Aubenton-Carafa Y, Xu Z, Steinmetz LM, Jacquier A. Widespread bidirectional promoters are the major source of cryptic transcripts in yeast. Nature. 2009;457:1038–42.
Preker P, Nielsen J, Kammler S, Lykke-Andersen S, Christensen MS, Mapendano CK, Schierup MH, Jensen TH. RNA exosome depletion reveals transcription upstream of active human promoters. Science. 2008;322:1851–4.
Core LJ, Lis JT. Transcription regulation through promoter-proximal pausing of RNA polymerase II. Science. 2008;319:1791–2.
Seila AC, Calabrese JM, Levine SS, Yeo GW, Rahl PB, Flynn RA, Young RA, Sharp PA. Divergent transcription from active promoters. Science. 2008;322:1849–51.
Jin Y, Eser U, Struhl K, Churchman LS. The ground state and evolution of promoter region directionality. Cell. 2017;170:889–98 e810.
Andersson R, Chen Y, Core L, Lis JT, Sandelin A, Jensen TH. Human gene promoters are intrinsically bidirectional. Mol Cell. 2015;60:346–7.
Duttke SH, Lacadie SA, Ibrahim MM, Glass CK, Corcoran DL, Benner C, Heinz S, Kadonaga JT, Ohler U. Human promoters are intrinsically directional. Mol Cell. 2015;57:674–84.
Duttke SH, Lacadie SA, Ibrahim MM, Glass CK, Corcoran DL, Benner C, Heinz S, Kadonaga JT, Ohler U. Perspectives on unidirectional versus divergent transcription. Mol Cell. 2015;60:348–9.
Lee W, Tillo D, Bray N, Morse RH, Davis RW, Hughes TR, Nislow C. A high-resolution atlas of nucleosome occupancy in yeast. Nat Genet. 2007;39:1235–44.
Lu Z, Lin Z. Pervasive and dynamic transcription initiation in Saccharomyces cerevisiae. Genome Res. 2019;29:1198–210.
Donczew R, Warfield L, Pacheco D, Erijman A, Hahn S. Two roles for the yeast transcription coactivator SAGA and a set of genes redundantly regulated by TFIID and SAGA. eLife 2020;9:e50109. https://doi.org/10.7554/eLife.50109.
Huisinga KL, Pugh BF. A genome-wide housekeeping role for TFIID and a highly regulated stress-related role for SAGA in Saccharomyces cerevisiae. Mol Cell. 2004;13:573–85.
Basehoar AD, Zanton SJ, Pugh BF. Identification and distinct regulation of yeast TATA box-containing genes. Cell. 2004;116:699–709.
Kuras L, Kosa P, Mencia M, Struhl K. TAF-containing and TAF-independent forms of transcriptionally active TBP in vivo. Science. 2000;288:1244–8.
Li XY, Bhaumik SR, Green MR. Distinct classes of yeast promoters revealed by differential TAF recruitment. Science. 2000;288:1242–4.
Wu R, Li H. Positioned and G/C-capped poly(dA:dT) tracts associate with the centers of nucleosome-free regions in yeast promoters. Genome Res. 2010;20:473–84.
Seizl M, Hartmann H, Hoeg F, Kurth F, Martin DE, Soding J, Cramer P. A conserved GA element in TATA-less RNA polymerase II promoters. PLoS One. 2011;6:e27595.
Vo Ngoc L, Kassavetis GA, Kadonaga JT. The RNA polymerase II Core promoter in Drosophila. Genetics. 2019;212:13–24.
Warfield L, Ramachandran S, Baptista T, Devys D, Tora L, Hahn S. Transcription of nearly all yeast RNA polymerase II-transcribed genes is dependent on transcription factor TFIID. Mol Cell. 2017;68:118–29 e115.
Baptista T, Grunberg S, Minoungou N, Koster MJE, Timmers HTM, Hahn S, Devys D, Tora L. SAGA is a general cofactor for RNA polymerase II transcription. Mol Cell. 2017;68:130–43 e135.
Hampsey M. The Pol II initiation complex: finding a place to start. Nat Struct Mol Biol. 2006;13:564–6.
Corden JL. Yeast Pol II start-site selection: the long and the short of it. EMBO Rep. 2008;9:1084–6.
Zhang Z, Dietrich FS. Mapping of transcription start sites in Saccharomyces cerevisiae using 5′ SAGE. Nucleic Acids Res. 2005;33:2838–51.
Park D, Morris AR, Battenhouse A, Iyer VR. Simultaneous mapping of transcript ends at single-nucleotide resolution and identification of widespread promoter-associated non-coding RNA governed by TATA elements. Nucleic Acids Res. 2014;42:3736–49.
Pelechano V, Wei W, Steinmetz LM. Extensive transcriptional heterogeneity revealed by isoform profiling. Nature. 2013;497:127–31.
Chen RA, Down TA, Stempor P, Chen QB, Egelhofer TA, Hillier LW, Jeffers TE, Ahringer J. The landscape of RNA polymerase II transcription initiation in C. elegans reveals promoter and enhancer architectures. Genome Res. 2013;23:1339–47.
Yamashita R, Sathira NP, Kanai A, Tanimoto K, Arauchi T, Tanaka Y, Hashimoto S, Sugano S, Nakai K, Suzuki Y. Genome-wide characterization of transcriptional start sites in humans by integrative transcriptome analysis. Genome Res. 2011;21:775–89.
Hoskins RA, Landolin JM, Brown JB, Sandler JE, Takahashi H, Lassmann T, Yu C, Booth BW, Zhang D, Wan KH, et al. Genome-wide analysis of promoter architecture in Drosophila melanogaster. Genome Res. 2011;21:182–92.
Consortium F, the RP, Clst, Forrest AR, Kawaji H, Rehli M, Baillie JK, de Hoon MJ, Haberle V, Lassmann T, et al: A promoter-level mammalian expression atlas. Nature 2014, 507:462–470.
Nepal C, Hadzhiev Y, Previti C, Haberle V, Li N, Takahashi H, Suzuki AM, Sheng Y, Abdelhamid RF, Anand S, et al. Dynamic regulation of the transcription initiation landscape at single nucleotide resolution during vertebrate embryogenesis. Genome Res. 2013;23:1938–50.
Gleghorn ML, Davydova EK, Basu R, Rothman-Denes LB, Murakami KS. X-ray crystal structures elucidate the nucleotidyl transfer reaction of transcript initiation using two nucleotides. Proc Natl Acad Sci U S A. 2011;108:3566–71.
Smale ST, Baltimore D. The “initiator” as a transcription control element. Cell. 1989;57:103–13.
Vo Ngoc L, Cassidy CJ, Huang CY, Duttke SH, Kadonaga JT. The human initiator is a distinct and abundant element that is precisely positioned in focused core promoters. Genes Dev. 2017;31:6–11.
Sainsbury S, Niesser J, Cramer P. Structure and function of the initially transcribing RNA polymerase II-TFIIB complex. Nature. 2013;493:437–40.
Struhl K. Promoters, activator proteins, and the mechanism of transcriptional initiation in yeast. Cell. 1987;49:295–7.
Lu Z, Lin Z: The origin and evolution of a distinct mechanism of transcription initiation in yeasts. bioRxiv 2020.04.04.025502. https://doi.org/10.1101/2020.04.04.025502.
Breathnach R, Chambon P. Organization and expression of eucaryotic split genes coding for proteins. Annu Rev Biochem. 1981;50:349–83.
Giardina C, Lis JT. DNA melting on yeast RNA polymerase II promoters. Science. 1993;261:759–62.
Yang C, Ponticelli AS. Evidence that RNA polymerase II and not TFIIB is responsible for the difference in transcription initiation patterns between Saccharomyces cerevisiae and Schizosaccharomyces pombe. Nucleic Acids Res. 2012;40:6495–507.
Goel S, Krishnamurthy S, Hampsey M. Mechanism of start site selection by RNA polymerase II: interplay between TFIIB and Ssl2/XPB helicase subunit of TFIIH. J Biol Chem. 2012;287:557–67.
Khaperskyy DA, Ammerman ML, Majovski RC, Ponticelli AS. Functions of Saccharomyces cerevisiae TFIIF during transcription start site utilization. Mol Cell Biol. 2008;28:3757–66.
Kuehner JN, Brow DA. Quantitative analysis of in vivo initiator selection by yeast RNA polymerase II supports a scanning model. J Biol Chem. 2006;281:14119–28.
Pal M, Ponticelli AS, Luse DS. The role of the transcription bubble and TFIIB in promoter clearance by RNA polymerase II. Mol Cell. 2005;19:101–10.
Majovski RC, Khaperskyy DA, Ghazy MA, Ponticelli AS. A functional role for the switch 2 region of yeast RNA polymerase II in transcription start site utilization and abortive initiation. J Biol Chem. 2005;280:34917–23.
Freire-Picos MA, Krishnamurthy S, Sun ZW, Hampsey M. Evidence that the Tfg1/Tfg2 dimer interface of TFIIF lies near the active center of the RNA polymerase II initiation complex. Nucleic Acids Res. 2005;33:5045–52.
Ghazy MA, Brodie SA, Ammerman ML, Ziegler LM, Ponticelli AS. Amino acid substitutions in yeast TFIIF confer upstream shifts in transcription initiation and altered interaction with RNA polymerase II. Mol Cell Biol. 2004;24:10975–85.
Chen BS, Hampsey M. Functional interaction between TFIIB and the Rpb2 subunit of RNA polymerase II: implications for the mechanism of transcription initiation. Mol Cell Biol. 2004;24:3983–91.
Faitar SL, Brodie SA, Ponticelli AS. Promoter-specific shifts in transcription initiation conferred by yeast TFIIB mutations are determined by the sequence in the immediate vicinity of the start sites. Mol Cell Biol. 2001;21:4427–40.
Pappas DL Jr, Hampsey M. Functional interaction between Ssu72 and the Rpb2 subunit of RNA polymerase II in Saccharomyces cerevisiae. Mol Cell Biol. 2000;20:8343–51.
Wu WH, Pinto I, Chen BS, Hampsey M. Mutational analysis of yeast TFIIB. A functional relationship between Ssu72 and Sub1/Tsp1 defined by allele-specific interactions with TFIIB. Genetics. 1999;153:643–52.
Bangur CS, Faitar SL, Folster JP, Ponticelli AS. An interaction between the N-terminal region and the core domain of yeast TFIIB promotes the formation of TATA-binding protein-TFIIB-DNA complexes. J Biol Chem. 1999;274:23203–9.
Pardee TS, Bangur CS, Ponticelli AS. The N-terminal region of yeast TFIIB contains two adjacent functional domains involved in stable RNA polymerase II binding and transcription start site selection. J Biol Chem. 1998;273:17859–64.
Sun ZW, Tessmer A, Hampsey M. Functional interaction between TFIIB and the Rpb9 (Ssu73) subunit of RNA polymerase II in Saccharomyces cerevisiae. Nucleic Acids Res. 1996;24:2560–6.
Sun ZW, Hampsey M. Identification of the gene (SSU71/TFG1) encoding the largest subunit of transcription factor TFIIF as a suppressor of a TFIIB mutation in Saccharomyces cerevisiae. Proc Natl Acad Sci U S A. 1995;92:3127–31.
Pinto I, Wu WH, Na JG, Hampsey M. Characterization of sua7 mutations defines a domain of TFIIB involved in transcription start site selection in yeast. J Biol Chem. 1994;269:30569–73.
Berroteran RW, Ware DE, Hampsey M. The sua8 suppressors of Saccharomyces cerevisiae encode replacements of conserved residues within the largest subunit of RNA polymerase II and affect transcription start site selection similarly to sua7 (TFIIB) mutations. Mol Cell Biol. 1994;14:226–37.
Pinto I, Ware DE, Hampsey M. The yeast SUA7 gene encodes a homolog of human transcription factor TFIIB and is required for normal start site selection in vivo. Cell. 1992;68:977–88.
Hampsey M, Na JG, Pinto I, Ware DE, Berroteran RW. Extragenic suppressors of a translation initiation defect in the cyc1 gene of Saccharomyces cerevisiae. Biochimie. 1991;73:1445–55.
Knaus R, Pollock R, Guarente L. Yeast SUB1 is a suppressor of TFIIB mutations and has homology to the human co-activator PC4. EMBO J. 1996;15:1933–40.
Jin H, Kaplan CD. Relationships of RNA polymerase II genetic interactors to transcription start site usage defects and growth in Saccharomyces cerevisiae. G3 (Bethesda). 2014;5:21–33.
Braberg H, Jin H, Moehle EA, Chan YA, Wang S, Shales M, Benschop JJ, Morris JH, Qiu C, Hu F, et al. From structure to systems: high-resolution, quantitative genetic analysis of RNA polymerase II. Cell. 2013;154:775–88.
Kaplan CD, Jin H, Zhang IL, Belyanin A. Dissection of Pol II trigger loop function and Pol II activity-dependent control of start site selection in vivo. PLoS Genet. 2012;8:e1002627.
Eichner J, Chen HT, Warfield L, Hahn S. Position of the general transcription factor TFIIF within the RNA polymerase II transcription preinitiation complex. EMBO J. 2010;29:706–16.
Kaplan CD, Larsson KM, Kornberg RD. The RNA polymerase II trigger loop functions in substrate selection and is directly targeted by alpha-amanitin. Mol Cell. 2008;30:547–56.
Kireeva ML, Nedialkov YA, Cremona GH, Purtov YA, Lubkowska L, Malagon F, Burton ZF, Strathern JN, Kashlev M. Transient reversal of RNA polymerase II active site closing controls fidelity of transcription elongation. Mol Cell. 2008;30:557–66.
Malagon F, Kireeva ML, Shafer BK, Lubkowska L, Kashlev M, Strathern JN. Mutations in the Saccharomyces cerevisiae RPB1 gene conferring hypersensitivity to 6-azauracil. Genetics. 2006;172:2201–9.
Qiu C, Erinne OC, Dave JM, Cui P, Jin H, Muthukrishnan N, Tang LK, Babu SG, Lam KC, Vandeventer PJ, et al. High-resolution phenotypic landscape of the RNA polymerase II trigger loop. PLoS Genet. 2016;12:e1006321.
Xu C, Park JK, Zhang J. Evidence that alternative transcriptional initiation is largely nonadaptive. PLoS Biol. 2019;17:e3000197.
Borlin CS, Cvetesic N, Holland P, Bergenholm D, Siewers V, Lenhard B, Nielsen J. Saccharomyces cerevisiae displays a stable transcription start site landscape in multiple conditions. FEMS Yeast Res. 2019;19(2):128. https://doi.org/10.1093/femsyr/foy128.
Lubliner S, Keren L, Segal E. Sequence features of yeast and human core promoters that are predictive of maximal promoter activity. Nucleic Acids Res. 2013;41:5569–81.
Lubliner S, Regev I, Lotan-Pompan M, Edelheit S, Weinberger A, Segal E. Core promoter sequence in yeast is a major determinant of expression level. Genome Res. 2015;25:1008–17.
Kamenova I, Warfield L, Hahn S. Mutations on the DNA binding surface of TBP discriminate between yeast TATA and TATA-less gene transcription. Mol Cell Biol. 2014;34:2929–43.
Donczew R, Hahn S. Mechanistic differences in transcription initiation at TATA-less and TATA-containing promoters. Mol Cell Biol. 2017;38(1):e00448-17. https://doi.org/10.1128/MCB.00448-17.
Lorch Y, Maier-Davis B, Kornberg RD. Role of DNA sequence in chromatin remodeling and the formation of nucleosome-free regions. Genes Dev. 2014;28:2492–7.
Krietenstein N, Wal M, Watanabe S, Park B, Peterson CL, Pugh BF, Korber P. Genomic nucleosome organization reconstituted with pure proteins. Cell. 2016;167:709–21 e712.
Rossi MJ, Lai WKM, Pugh BF. Simplified ChIP-exo assays. Nat Commun. 2018;9:2842.
Rossi MJ, Lai WKM, Pugh BF. Genome-wide determinants of sequence-specific DNA binding of general regulatory factors. Genome Res. 2018;28:497–508.
Fazal FM, Meng CA, Murakami K, Kornberg RD, Block SM. Real-time observation of the initiation of RNA polymerase II transcription. Nature. 2015;525:274–7.
Tomko EJ, Fishburn J, Hahn S, Galburt EA. TFIIH generates a six-base-pair open complex during RNAP II transcription initiation and start-site scanning. Nat Struct Mol Biol. 2017;24:1139–45.
Bhuiyan T, Timmers HTM. Promoter recognition: putting TFIID on the spot. Trends Cell Biol. 2019;29(9):752-73.
Tramantano M, Sun L, Au C, Labuz D, Liu Z, Chou M, Shen C, Luk E: Constitutive turnover of histone H2A.Z at yeast promoters requires the preinitiation complex. eLife 2016;5:e14243. https://doi.org/10.7554/eLife.14243.
Zhou X, Blocker AW, Airoldi EM, O'Shea EK. A computational approach to map nucleosome positions and alternative chromatin states with base pair resolution. eLife 2016;5:e16970. https://doi.org/10.7554/eLife.16970.
Struhl K. Molecular mechanisms of transcriptional regulation in yeast. Annu Rev Biochem. 1989;58:1051–77.
Li H, Hou J, Bai L, Hu C, Tong P, Kang Y, Zhao X, Shao Z. Genome-wide analysis of core promoter structures in Schizosaccharomyces pombe with DeepCAGE. RNA Biol. 2015;12:525–37.
Kaplan CD. Basic mechanisms of RNA polymerase II activity and alteration of gene expression in Saccharomyces cerevisiae. Biochim Biophys Acta. 1829;2013:39–54.
Fishburn J, Galburt E, Hahn S. Transcription start site scanning and the requirement for ATP during transcription initiation by RNA polymerase II. J Biol Chem. 2016;291:13040–7.
Haberle V, Li N, Hadzhiev Y, Plessy C, Previti C, Nepal C, Gehrig J, Dong X, Akalin A, Suzuki AM, et al. Two independent transcription initiation codes overlap on vertebrate core promoters. Nature. 2014;507:381–5.
Lai WK, Pugh BF. Genome-wide uniformity of human ‘open’ pre-initiation complexes. Genome Res. 2017;27:15–26.
Shao W, Zeitlinger J. Paused RNA polymerase II inhibits new transcriptional initiation. Nat Genet. 2017;49:1045–51.
Scruggs BS, Gilchrist DA, Nechaev S, Muse GW, Burkholder A, Fargo DC, Adelman K. Bidirectional transcription arises from two distinct hubs of transcription factor binding and active chromatin. Mol Cell. 2015;58:1101–12.
Kaplan CD. Pairs of promoter pairs in a web of transcription. Nat Genet. 2016;48:975–6.
Core LJ, Waterfall JJ, Lis JT. Nascent RNA sequencing reveals widespread pausing and divergent initiation at human promoters. Science. 2008;322:1845–8.
Murakami K, Mattei PJ, Davis RE, Jin H, Kaplan CD, Kornberg RD. Uncoupling promoter opening from start-site scanning. Mol Cell. 2015;59:133–8.
Singh A, Compe E, Le May N, Egly JM. TFIIH subunit alterations causing xeroderma pigmentosum and trichothiodystrophy specifically disturb several steps during transcription. Am J Hum Genet. 2015;96:194–207.
Egly JM, Coin F. A history of TFIIH: two decades of molecular biology on a pivotal transcription/repair factor. DNA Repair (Amst). 2011;10:714–21.
Compe E, Egly JM. TFIIH: when transcription met DNA repair. Nat Rev Mol Cell Biol. 2012;13:343–54.
Clapier CR, Iwasa J, Cairns BR, Peterson CL. Mechanisms of action and regulation of ATP-dependent chromatin-remodelling complexes. Nat Rev Mol Cell Biol. 2017;18:407–22.
Puig O, Caspary F, Rigaut G, Rutz B, Bouveret E, Bragado-Nilsson E, Wilm M, Seraphin B. The tandem affinity purification (TAP) method: a general procedure of protein complex purification. Methods. 2001;24:218–29.
Ghaemmaghami S, Huh WK, Bower K, Howson RW, Belle A, Dephoure N, O'Shea EK, Weissman JS. Global analysis of protein expression in yeast. Nature. 2003;425:737–41.
Winston F, Dollard C, Ricupero-Hovasse SL. Construction of a set of convenient Saccharomyces cerevisiae strains that are isogenic to S288C. Yeast. 1995;11:53–5.
Boeke JD, Trueheart J, Natsoulis G, Fink GR. 5-Fluoroorotic acid as a selective agent in yeast molecular genetics. Methods Enzymol. 1987;154:164–75.
Xia Y, Chu W, Qi Q, Xun L. New insights into the QuikChange process guide the use of Phusion DNA polymerase for site-directed mutagenesis. Nucleic Acids Res. 2015;43:e12.
Schmitt ME, Brown TA, Trumpower BL. A rapid and simple method for preparation of RNA from Saccharomyces cerevisiae. Nucleic Acids Res. 1990;18:3091–2.
Goldman SR, Sharp JS, Vvedenskaya IO, Livny J, Dove SL, Nickels BE. NanoRNAs prime transcription initiation in vivo. Mol Cell. 2011;42:817–25.
Vvedenskaya IO, Goldman SR, Nickels BE. Preparation of cDNA libraries for high-throughput RNA sequencing analysis of RNA 5′ ends. Methods Mol Biol. 2015;1276:211–28.
Langmead B, Trapnell C, Pop M, Salzberg SL. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 2009;10:R25.
Quinlan AR, Hall IM. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics. 2010;26:841–2.
Saldanha AJ. Java Treeview--extensible visualization of microarray data. Bioinformatics. 2004;20:3246–8.
de Hoon MJ, Imoto S, Nolan J, Miyano S. Open source clustering software. Bioinformatics. 2004;20:1453–4.
Ramirez F, Dundar F, Diehl S, Gruning BA, Manke T. deepTools: a flexible platform for exploring deep-sequencing data. Nucleic Acids Res. 2014;42:W187–91.
Kawaji H, Frith MC, Katayama S, Sandelin A, Kai C, Kawai J, Carninci P, Hayashizaki Y. Dynamic usage of transcription start sites within core promoters. Genome Biol. 2006;7:R118.
Malik I, Qiu C, Snavely T, Kaplan CD. Wide-ranging and unexpected consequences of altered Pol II catalytic activity in vivo. Nucleic Acids Res. 2017;45:4431–51.
Kuras L, Struhl K. Binding of TBP to promoters in vivo is stimulated by activators and requires Pol II holoenzyme. Nature. 1999;399:609–13.
Li H: Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. arXiv 2013, 1303.3997.
Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R, Genome Project Data Processing S. The sequence alignment/map format and SAMtools. Bioinformatics. 2009;25:2078–9.
van Bakel H, Tsui K, Gebbia M, Mnaimneh S, Hughes TR, Nislow C. A compendium of nucleosome and transcript profiles reveals determinants of chromatin architecture and transcription. PLoS Genet. 2013;9:e1003479.
Langmead B, Salzberg SL. Fast gapped-read alignment with Bowtie 2. Nat Methods. 2012;9:357–9.
Abante J: NucSeq v1.0. v1.0 edition: Zenodo; 2016.
Qiu C, Jin H, Vvedenskaya I, Llenas JA, Zhao T, Malik I, Schwartz SL, Cui P, Čabart P, Han KH, et al: Datasets. BioProject https://trace.ncbi.nlm.nih.gov/Traces/study/?acc=PRJNA522619 Sequence Read Archive; 2020.
Acknowledgements
Mahmoud Bassal and Kaplan lab members are acknowledged for discussions and comments on the manuscript. Yunye Zhu is acknowledged for the contribution to the design of the diagram for the shooting gallery model. Jonathan Dreyfuss is acknowledged for the Harvard Catalyst Biostatistical Consulting. Jie Wang is acknowledged for discussions with the University of Pittsburgh Statistics Consulting Center. We kindly acknowledge comments on the preprint of this work from Rafal Donczew and Steve Hahn.
Peer review information
Kevin Pang was the primary editor of this article and managed its editorial process and peer review in collaboration with the rest of the editorial team.
Review history
The review history is available as Additional file 5.
Funding
Initial funding for this project was provided by grants from the National Institutes of Health R01GM097260 and Welch Foundation A-1763 to C.D.K. We acknowledge funding from NIH R01GM120450 to C.D.K. and R35GM118059 to B.E.N.
Author information
Authors and Affiliations
Contributions
C.Q. analyzed the data, made the figures, and contributed to writing the manuscript. H.J. initiated the project, generated strains, prepared the material for TSS-seq, generated material and libraries for MNase-seq, analyzed the data, piloted most informatics approaches, and generated the outline of the manuscript. I.V. generated libraries for TSS-seq. P.Č. generated strains for ChIP-exo analyses. J.A.L. collaborated with H.J. on nucleosome positioning analyses and generated scripts and code for the analyses. T.Z. provided informatics analysis of TSS-seq data. I.M. constructed strains and performed Northern blotting for promoter variant studies. S.S. initiated informatics analyses for TSS-seq in yeast. P.C. constructed strains and performed Northern blotting for promoter variant studies. K.H.H. and W.K.M.L. prepared ChIP-exo samples for sequencing. W.K.M.L. processed ChIP-exo sequencing reads. A.M.V. performed initial ChIP-exo v5.0 analysis and piloted figures. R.P.M. and C.D.J consulted on Illumina sequencing strategies and library preparation. S-H.Z. implemented MNase analyses as described in [102]. B.F.P. consulted on ChIP-exo and enabled sequencing of ChIP-exo samples. B.E.N provided funding and consulted on the development of TSS-seq for yeast Pol II RNAs. C.D.K conceived the project, guided analyses, made figures, provided funding, and wrote the manuscript. All others read and approved the final manuscript.
Authors’ information
Twitter handles: @IndranilMalik (Indranil Malik); @BioMath (Charles D. Johnson); @ThePughLab (B. Franklin Pugh); @TriggerLoop (Craig D. Kaplan).
Corresponding author
Ethics declarations
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Competing interests
B.F.P. has a financial interest in Peconic, LLC, which utilizes the ChIP-exo technology implemented in this study and could potentially benefit from the outcomes of this research. All other authors declare that they have no competing interests.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Additional file 1.
Supplemental Fig. S1-S10 and legends.
Additional file 2.
Oligonucleotides, yeast strains, and plasmids.
Additional file 3.
Mapping statistics for TSS-seq, MNase-seq, and ChIP-exo libraries.
Additional file 4.
Genomic positions and attributes of promoters analyzed.
Additional file 5.
Review history.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
About this article
Cite this article
Qiu, C., Jin, H., Vvedenskaya, I. et al. Universal promoter scanning by Pol II during transcription initiation in Saccharomyces cerevisiae. Genome Biol 21, 132 (2020). https://doi.org/10.1186/s13059-020-02040-0
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s13059-020-02040-0