Biological processes are often executed by a small number of molecules per individual cell, leading to significant cell-to-cell variability (“noise”) in gene expression (Paulsson 2004; Maheshri and O'Shea 2007; Raj and van Oudenaarden 2008; Tawfik 2010). When analyzing gene expression noise, it is convenient to distinguish between intrinsic variations, resulting from stochastic production, and extrinsic variations propagating from global (e.g., ribosomes, polymerases, metabolites, etc.) or pathway-specific factors (Elowitz et al. 2002). Intrinsic noise is of particular interest, as it reflects on the transcription process itself (Paulsson 2004, 2005; Raser and O'Shea 2004; Raj and van Oudenaarden 2008; Rinott et al. 2011; So et al. 2011). The prevailing model of gene expression noise assumes that genes transit stochastically between states that are permissive or nonpermissive to transcription (Paulsson 2004, 2005; Raser and O'Shea 2004; Friedman et al. 2006; Raj et al. 2006; Zenklusen et al. 2008; So et al. 2011). This two-state model predicts a scaling relationship between mean expression m and the coefficient of variation (noise, SD/mean) η: η2 = b/m + ηext2 where b is the typical number of protein molecules made during a single “on” state (“burst size”) and ηext denotes the extrinsic noise (Paulsson 2004, 2005; Raser and O'Shea 2004; Bar-Even et al. 2006; Raj et al. 2006; Pedraza and Paulsson 2008; Tan and van Oudenaarden 2010; Taniguchi et al. 2010). Note that burst size, b, accounts for all the transcription-translation processes following the main stochastic event (burst initiation), integrating the number of mRNA molecules produced per burst and the number of protein molecules made per each mRNA molecule. Upon a perturbation, the noise–mean relationship may change, depending on whether burst size or burst frequency were modulated (Pedraza and Paulsson 2008; Zenklusen et al. 2008; Tan and van Oudenaarden 2010). Genome-wide analysis of the noise–mean relationship in yeast (Bar-Even et al. 2006; Newman et al. 2006) or Escherichia coli (Taniguchi et al. 2010; So et al. 2011) genes reported a general dependency that was well defined by the scaling relation η2 = b/m + ηext2, suggesting a similar burst size for many genes. The expression of genes deviating from the scaling curve, displaying higher-than-expected noise (Bar-Even et al. 2006; Newman et al. 2006), was more responsive to changing conditions and also diverged more between related species (Tirosh and Barkai 2008; Choi and Kim 2009; Lehner 2010). Notably, high noise, responsiveness, and divergence were all correlated with the organization of gene promoters: All three measures were low in promoters lacking a TATA box and containing a nucleosome free region (NFR) proximal to the transcription start site (referred to as DPN promoters—depleted proximal nucleosome), and were high in TATA-containing promoters that lack NFR (OPN, occupied proximal nucleosome) (Field et al. 2008; Tirosh and Barkai 2008; Choi and Kim 2009). TATA box was also shown to increase noise in Pho5 expression (Raser and O'Shea 2004) and in synthetic promoters (Blake et al. 2006; Murphy et al. 2010). The observation that genes with a characteristic promoter structure had a high noise (relative to that expected given their mean expression) is consistent with the idea that promoter sequence influences not only burst frequency but also burst size. Still, the principles by which promoter sequences regulate those two processes are not understood, primarily because most studies analyzing the interplay between promoter sequence variations and gene expression consider mean expression only (e.g., Yun et al. 2012). To distinguish between the effects of promoter sequence on burst size and burst frequency, we generated large libraries of sequence-mutated promoters. Specifically, we chose 22 yeast promoters that span a range of expression and noise levels. Using mutagenic PCR, we generated hundreds of sequence variants of each promoter. Each variant was fused to a fluorescent reporter, and the associated mean expression and noise (coefficient of variation) in a population of identical cells were measured. We found that sequence variants in each of the libraries defined a scaling curve η2 = b/m + ηext2, with a constant estimated burst size b, that was largely promoter-specific and was particularly large for OPN promoters containing a TATA box. A small fraction of sequence mutations leading to a large change in burst size was identified in the OPN-type promoters containing a TATA box. These changes were biased toward reducing burst size, and were almost fully explained by elimination of a TATA box or insertion of a new out-of-frame translation start site. Interestingly, mutations that deleted a TATA box in low-noise DPN-type promoters did not reduce burst size. Our results suggest that burst size is a promoter-specific property that is insensitive to most sequence mutations but is largely influenced by the interaction between TATA box and promoter nucleosomes.