1. Composability of regulatory sequences controlling transcription and translation in Escherichia coli
- Author
-
Daniel B. Goodman, George M. Church, Adam P. Arkin, Yuan Gao, Sriram Kosuri, Guillaume Cambray, Drew Endy, Vivek K. Mutalik, Chercheur indépendant, University of Manchester [Manchester], Diversité, Génomes & Interactions Microorganismes - Insectes [Montpellier] (DGIMI), Institut National de la Recherche Agronomique (INRA)-Université Montpellier 2 - Sciences et Techniques (UM2)-Université de Montpellier (UM), St Vincents Hosp, Biofab3D, Melbourne, Vic, Australia, ON Semiconductor, Wyss Institute for Biologically Inspired Engineering [Harvard University], and Harvard University [Cambridge]
- Subjects
[SDV.BIO]Life Sciences [q-bio]/Biotechnology ,Biodiversité et Ecologie ,[SDV]Life Sciences [q-bio] ,Messenger ,Ribosome ,Synthetic biology ,chemistry.chemical_compound ,genetique des populations ,Transcription (biology) ,Models ,Regulatory Elements, Transcriptional ,Cloning, Molecular ,Promoter Regions, Genetic ,Genetics ,0303 health sciences ,Multidisciplinary ,Reverse Transcriptase Polymerase Chain Reaction ,Systems Biology ,030302 biochemistry & molecular biology ,Bacterial ,High-Throughput Nucleotide Sequencing ,Biological Sciences ,Flow Cytometry ,Regulatory sequence ,Transcriptional ,Genetic Engineering ,Biotechnology ,Bioengineering ,Biology ,next-generation sequencing ,synthetic biology ,systems biology ,Biodiversity and Ecology ,Promoter Regions ,03 medical and health sciences ,Genetic ,Escherichia coli ,[SDV.BBM]Life Sciences [q-bio]/Biochemistry, Molecular Biology ,RNA, Messenger ,030304 developmental biology ,DNA Primers ,Gene Library ,Messenger RNA ,Models, Genetic ,Human Genome ,RNA ,Molecular ,Promoter ,Gene Expression Regulation, Bacterial ,Regulatory Elements ,diversité des populations ,chemistry ,Gene Expression Regulation ,Ribosomes ,DNA ,Cloning - Abstract
The inability to predict heterologous gene expression levels precisely hinders our ability to engineer biological systems. Using well-characterized regulatory elements offers a potential solution only if such elements behave predictably when combined. We synthesized 12,563 combinations of common promoters and ribo-some binding sites and simultaneously measured DNA, RNA, and protein levels from the entire library. Using a simple model, we found that RNA and protein expression were within twofold of expected levels 80% and 64% of the time, respectively. The large dataset allowed quantitation of global effects, such as translation rate on mRNA stability and mRNA secondary structure on translation rate. However, the worst 5% of constructs deviated from prediction by 13-fold on average, which could hinder large-scale genetic engineering projects. The ease and scale this of approach indicates that rather than relying on prediction or standardization, we can screen synthetic libraries for desired behavior. next-generation sequencing | synthetic biology | systems biology O rganisms can be engineered to produce chemical, material, fuel, and medical products that are often superior to non-biological alternatives (1). Biotechnologists have sought to discover , improve, and industrialize such products through the use of recombinant DNA technologies (2, 3). In recent years, these efforts have increased in complexity from expressing a few genes at once to optimizing multicomponent circuits and pathways (4-7). To attain desired systems-level function reliably, careful and time-consuming optimization of individual components is required (8-11). To mitigate this slow trial-and-error optimization, two dominant approaches have taken hold. The first approach seeks to predict expression levels by elucidating the biophysical relationships between sequence and function. For example, several groups have modified promoters (12, 13) and ribosome binding sites (RBSs) (14-16) to see how small sequence changes affect transcription or translation. Such studies are fundamentally challenging due to the vastness of sequence space. In addition, because these approaches mostly look at either transcription or translation individually, they are rarely able to investigate interactions between these processes. The second approach uses combinations of individually characterized elements to attain desired expression without directly considering their DNA sequences (17-25). Current efforts have focused on approaches to limit the number of time-consuming steps required to characterize potential interactions and on identifying existing or engineered elements that act predictably when used in combination (26-28). However, these studies still suggest there are enough idiosyncratic interactions and context effects that it will be necessary to construct and measure many variants of a circuit to achieve desired function (29). For larger circuits, such approaches are necessarily limited in scope due to the difficulty in measuring large numbers of combinations (26, 27). Here, we overcome previous limitations in generating and measuring large numbers of regulatory elements by combining recent advances in DNA synthesis with novel multiplexed methods for measuring DNA, RNA, and protein levels simultaneously using next-generation sequencing. We use the method to characterize all combinations of 114 promoters and 111 RBSs and quantify how often simple measures of promoter and RBS strengths can accurately predict gene expression when used in combination. In addition, because we measure both RNA and protein levels across the library, we can quantify how translation affects mRNA levels and how mRNA secondary structure affects translation efficiency. Finally, the size of the characterized library also provides a resource for researchers seeking to achieve particular expression levels. In lieu of using standardized elements or prediction-based design, library synthesis and screening allows precise tuning of expression in arbitrary contexts. Results Library Design, Construction, and Initial Characterization. To explore the effects of regulatory element composition systematically, we designed and synthesized all combinations of 114 promoters with 111 RBSs (12,653 constructs in total; one combination resulted in an incompatible restriction site). We used 90 promoters from an existing library from BIOFAB: International Open Facility Advancing Biotechnology, 17 promoters from the Ander-son promoter library on the BioBricks registry, 6 promoters from common cloning vectors, and a spacer sequence chosen as a negative control. From RBSs, we used 55 RBSs from the BIOFAB library, 31 from the Anderson BioBrick library, 13 from the Salis RBS Calculator (14) expected to give a range of expression, 12 commonly used RBSs from cloning vectors and the BioBrick Registry, and one sequence chosen as a negative control (reverse complement of canonical RBS sequence). We synthesized the construct library using Agilent's oligo library synthesis (OLS) technology (30) and cloned at ∼50-fold coverage into a custom medium-copy vector (pGERC), where the constructs drive expression of superfolder GFP (31) (Fig. S1). pGERC also contains an mCherry (32) reporter under constant expression by
- Published
- 2013
- Full Text
- View/download PDF