Reporter genes for the analysis of gene expression

The expression pattern of a gene is central to its function. “Gene function” and “gene expression” are not synonyms; “gene function” is a broader term that could include an interpretation of mutant phenotypes, while “gene expression” typically refers to transcription and/or translation of the gene. Even “gene expression” includes several distinct, but related, concepts, since the time and location in which a gene is transcribed are not always the same as when and where its product is active. One of the most widely used methods to study the expression pattern of a gene—both its transcription and the location of its protein product—is to use a reporter gene.

The use of reporter genes as experimental tools is based directly on the modular structure of genes, as depicted in Figure 12.12. Note in this figure that the products encoded by the gene—its RNA and protein—function independently of the mechanism by which the expression pattern is controlled by the promoter and upstream regulatory region. To understand how a gene product functions in a cell or an organism, the coding portion of the gene is critical; to understand how the transcription is regulated, the upstream regulatory region is crucial. Reporter genes, or more formally reporter gene constructs made in vitro, separate the regulation of transcription from the function and are therefore used primarily to analyze the regulation of genes.

So how are reporter genes made in the laboratory? In overview, the DNA sequence of a gene of interest and the surrounding region of the genome, which is presumed or known to include the regulatory region, is isolated in vitro. This surrounding region is then retained, but the coding region of the gene is replaced by, or supplemented with, the coding region of an unrelated gene whose protein product can be readily visualized, as illustrated in Figure A; this unrelated coding region is the reporter. This reporter gene construct is reintroduced into the organism to make a transgenic organism, in which the expression of the easily visualized reporter is controlled by the regulatory region of the gene of interest. The expression pattern—that is, the time and location—of the easily visualized reporter protein in the transgenic organism thus reveals the specificity conferred by the regulatory region of the gene of interest. 

Figure A  Making a reporter gene. A gene with its surrounding region, which includes the regulatory region, is isolated. The protein-coding region of a reporter gene, shown here in dark blue, is then attached to the gene in vitro, either replacing most of the coding region of the gene, above, or as a fusion to the end of the coding region of the gene, below. These are transcriptional and translational reporter constructs, respectively, which are then reintroduced into the cell or organism.

In this box, we briefly consider two main points: the easily visualized proteins (or rather, protein-coding regions) routinely used as reporters; and the difference between using the reporter gene to replace the coding region of the gene of interest and using it to supplement the coding region of the gene of interest.

A good reporter protein is easy to detect and quantify

In order to be seen in vivo, a reporter protein must be easy to visualize and measure accurately. The expression of most common reporter proteins is determined using colorimetric assays, which can also be used quantitatively to compare changes in the level of expression. The reporter protein should not be quickly degraded in the organism or during extraction. The size of the reporter protein is also important—it should be large enough that it cannot diffuse out of the cell but not so big that it affects the activity of the protein being studied. Furthermore, the model species cannot have another protein whose properties are similar to the reporter protein.

Although other reporter proteins have occasionally been employed, four reporters are commonly used. Examples depicting the use of each of these reporter genes are shown in Figure B. Green fluorescent protein (GFP) from the jellyfish Aequorea victoria and luciferase (LUC) from the firefly Photinus pyralis are used in both animals and plants; β-galactosidase, encoded by the lacZ gene of Escherichia coli, is widely used in animals, while β-glucuronidase (GUS), encoded by the Escherichia coli uidA gene, is widely used in flowering plants. All these reporters have been used in bacteria.  





Figure B  Examples of reporter genes. Four widely used reporter genes are shown. (i) GFP. (ii) Luciferase. (iii) β-glucuronidase or GUS. (iv) β-galactosidase (lacZ).

GFP has become the most widely used and versatile reporter, and the scientists who developed it for experimental use were recognized by a Nobel Prize in 2008. GFP has natural green fluorescence under blue light, and no additional substrate needs to be added for detection. This provides the important advantage that GFP can be assayed in living cells, unlike the other reporter proteins. Thus, GFP can be used for monitoring dynamic patterns of gene expression and subcellular localization, as well as for separating live cells based on expression patterns.

In general, detection of GFP is somewhat less sensitive than the other reporters, although enhanced GFP with increased fluorescence is commercially available. The GFP protein is small, and its spectral properties are easily modified in vitro. Thus, in addition to its natural green fluorescence, the GFP gene has been modified to emit yellow, cyan, or red fluorescence; these modified versions have been given the names YFP, CFP, and RFP. (The original RFP has been largely replaced by an analogous red reporter called mCherry, isolated and modified from the red coral Discosoma.) These permutations of GFP allow for the simultaneous labeling and tracing of multiple genes in the same cell or organism.

Reporters other than GFP usually cannot be used in living organisms, but even single fixed cells can be assayed quantitatively by colorimetric assays. LacZ and GUS reporters are especially sensitive methods for studying expression patterns in fixed specimens. The lacZ gene will be discussed in detail in Chapter 14 and in Tool Box 15.2, but, for now, the important points are that it encodes an enzyme known as β-galactosidase or β-gal. For β-gal detection, the specimen is fixed, and the substrate 5-bromo-4- chloro-3-indolyl-β-d-galactopyranoside (more commonly called X-gal) is added. β-galactosidase hydrolyzes X-gal to produce galactose and the deep blue precipitate 4-chloro-3-bromo-indigo, so the presence of the blue color indicates the activity of β-gal. This is seen in Figure A (iv). Colorimetric substrates of β-gal other than X-gal are also commercially available, as discussed in Tool Box 15.2.

In addition to its use as a reporter gene in many animals and tissue culture cells, lacZ is also used as a reporter gene in some applications of gene cloning, as part of a process known as blue– white selection. During gene cloning, a gene or sequence of interest is inserted into a plasmid vector; the plasmid is ligated, and the new plasmid is transformed into bacterial cells. However, the process is quite inefficient, so most of the time the plasmid is ligated without an insert, and the gene or sequence of the insert is not actually cloned into the vector. In order to improve the process for detecting plasmids with an insert, a plasmid with the lacZ gene can be used as the vector. The cloning site for the insert is within the lacZ gene, so a successful insertion of foreign DNA (that is, the gene or sequence of interest) disrupts the lacZ gene. Thus, bacteria expressing the intact lacZ gene with no insert will be blue, while bacteria with the gene of interest inserted into the lacZ gene will be white. By screening for the white colonies in a background when most colonies are blue, the investigator can more readily find the plasmid containing the gene of interest.

GUS assays were developed for use in plants to avoid an endogenous enzyme activity that prevented the use of lacZ. They are done in the same manner as for β-gal, with the substrates being X-gluc (5-bromo-4-chloro-3-indolyl glucuronide) or MUG (4-methylumbelliferyl-β-d-glucuronide), with similar spectroscopic properties.

Luciferase was isolated from fireflies as the enzyme that makes fireflies luminesce. As observers who have seen fireflies (or lightning bugs) on a summer evening will realize, a small amount of light is released when luciferase acts on its substrate, luciferin. In luciferase assays, the cells are lysed and combined with luciferin immediately before quantification in a luminometer, which measures the amount of light emitted. Because the light released from the metabolism of one luciferin molecule is small and transient, luciferase is primarily used when rapid detection is important, often in cell cultures.

Transcriptional and translational reporter genes

As noted above, “gene expression” can refer to the pattern of transcription of a gene or to the location of a gene product, or both. While these two patterns are the same for many genes, there are enough exceptions (such as proteins secreted by a cell or tissue) that both transcription products and translation products need to be considered.

Reporter genes can be used to analyze either the transcription pattern or the translation pattern and subsequent protein localization associated with the expression of a given gene. In either case, the regulatory region of the gene of interest is retained. The difference between transcriptional and translational reporters lies in where the protein-coding region of the reporter gene is inserted and how much of the coding region of the gene of interest is retained; these are shown in Figure C. 

Figure C  Information provided by transcriptional and translational reporter genes. The properties of transcriptional and translational reporter genes are summarized. Transcriptional reporters, which do not require a functional gene product, are much easier to construct, while translational reporters offer more information.

In transcriptional reporters, the reporter coding region replaces most (or all) of the coding region of the gene of interest. Thus, no functional protein product from the reporter transgene is made, and post-transcriptional or post-translational modifications (which depend on the sequence of the coding region) do not occur. The expression of the reporter protein displays the pattern of transcription for the gene.

In translational reporters, the reporter coding region is fused in frame to the coding region of the gene of interest to make a fusion protein that encodes both the reporter and the protein of interest. In an ideal translational reporter, the protein of interest retains its function, as well as all of its post-transcriptional modifications (its protein-trafficking pattern, for example). In this case, the expression of the reporter protein displays the location of the final protein product in the cell. This in-frame fusion between the coding region of the normal gene and the reporter gene makes translational reporters more informative but also more difficult to create than transcriptional reporters.