Regulation of

Cellular function is influenced by cellular environment. Adaptation to specific environments is achieved by regulating the expression of genes that encode the enzymes and proteins needed for survival in a particular environment. Factors that influence gene expression include nutrients, temperature, light, toxins, metals, chemicals, and signals from other cells. Malfunctions in the regulation of gene expression can cause various human disorders and diseases.

Bacteria have a simple general mechanism for coordinating the regulation of genes that encode products involved in a set of related processes. The gene cluster and promoter, plus additional sequences that function together in regulation are called an operon.

The lactose operon ofE. coliencodes the enzymeb-galactosidase which hydrolyzes lactose into galactose and glucose.

Thelacoperon contains three cistrons or DNA fragments that encode a functional protein. The proteins encoded by cistrons may function alone or as sub-units of larger enzymes or structural proteins.

The Z gene encodes forb-galactosidase. The Y gene encodes a permease that facilitates the transport of lactose into the bacterium. The A gene encodes a thiogalactoside transacetylase whose function is not known. All three of these genes are transcribed as a single, polycistronic mRNA. Polycistronic RNA contains multiple genetic messages each with its own translational initiation and termination signals.

The activity of the promoter that controls the expression of thelacoperon is regulated by two different proteins. One of the proteins prevents the RNA polymerase from transcribing (negative control), the other enhances the binding of RNA polymerase to the promoter (positive control).

The protein that inhibits transcription of thelacoperon is a tetramer with four identical subunits calledlacrepressor. Thelacrepressor is encoded by thelacIgene, located upstream of thelacoperon and has its own promoter. Expression of thelacIgene is not regulated and very low levels of thelacrepressor are continuously synthesized. Genes whose expression is not regulated are called constitutive genes.

In the absence of lactose thelacrepressor blocks the expression of thelacoperon by binding to the DNA at a site, called the operator that is downstream of the promoter and upstream of the transcriptional initiation site. The operator consists of a specific nucleotide sequence that is recognized by the repressor which binds very tightly, physically blocking (strangling) the initiation of transcription.

Thelacrepressor has a high affinity for lactose. When a small amount of lactose is present thelacrepressor will bind it causing dissociation from the DNA operator thus freeing the operon for gene expression. Substrates that cause repressors to dissociate from their operators are called inducers and the genes that are regulated by such repressors are called inducible genes.

Although lactose can induce the expression oflacoperon, the level of expression is very low. The reason for this is that thelacoperon is subject to catabolite repression or the reduced expression of genes brought on by growth in the presence of glucose. Glucose is very easily metabolized so is the preferred fuel source over lactose, hence it makes sense to prevent expression oflacoperon when glucose is present.

The strength of a promoter is determined by its ability to bind RNA polymerase and to form an open complex. The promoter for thelacoperon is weak and consequently thelacoperon is poorly transcribed upon induction. There is a binding site, upstream from the promoter, for a protein called the catabolite activator protein (CAP). When the CAP protein binds it distorts the DNA so that the RNA polymerase can bind more effectively, thus transcription of thelacoperon is greatly enhanced. In order to bind the CAP must first bind cyclic AMP (cAMP), a second messenger synthesized from ATP by the enzyme Adenylate Cyclase.

In the presence of glucose circulating cAMP levels are very low and consequently the initiation of transcription from thelacoperon is very low. As glucose levels decrease the concentration of cAMP increases activating CAP which in turn binds to the CAP site stimulating transcription. The cAMP-CAP complex is called a positive regulator.

Arabinose is a five-carbon sugar that can serve as an energy and carbon source forE. coli. Arabinose must first be converted into ribulose-5-phosphate before it can be metabolized. The arabinose operon has three genes,araB,araAandaraDthat encode for three enzymes to carry out this conversion. A fourth gene,araC, which has its own promoter, encodes a regulatory factor called the C protein.

The regulatory sites of thearaoperon include four sites that bind the C protein and one CAP binding site. ThearaO1andaraO2sites are upstream of the promoter and CAP binding sites. The other two C protein binding sites calledaraI1andaraI2are located between the CAP binding site and the promoter.

In the absence of arabinose, dimers of the C protein bind toaraO2,araO1andaraI1. The C proteins bound toaraO2andaraI1associate with one another causing the DNA between them to form a loop effectively blocking transcription of the operon.

The C protein binds arabinose and undergoes a conformational change that enables it to also bind thearaO2andaraI2sites. This results in the generation of a different DNA loop that is formed by the interaction of C proteins bound to thearaO1andaraO2sites.

The formation of this loop stimulates transcription of thearaCgene resulting in additional C protein synthesis, thus the C protein autoregulates its own synthesis. In the absence of glucose, cAMP-CAP is formed which binds to the CAP site. C protein bound at thearaI1andaraI2sites interacts with the bound CAP enabling RNA polymerase to initiate transcription from thearaoperon promoter.

E. colican synthesize all 20 of the natural amino acids. Amino acid synthesis consumes a lot of energy, so to avoid wasting energy the operons that encode for amino acid synthesis are tightly regulated. Thetrpoperon consists of five genes,trpE,trpD,trpC,trpBandtrpA, that encode for the enzymes required for the synthesis of tryptophan.

Thetrpoperon is regulated by two mechanisms, negative corepression and attenuation. Most of the operons involved in amino acid synthesis are regulated by these two mechanisms.

The trp operon is negatively controlled by thetrprepressor, a product of thetrpRgene. Thetrprepressor binds to the operator and blocks transcription of the operon. However, in order to bind to the operator the repressor must first bind to Trp hence tryptophan is a corepressor. In the absence of Trp thetrprepressor dissociates and transcription of thetrpoperon is initiated.

Attenuation regulates the termination of transcription as a function of tryptophan concentration. At low levels of trp full length mRNA is made, at high levels transcription of thetrpoperon is prematurely halted. Attenuation works by coupling transcription to translation. Prokaryotic mRNA does not require processing and since prokaryotes have no nucleus translation of mRNA can start before transcription is complete. Consequently regulation of gene expression via attenuation is unique to prokaryotes.

a. Attenuation is mediated by the formation of one of two possible stem-loop structures in a 5 segment of the trp operon in the mRNA.

b. If tryptophan concentrations are low then translation of the leader peptide is slow and transcription of the trp operon outpaces translation. This results in the formation of a nonterminating stem-loop structure between regions 2 and 3 in the 5 segment of the mRNA. Transcription of the trp operon is then completed.

c. If tryptophan concentrations are high the ribosome quickly translates the mRNA leader peptide. Because translation is occurring rapidly the ribosome covers region 2 so that it can not attach to region 3. Consequently the formation of a stem-loop structure between regions 3 and 4 occurs and transcription is terminated.

Regulation of Gene Expression in Eukaryotes

The genetic information of a human cell is a thousand fold greater than that of a prokaryotic cell. Things are further complicated by the number of cell types and the fact that each cell type must express a particular subset of genes at different points in an organisms development. Regulating gene expression so that a particular subset of genes is expressed in a specific tissue at specific points of development is very complicated. This increased complexity in regulation lends itself to malfunctions that cause disease. Three ways that eukaryotes regulate gene expression will be discussed: alteration of gene content or position, transcriptional regulation and alternative RNA processing.

1.Alteration of Gene Content or Position

The copy number of a gene or its location on the chromosome can greatly effect its level of expression. Gene content or location can be altered by gene amplification, diminution or rearrangement.

The expression of a particular gene can be augmented by amplifying its copy number. Histone proteins and rRNA are needed in large quantities by almost all eukaryotic cells therefore the genes encoding histones and rRNA exist in a permanently amplified state. Gene amplification can present problems with the use of chemotherapeutic drugs. Methotrexate inhibits dihydrofolate reductase, the enzyme responsible for regenerating the folates used in nucleotide synthesis. Tumor cells often become resistant to the drug because the gene encoding dihydrofolate reductase is amplified by several hundred fold resulting in more enzyme production then the drug can handle.

A gene whose expression is only needed at a particular developmental point or in a particular tissue may be shut off by gene diminution. As reticulocytes mature into red blood cells all of their genes are lost as the nucleus is degraded.

Gene rearrangement is used to generate each of the genes encoding the millions of different antibodies that are produced by B cells. Sometimes bad gene rearrangements occur that lead to improper gene regulation. This frequently occurs in cancer cells. Translocation of a segment from chromosome 8 to chromosomes that encode immunoglobulins leads to activation of a gene that transforms healthy B cells into Burkitts lymphoma cells (unregulated proliferating B cells).

Regions of each of the different chromosomes are either packaged as heterochromatin or euchromatin. In heterochromatin the DNA is very tightly condensed and rendered inaccessible to the transcriptional machinery, consequently heterochromatin is transcriptionally inactive. In human females one of each of the two X chromosomes is completely inactivated by being packaged into a heterochromatin to form a Barr body. The Cys residues in DNA in the heterochromatin are heavily methylated suggesting that methylation may play a role in the maintenance of heterochromatin. Drugs that interfere with methylation cause activation of previously inactive genes found in heterochromatin.

In euchromatin the DNA is not as condensed and is accessible to the transcription machinery. The regions of a chromosome that are maintained as hetero- and eu- chromatin may vary in a cell specific manner. This may enable the cells of specific tissues to express a particular subset of genes required for tissue function.

Proteins that participate in regulating gene expression are often called trans acting elements. At least 100 different proteins, many specific for the regulation of a particular gene, are known. Others play a more general role in regulating gene expression in a manner analogous to the activation of numerous prokaryotic genes by the CAP-cAMP complex. Trans-acting factors have multiple domains required for activity and may include DNA-binding, transcription-activating and ligand-binding domains.

DNA binding domains recognize specific DNA sequences in the regulatory regions of a gene. The DNA-binding domains of a regulatory protein generally consist of one of three motifs: helix-turn-helix, zinc finger or leucine zipper. DNA-binding proteins possessing these motifs bind with high affinity to their recognition sites and with low affinity to other DNA. A very small portion of the protein makes contact with the DNA through H-bonds and van der Waals interactions between amino acid side chains and the functional groups in the major groove and the phosphate backbone of the DNA. The remainder of the protein is involved in proper positioning of the DNA-binding domain and in making protein-protein contacts with other transcriptional proteins.

Proteins with this motif form symmetric dimers that recognize a symmetric palindromic DNA sequence. Each monomer of the dimer contains a region in which two a helices are held at 90 degrees to each other by a turn of four amino acids. One set of helices makes contact with about five base pairs in the major groove. The other set sits atop the phosphate backbone and helps to properly position the set of helices that fits into the major groove.

Proteins possessing this motif contain between 2 to 9 repeated domains that are each centered on a tetrahedrally coordinated zinc ion. Each zinc coordinated domain forms a loop containing ana-helix, this loop is called a zinc-finger. There are two types of zinc fingers: the C2H2finger and the Cxfinger.

Three fingers interact with the major groove and wrap around the DNA. Many transcription factors have this type of domain.

Proteins with this motif bind as dimers to the major groove of the DNA. Many steroid receptors have this type of domain.

Proteins with this type of motif have an amphipathica-helix at their carboxyl terminus. One side of the helix consists of hydrophobic groups, usually leucine, that are repeated every seventh position for several turns of the helix. The other face consists of charged and polar groups.

Proteins with this motif bind as dimers to the major groove of the DNA. The twoa-helices of each arm enter the major groove and wrap around the double helix. Several oncogenes use this type of motif.

These domains generally act separately and independently of the DNA-binding domains. Transcription-activating domains enhance transcription by physically ineracting with other regulatory proteins and/or with RNA polymerase. The actual mechanisms by which these domains activate or enhance transcription are not known.

Steroid hormones, thyroid hormones and retinoic acid are examples of ligands that activate transcription by binding to a specific domain on a receptor protein. Upon binding the receptor undergoes a conformational change that enables it to bind DNA. Once bound to the DNA a receptor protein can activate or repress transcription of the target gene.

Cis-acting elements are DNA sequences that are recognized and bound by the trans-acting elements that regulate transcription. There are two major types of cis-acting elements: promoters and regulatory elements.

Promoters are the sites where RNA polymerase must bind to the DNA in order to initiate transcription (see RNA Synthesis and Processing lecture). The rate or efficiency of promoter use by RNA polymerase is affected by the regulatory elements.

Regulatory elements are specific DNA sequences that are recognized and bound by the trans-acting elements that stimulate or inhibit the expression of a particular gene. There are two types: enhancers and response elements.

Enhancersare regulatory elements that increase or repress the rate of gene transcription.

Response Elementsare regulatory sequences that facilitate the coordinated regulation of a group of genes. Certain ligands such as steroid hormones and cAMP bind to their receptors which in turn bind to their response element to activate or inhibit transcription.

Initiating transcription at an alternative start site places a different exon at the 5 end of the transcript. Examples of genes that use alternative start sites as a form of regulation include amylase, myosin and alcohol dehydrogenase.

Immunoglobin (antibody) heavy chains use an alternative polyadenylation site to affect the length of transcripts. The longer transcript encodes themmform which is localized to the cell membranes of lymphocytes, the shorter transcript encodes the secreted form,ms.

Alternative splice sites are used to generate similar proteins with tissue specific functions called isoforms. Many peptide hormones exist as isoforms such as the calcitonin gene which is differentially spliced to produce calcitonin in the thyroid and calcitonin gene-related peptide in the neurons.

The stability of mRNA is quite variable form gene to gene. These variations in stability govern the length of time that mRNA is available for translation and hence the amount of protein that is synthesized. The half-lives of mRNA vary from 10 hours to minutes. Sequences in the 3 untranslated region of mRNA which serve as signals for rapid degradation have been identified in some mRNAs with very short half-lives. The length of the poly A tail also affects mRNA stability, with longer tails tending to have longer half-lives.

The views and opinions expressed on any page of any unofficial site hosted on this web server of California State University, Dominguez Hills faculty, staff or students are strictly those of the page authors. The content of this page has not been reviewed or approved by California State University, Dominguez Hills.