If our genes are so similar, what really makes a eukaryote different from a prokaryote, or a human from E. coli? The answer lies in the difference in gene expression and regulation used.
It is estimated that the human genome encodes approximately 25,000genes, about the same number as that for corn and nearly twice as many as thatfor the common fruit fly. Even more interesting is the fact that those 25,000genes are encoded in about 1.5% of the genome. So, what exactly does the other98.5% of our DNA do? While many mysteries remain about what all of that extra sequenceis for, we know that it does contain complex instructions that direct theintricate turning on and off of gene transcription.
Eukaryotes Require Complex Controls Over Gene Expression
While basic similarities in gene transcription exist betweenprokaryotes and eukaryotes—including the fact that RNA polymerase bindsupstream of the gene on its promoter to initiate the process of transcription—multicellulareukaryotes control cell differentiation through more complex and precisetemporal and spatial regulation of gene expression.
Multicellular eukaryotes have a much larger genome thanprokaryotes, which is organized into multiple chromosomes with greater sequencecomplexity. Many eukaryotic species carry genes with the same sequences as otherplants and animals. In addition, the same DNA sequences (though not the sameproteins) are found within all of an organism's diploid, nucleated cells, eventhough these cells form tissues with drastically different appearances,properties, and functions. Why then, is there such great variation among andwithin such organisms? Quite simply, the way in which different genes areturned on and off in specific cells generates the variety we observe in nature.In other words, specific functions of different cell types are generatedthrough differential gene regulation.
Of course, higher eukaryotes still respond to environmentalsignals by regulating their genes. But there is an additional layer ofregulation that results from cell-to-cell interactions within the organism thatorchestrate development. Specifically, gene expression is controlled on twolevels. First, transcription is controlled by limiting the amount of mRNA thatis produced from a particular gene. The second level of control is throughpost-transcriptional events that regulate the translation of mRNA into proteins.Even after a protein is made, post-translational modifications can affect itsactivity.
Transcriptional Regulation in Eukaryotes
Regulation of transcription ineukaryotes is a result of the combined effects of structural properties (howDNA is "packaged") and the interactions of proteins called transcriptionfactors. The most important structural difference between eukaryotic andprokaryotic DNA is the formation of chromatin in eukaryotes. Chromatin resultsin the different transcriptional "ground states" of prokaryotes and eukaryotes(Table 1).
Table 1: Overview of Differences Between Prokaryoticand Eukaryotic Gene Expression and Regulation
Prokaryotes | Eukaryotes | |
Structure of genome | Single, generally circular genome sometimes accompanied by smaller pieces of accessory DNA, like plasmids | Genome found in chromosomes; nucleosome structure limits DNA accessibility |
Size of genome | Relatively small | Relatively large |
Location of gene transcription and translation | Coupled; no nucleoid envelope barrier because of prokaryotic cell structure | Nuclear transcription and cytoplasmic translation |
Gene clustering | Operons where genes with similar function are grouped together | Operons generally not found in eukaryotes; each gene has its own promoter element and enhancer element(s) |
Default state of transcription | On | Off |
DNA structure | Highly supercoiled DNA with some associated proteins | Highly supercoiled chromatin associated with histones in nucleosomes |
Transcription Factors and Combinatorial Control
Figure 1:DNA footprinting reveals transcription factor specificity in different cell types
In vivo footprinting analysis of the human beta globin promoter shows that adult erythroblasts (E, lane 4) have footprints on important regulatory motifs (note lighter regions, especially at CACC) as compared to the other samples. Here, lane N is control DNA, lane H is HeLa cells, lane K is K562 cells, lane R is Raji cells, and lane J is Jurkat cells. Of these cells lines, none is part of the lineage leading to red blood cells.
© 1994 American Society for Biochemistry and Molecular Biology Reddy, P. M. et al. Genomic footprinting and sequencing of human beta-globin locus: tissue specificity and cell line artifact. Journal of Biological Chemistry 269, 8287–8295 (1994). All rights reserved.
Transcription factors (TFs) are regulatory proteins whose function is to activate (or more rarely, to inhibit) transcription of DNA by binding to specific DNA sequences. TFs have defined DNA-binding domains with up to 106-fold higher affinity for their target sequences than for the remainder of the DNA strand. These highly conserved sequences have been used to categorize the known TFs into various "families," such as the MADS box-containing proteins, SOX proteins, and POU factors (Remenyi et al., 2004). Transcription factors can also be classified by their three-dimensional protein structure, including basic helix-turn-helix, helix-loop-helix, and zinc finger proteins. These different structural motifs result in transcription factor specificity for the consensus sequences to which they bind.
Sequence-specific transcription factors are considered the most important and diverse mechanisms of gene regulation in both prokaryotic and eukaryotic cells (Pulverer, 2005). In eukaryotes, regulation of gene expression by transcription factors is said to be combinatorial, in that it requires the coordinated interactions of multiple proteins (in contrast to prokaryotes, in which a single protein is usually all that is required).
Many genes, known as housekeeping genes, are needed by almost every type of cell and appear to be unregulated or constitutive. But at the core of cellular differentiation, manifested in the variety of cell types observed in different organisms, is the regulation of gene expression in a tissue-specific manner. The same genome is responsible for making the entire cadre of cell types, each of which has its own function—for example, red blood cells exchange oxygen, muscle cells expand and contract, and cells in the immune system recognize pathogens. Genes that regulate cell identity are turned on under very specific temporal, spatial, and environmental conditions to ensure that a cell is able to perform its designated function.
Take the example of the gene for beta globin, a protein used in red blood cells for oxygen exchange. Every cell in the human body contains the beta globin gene and the corresponding upstream regulatory sequences that regulate expression, but no cell type other than red blood cells expresses beta globin. Scientists can use a technique called DNA footprinting to map where transcription factors bind to specific regulatory sequences. When Reddy et al. examined the beta globin promoter in different cell types, they found that the transcription factors that could bind to the promoter sequences required for beta globin expression were expressed only in erythroblasts (immature adult red blood cells). (See Figure 1). The two consensus sequences in the beta globin promoter known for binding transcription factors, CCAAT and CACC, were protected in the erythroid cells (E), but not the other cell types (Reddy et al., 1994).
The "Ground State" of DNA Expression
RNA polymerase in prokaryotes can access almost any promoterin a DNA strand without the presence of activators or repressors. Thus, the"ground state" of DNA expression in prokaryotes is said to be nonrestrictive,or "on." In eukaryotes, however, the ground state of expression is restrictivein that, although strong promoters might be present, they are inactive in theabsence of some sort of recruitment to the promoter by transcription factors. Forinstance, RNA polymerase II, which transcribes mRNA, cannot bind to promotersin eukaryotic DNA without the help of transcription factors (Struhl, 1999). Inmany eukaryotic organisms, the promoter contains a conserved gene sequencecalled the TATA box. Various other consensus sequences also exist and arerecognized by the different TF families. Transcription is initiated when one TFbinds to one of these promoter sequences, initiating a series of interactionsbetween multiple proteins (activators, regulators, and repressors) at the samesite, or other promoter, regulator, and enhancer sequences. Ultimately, a transcriptioncomplex is formed at the promoter that facilitates binding and transcription byRNA polymerase.
As in prokaryotes, eukaryotic repressor molecules can sometimesbind to silencer elements in the vicinity of a gene and inhibit the binding,assembly, or activity of the transcription complex, thus turning off expressionof a gene. Positive regulation by TFs that are activators is common ineukaryotes. Considering the restrictive transcriptional ground state, it is logicalthat positive regulation is the predominant form of control in all systemscharacterized to date. Many activating TFs are generally bound to DNA untilremoved by a signal molecule, while others might only bind to DNA onceinfluenced by a signal molecule. The binding of one type of TF can influence thebinding of others, as well. Thus, gene expression in eukaryotes is highlyvariable, depending on the type of activators involved and what signals arepresent to control binding.
The Role of Chromatin
Even when transcription factors are present in a cell,transcription does not always occur, because often the TFs cannot reach theirtarget sequences. The association of the DNA molecule with proteins is thefirst step in its silencing. The associated DNA and histone proteins are collectively calledchromatin; the complex is tightly bonded by attraction of the negativelycharged DNA to the positively charged histones (Table 1). The state of chromatincan limit access of transcription factors and RNA polymerase to DNA promoters,contributing to the restrictive ground state of gene expression. In order forgene transcription to occur, the chromatin structure must be unwound.
Chromatin structure contributes to the varying levels ofcomplexity in gene regulation. It allows simultaneous regulation offunctionally or structurally related genes that tend to be present in widelyspaced clusters or domains on eukaryotic DNA (Sproul et al., 2005). Interactions of chromatin with activators andrepressors can result in domains of chromatin that are open, closed, or poisedfor activation. Chromatin domains have various sizes and different extents ofstability. These variations allow for phenomena found solely in eukaryotes,such as transcription at various stages of development and epigenetic memorythroughout cell division cycles. They also allow for the maintenance ofdifferentiated cellular states, which is crucial to the survival ofmulticellular organisms (Struhl, 1999).
Multiple Interactions Provide Synchronous Control
As you have seen, the state of chromatin structure at aspecific region in eukaryotic DNA, along with the presence of specifictranscription factors, works to regulate gene expression in eukaryotes. However,this complex interplay between proteins that serve as transcriptionalactivators or repressors and accessibility to the regulatory sequence is stilljust part of the story. Epigenetic mechanisms, including DNA methylation andimprinting, noncoding RNA, post-translational modifications, and othermechanisms, further enrich the cellular portfolio of gene expression controlactivities.
References and Recommended Reading
Pulverer, B. Sequence-specific DNA-binding transcription factors. Nature Milestones (2005) doi: 10.1038/nrm1800 (link to article)
Reddy, P. M., Stamatoyannopoulos, G., Papayannopoulou, T., & Shen, C. K. Genomic footprinting and sequencing of human beta-globin locus: Tissue specificity and cell line artifact. Journal of Biological Chemistry 269, 8287–8295 (1994)
Remenyi, A., Scholer, H., & Wilmanns, M. Combinatorial control of gene expression. Nature Structural and Molecular Biology 11, 812–815 (2004) doi:10.1038/nsmb820 (link to article)
Struhl, K. Fundamentally different logic of gene regulation in eukaryotes and prokaryotes. Cell 98, 1–4 (1999)
Sproul, D., Gilbert, N., & Bickmore, W. The role of chromatin structure in regulating the expression of clustered genes. Nature Reviews Genetics 6, 775–781 (2005) doi:10.1038/nrg1688 (link to article)