TATA-Binding Protein
July 2005 Molecule of the Month by David Goodselldoi: 10.2210/rcsb_pdb/mom_2005_7 (PDF Version, ePub Version )
Discussed Structures
Introduction
The enzyme RNA polymerase performs the delicate task of unwinding the two strands of DNA and transcribing the genetic information into a strand of RNA. But how does it know where to start? Our cells contain 30,000 genes encoded in billions of nucleotides. For each gene, the cell must be able to start transcription at the right place and at the right time.
Getting Started
Specialized DNA sequences next to genes, called promoters, define the proper start site and direction for transcription. Promoters vary in sequence and location from organism to organism. In bacteria, typical promoters contain two regions that interact with the sigma subunit of their RNA polymerase. The sigma subunit binds to these DNA sequences, assists the start of transcription, and then detaches from the polymerase as it continues transcription through the gene. Our cells have a far more complex promoter system, using dozens of different proteins to ensure that the proper RNA polymerase is targeted to each gene. The TATA-binding protein is the central element of this system.
The TATA Box
Our protein-coding genes have a characteristic sequence of nucleotides, termed the TATA box, in front of the start site of transcription. The typical sequence is something like T-A-T-A-a/t-A-a/t, where a/t refers to positions that can be either A or T. Surprisingly many variations on this theme also work, and one of the challenges in the study of transcription is discovering why some sequences work and others don't. The TATA-binding protein (sometimes referred to as TBP) recognizes this TATA sequence and binds to it, creating a landmark that marks the start site of transcription. When the first structures of TATA-binding protein were determined, researchers discovered that TATA-binding protein is not gentle when it binds to DNA. Instead, it grabs the TATA sequence and bends it sharply, as seen in PDB entries 1ytb, 1tgh and 1cdw.
Helpers
TATA-binding protein works as part of a larger transcription
factor, TFIID, that starts the process of transcription. After it binds to the
promoter, it recruits additional transcription factors.
TFIIB, shown at the top
here from PDB entry 1vol, binds next.
Then a string of other transcription factors bind, constructing a large protein
complex that decides whether or not to start transcription. These may include
transcription activators, such as TFIIA shown in the middle from PDB entry 1ytf, that promote
the start of transcription.
Other factors inhibit the start of transcription,
such as the transcription regulator NC2 (negative cofactor 2), shown at the
bottom from PDB entry 1jfi.
In all of these
pictures, TATA-binding protein is shown in blue, a small piece of DNA is shown
in red and the transcription factor is shown in green.
Exploring the Structure
TATA-binding protein uses two types of interactions to recognize and hold the TATA sequence, as seen in this structure
from PDB entry 1ytb. First, as shown
at the top, it has a string of lysine and arginine amino acids (colored dark
blue) that interact with the phosphate groups of the DNA (colored bright yellow
and red). This glues the protein to the DNA. Second, the protein uses
specially-placed amino acids to interact with DNA bases. As shown in the lower
picture, four phenylalanine amino acids jam into the DNA minor groove and form
the kinks that bend the DNA. There are also two symmetrical asparagine amino
acids that form hydrogen bonds at the very center. The combination of the
unusual flexibility of TATA DNA sequences and these specific hydrogen bonds
allows TATA-binding protein to recognize the proper sequence.
As you are
looking at these structures yourself, notice that TATA-binding protein, even
though it is composed of a single protein chain, is composed of two symmetrical
halves. This symmetry is easily seen in the two pairs of phenylalanines and the
two asparagines shown in the lower figure. It is thought that an ancient gene
duplication created this protein by combining two copies of the same gene. For
more information on TATA-binding protein from a genomics perspective, visit the
Protein of the Month
at the European Bioinformatics Institute.
These pictures were
created with RasMol. You can create similar pictures by clicking on the
accession codes here and picking one of the options under View Structure. The
phenylalanines shown above are numbers 99, 116, 190, and 207, and the
asparagines are numbers 69 and 159.
Further reading about TATA-binding protein
R. G. Roeder (1996) The role
of general initiation factors in transcription by RNA polymerase II. Trends in
Biochemical Sciences 21, 327-335.
Z. S. Juo, T. K. Chiu, P. M. Leiberman,
I. Baikalov, A. J. Berk and R. E. Dickerson (1996) How proteins recognize the
TATA box. Journal of Molecular Biology 261, 239- 254.
© 2013 David Goodsell & RCSB Protein Data Bank




