Jump to a Molecule:
Oct and Sox Transcription Factors

Keywords: transcription factor complex, octamer transcription factors, sox transcription factors, pluripotent stem cells, nuclear reprogramming


The development of a complete human being from a single cell is one of the great miracles of life. A human egg cell contains about 30,000 genes that encode proteins, and of these, about 3,000 of these genes encode transcription factors. Transcription factors determine when genes will be turned on and turned off, orchestrating the many processes involved in the development of an embryo and the many tasks performed by each cell after a child is born. Amazingly, there is only about 1 transcription factor for every 10 genes, posing a puzzle: how does this limited set of proteins control the many genes and processes that must be regulated?

Combinatorial Control

One of the answers to this question may be discovered by looking at the binding sites for transcription factors in the genome. Typical genes in our cells have extensive regulatory regions before and after the genes, sometimes 100,000 base pairs away, and occasionally even inside the genes. These regions act in many different ways, as enhancers, silencers, insulators, and promotors of the gene. Each gene is controlled by a combination of many transcription factors, which together form a consensus as to whether the gene will be expressed or not at any given time.

Choosing a Path

Oct4 and its cofactor Sox2 are at the center of a collection of transcription factors that control the first decisions in the development of an embryo. Oct4 is present in embryonic stem cells, and its levels drop when the cell starts to divide and differentiate into different types of cells. It has been called the "gatekeeper" of development, since it is necessary for maintaining the stem cell state. The structure shown here, from PDB entry 1gt0, shows the DNA-binding portions of a similar protein, Oct1 (at the bottom in turquoise), and Sox2 (at the top in blue) bound to a short piece of DNA (in orange and pink).


Unfortunately, once stem cells make their choices and differentiate into nerve cells or skin cells or other types of cells, they are normally unable to reverse their choices and become stem cells once again. If this were possible, however, it would be very useful: for instance, imagine taking a few skin cells from a patient with diabetes, and then changing these cells into pancreatic cells that can make insulin. Researchers have recently used Oct4 and Sox2 to make the first steps towards this amazing goal. By adding the genes for these proteins, along with a few other transcription factors, to skin cells, they were able to reprogram the cells into "pluripotent" stem cells that are able to form many other cell types.


Group Effort

The reprogramming of skin cells or other cells into stem cells requires a few helper proteins that relay the signal of Oct4 to the many genes that must be affected. The two proteins shown here--c-Myc (top) and Klf4 (bottom)--were used in the first successful reprogramming experiments. The DNA-binding portion of c-Myc is shown bound to a small piece of DNA along with its partner protein Max, from PDB entry 1nkp. The DNA-binding portion of Klf transcription factors are composed of three zinc finger domains, shown here from PDB entry 2ebt.


click on the above Jmol tab for an interactive visualization


Exploring the Structure

Combinatorial control, where several transcription factors bind together to control a gene, allows the same proteins to be used in different ways. This is shown in two structures of Oct1 and Sox2 bound to different pieces of regulatory DNA. In PDB entry 1gt0 (left), the proteins are bound to the FGF4 enhancer DNA and the two proteins interact weakly through and extended tail of Sox2. In PDB entry 1o4x (right), the proteins are bound closer together on the Hoxb1regulatory sequence and they form a stronger interaction. In this way, the different spacing of the binding sites in the DNA can control the binding strength of the Oct and Sox complex.

Topics for further exploration

  1. Researchers have used a variety of other transcription factors along with Oct4 and Sox2 for reprogramming cells, including Nanog and Lin-28. The DNA-binding portions of these proteins are available in the PDB. Can you find similarities and differences with the transcription factors shown here?
  2. Many DNA-binding proteins bend DNA when they bind. Can you find other examples in the PDB?


Additional Reading

  • M. Levine and R. Tjian (2003) Transcription regulation and animal diversity. Nature 424, 147-151.
  • W. Buitrago and D. R. Roop (2007) Oct-4: the almighty POUripotent regulator? Journal of Investigative Dermatology 127, 260-262.
  • S. I. E. Guth and M. Wegner (2008) Having it both ways: Sox protein function between conservation and innovation. Cellular and Molecular Life Sciences 65, 3000-3018.
  • Y.-H. Loh, J.-H. Ng and H.-H. Ng (2008) Molecular framework underlying pluripotency. Cell Cycle 7, 885-891.
  • K. Hochedlinger and K. Plath (2009) Epigenetic reprogramming and induced pluripotency. Development 136, 509-523.

Author Note

Entries included in Molecule of the Month articles are selected by the author, and do not represent a record of scientific priority or comprehensive review.

© 2015 David Goodsell & RCSB Protein Data Bank