The Molecule of the Month series, by David S.
Goodsell, explores the functions and
significance of selected biological macromolecules for a general
audience. These installments are available at
www.rcsb.org/pdb/molecules/molecule_list.html. A sample of the molecules featured during this past quarter are included below:
October, 2003 -- Your body needs a steady supply of amino acids for
use in growth and repairs. Each day, a typical adult needs something
in the range of 35-90 grams of protein, depending on their weight.
Quite surprisingly, a large fraction of this may come from inside.
A typical North American diet may contain 70-100 grams of protein
each day. But your body also secretes 20-30 grams of digestive
proteins, which are themselves digested when they finish their
duties. Dead intestinal cells and proteins leaking out of blood
vessels are also digested and reabsorbed as amino acids, showing
that our bodies are experts at recycling.
Proteins are tough, so we use an arsenal of enzymes to digest them
into their component amino acids. Digestion of proteins begins in
the stomach, where hydrochloric acid unfolds proteins and the enzyme
pepsin begins a rough disassembly. The real work then starts in the
intestines. The pancreas adds a collection of protein-cutting enzymes,
with trypsin playing the central role, that chop the protein chains
into pieces just a few amino acids long. Then, enzymes on the surfaces
of intestinal cells and inside the cells chop them into amino acids,
ready for use throughout the body.
Trypsin uses a special serine amino acid in its protein-cutting
reaction, and is consequently known as a serine protease. The serine
proteases are a diverse family of enzymes, all of which use similar
enzymatic machinery. In digestion, trypsin, chymotrypsin and elastase
work together to chop up proteins. Each has a particular taste for
protein chains: trypsin (shown in PDB entry 2ptn) cuts next to lysine
and arginine, chymotrypsin (shown in PDB entry 2cha) cuts next to
phenylalanine and other large amino acids, and elastase likes chains
with small amino acids like alanine (shown in PDB entry 3est).
Trypsin-like enzymes are also found in many other places in the body.
Some of these are highly specific, cleaving only a specific target
protein. For instance, thrombin, presented in the Molecule of the
Month in January 2002, is designed to make a specific cut in
fibrinogen, creating a blood clot.
For more information about trypsin, see
Simian Virus 40: Steering the Cycle of Life
November, 2003 -- Simian virus 40 is an example of how simple a virus
can be and still perform its deadly job. Viruses are tiny machines
with a single purpose: to reproduce themselves. They enter cells and
hijack their synthetic machinery, forcing them to create new viruses.
SV40 does this with very little molecular machinery. It is enclosed by
a spherical capsid composed of 360 copies of one protein, seen in PDB
entry 1sva, and a few copies of two others. This capsid is just big
enough to enclose a small circle of DNA 5,243 nucleotides long, which
contains the barest minimum of information needed to get into the cell
and make new viruses.
The circular SV40 genome is found in the cell as a "mini-chromosome"
wound into a handful of nucleosomes. It only has enough space to
encode a few functions, since it all has to fit inside the tiny
capsid. It has a regulatory region that controls the entire life-
cycle of the virus. It also encodes several proteins: the T-antigen
(and a spliced version of it called the t-antigen) and three capsid
proteins, VP1, VP2 and VP3. Only a few tiny segments are not used.
Space is so limited in this genome that the capsid proteins are
actually encoded with overlapping reading frames, such that the end
portion of the gene for one protein also encodes for the beginning
portion of the next protein. For more information on the parsimonious
genome of SV40, take a look at the European Bioinformatics
Institute's Protein of the Month feature at
SV40 infects primate cells, forcing its way inside and releasing its
DNA circle. Once inside, it has two jobs: to replicate its DNA and to
package it inside new viral capsids. Amazingly, SV40 only needs one
protein, the T-antigen, to control both of these processes. Soon after
the virus enters the cell, the cell's own synthetic machinery
recognizes a TATA sequence at the center of the SV40 regulatory
regions. The cell then creates a messenger RNA reading
counterclockwise around the DNA circle. This mRNA is used
to make the T-antigen protein. Then the virus really gets to work.
The T-antigen binds to the SV40 circle and helps to separate the
strands, making way for the cell's polymerases to copy the DNA. It
also directs the reading of the DNA in the opposite direction,
clockwise around the strand, to create many copies of the capsid
For more information on simian virus 40, see
Catabolite Activator Protein: a Second Messenger
December, 2003 -- Bacteria love sugar. In particular, bacteria love
glucose, which is easily digestible and quickly converted to chemical
energy. When glucose is plentiful, bacteria ignore other nutrients in
their environment, feasting on their favored source. But, when glucose
is rare, they shift gears and mobilize the machinery needed to use
other sources of energy.
Bacteria use an unusual modification of ATP, the molecule that carries
chemical energy in the cell, to notify its synthetic machinery about
what it is currently eating. As glucose levels drop, the cell-surface
enzyme adenyl cyclase is activated. It grabs ATP molecules, clips off
two phosphates, and reconnects the free end back onto the molecule,
creating an odd little molecular loop through the phosphate. This
product, called cyclic AMP, is released and it spreads through the
cell, stimulating production of the enzymes that process other food
molecules. Because of its role in delivering messages from the primary
glucose sensor (adenyl cyclase) to the synthetic machinery, cyclic AMP
is often known as a second messenger.
Catabolite activator protein (CAP), also known as cyclic AMP receptor
protein (CRP), is activated by cyclic AMP and stimulates synthesis of
the enzymes that break down non-glucose food molecules. It is composed
of two identical subunits, shown in PDB entry 1cgp. When cyclic AMP
binds, it changes the conformation of the protein slightly, making it
perfect for binding to DNA. CAP binds to a specific DNA sequence,
which is found next to the genes that are activated. When CAP binds to
DNA, it coaxes RNA polymerase into place, beginning transcription.
For more information about the catabolite activator protein, see
The RCSB PDB (citation) is managed by two members of the Research Collaboratory for Structural Bioinformatics:
RCSB PDB is a member of the
The RCSB PDB is funded by a grant from the
National Science Foundation, the
National Institutes of Health, and the
US Department of Energy.