October 2011 Molecule of the Month by David Goodsell
Keywords: Protein Crystallography, Structural Biology, Protein Data Bank
Structural biology was born in 1958 with John Kendrew's atomic structure of myoglobin, and in the following decade, the field grew rapidly. By the early 1970's, there were a dozen atomic structures of proteins, and researchers were discovering that they had a goldmine of information. However, the coordinate files for these structures are quite large, and in the days before the internet, it was difficult for individual researchers to share these large files with the growing number of interested structural biologists around the world. The Protein Data Bank archive was created to solve this problem. Depositors would send their coordinates to the PDB, who would then mail them to interested users. To celebrate the 40th anniversary of the PDB, you can explore the historic protein structures that inspired the creation of the archive.
John Kendrew's structure of myoglobin (1mbn) revealed the folding of protein chains for the first time, as well as showing how protein chains interact with prosthetic groups and with ligands. Max Perutz's structure of hemoglobin (2dhb) extended this story, showing how four similar chains can associate and regulate the binding of ligands through small changes in shape. The early PDB also included one additional protein from this family. A hemoglobin from lamprey (2lhb), which is intermediate between myoglobin and hemoglobin, regulates its action by transitioning between monomers and dimers.
Enzyme Active Sites
The early PDB also included several structures of enzymes, revealing how protein chains fold to form chemical catalysts. DC Phillips' structure of lysozyme (1lyz), solved in 1965 and added to the PDB in 1975, revealed that enzymes have a form-fitting active site, and with some careful modeling, his group proposed that lysozyme distorts its substrate, making it easier to cleave. Several protein-cutting enzymes were included on the PDB magnetic tapes, including carboxypeptidase (3cpa), subtilisin (1sbt), chymotrypsin (2cha), and papain (9pap), as well as a small inhibitory protein, pancreatic trypsin inhibitor (4pti). The largest structure in the early PDB was the oligomeric enzyme lactate dehydrogenase (6ldh), composed of four chains with a groundbreaking 334 amino acids each.
Two protein structures included in the early PDB, rubredoxin (4rxn) and cytochrome b5 (1cyo), revealed how proteins carry the smallest cargo: individual electrons. Both proteins use an iron atom to carry the electron, but they do this in different ways. Rubredoxin traps the iron ion in a cage of four sulfur atoms, provided by cysteine amino acids from the protein chain. Cytochrome b5, on the other hand, uses a heme group to position its iron atom in the proper place.
click on the above Jmol tab for an interactive visualization
Exploring the Structures
Many of the PDB files for these pioneering structures have been modified and
improved over the past 40 years, but you can still find all of them in the database. Of
course, today you don't have to wait for them to arrive in the mail: you can just download
them from the internet, or use the Jmol image included here to flip through all of
them. When you're browsing, notice that these proteins are small and compact, and
they are also fairly plentiful proteins. These things made these early structures
possible, because the first techniques of protein crystallography required lots of purified
protein and lots of stable crystals. Today, sophisticated crystallization techniques
and very bright synchrotron X-ray sources allow researchers to solve structures of
much larger and more complex molecules, using much less material.
These early structures gave the first view of each protein, but they only provide a single snapshot of the protein. In each case, later structures fill out the biological story of how the protein works. You can find additional structures in the PDB that show the effect of mutations, ligand and inhibitor binding, motion of the protein, and other aspects of their function. For instance, you might look for structures that show features outlined in the "Topics for Further Exploration" below:
- J. C. Kendrew, G. Bodo, H. M. Dintzis, R. G. Parrish, H. Wyckoff & D. C. Phillips (1958) A three-dimensional model of the myoglobin molecule obtained by x-ray analysis. Nature 181, 662-666.
- C. C. F. Blake, D. F. Koenig, G. A. Mair, A. C. T. North, D. C. Phillips & V. R. Sarma (1965) Structure of hen egg-white lysozyme, a three-dimensional Fourier synthesis at 2 Angstroms resolution. Nature 206, 757-761.
- W. Bolton & M. F. Perutz (1970) Three dimensional Fourier synthesis of horse deoxyhaemoglobin at 2.8 Angstrom units resolution. Nature 228, 551-2.
- F. C. Bernstein, T. F. Koetzle, G. J. B. Williams, E. F. Meyer Jr., M. D. Brice, J. R. Rogers, O. Kennard, T. Shimanouchi & M. Tasumi (1977) The Protein Data Bank: a computer-based archival format for macromolecular structures. Journal of Molecular Biology 112: 535-542.
- H. M. Berman (2008) The Protein Data Bank: a historical perspective. Acta Crystallographica Section A 64, 88-95.
© 2015 David Goodsell & RCSB Protein Data Bank