Molecules of the Quarter:
Ribosome, Rubisco, and Pepsin

The PDB has continued to feature its popular "Molecule of the Month" piece. Written and drawn by David S. Goodsell, an assistant professor of molecular biology at The Scripps Research Institute in La Jolla, California, these articles provide an overview of significant milestones in the growth of the PDB's macromolecular structure data for a diverse audience. Here is a sample of the information that is presented in this feature:

Ribosome: The Elusive Protein Factory

October, 2000 -- Protein synthesis is the major task performed by living cells. For instance, roughly one third of the molecules in a typical bacterial cell are dedicated to this central task. Protein synthesis is a complex process involving many molecular machines. You can look at many of these molecules in the PDB, including DNA, DNA polymerases, and RNA polymerases; a host of repressors, DNA repair enzymes, topoisomerases, and histones; tRNA and acyl-tRNA synthetases; and molecular chaperones. This month, for the first time, you can also look at the factory of protein synthesis in atomic detail.

The ribosome has been under the scrutiny of scientists for decades. Electron microscopy has yielded an increasingly detailed view over the years, defining the overall shape of individual ribosomes and differences in this shape for ribosomes from different species. More recently, detailed electron micrograph reconstructions have studied the interaction or ribosomes with messenger RNA, transfer RNA and the protein elongation factors. This legacy of morphological work lays the groundwork on which the atomic structures may be understood.

Ribosomes are composed of two subunits: a large subunit, shown on the right, and a small subunit, shown on the left. Of course, the term "small" is used in a relative sense here: both the large and the small subunits are huge compared to a typical protein. Both subunits are composed of long strands of RNA dotted with protein chains. When synthesizing a new protein, the two subunits lock together with a messenger RNA trapped in the space between. The ribosome then walks down the messenger RNA three nucleotides at a time, building a new protein piece-by-piece.

The structure of the large subunit is available in PDB entries 1ffk and 1fjf. The large subunit contains the active site of the ribosome: the site that creates the new peptide bonds when proteins are synthesized. This structure, along with several other structures with inhibitors bound, provide strong evidence that the ribosome is a ribozyme. Enzymes typically use amino acids to catalyze chemical reactions, but the ribosome appears to use an adenine RNA nucleotide to perform its synthetic task.

The large subunit is composed of two RNA strands. Dozens of proteins bind on the surface of the ribosome. Many have long, snaky tails that extend into the body of the ribosome, gluing the RNA strands into their proper shape. Several of the proteins were not seen in this crystallographic structure, perhaps because they are too flexible. Approximate shapes for these proteins form two prominent stalks which are commonly used as landmarks in electron micrographs.

The structure of the small subunit is available in the PDB entries 1fka and 1fjg. The small subunit is in charge of information flow during protein synthesis. It initially finds a messenger RNA strand and, after combining with a large subunit, ensures that each codon in the message is paired with the anticodon in the proper transfer RNA. The messenger RNA is thought to enter through a small hole and then extend up into the "decoding center" in the cleft between the "head" at one end and the "body" at the other. The messenger RNA does not have to thread through this hole like a needle, however, because the hole is actually formed by a loop of the ribosomal RNA, which can open like a latch to admit the messenger.

Before jumping into these structures, be prepared. Both the large subunit and the small subunit are enormous complexes with many atoms: the structure of the large subunit in PDB entry 1ffk contains over 64,000 atoms, even though the authors chose to release only alpha carbon positions for the proteins, and the small subunit structure (1fka), also with partial structures for the proteins, contains almost 35,000 atoms. Many interactive display programs become very sluggish when working on structures this large.

The proposed active site in the large ribosomal subunit is comprised of several nucleic acid bases, potassium, and hydrogen. Adenine is thought to perform the synthesis reaction. Two guanines and a potassium ion serve to activate this adenine through a series of hydrogen bonds.

Rubisco: Fixing Carbon

November, 2000 -- Carbon is essential to life. All of our molecular machines are built around a central scaffolding of organic carbon. Unfortunately, carbon in the earth and atmosphere is locked in highly oxidized forms, such as carbonate minerals and carbon dioxide gas. In order to be useful, this oxidized carbon must be "fixed" into more organic forms, rich in carbon-carbon bonds and decorated with hydrogen atoms. Powered by the energy of sunlight, plants perform this central task of carbon fixation.

Inside plant cells, the enzyme ribulose bisphosphate carboxylase/oxygenase (rubisco) forms the bridge between life and the lifeless, creating organic carbon from the inorganic carbon dioxide in the air. Rubisco takes carbon dioxide and attaches it to ribulose bisphosphate, a short sugar chain with five carbon atoms. Rubisco then clips the lengthened chain into two identical phosphoglycerate pieces, each with three carbon atoms. Phosphoglycerates are familiar molecules in the cell, and many pathways are available to use it. Most of the phosphoglycerate made by rubisco is recycled to build more ribulose bisphosphate, which is needed to feed the carbon-fixing cycle. But one out of every six molecules is skimmed off and used to make sucrose (table sugar) to feed the rest of the plant, or stored away in the form of starch for later use.

In spite of its central role, rubisco is remarkably inefficient. As enzymes go, it is painfully slow. Typical enzymes can process a thousand molecules per second, but rubisco fixes only about three carbon dioxide molecules per second. Plant cells compensate for this slow rate by building lots of the enzyme. Chloroplasts are filled with rubisco, which comprises half of the protein. This makes rubisco the most plentiful single enzyme on the Earth.

Rubisco also shows an embarrassing lack of specificity. Unfortunately, oxygen molecules and carbon dioxide molecules are similar in shape and chemical properties. In proteins that bind oxygen, like myoglobin, carbon dioxide is easily excluded because carbon dioxide is slightly larger. But in rubisco, an oxygen molecule can bind comfortably in the site designed to bind to carbon dioxide. Rubisco then attaches the oxygen to the sugar chain, forming a faulty oxygenated product. The plant cell must then perform a costly series of salvage reactions to correct the mistake.

Plants and algae build a large, complex form of rubisco, composed of eight copies of a large protein chain and eight copies of a smaller chain. The protein in the PDB entry 1rcx is taken from spinach leaves. The tobacco enzyme may be found in 1rlc. Many enzymes form similar symmetrical complexes. Often, the interactions between the different chains are used to regulate the activity of the enzyme in the process known as allostery. Rubisco, however, seems to be rigid as a rock, with each of the active sites acting independently of one another. In fact, photosynthetic bacteria build a smaller rubisco (shown in PDB entry 9rub) composed of only two chains, which performs its catalytic task just as well. So, why do plants build a large complex? The answer might lie in the crowded conditions under which rubisco performs its job. By packing many chains together into a tight complex, the protein reduces the surface that must be wetted by the surrounding water. This allows more protein chains, and thus more active sites, to be packed into the same space.

The active site of rubisco is arranged around a magnesium ion. The magnesium ion is held tightly by three amino acids, including a surprising modified form of lysine. An extra carbon dioxide molecule is attached firmly to the end of the snaky lysine sidechain. In plant cells, this "activator" carbon dioxide, which is different from the carbon dioxide molecules that are fixed in the reaction, is attached to rubisco during the day, turning the enzyme "on," and removed at night, turning the enzyme "off." The exposed side of the magnesium ion is then free to bind to both ribulose bisphosphate, holding onto two oxygen atoms, and the carbon dioxide molecule that will be attached to sugar. In the PDB entry 8ruc structure, the carbon dioxide is already attached to the sugar. You will find that this structure includes only one half of the entire rubisco complex--if you are interested in looking at the whole rubisco molecule, the structure in 1rcx contains all sixteen chains.

Pepsin: A Piece of Scientific History

December, 2000 -- During the holiday season, we often place greater demands on our digestive enzymes than at other times of the year. Our digestive system contains a host of tough, stable enzymes designed to seek out those rich holiday treats and break them into small pieces. Pepsin is the first in a series of enzymes that digest proteins. In the stomach, protein chains bind in the deep active site groove of pepsin, and are broken into smaller pieces. Then, a variety of proteases and peptidases in the intestine finish the job. The small fragments--amino acids and dipeptides--are then absorbed by cells for use as metabolic fuel or construction of new proteins.

Enzymes that digest proteins pose a real challenge. The enzyme must be constructed inside the cell, but controlled in some manner so that it doesn't immediately start digesting the cell's own proteins. To solve this problem, pepsin and many other protein-cutting enzymes are created as inactive "proenzymes," which may then be activated once safely outside the cell. Pepsin is constructed with an extra 44 amino acids which block the large active site groove and hobble the enzyme. In the stomach, this extraneous chain is clipped off and the enzyme begins its destructive campaign.

For several reasons, digestive enzymes are attractive candidates for scientific study. They are easily isolated and present in large amounts in digestive juices. They are also extraordinarily stable, because they perform their jobs under the harsh conditions present in the digestive system. The reactions catalyzed by digestive enzymes are also easily followed: you can add them to a protein such as gelatin and watch it lose its gel-like consistency. In the 18th century, pepsin was the first enzyme to be discovered, and later, pepsin was the second enzyme to be crystallized (after urease). These crystals played an important role in showing that enzymes were proteins and that they had a defined structure. Today, the structure of pepsin, determined from similar crystals, is available in PDB entry 5pep and several others.

Pepsin is one example of a group of enzymes termed "acid proteases." In the case of pepsin, this name is doubly appropriate. Pepsin works its best in strong hydrochloric acid. But the similarity with other enzymes refers to a second type of acid. The active site of the acid proteases rely on two acidic aspartate amino acids (asparteases), which activate a water molecule and use it to cleave protein chains.

The acid proteases have evolved to fill several functional roles in different organisms. Pepsin (PDB entry 5pep) is optimized for digestion of food in the acidic environment of the stomach. It is very promiscuous, cleaving proteins in many different places. Chymosin (PDB entry 4cms), is made by young calves to break down milk proteins. A purified form of chymosin, taken from calf stomach, has been used for centuries to curdle milk in the production of cheese. Cathepsin D (PDB entry 1lyb) digests proteins inside lysozomes, the tiny stomachs inside cells. Other cellular acid proteases, such as renin (PDB entry 1hrn), are designed to make very specific cuts in one particular protein, aiding in the maturation of a hormone or structural protein. Endothiapepsin (PDB entry 4ape) is made by a fungus and excreted into the surrounding environment, breaking up the surrounding proteins and allowing the fungus to feed on the pieces.

Pepsin uses a pair of aspartate residues to perform the protein cleavage reaction. In an example of parallel evolution (where two organisms independently develop the same method for solving a problem), the mechanism is similar to that used by HIV protease, discussed in a previous Molecule of the Month.