What's New in this Release (March 05, 2013)
Protein Stoichiometry and Symmetry
Drugs and Drug Targets in PDB
The Biologically Interesting Molecule Reference Dictionary (BIRD)
BIRD stands for the wwPDB's Biologically Interesting molecule Reference Dictionary, and contains information about the representation of peptide-like antibiotic and inhibitor molecules in the PDB archive.
BIRD contains chemical descriptions, sequence and linkage information, and functional and classification information as taken from the core structures and from external resources. PDB entries containing these molecules have been annotated using this dictionary, and will contain a corresponding BIRD ID code only in the PDBx-formatted file.
BIRD has enabled the RCSB PDB to offer improved searching and visualization options for these molecules.
Top Bar Search
Biologically interesting molecules from BIRD can be searched by typing a name (vancomycin), a BIRD ID (PRD_000204), type (glycopeptide), or class (antibiotic) in the top search bar. Suggestions will appear under the BIRD Molecules category.
Biologically interesting molecules from BIRD can also be searched from the Advanced Search menu. Search by text/name, BIRD type (structural classification of the entity), or BIRD class (broad definition of the entity function).
View BIRD Annotations
The BIRD widget on an entry's Structure Summary page will display the ID, image, name, type, class, and chain location for any such molecules in the entry. Ligand Explorer can be launched to view the molecule and binding site in 3D.
As with other Structure Summary page features, the examples displayed (ID, name, type, and class) can be used to find other PDB entries with the same characteristics.
Visualization of BIRD Entries
Ligand Explorer can be launched from the BIRD Widget to visualize these molecules and their binding sites. By default, Ligand Explorer zooms in on the active ligand used to launch the program. The program can center on other molecules by clicking on a name or identifier from the left hand menu. Intermolecular interactions such as hydrogen bonds or hydrophobic interactions and a binding site surface can be turned on for the active ligand.
In Protein Workshop, BIRD molecules are listed in the right hand menu and can be manipulated as with any other macromolecule chain or ligand. Protein Workshop can be launched from any entry's Structure Summary page.Protein Stoichiometry and Symmetry
The stoichiometry of a protein complex represents the composition of its subunits. For example, the biological assembly of hemoglobin has two alpha and two beta subunits, represented by the stoichiometry formula A2B2. In some cases there is minor heterogeneity among subunits caused by posttranslational modifications, point mutations, or micro-heterogeneity. In the stoichiometry calculation, minor differences among subunits are ignored: if the sequence identity is >= 95% over 90% of the sequence length of two protein chains, they are considered identical. Protein chains with less than 24 residues, nucleic acid chains, and ligands are ignored.
Symmetry refers to the point group symmetry of a protein complex. While a single protein chain with L-amino acids cannot be symmetric (point group C1), protein complexes with quaternary structure can have rotational symmetry belonging to the point groups: cyclic (Cn), dihedral (Dn), tetrahedral (T), octahedral (O), or icosahedral (I). Complexes are considered symmetric if all rotated identical subunits, generated by the symmetry operations of the point group, superpose within <= 5 Å with the original structure. The identity of subunits is based on the same criteria as described for protein stoichiometry.
Pseudostoichiometry and Pseudosymmetry
By default, a 95% sequence identity threshold is used for the stoichiometry and symmetry assignments. In addition, these properties are calculated at 30% sequence identity. If we consider hemoglobin again, at 95% sequence identity threshold the alpha and beta subunits are considered different, which correspond to an A2B2 stoichiometry and a C2 point group. At the 30% sequence identity level, all four chains would be considered homologous (~45% sequence identity) with an A4 pseudostoichiometry and D2 pseudosymmetry. The word pseudo indicates that the stoichiometry and symmetry are approximate.
Split entries (entries divided between multiple coordinate files due to the limitations of the PDB file format) are currently excluded from the protein stoichiometry and protein symmetry features.
Distribution drill-downs for protein stoichiometry and symmetry (at 95% sequence identity threshold) have been added to the Explore Archive widget on the home page. The stoichiometry and symmetry information used in the drill-downs comes from the first biological assembly listed in the entry if available, otherwise takes the assembly as it is in the entry.
The drill-downs can be applied successively. For example to find C5 symmetric homo-pentameric human proteins one would use the following sequence of drill-downs:
Step 1: Protein Symmetry Cyclic
Step 2: Protein Symmetry C5
Step 3: Protein Stoichiometry Homomer
Step 4: Protein Stoichiometry A5
Step 5: Organism Homo sapiens
Drill-downs, including options for stoichiometry or symmetry, can also be used to further refine search results from the query result browser page.
Advanced Search Options for Stoichiometry and Symmetry
Protein stoichiometry and symmetry search functionality has been added to Advanced Search. By default, these searches use a 95% sequence identity threshold. Alternatively, a 30% sequence identity threshold can be specified to query for entries with either pseudostoichiometry or pseudosymmetry. These search options can be combined with other types of Advanced Searches to find structures of interest.
Visualizing Protein Symmetry in Jmol
Protein symmetry can be viewed in 3D using Jmol (select the "View in 3D" link on an entry's Structure Summary page). Protein symmetry is calculated for all entries containing at least one protein chain, including asymmetric units and all biological assemblies (except for entries split among several PDB files due to their size). The Jmol page for each asymmetric unit or biological assembly is accessible from the "View in 3D" link on every structure summary page.
To facilitate the exploration of symmetry, several options are available:
Protein complexes are aligned along the highest-order symmetry axis or along the principal axes on inertia for asymmetric cases. Several default orientations of the structure can be toggled using the < and > buttons. The default orientations are canonical views: sides and back, and along unique n-fold symmetry axes.
View PDB ID 3EAM in Jmol
Symmetry polyhedra and axes
Polyhedrons and symmetry axes can be displayed to facilitate symmetry analysis of symmetry. A complex is enclosed in a polyhedron that matches its symmetry. All symmetry axes and their symbols representing the fold (dyad for 2-fold, triad for 3-fold axis, or in general a polygon for n-fold axis) can be displayed. For asymmetric cases, the 3 axes of inertia are displayed. Polyhedron and Axes can be toggled on/off using the check boxes in the right panel.
View PDB ID 1AEW in Jmol
Structures can be colored by symmetry (see the list of color schemes used), sequence (subunits with >= 95% sequence identity shown in the same color), or by subunit. Two examples of structures colored by symmetry are shown below.
Example 1For Cn symmetry (see example on left), the color scheme start at the 12 o'clock position, and the color gradient (light to dark) increases in a clockwise direction. The polyhedron, a pentagonal prism is displayed in a color complementary to the symmetry color scheme. The principal rotation axis is rendered in red with a pentagon representing the 5-fold rotation.
View PDB ID 3EAM in Jmol
Example 2In all cubic systems (T, O, I), different layers of the subunit are colored along a gradient (light to dark) from the plus and minus z-axis towards the origin (see example on left). The 4-fold axes are rendered in red, the 3-fold axes in green, and the 2-fold axes in blue. Using the toggle option underneath the Jmol applet, 3 views along these 3 different axes are available.
View PDB ID 1SHS in Jmol
Setting the sequence identity threshold
The display of symmetry depends on the sequence identity threshold. Either a 95% (default, far left) or 30% (left) sequence identity thresholds can be selected. At the lower identity level, some complexes may have pseudosymmetry, which is demonstrated with the hemoglobin example in the images to the left.
View PDB ID 4HHB in JmolImproved Visualization
Scripting Options We have redesigned the Jmol page. The new layout should provide a more comprehensive display of all the options on this page.
Domains Tab All available domain assignments for a protein structure entry are available from the Domains Tab. Each of the domain names is click-able and the Jmol view on top will highlight the corresponding region in the 3D structure.
Ligands Tab The Ligand Tab provides a new way for quickly selecting and highlighting ligand residues in 3D.
Ligand Explorer With Binding Site Surfaces
The option to display binding site surfaces has been added to Ligand Explorer. Surfaces are created using the algorithm from D. Xu and Y. Zhang (2009) Generating Triangulated Macromolecular Surfaces by Euclidean Distance Transform. PLoS ONE 4(12): e8140. An edge-smoothing algorithm is applied to generate smooth surface edges.
Biotin (PDB ID 1STP), displayed with the default options of a solid surface, colored by hydrophobicity, with a distance threshold of 6.5 Å.
View biotin (PDB ID 1STP) in Ligand Explorer
The same example shown using a blue mesh surface.Access Drugs and Drug Targets in the PDB
Stereoisomers of Drug Molecules
Drug and drug target data from DrugBank (www.drugbank.ca) were integrated with the RCSB PDB website in the last release. The integration was based on matching the stereochemistry and ionization state. This integration has been extended to include stereoisomer matches in this release.
Searches for a given drug name will return the different stereoisomers of the drug. For example, Ibuprofen is associated with entries from the Chemical Component Dictionary that are stereoisomers with different stereochemistry.
The example shows the auto-complete suggestions when searching Ibuprofen.
Drug Targets in the PDB
A Drug Info widget on the Ligand Summary pages lists the corresponding data from DrugBank (when available). The widget contains DrugBank ID, drug name, groups, brand name, and more, with links to the corresponding data at DrugBank.
Clicking on the drug target name will launch a query on the drug target sequence, e-value cutoff is set to 0.000001.
Browse Drugs by ATC Classification
The Anatomical Therapeutic Chemical (ATC) Classification System is used for drug classification. It is controlled by the WHO Collaborating Centre for Drug Statistics Methodology.
The RCSB PDB database can now be browsed using the ATC system. Select the ATC tab from the Browse Database interface to navigate through the drug classification hierarchy, view the number of associated PDB structures, and access the related entries.
Top Bar Search
The top bar search has been given a cleaner and simpler look and feel. The options on top determine the scope of the search, i.e. whether to include all fields ("Everything") or to restrict the search to author names, macromolecule names, sequences, or chemical components ("Ligand"). After entering text, the same autocomplete suggestion are available as before. The Search History and Previous Results are stored only for the currently active session. Links to Advanced Search and the Browse Database interface are now available on the left.
The navigation of the PDB-101 site has new tabs for easier access to Educational Materials, Molecule of the Month, Understanding PDB Data, and Author Profiles. A "PDB-101 News" section has been added.Miscellaneous Features
Search by Gene Name
A new auto-suggest option based on gene names from UniProt will return the Protein Feature View page in order to investigate which PDB entries are related to the gene names.
SCOP Track in Protein Feature View
The Protein Feature View (Protein Feature View), now displays SCOP domain assignments.