What's New in this Release (May 01, 2012)

Domain-based Structural Alignments
The 3D Similarity tab provides pre-calculated systematic structure comparisons for all PDB proteins. The new version of this feature provides domain-based protein structure alignments instead of chain-based alignments.

The procedure to calculate the domain-split representative is an extension of our sequence clustering approach. In order to remove redundancy, we start with a 40% sequence identity clustering procedure. A representative chain is taken from each sequence cluster. In cases where the representative chain comprises multiple domains, each of those domains is included in database searches. If available, the domain assignment provided by SCOP 1.75 is used. Otherwise algorithmic domain assignments are computed using the ProteinDomainParser software.

Example: Try the Cyclodextrin glycosyl transferase 3BMV


Highlighting of Search Terms in Results

Keyword and Macromolecule name searches, which can be performed by clicking the "search" button on the top-bar search without specifically selecting an auto-complete suggestion, now highlight the matching sentence within the text of the PDB entry.

Also highlighted by each search is the name of the data field (title, citation, molecule name, remark, etc.) in which the search terms can be found. For example, searching for the terms "mutant p53" on the top-bar search reveals that some of the results match structure titles, while others match REMARK 900 records (related entries) or words within the keywords section.


SMILES String Search From Top Bar

Typing a SMILES string (which encodes a chemical structure) in the top-bar search provides auto-completion options for a chemical search.

For example, typing the SMILES string Clc1ccccc1 (chlorobenzene) into the top-bar search provides the user with options to perform substructure, exact structure, or similar structure searches.

Selecting the exact structure search option redirects the user to the Ligand Summary for 8CL, which is the wwPDB Chemical Component Dictionary code for chlorobenzene.

The "Link Records" search finds structures containing interresidue connectivity (LINK records) that cannot be inferred from the primary structure.

When searching, it is possible to specify the type of connectivity (covalent bond, disulfide bridge, etc.), the three-letter codes for one or both of the linked residues, and/or the atom names of the linked atoms.

For example, a user can find any structure which contains coordination of an alanine (ALA) carbonyl oxygen (O) to calcium (CA).


Obsolete Ligands

There are several cases where ligand entries have been withdrawn from the wwPDB Chemical Component Dictionary and marked with the status "obsolete." Some are incorrectly defined chemical structures, but most are redundant ligands that have been superseded by an identical ligand with a different three-letter code.

For normal searches of the PDB, obsolete ligands are excluded. However, because literature references may cite obsolete ligand codes, we now provide an Obsolete Ligand search function whereby the user can find information on both obsolete ligands and their superseding couterparts, simply by entering the three-letter code in the top-bar search.

For example, the obsolete ligand 0AC has been superseded by identical ligand 1CR.

Improved UniProt ID and EC Number Mapping

The Uniprot and EC numbers that are displayed in the structure summary page are checked for consistency with the UniProt and the IUBMB web sites on a weekly basis.

Information from related UniProt entries is used, and simple improvements, such as the forwarding of obsolete EC numbers, are performed automatically.

This results in better coverage, quality, and consistency in the cross-referencing that is applied to structure pages, to relevant hierachical classification browsers, to relevant searches and associated functionality such as drill-down charts and auto-completion suggestions, and to relevant reports and associated web services.

Weekly updated Pfam mapping

We are now running Hmmer3 scans on a weekly basis to annotate newly released PDB chains with Pfam domain information. The data are available for download via the RESTful web api.

Improved Ligand Summary Report
Ligand Summary Reports can be generated for query result sets. To generate a report, select the Ligand Hits tab followed by the summary report option from the Generate Reports pull-down menu.

These reports include information about the selected ligands such as formula, molecular weight, name, SMILES string, which PDB entries are related to the ligand, and how they are related.

For each ligand included in the report, a sub-table can be selected to show lists of all related PDB entries that contain the ligand, the entries that contain the ligand as a free ligand, and entries that contain the ligand as part of a larger, polymeric ligand.

These reports now list the PDB IDs associated with a ligand, rather than just including the number of related entries. To display this sub-table, select the triangle (triangle) shown next to the Ligand ID. See a report example with ligand ID 017's sub-table displayed.

The sub-table limits the display to 15 PDB IDs in each column. For the ligands associated with more than 15 PDB IDs, the ... [more] link will launch a query for that set of structures.

The screen shot image above shows a Ligand Summary Report with 12 ligand IDs. The sub-table has been displayed for ligand ID 017. The big red arrow shows the query result page after clicking the ... [more] link. The result page is a query of "Chemical ID 017 and Polymeric Type Any."

As part of the tabular report system, the Ligand Summary Report can be exported in three formats: , Excel 97-2003, Excel 2007 or newer versions, and CSV.

To support various data analysis on the exported file, the three PDB ID lists with complete PDB IDs are always included in the exported file as additional columns. Due to Excel's cell limitation, we suggest that the user export the file in CSV format if some ligands in the report are associated with more than 6500 PDB IDs.

All tabular report features are also available, including sorting, filtering, export to other report formats, and column customization.