New Features in Beta Testing

PDB users are encouraged to preview the biological unit files, curated (beta) mmCIF files, redundancy reduction cluster data, and new keyword search that are now in a beta testing phase. Comments on these new features are highly appreciated and may be sent to

Biological Unit - Images and Coordinate Files

The biological unit images and corresponding coordinate files for applicable structures are accessible from the Structure Explorer pages on the PDB Beta Web Site at

The View Structure section of the Structure Explorer offers still ribbon images of the assumed biological unit(s) for structures, where relevant, in addition to static images of the asymmetric unit. Links to the coordinate files that are used to generate the biological unit images are also accessible here, as well as from the Download/Display File section of the Structure Explorer.

Curated (Beta) mmCIF Files

The Download/Display File section of the Structure Explorer pages on the Beta Web Site provides links to view or download the curated mmCIF files. These files include remediated data from the Data Uniformity Project ( The files follow the latest version of the mmCIF dictionary supplemented by an exchange dictionary developed by the RCSB and the MSD-EBI. This exchange dictionary can be obtained from

The curated mmCIF files for a set of query results can be downloaded by selecting the Download Structures or Sequences option from the pull down menu at the top of the Query Result Browser page.

Curated mmCIF files for all PDB structures are available in gzip (.gz) format at UNIX-compressed versions of these files (.Z) are available at

Redundancy Reduction Cluster Data

The results of the weekly clustering of protein chains in the PDB are available for beta testing at These clusters are used in the "remove sequence homologs" feature on the PDB web sites. Files that list the clusters and their rankings at 50%, 70% and 90% sequence identity are available. Smaller rank numbers indicate higher (better) ranking. Chains with rank number 1 are ranked as the best representative of their cluster.

The contents of these files and the details of the clustering and ranking are further described at and

New Keyword Search

A much improved keyword search is now available on the beta website's home page, SearchLite, and the "Text Search" box on SearchFields. This new search engine (powered by Lucene) queries an index derived from the curated mmCIF files, and should return more accurate search results.

The inferred biologically active unit of the YopM cytotoxin

PDB ID: 1g9u

A.G. Evdokimov, D.E. Anderson, K.M. Routzahn, D.S. Waugh (2001): Unusual molecular architecture of the Yersinia pestis cytotoxin YopM: a leucine-rich repeat protein with the shortest repeating unit. J. Mol. Biol. 312, p. 807.