New Features in Beta Testing
PDB users are encouraged to preview the biological unit files, curated (beta) mmCIF files, redundancy reduction cluster data, and new keyword search that are now in a beta testing phase. Comments on these new features are highly appreciated and may be sent to firstname.lastname@example.org:
Biological Unit - Images and Coordinate Files
The biological unit images and corresponding coordinate files for applicable structures are accessible from the Structure Explorer pages on the PDB Beta Web Site at beta.rcsb.org/pdb.
The View Structure section of the Structure Explorer offers still ribbon images of the assumed biological unit(s) for structures, where relevant, in addition to static images of the asymmetric unit. Links to the coordinate files that are used to generate the biological unit images are also accessible here, as well as from the Download/Display File section of the Structure Explorer.
Curated (Beta) mmCIF Files
The Download/Display File section of the Structure Explorer pages on the Beta Web Site provides links to view or download the curated mmCIF files. These files include remediated data from the Data Uniformity Project (www.rcsb.org/pdb/uniformity). The files follow the latest version of the mmCIF dictionary supplemented by an exchange dictionary developed by the RCSB and the MSD-EBI. This exchange dictionary can be obtained from deposit.pdb.org/mmcif.
The curated mmCIF files for a set of query results can be downloaded by selecting the Download Structures or Sequences option from the pull down menu at the top of the Query Result Browser page.
Curated mmCIF files for all PDB structures are available in gzip (.gz) format at ftp://beta.rcsb.org/pub/pdb/uniformity/data/mmCIF.gz/. UNIX-compressed versions of these files (.Z) are available at ftp://beta.rcsb.org/pub/pdb/uniformity/data/mmCIF/.
Redundancy Reduction Cluster Data
The results of the weekly clustering of protein chains in the PDB are available for beta testing at ftp://ftp.rcsb.org/pub/pdb/derived_data/NR/. These clusters are used in the "remove sequence homologs" feature on the PDB web sites. Files that list the clusters and their rankings at 50%, 70% and 90% sequence identity are available. Smaller rank numbers indicate higher (better) ranking. Chains with rank number 1 are ranked as the best representative of their cluster.
New Keyword Search
A much improved keyword search is now available on the beta website's home page, SearchLite, and the "Text Search" box on SearchFields. This new search engine (powered by Lucene) queries an index derived from the curated mmCIF files, and should return more accurate search results.