PDB users are encouraged to preview the biological unit files, curated
(beta) mmCIF files, redundancy reduction cluster data, and new keyword
search that are now in a beta testing phase. Comments on these new
features are highly appreciated and may be sent to firstname.lastname@example.org:
Biological Unit - Images and Coordinate Files
The biological unit images and corresponding coordinate files for
applicable structures are accessible from the Structure Explorer
pages on the PDB Beta Web Site at beta.rcsb.org/pdb.
The View Structure section of the Structure Explorer offers still
ribbon images of the assumed biological unit(s) for structures,
where relevant, in addition to static images of the asymmetric unit.
Links to the coordinate files that are used to generate the biological
unit images are also accessible here, as well as from the
Download/Display File section of the Structure Explorer.
Curated (Beta) mmCIF Files
The Download/Display File section of the Structure Explorer pages
on the Beta Web Site provides links to view or download the
curated mmCIF files. These files include remediated data from the
Data Uniformity Project (www.rcsb.org/pdb/uniformity). The files
follow the latest version of the mmCIF dictionary supplemented by
an exchange dictionary developed by the RCSB and the MSD-EBI.
This exchange dictionary can be obtained from deposit.pdb.org/mmcif.
The curated mmCIF files for a set of query results can be downloaded
by selecting the Download Structures or Sequences option from the
pull down menu at the top of the Query Result Browser page.
Curated mmCIF files for all PDB structures are available in gzip (.gz)
format at ftp://beta.rcsb.org/pub/pdb/uniformity/data/mmCIF.gz/.
UNIX-compressed versions of these files (.Z) are available
Redundancy Reduction Cluster Data
The results of the weekly clustering of protein chains in the PDB
are available for beta testing at
ftp://ftp.rcsb.org/pub/pdb/derived_data/NR/. These clusters are
used in the "remove sequence homologs" feature on the PDB web sites.
Files that list the clusters and their rankings at 50%, 70% and
90% sequence identity are available. Smaller rank numbers indicate
higher (better) ranking. Chains with rank number 1 are ranked as the
best representative of their cluster.
The contents of these files and the details of the clustering and
ranking are further described at
New Keyword Search
A much improved keyword search is now available on the beta website's
home page, SearchLite, and the "Text Search" box on SearchFields. This
new search engine (powered by Lucene) queries an index derived from
the curated mmCIF files, and should return more accurate search
PDB ID: 1g9u
A.G. Evdokimov, D.E. Anderson, K.M. Routzahn, D.S. Waugh (2001): Unusual molecular architecture of the Yersinia pestis cytotoxin YopM: a leucine-rich repeat protein with the shortest repeating unit. J. Mol. Biol. 312, p. 807.
The RCSB PDB (citation) is managed by two members of the Research Collaboratory for Structural Bioinformatics:
RCSB PDB is a member of the
The RCSB PDB is funded by a grant from the
National Science Foundation, the
National Institutes of Health, and the
US Department of Energy.