One of the goals of the PDB is to make the archive as consistent and error-free as possible. The PDB's Data Uniformity Project enhances the consistency of existing (legacy) entries and maintains a consistent method of annotating current depositions. Recently, the PDB archive has been standardized and released in mmCIF format.
One focus of this work has been to resolve any inconsistencies between the specification of the chemical sequence and sequence that is inferred from the deposited coordinate data. Another focus of this work was to include in the mmCIF data files the results of prior uniformity processing of individual PDB records. The standardized data for records such as compound name, citation, and source organism were previously accessible from the PDB database, but this information was not available in all of the data files. The mmCIF data files include the integration of all of this information, as well as additional macromolecular names and synonyms from related SwissProt sequence database entries.
All legacy PDB entries and the recent RCSB entries are available in mmCIF format from the PDB beta FTP site at ftp://beta.rcsb.org/pub/pdb/uniformity/data/mmCIF/. The files follow the latest version of the mmCIF dictionary supplemented by an exchange dictionary developed by the PDB and the EBI. This exchange dictionary can be obtained from http://deposit.pdb.org/mmcif/.
An application program called CIFTr was made available for translating files in mmCIF format into files in PDB format. Further information on this program is available in this newsletter.
Comments on this, and all aspects of the PDB, are welcome at firstname.lastname@example.org.
The RCSB PDB (citation) is managed by two members of the Research Collaboratory for Structural Bioinformatics:
RCSB PDB is a member of the
The RCSB PDB is funded by a grant from the
National Science Foundation, the
National Institutes of Health, and the
US Department of Energy.