February 8, 2000 -- Helge Weissig and Phil Bourne of the SDSC-PDB team have published a paper on an analysis of the PDB for trends in data quality and consistency. They found that, averaged over the complete collection, the stereochemical quality of atomic models has, in the past few years, moved towards ideal values. At the same time, there are inconsistencies in how data are reported. Water content is not reported consistently and the percent of data collected when reporting the high-resolution shell varies, detracting from the value of resolution as a yardstick for assessing the quality of a structure.
A more detailed analysis of these inconsistencies is hampered by the lack of machine-readable experimental data. To the user of macromolecular structure data, this suggests that structural details beyond the standard quality measures of resolution and R value should be considered when using coordinate sets for further derivation or in inferring biological function. To the curators of the PDB, this suggests the need to capture more of the experimental data associated with the experiment in a way that permits straightforward parsing.
Weissig, H. and P.E. Bourne. 1999. An analysis of the Protein Data Bank in search of temporal and global trends. Bioinformatics 15:807-831.
The RCSB PDB (citation) is managed by two members of the Research Collaboratory for Structural Bioinformatics:
RCSB PDB is a member of the
The RCSB PDB is funded by a grant from the
National Science Foundation, the
National Institutes of Health, and the
US Department of Energy.