RCSB PDB Newsletter #20: Announcing the Worldwide Protein Data Bank
HEADLINES

No. 20
Winter 2004


FRONT PAGE

Message from the RCSB PDB

Announcing the Worldwide Protein Data Bank

Downloadable PDB_EXTRACT Makes Deposition Easier

Biological Unit Tutorial Now Available from the RCSB PDB

Ligand Depot--a Small Molecule Information Resource

PDB Focus: Deposition and Release Policies

PDB Deposition Statistics

Lucene Keyword Search Released on the RCSB PDB Web Site

PDB Focus: Redundancy Reduction Cluster Data Available on the PDB FTP Site

PDB Focus: Searching for Experimental Data Files

Updates of mmCIF Files on the RCSB PDB FTP Site

RCSB PDB Web Site Statistics

NIGMS News: PSI-2 and Structural Biology Roadmap RFA

RCSB PDB Article Published in Nucleic Acids Research

New Update Release of CD-ROM Sets

PDB Molecules of the Quarter: Trypsin, Simian Virus 40, and Catabolite Activator Protein

PDB Community Focus: Edward N. Baker

PDB Education Corner by Katherine Kantardjieff

Related Links: FTP Resources

RCSB PDB Job Listings

RCSB PDB Members & Statement of Support


Questions? info@rcsb.org

© 2004 RCSB PDB

 

Announcing the Worldwide Protein Data Bank

Reprinted with permission from Nature Structural Biology.

In recognition of the growing international and interdisciplinary nature of structural biology, three organizations have formed a collaboration to oversee the newly formed worldwide Protein Data Bank (wwPDB; www.wwpdb.org). The Research Collaboratory for Structural Bioinformatics (RCSB), the Macromolecular Structure Database (MSD) at the European Bioinformatics Institute (EBI) and the Protein Data Bank Japan (PDBj) at the Institute for Protein Research in Osaka University will serve as custodians of the wwPDB, with the goal of maintaining a single archive of macromolecular structural data that is freely and publicly available to the global community.

The wwPDB represents a milestone in the evolution of the Protein Data Bank (PDB; www.pdb.org1, 2), which was established in 1971 at Brookhaven National Laboratory as the sole international repository for three-dimensional structure data of biological macromolecules. Since July 1, 1999, the PDB has been managed by three member institutions of the RCSB: Rutgers, The State University of New Jersey; the San Diego Supercomputer Center at the University of California, San Diego; and the Center for Advanced Research in Biotechnology of the National Institute of Standards and Technology.

The wwPDB recognizes the importance of providing equal access to the database?both in terms of depositing and retrieving data?from different regions of the world. Therefore, the wwPDB members will continue to serve as deposition, data processing, and distribution sites. Deposition procedures will not be altered by the formation of the wwPDB; data can still be deposited using ADIT at the RCSB and PDBj or by using AutoDep at the EBI.

To ensure the consistency of PDB data, all entries will be validated and annotated following a common set of criteria. All processed data will be sent to the RCSB, which distributes the data worldwide. All format documentation will be kept publicly available and the distribution sites will mirror the PDB archive using identical contents and subdirectory structure. However, each member of the wwPDB will be able to develop its own Web site, with a unique view of the primary data, providing a variety of tools and resources for the global community.

An Advisory Board consisting of appointees from the wwPDB, the International Union of Crystallography and the International Council on Magnetic Resonance in Biological Systems will provide guidance through annual meetings with the wwPDB consortium. This board is responsible for reviewing and determining policy as well as providing a forum for resolving issues related to the wwPDB. Specific details about the Advisory Board can be found in the wwPDB charter, available on the wwPDB Web site.

The RCSB is the 'archive keeper' of wwPDB. It has sole write access to the PDB archive and control over directory structure and contents, as well as responsibility for distributing new PDB identifiers to all deposition sites. The PDB archive is a collection of flat files in the legacy PDB file format3 and in the mmCIF4 format that follows the PDB exchange dictionary (deposit.pdb.org/mmcif). This dictionary describes the syntax and semantics of PDB data that are processed and exchanged during the process of data annotation. It was designed to provide consistency in data produced in structure laboratories, processed by the wwPDB members and used in bioinformatics applications. The PDB archive does not include the Web sites, browsers, software and database query engines developed by researchers worldwide.

The members of the wwPDB will jointly agree to any modifications or extensions to the PDB exchange dictionary. As data technology progresses, other data formats (such as XML) and delivery methods may be included in the official PDB archive if all the wwPDB members concur on the alteration. Any new formats will follow the naming and description conventions of the PDB exchange dictionary. In addition, the legacy PDB format would not be modified unless there is a compelling reason for a change. Should such a situation occur, all three wwPDB members would have to agree on the changes and give the structural biology community 90 days advance notice.

The creation of the wwPDB formalizes the international character of the PDB and ensures that the archive remains single and uniform. It provides a mechanism to ensure consistent data for software developers and users worldwide. We hope that this will encourage individual creativity in developing tools for presenting structural data, which could benefit the scientific research community in general.

REFERENCES
1. H.M. Berman, et al. (2000): Nucleic Acids Res. 28, pp. 235-242.
2. F.C. Bernstein, et al. (1977): J. Mol. Biol. 112, pp. 535-542 .
3. J. Callaway, et al. (1996): Protein Data Bank Contents Guide: Atomic coordinate entry format description. (Brookhaven National Laboratory).
4. P.E. Bourne, H.M. Berman, K. Watenpaugh, J.D. Westbrook, & P.M.D. Fitzgerald (1997): Methods Enzymol. 277, pp. 571-590.

ACKNOWLEDGMENTS
The RCSB PDB is supported by funds from the National Science Foundation, the Department of Energy, and the National Institutes of Health. The MSD-EBI is supported by funds from the Wellcome Trust, the European Union (TEMBLOR, NMRQUAL, SPINE, AUTOSTRUCT, and IIMS awards), CCP4, the Biotechnology and Biological Sciences Research Council (UK), the Medical Research Council (UK), and the European Molecular Biology Laboratory. PDBj is supported by grant-in-aid from the Institute for Bioinformatics Research and Development, Japan Science and Technology Agency (BIRD-JST), and the Ministry of Education, Culture, Sports, Science and Technology (MEXT).

H.M. Berman, K. Henrick, H. Nakamura (2003): Announcing the worldwide Protein Data Bank. Nature Structural Biology 10 (12), p. 980.