The December 10, 2014 release offers features related to large structure support.
With this week's update, large structures (containing >62 chains and/or 99999 ATOM lines) represented as single files have been fully integrated into the main PDB FTP archive in both PDBx/mmCIF and PDBML formats. Previously, large structures were represented in multiple "SPLIT" entries, which have now been removed (obsoleted).
A separate directory in the PDB FTP archive contains a TAR file including a collection of "best-effort", minimal, PDB format files for large structures that contain authorship, citation details and coordinate data, and an index file that contains the mapping between the chains present in the large entry and the chains present in the limited PDB-format files. DOIs for large structures will point to these TAR files.
Large structures will only be distributed in the main PDB FTP directory in PDBx/mmCIF and PDBML formats, including biological assembly files. Structures that do not exceed the limitations of the PDB format will continue to be provided as PDB files in the archive for the foreseeable future.
Detailed information is available at http://wwpdb.org.
The RCSB PDB website has been updated to support these new files. Users searching for ID codes of "SPLIT" entries will be automatically redirected to the combined entry. Download and Display options for coordinate files access the corresponding files in the main archive.
A multi-scale rendering option has been implemented for the efficient display of large structures in Simple Viewer and Protein Workshop. These viewers are accessible from the Structure Summary page.
We thank Henry Truong (UCSD Computer Science) for working on this project as part of the UCSD 2014 STARS program (Summer Training Academy for the Research in the Sciences).
Very large structures can be challenging for visualization programs. To improve loading time, carbon-alpha atoms are rendered on a per protein residue level using an average radius for each residue type. This results in a low-resolution surface as shown in the images below.
For large protein-nucleic acid complexes, such as ribosomes, protein chains are rendered as low-resolution surfaces and nucleic acids chains as ribbons.
Users can quickly find all ribosomes and viruses in the PDB using the top bar simple search. For example, entering "ribosome" in the text search box, returns "View Ribosomes" option in the search suggestions (see image below). Similarly, entering "virus" can retrieve all virus structures.
Text search for the keyword ribosome will present the "Retrieve" feature to quickly find all ribosome structures.
The RCSB PDB (citation) is managed by two members of the Research Collaboratory for Structural Bioinformatics:
RCSB PDB is a member of the
The RCSB PDB is funded by a grant from the
National Science Foundation, the
National Institutes of Health, and the
US Department of Energy.