2000 PDB News

Contents:

19-Dec-2000 Data Uniformity Project Web Page
12-Dec-2000 PDB Focus: Get Educated
5-Dec-2000 PDB Releases mmCIF Files from the Data Uniformity Project and Translation Software
28-Nov-2000 PDB Focus: ADIT Help Services
21-Nov-2000 PDB CD-ROM Set 94 Now Available
14-Nov-2000 The PDB and Structural Genomics
7-Nov-2000 New ADIT Features
31-Oct-2000 New Query Features Available on the Beta Test Site
31-Oct-2000 Seven Ways of Looking at a Protein
24-Oct-2000 Issue 7 of the PDB Newsletter Now Available
17-Oct-2000 PDB Focus: FAQ Lists
10-Oct-2000 PDB Focus: WWW User Guides
3-Oct-2000 Data Deposited and Processed using ADIT mirror at Osaka University
26-Sep-2000 PDB Focus: info@rcsb.org
19-Sep-2000 PDB Focus: The ADIT Validation Server
12-Sep-2000 New Query Features Available on the PDB Web Site
5-Sep-2000 PDB's NMR Task Force at ICMRBS Conference
29-Aug-2000 PDB at the ISMB Conference
15-Aug-2000 July 2000 CD-ROM Now Available
15-Aug-2000 PDB at the Protein Society Symposium
8-Aug-2000 PDB Exhibit Booth at the ISMB Conference
1-Aug-2000 PDB at the ACA Meeting
25-Jul-2000 RCSB PDB Exhibit Booth at Protein Society's Annual Symposium
18-Jul-2000 New Query Features in Production and Beta Test
18-Jul-2000 PDB to Host Booth and Users Lunch at the ACA Meeting
18-Jul-2000 PDB Newsletter Issue 6 Now Available
27-Jun-2000 PDB to be updated on July 4, 2000
20-Jun-2000 PDB Launches a Deposition Mirror Site at Osaka University
6-Jun-2000 PDB Plans for Releasing Data from the Uniformity Project
30-May-2000 RCSB Hosts CCPN Meeting
30-May-2000 A Reminder to Bookmark and Use the Closest Mirror Site
9-May-2000 PDB CD-ROM Set 92 Released
2-May-2000 PDB File Format FAQ Now Available
2-May-2000 Alternative URL for PDB Site
18-Apr-2000 Issue 5 PDB Newsletter Now Available
18-Apr-2000 Automated Filing System for the PDB Master Archive
11-Apr-2000 PDB Featured at BIO2000
4-Apr-2000 New Queries, New Reports
7-Mar-2000 New Ligand Search Capability Available for Beta Testing
7-Mar-2000 New Query and Reporting Features
7-Mar-2000 Release of Cleaned-up Citation Data
22-Feb-2000 New PDB Mirror Site in Brazil
22-Feb-2000 Proposal for CORBA standard for Macromolecular Structure Data Submitted
22-Feb-2000 PDB at the Biophysical Society Annual Meeting
8-Feb-2000 New Publication Analysing the PDB
1-Feb-2000 New PDB Query and Reporting Capability Available for Beta Testing
1-Feb-2000 Cleaned-up Citation Data Available
1-Feb-2000 Assignment of Chain IDs in PDB Files
11-Jan-2000 Issue 4 PDB Newsletter Now Available
 1-Jan-2000 Happy New Year !


Read the latest PDB news. Earlier news is archived in the RCSB PDB newsletters.


19-Dec-2000

Data Uniformity Project Web Page

Links and update notices will be archived at the Data Uniformity Project Web page at http://www.rcsb.org/pdb/uniformity/index.html. Currently, links to mmCIF-formatted files on the PDB beta FTP site and to the CIFTr translation software are highlighted.

12-Dec-2000

PDB Focus: Get Educated

One of the goals of the RCSB is to educate the community about the PDB portal and related topics of interest. A collection of valuable links are available from the PDB Get Educated page at http://www.rcsb.org/pdb/education_discussion/educational_resources/index.html. This resource contains a wealth of information for audiences ranging from elementary level students to undergraduates to the general public. Included are links to such features as general information about proteins and nucleic acids, several articles and animated presentations on the PDB, protein documentaries providing detailed descriptions of specific well-characterized proteins from different families, a query interface for the novice user, interactive 3-dimensional tutorials and college course materials, and an illustrated glossary of crystallographic and NMR terminology.

Suggestions for additions to this page are appreciated and can be sent to info@rcsb.org.

5-Dec-2000

PDB Releases mmCIF Files from the Data Uniformity Project and Translation Software

Approximately 1,000 mmCIF formatted files for nucleic acid-containing crystal structures are now available from the PDB beta ftp site at ftp://beta.rcsb.org/pub/pdb/uniformity/data/mmCIF/. These entries were curated by the Nucleic Acid Database project and revisited for data uniformity processing.

The files follow the latest version of the mmCIF dictionary supplemented by an exchange dictionary developed by the PDB and the European Bioinformatics Institute. This exchange dictionary can be obtained from http://pdb.rutgers.edu/mmcif/.

An application program called CIFTr is available for translating files in mmCIF format into files in PDB format. CIFTr works on UNIX platforms, and can be downloaded at http://pdb.rutgers.edu/software/. CIFTr also provides the option of producing a file with a blank chain ID field for structures with a single chain, and the option of producing files with standard IUPAC hydrogen nomenclature for standard L-amino acids.

28-Nov-2000

PDB Focus: ADIT Help Services

An e-mail help desk, a deposition FAQ, and several tutorials are available for depositors using the AutoDep Input Tool (ADIT).

General deposition and processing questions or comments can be submitted to deposit@rcsb.rutgers.edu, and are generally answered within one business day. Information is also provided at the PDB Data Deposition and Processing FAQ, which supplies links to several data deposition and processing resources. Tutorials for ADIT and the ADIT Validation Server are also available. Sample "in progress" depositions are available at http://pdb.rutgers.edu:81/.

21-Nov-2000

PDB CD-ROM Set 94 Now Available

The latest PDB CD-ROM (release #94) is currently being distributed. This release contains the macromolecular structure entries and the experimental data when available for the 13,270 structures available as of the September 27, 2000 update.

The documentation for the CD-ROM set is now included on disk 1 in directory "pub" as an Adobe Acrobat file (document.pdf) and is also available at http://www.rcsb.org/pdb/general_information/about_pdb/cdrom_distribution/index.html. Please browse this file for more information on the CD-ROM contents and ordering information.

14-Nov-2000

The PDB and Structural Genomics

As part of Nature Structural Biology's Structural Genomics Supplement issue, the PDB has a paper entitled "The Protein Data Bank and the challenge of structural genomics" (Berman et al. 2000). This document describes how the PDB's systems for the processing, exchange, query, and distribution of data will enable many aspects of high throughput structural genomics in the next few years.

H.M. Berman, T.N. Bhat, P.E. Bourne, Z. Feng, G. Gilliland, H. Weissig , J. Westbrook (2000): The Protein Data Bank and the challenge of structural genomics. Nature Structural Biology 7 (11), pp. 957 - 959.

7-Nov-2000

New ADIT Features

The ADIT servers at http://pdb.rutgers.edu/adit/ and http://pdbdep.protein.osaka-u.ac.jp/adit/ have been updated with additional data items, enhanced help and examples, and updated pull-down menu values.

In the X-ray view, depositors can now include more detailed information in the refinement and diffraction categories. In both X-ray and NMR views, the help and example descriptions for all items have been expanded. New pull-down menus, such as for source organism name, have been included to make depositing data faster and more uniform.

Questions and comments about ADIT can be sent to deposit@rcsb.rutgers.edu.

31-Oct-2000

New Query Features Available on the Beta Test Site

In the continued efforts to improve the capabilities of the PDB, the RCSB is pleased to announce the release of the following new features on the beta test site at http://beta.rcsb.org/pdb/:

Exact and Partial Word Match

The SearchFields and SearchLite interfaces now support both partial and exact word match queries on the text search fields. The exact word match feature is available as an option that the user can select. Partial word searching functionality remains available as the default, since this is also an important capability for certain queries.

Keyword searches through the SearchLite interface as well as the "Text Search" form field on the SearchFields form fully support matching on word boundaries. For example, searches for the keyword "man" with the "exact word match" option checked will not match entries containing words like "human" or "manitose".

For all other textual queries, "exact word matches" are currently only implemented as follows: Queries for "kinase" will not match values like "protein kinase" or "tyrosine kinase". This limitation will be removed in the near future to provide the same functionality as with keyword searches. In addition, keyword searches are now possible with arbitrary numbers of optionally nested parenthesis.

Title Record Results

Title records, when available, have been added to the search results on the Query Result Browser. The ability to sort results based on certain criteria has also been implemented, and users will be able to customize the output of these results in the future.

Title Record Search

The capability to search on structure titles has also been added. The title is based on information provided by the author which describes the structure in a brief sentence.

We appreciate any comments you may have on these features. Please send your feedback to notify@rcsb.org.

Seven Ways of Looking at a Protein

The PDB was used to create images for "Seven Ways of Looking at a Protein" by Clay Shirky in an article in the online magazine FEED at http://www.feedmag.com/feature/fr409_master.html

24-Oct-2000

Issue 7 of the PDB Newsletter Now Available

The PDB newsletter's 7th issue is now available from the PDB Web site. This issue highlights the activities of the past three months and contains summaries of the current status of Data Deposition and Processing, Data Uniformity and NMR, Data Query Reporting, Access, and Distribution, and PDB's recent Outreach Activities. This periodical is available in text and HTML format, and a printer-friendly version is also available for this issue.

17-Oct-2000

PDB Focus: FAQ Lists

In order to assist PDB users, frequently asked questions with their corresponding answers have been compiled into two lists. One list of general questions and answers is available under "Contact Us" at http://www.rcsb.org/pdb/pdb_help.html#faqs. Another list of file format inquiries and responses is included with the list of "File Formats and Dictionaries" at http://www.rcsb.org/pdb/info.html#File_Formats_and_Dictionaries. Users are encouraged to utilize these helpful resources. Please send any unanswered questions to info@rcsb.org.

10-Oct-2000

PDB Focus: WWW User Guides

A list of useful guides to PDB's Web-based capabilities are available from the PDB home page at http://www.rcsb.org/pdb/info.html#PDB_Users_Guides. Topics covered include tutorials on using ADIT and the Validation Server, instructions for downloading files and optimizing search results using the different PDB interface, how to interpret query results, trouble-shooting tips for molecular graphics applications, and documentation on the directory structure of the FTP site. Users can also obtain details about SearchFields features directly from that interface by clicking on their field titles. These guides are in place to provide PDB users with helpful information about using the PDB's many features.

3-Oct-2000

Data Deposited and Processed using ADIT mirror at Osaka University

During the past three months, depositions have been accepted at the ADIT site established at the Institute for Protein Research at Osaka University in Osaka, Japan. These entries have been processed by staff at the Laboratory of Protein Informatics (Head, Professor Haruki Nakamura) at the Institute for Protein Research at Osaka University using the same ADIT tools and procedures as the RCSB. These entries are automatically mirrored to the RCSB's processing database and are released into the PDB archive. Under Dr. Masami Kusunoki's direction, Dr. Genji Kurisu, Reiko Igarashi, and Takashi Kosada have processed over 40 structures deposited at the new ADIT site.

ADIT is available at http://pdb.rutgers.edu/adit and http://pdbdep.protein.osaka-u.ac.jp/adit/

26-Sept-2000

PDB Focus: info@rcsb.org

The PDB Help Service at info@rcsb.org is provided by RCSB's scientists and staff. All inquiries and comments sent to this address receive a response, and most are answered within 1-2 working days. The queries and answers are archived internally for future reference and study. Inquiries range from general questions about the PDB to specific queries about using PDB features. So, if you have any questions or comments, please let us know!

19-Sep-2000

PDB Focus: The ADIT Validation Server

Depositors are encouraged to use the ADIT Validation Server (http://pdb.rutgers.edu/validate/) prior to the deposition of a structure to the PDB. The Validation Server may be used to check a structure at any time during structure determination and refinement. There is no limit to the number of times it can be used.

The RCSB developed the Validation Server to allow users to check the format consistency of coordinates (Precheck) and to create validation reports about a structure before deposition (Validation).

To use the Validation Server, the coordinate file should be in either PDB or mmCIF format. The structure factor file must be in mmCIF format (see http://pdb.rutgers.edu/sf_cif.html for more information).

The Precheck step will produce a brief report identifying any changes that need to be made in your data files in order to obtain a validation report.

The Validation step produces a validation report which includes an Atlas entry, a summary report, and a collection of structural diagnostics including bond distance and angle comparisons, torsion angle comparisons, base morphology comparisons (for nucleic acids), and molecular graphic images. Reports from PROCHECK1, NUCheck2, and SFCheck3 are made available.

Tutorials are available online.

1Laskowski, RA, McArthur, MW, Moss, DS, and Thornton, JM. J. Appl. Cryst., 1993; 265:283-291.
2Feng, Z, Westbrook, J, and Berman, HM. NUCheck. 1998. NDB-407 Rutgers University, New Brunswick, NJ.
3Vaguine, AA, Richelle, J, and Wodak, SJ. Acta Crystallogr., 1999; D55:191-205.

12-Sep-2000

New Query Features Available on the PDB Web Site

The PDB is pleased to announce the availability of new features on the PDB Web site (http://www.rcsb.org/pdb) and its worldwide mirrors (see http://www.rcsb.org/pdb/general_information/mirror_sites/index.html).

Several additions have been made to the SearchFields query options. A major incorporation is the accurate query of enzymes according to the Enzyme Commission classification, including the EC number and name. The enzyme classification was derived directly from the recommendations of the Nomenclature Committee of the International Union of Biochemistry and Molecular Biology (IUBMB). The PDB would like to acknowledge the work of Dietmar Schomburg of the Institute of Biochemistry at the University of Cologne, and Project leader of the BRENDA database for enzyme classification remediation.

The combination of a complete enzyme classification of PDB structures including the enzyme nomenclature enables users to identify all structures available for a particular enzyme class at all four enzyme classification levels. A convenient way of accessing all structures belonging to a particular enzyme class is provided with the EC Classification Browser linked to the SearchFields interface. To locate this information, simply choose the custom option "EC Number and Classification" and generate a new form.

Another addition to the PDB site concerns the Sequence Details available for every structure. The Sequence Details section of the Structure Explorer now points to the sequence entries in the major sequence databases corresponding to the particular structure being analyzed. This cross-link is currently limited to structures deposited after January 27, 1999. Furthermore, the Structure Explorer interface has been enhanced with navigation arrows, for result sets larger than a single structure, that allow for browsing of individual structures within the set.

5-Sep-2000

PDB's NMR Task Force at ICMRBS Conference

The PDB's NMR Task Force met during the XIX International Conference on Magnetic Resonance in Biological Structures (ICMRBS), August 20-25 in Florence, Italy. The Task Force discussed IUPAC nomenclature, representative structures and constraints files. The next meeting of this group will be the Keystone Meeting - Frontiers of NMR in Molecular Biology VII, to be held in Big Sky, Montana, on January 20-26, 2001.

29-Aug-2000

PDB at the ISMB Conference

Many thanks to all who came by to visit the PDB exhibit at the 8th International Conference on Intelligent Systems for Molecular Biology (ISMB) in La Jolla, CA. We received a lot of feedback from the user community, and appreciate the great suggestions that were given. We hope to meet you all again at future conferences!

15-Aug-2000

July 2000 CD-ROM Now Available

The July release of the PDB CD-ROM (Issue 93) is currently being shipped. This set of PDB CD-ROMS includes the data files for the 12,592 structures released from the PDB through the June 28th update. The available structure factor files and NMR constraint files are also included. Five CDs are required to hold the PDB data.

The PDB releases these CD-ROMs on a quarterly basis, at no cost. CD-ROMs are delivered to more than fifty countries throughout the world, including all continents. Ordering information is available at http://www.rcsb.org/pdb/general_information/about_pdb/cdrom_distribution.html.

PDB at the Protein Society Symposium

The PDB exhibit at the Protein Society meeting was a success. Thanks to all who came by to show their support! Many good suggestions and comments were given. We appreciate the proactive feedback from the user community. We hope for a similar turnout at ISMB next week.

8-Aug-2000

PDB Exhibit Booth at the ISMB Conference

The PDB will be hosting an exhibit booth at the 8th International Conference on Intelligent Systems for Molecular Biology (ISMB). This event will take place at the Price Center on the campus of the University of California, San Diego in La Jolla, CA, on August 19-23, 2000. At booth E in gallery B, PDB members will be available to answer your questions. We hope to see you at this exciting event!

1-Aug-2000

PDB at the ACA Meeting

The PDB thanks everyone who visited our booth during the American Crystallographic Association's Annual Meeting in St. Paul, MN. We were especially pleased to have so many people at our Users Meeting. We hope to see you again at the Protein Society Meeting!

25-Jul-2000

RCSB PDB Exhibit Booth at Protein Society's Annual Symposium

The PDB will host an exhibit booth at the 14th Annual Symposium of the Protein Society, to be held at the San Diego Convention Center on August 5-9, 2000. We look forward to meeting many of our users at this event - please stop by and visit the PDB team members in booth 211.

18-Jul-2000

New Query Features in Production and Beta Test

The PDB is pleased to announce the addition of several new features to the production PDB website (http://www.rcsb.org/pdb), its worldwide mirrors (see http://www.rcsb.org/pdb/general_information/mirror_sites/index.html) and the PDB Preview (beta) site (http://beta.rcsb.org/pdb/).

Apart from new functionalities based on existing information and minor interface enhancements, the most interesting features are based on data from the RCSB's data uniformity project. As the name suggests, this project reviews all PDB entries to provide consistent nomenclature and format for specific fields of data so they may be queried accurately across the complete contents of the PDB. It should be noted that while this uniform specification is available for query and display on the PDB WWW site, the PDB files themselves have not been changed.

Production Site and Mirrors - Number of Chains and Source Organism

It is now possible to use the number of chains as derived from the PDB SEQRES as a search criteria via the SearchFields interface. Select the custom option "Number of Chains and Chain Length" and generate a new form to make this field available.

To query source organism, select the custom option "Source" from the SearchFields interface and generate a new form. Specifications of the scientific name for source were parsed from the NCBI's MMDB assignments. Users may also substitute a common name for a scientific name, such as a substitution of human for homo sapien.

Preview (beta) Site

Additions have been made to the SearchFields query option on the beta site. A major addition is accurate query of enzymes according to the Enzyme Commission classification, including the EC number and name. The enzyme classification was directly derived from the recommendations of the Nomenclature Committee of the International Union of Biochemistry and Molecular Biology (IUBMB). The PDB would like to acknowledge the work of Dietmar Schomburg, Institute of Biochemistry at the University of Cologne, Project leader for the BRENDA database for the enzyme classification remediation.

The combination of a complete enzyme classification of PDB structures with the enzyme nomenclature enables users to identify all structures available for a particular enzyme class at all four enzyme classification levels. A convenient way of accessing all structures belonging to a particular enzyme class is provided with the EC Classification Browser linked to from the SearchFields interface (choose the custom option "EC Number and Classification" and generate a new form).

A further addition to the beta site concerns the Sequence Details available for every structure. The Sequence Details section of the Structure Explorer now points to the sequence entries in the major sequence databases corresponding to the particular structure being analyzed. This cross-link is currently limited to structures deposited after January 27, 1999. In addition, the Structure Explorer interface has been enhanced with navigation arrows, for result sets larger than a single structure, that allow for browsing of individual structures within the set.

Enhancements have been made for viewing structures on the PDB beta site. Detailed images are now available for nucleic acid-only structures. For all structures, custom-sized images can be generated in JPEG and TIFF formats using a new feature in the View Structure section of Structure Explorer.

Your comments on these new features are highly appreciated and should be mailed to notify@rcsb.org.

PDB to Host Booth and Users Lunch at the ACA Meeting

The PDB will be hosting an exhibit booth at the American Crystallographic Association's Annual Meeting in St. Paul, MN on July 22- 27. At Booth 210, PDB members will be happy to answer your questions. We will also be having a PDB Users Meeting on July 24 in rooms 7,8,9 at noon. We look forward to meeting you.

PDB Newsletter Issue 6 Now Available

Issue 6 of the PDB newsletter is now available from the PDB Web site. This issue archives the breaking news items published over the last three months and contains summaries of the current status of Data Deposition and Processing, Data Uniformity, Data Query and Reporting, Access, Outreach.

27-Jun-2000

PDB to be updated on July 4, 2000

The PDB Web and FTP server is updated weekly on Tuesday afternoons, making the updated information available to the public no later than 1:00am Pacific Standard Time on Wednesday mornings. As Tuesday, July 4, 2000, falls on a US holiday, that week's update will take place on Monday, July 3, and the updated information will be available to PDB users by 1:00am on Tuesday, July 4. The regular Tuesday-afternoon update schedule will resume on July 11, 2000.

20-Jun-2000

PDB Launches a Deposition Mirror Site at Osaka University

The PDB has established a deposition mirror site at the Institute for Protein Research at Osaka University in Osaka, Japan. The AutoDep Input Tool (ADIT) is now open to accept PDB depositions at http://pdbdep.protein.osaka-u.ac.jp/adit/as well as at http://pdb.rutgers.edu/adit/. The Osaka ADIT mirror has all of the features available from the main ADIT server, including automatic ID assignment. ADIT was developed by the RCSB to provide a simple method for depositing structure data. Entries deposited at this site will be forwarded to the PDB for processing and inclusion in the database.

6-Jun-2000

PDB Plans for Releasing Data from the Uniformity Project

The goal of the PDB Data Uniformity project is to maintain the greatest possible consistency within the entire archive. Uniformity is a key prerequisite for any meaningful query or systematic analysis of the archive. Two complementary methods have been used to update and unify the data in the PDB archive.

The Data Uniformity project began by examining individual entries within groups of chemically related structures. During the PDB's long history the PDB format has undergone a number of changes. In the file-by-file uniformity processing, each entry is brought up to the current PDB format standard. This includes adding records that were not present in early entries where possible, correcting outstanding reported problems, and providing standard nomenclature. Each file is rechecked using our current validation software. Approximately 3000 entries containing nucleic acids, globins, retroviral proteases, and aspartic proteases have been processed in this way.

In addition to file-by-file uniformity procedures the Uniformity Project has also targeted key records within each PDB entry for archive-wide uniformity processing. This archive-wide approach has been used to update citation, R-factor, and resolution records. These results have been loaded into the PDB database where they can be accessed from one of the PDB query interfaces or viewed in the PDB Structure Explorer reports. Other records targeted for archive-wide uniformity processing include ligand descriptions, protein classification, sequence, and source data. Some of these records are now available on the PDB beta test web site. This work complements the data clean-up project undertaken by the MSD group at the European Bioinformatics Institute. In the future, all of the records resulting from the archive-wide uniformity processing will be updated in the PDB entries as part of the file-by-file uniformity procedure using the plan described below.

One of the important issues for PDB's on-going data processing, including its Data Uniformity project, is the management of multiple nomenclatures. The problem of providing alternative nomenclatures within the PDB format is a well recognized problem. In assuming the stewardship for PDB archive, RCSB was charged with the responsibility for maintaining the greatest possible consistency within entire archive. Unfortunately, uniformity considerations are often at odds with preferences of depositors who provide additional insight into the description of an entry that is outside traditional PDB practice. Recent discussions on the PDB list server regarding the assignment of chain identifiers to ligands and solvent provide important illustrations of this on-going problem.

In planning for the release of the entries from the various uniformity projects, PDB has sought a release scheme that would: (1) provide the flexibility to permit users of the archive to access alternative nomenclatures within the limitations of the existing PDB format, (2) integrate the results of archive-wide and file-by-file uniformity processing, and (3) preserve the integrity of the archival PDB format files available from the PDB ftp site. In consultation with the PDB Database Committee and the PDB Advisory Committee, we arrived at the following plan:

  • Data will continue to be distributed in the current PDB data format from the RCSB PDB FTP site. The nomenclature including chain ID assignment will continue to follow the rules previously described.
  • Data will also be distributed in mmCIF format from a new ftp area. mmCIF provides a detailed and fully parsable description of macromolecular structure and experiment. mmCIF is also equipped to deal with alternative nomenclatures. This mmCIF ftp area will be used to distribute the remediated data from the Data Uniformity project, and to distribute newly processed entries in mmCIF format with support for multiple nomenclatures.
  • Software tools to create PDB format files from the new mmCIF files will be provided. These software tools will permit users to select the particular nomenclature to be written to a PDB format file. For instance, it will be possible to create a PDB file using either PDB hydrogen or IUPAC hydrogen atom name conventions, or with using either author or PDB chain ID conventions. An important benefit of this approach is that all of this flexibility can be provided using a single archival mmCIF file.

The new ftp area for mmCIF data will be implemented in the Fall. More details about this site will be provided in the near future.

Questions and comments should be sent to info@rcsb.org.

30-May-2000

RCSB Hosts CCPN Meeting

The RCSB was pleased to host the second workshop for the Collaborative Computational Project (CCP) for NMR on May 22, 2000 at NIST. The first workshop was held February 7-8, 2000 in Hinxton, UK. The CCP NMR (CCPN) project is funded by the Biotechnology and Biological Sciences Research Council (BBSRC) of the United Kingdom. The CCPN project aims to develop for the NMR community data exchange standards and software packages analogous to what the CCP4 project has developed for the crystallographic community. Further information about CCPN and workshop summaries are available at http://www.bio.cam.ac.uk/nmr/ccp/. The next CCPN meeting will be held in Florence, Italy in conjunction with the XIX International Conference on Magnetic Resonance in Biological Systems, August 20-25, 2000.

A Reminder to Bookmark and Use the Closest Mirror Site

The PDB primary site (http://www.rcsb.org/pdb) enjoys greater than 99% uptime. However, as a precaution and to balance load across PDB sites worldwide, we would suggest that users bookmark and use the PDB mirror sites. A recent brief loss of service at the primary site as a result of an Internet failure outside of the control of the PDB highlights this need.

We welcome any comments you have on the quality of service, particularly uptime and access to the mirror sites. Comments can be made by email to info@rcsb.org.

9-May-2000

PDB CD-ROM Set 92 Released

Issue 92 of the PDB CD-ROM set, containing 12,009 structure files on five disks, is now available. Files containing the experimental data (Structure factors and NMR constraints) are also now included in this distribution.

A new file included with this release is holdings.doc. This file lists the numbers of structures determined by various experimental techniques (X-ray, NMR, Theoretical) and by molecule type (Proteins, Peptides, and Viruses; Protein Nucleic Acid Complexes; Nucleic Acids; and Carbohydrates). This file also lists the number of experimental data files (X-ray and NMR) available. The holdings.doc file is found on CD-ROM disk 1 in directory pub.

The names of two directories have been changed with this release in order to provide a more obvious connection between directory names and directory contents. Entries is the new name for the old Distr directory that contains the main structure entries (coordinate files). Strucfac is the new name for the old Nonst directory that contains the structure factor data files.

The CD-ROM set is provided by the PDB to assist researchers who do not have good internet access to the primary PDB Website or any of its mirrors.

The PDB CD-ROM set is released quarterly at no charge. The CD-ROM set may be ordered on-line, by email, fax, or mail as follows:

Online orders: http://www.nist.gov/srd/nist80.htm
Email orders: srdata@nist.gov
Fax orders: (+1) 301 926 0416
Mail orders: RCSB/NIST, Mail Stop 2310, Gaithersburg, MD 20899-2310

2-May-2000

PDB File Format FAQ Now Available

A document that addresses questions about the PDB file format frequently posed by depositors and users over the past year is now available. The information in this document has been gathered from the PDB Contents Guide document originally created at Brookhaven National Laboratory, a careful study of existing files, an RCSB Workshop held in October 1998, and discussion with many users of the data. The guidelines presented here are those used by the annotation staff at the RCSB PDB.

This will be an evolving document. Questions, comments or suggestions about this document should be sent to format-faq@rcsb.rutgers.edu.

Alternative URL for PDB Site

The URL http://www.pdb.org/ now directs the Web browser to the home page for the primary PDB site at http://www.rcsb.org/pdb/. This alternative URL has been established to assist new PDB users find the RCSB PDB Web site.

18-Apr-2000

Issue 5 PDB Newsletter Now Available

Issue 5 of the PDB newsletter is now available from the PDB Web site. This issue archives the breaking news items published over the last three months and contains summaries of the current status of Data Deposition and Processing, Data Uniformity, Data Query and Reporting, Access, and Outreach.

Automated Filing System for the PDB Master Archive

A computer driven filing system is being installed at the PDB Master Archive site at NIST. The filing system is comprised of two carousel-type automated systems. The carousel systems are 10 feet high and each have 18 carriers, one with eight and the other with nine drawers per carrier for a combined storage capacity of over 4500 linear filing feet. The physical systems balance ease of operation and improved access to files with complete security for the contents.

As the files are transferred to the new filing system they will be bar coded for future access. The PDB files contain correspondence with the authors in addition to the deposited data. While the data may be needed for historical reference or for checking the current files, the privacy of the author's correspondence will be maintained.

11-Apr-2000

PDB Featured at BIO2000

The PDB saw an extremely high and broad interest in Biotechnology at the BIO2000 meeting held in Boston, Massachusetts during March 26-30. The anticipated attendance of 5000 soared to over 10,000, focusing on industry and business.The Protein Data Bank (PDB) was featured at an exhibit booth.

Many visitors to the booth saw the PDB website for the first time. From students to professors, pharmaceutical company representatives to software developers, all were impressed with the amount of free information and the simplicity of access to the data.

We look forward to meeting PDB users at future meetings.

4-Apr-2000

New Queries, New Reports

Several new query and reporting features previously available on the PDB beta test site have been moved to the PDB production sites. These features include:

Ligand Search Capability. The SearchFields interface now provides a 'Ligands and Prosthetic groups' field to enable queries for entries that contain a specific ligand. Ligands may be specified using either their common names or by the three-character identifiers found in the PDB het group dictionary.

Expanded Information on Ligands and Prosthetic groups. The Structure Explorer interface now provides a table of ligands and prosthetic groups (referred to as "HET groups") within a particular macromolecule. While the previous interface only provided non-polymeric groups, the new implementation also shows non-canonical residues within protein or nucleic acid chains. A hyperlink provides convenient access to a graphical representation of these groups within Rasmol.

Direct Hyperlink to Primary Citation. Where available, hyperlinks to Medline on the Structure Explorer summary page are now directly to the abstracts of the primary citations. The underlying information was collected and is maintained by the NIST data curation team within the RCSB. Note that the link to "Other Sources" within Structure Explorer holds an extensive set of links to all related publications within Medline.

Faster and Improved Access to Dynamic Links. Access to the very extensive set of cross-links provided via the Molecular Infomation Agent (MIA) has been considerably improved after feedback generated from the initial beta testing deployment. All information is now broken down into categories and the initial screen only provides access to resources with direct links possible via the PDB identifier. Hyperlinks to extended sets of links allow convenient access to all other links.

In addition, new SearchFields queries for data collection information, number of chains and source organism have now been implemented in the beta test site. See the beta news page for more information.

7-Mar-2000

New Ligand Search Capability Available for Beta Testing

The SearchFields interface on the PDB beta test site now provides a 'Ligands and Prosthetic groups' field to enable queries for entries that contain a specific ligand. Ligands may be specified using either their common names or by the three-character identifiers found in the PDB het group dictionary. An example of the SearchFields form with this option active is here.

New Query and Reporting Features

These features include:

Capability for limiting queries to entries for which the experimental data was deposited The SearchFields interface now contains an 'Experimental Data Availability' checkbox in the customizable section of the form. Checking this box and clicking the 'New Form' button creates a new form containing options that allow restriction of the search to only those entries that were deposited along with experimental data, namely structure factorsd and NMR restraint files.

Extensive links to other resources via incorporation of MIA The static links previously available for each PDB entry from the Structure Explorer/Other Sources pages are replaced by a dynamically updated and far more extensive set of links created by the Molecular Information Agent (MIA). Additional information on MIA is available.

An expanded VRML interface for generating molecular images The interface linked as 'VRML (custom options, full screen display)' from the Structure Explorer/View Structure pages has been enhanced. New features include options for scene annotation, for marking residues, drawing sites and symmetry copies.

Facile cross-linking of all files for NMR determined structures Coordinates sets for structures determined by NMR may be stored in several different files. Specifically, if an averaged minimized structure is deposited to the PDB, this file is separate from the file(s) containing the ensemble members comprising the multiple structure solutions. The Structure Explorer page for these entries now provides cross-linking of these files.

Release of Cleaned-up Citation Data

As part of the data clean-up project led by the NIST-PDB team, the citation data available on the beta site will now be available on the main production sites.

All primary citations for PDB entries as of July 1999 have been validated or corrected if necessary. This work involved verifying all primary citation data values (title, authors, journal, year, volume, pages) with the published literature using either electronic or hardcopy journal resources. The procedure also involved presenting the citation data values in a uniform format. Whenever possible, links have been added to PUBMED. At present, the primary citations are complete to >95 %. Updating citations is an ongoing process, and we greatly appreciate input from the user community for further corrections.

You may send us corrections by mailing to the PDB help service.

22-Feb-2000

New PDB Mirror Site in Brazil

A new RCSB PDB mirror site has been established by the Universidade Federal de Minas Gerais/CENAPAD in Brazil. Following the addition of this new mirror site, there are now seven RCSB PDB sites available worldwide. The complete list of RCSB PDB mirrors is available here.

Proposal for CORBA standard for Macromolecular Structure Data Submitted

On February 11, 2000 the Research Collaboratory for Structural Bioinformatics submitted an initial technology proposal to the Object Management Group (OMG) that defines a CORBA interface for macromolecular structure data. The final CORBA specification, when accepted, will enable the PDB to publish a robust and efficient interface definition for use by programs and other databases accessing the PDB. The submission is based in part, upon scientific nomenclature defined by the International Union of Crystallography.

The submission is available in Framemaker (as a gzipped tar file), Postscript and PDF format. Please note that these files are quite large and may take some time to download.

PDB at the Biophysical Society Annual Meeting

A presentation, award, and an exhibition booth were the focus of the PDB's attendance at the 44th Annual Biophysical Society Meeting in New Orleans, LA, February 12-16. Dr. Helen M. Berman, director of the PDB, presented a talk entitled "The Past, Present, and Future of the Protein Data Bank" at the Awards Symposium after receiving the Biophysical Society's Distinguished Service Award for her work with structural databases.

PDB members were able to meet with users one-on-one in the PDB's exhibition booth. Thanks to everyone who stopped by with their questions, input, and support! The next PDB exhibit booth will be at the American Crystallographic Association's Annual Meeting in St. Paul, MN in July.

8-Feb-2000

New Publication Analyzing the PDB

A paper entitled An analysis of the Protein Data Bank in search of temporal and global trends has been published (H. Weissig and P.E. Bourne Bioinformatics, 15:807-831 (1999)).

1-Feb-2000

New PDB Query and Reporting Capability Available for Beta Testing

The public RCSB PDB beta test site (http://beta.rcsb.org/pdb/) has been updated to include:

  1. Capability for limiting queries to entries for which the experimental data was deposited
  2. Extensive links to other resources via incorporation of MIA
  3. An expanded VRML interface for generating molecular images
  4. Facile cross-linking of all files (average minimized structure, multiple files containing ensembles) for structures determined by NMR

These options are described in greater detail on the beta site news page.

Cleaned-up Citation Data Available on Beta Site

As part of the data clean-up project led by the NIST-PDB team, the citation data available through the SearchFields and Structure Explorer interfaces is now up-to-date for almost all structures in the PDB.

All primary citations for PDB entries as of July 1999 have been validated or corrected if necessary. This work involved verifying all primary citation data values (title, authors, journal, year, volume, pages) with the published literature using either electronic or hardcopy journal resources. The procedure also involved presenting the citation data values in a uniform format. Whenever possible, links have been added to PUBMED. At present, the primary citations are complete to >95 %. Updating citations is an ongoing process, and we greatly appreciate input from the user community for further corrections.

You may send us corrections by mailing to the PDB help service.

Assignment of Chain IDs in PDB Files

Over the last few weeks there have been numerous mailings to the PDB discussion listserver regarding the assignment of chain ids in PDB files. What follows is an informational message on current and past practices that was originally posted by John Westbrook to the PDB discussion group on 7-Jan-2000.

Chain IDs

Thank you all for your comments regarding the assignment of chain identifiers in PDB data files. Now that there are a variety of issues on the table we would like to take a moment to provide some background information pertinent to the rationale behind the procedures in current use for the assignment of these identifiers, and then indicate how a change might be brought about.

First, some background. When we assumed responsibility for PDB we spent many months studying the data processing procedures that were in current use at the time. The purpose of this effort was to develop data processing and annotation practices which were as consistent and as familiar as possible to those used to using the database.

In our studies on the use of chain identifiers, we found that prior to 1996 structures with a single macromolecular chain did not receive a chain identifier, while from 1996 onward, essentially all macromolecular components of the structure were assigned chain identifiers. We choose to continue the latter practice when we took over the data processing, again in the interests of consistency.

Similarly, in developing procedures for the assignment of chain identifiers for ligands and solvent we have worked to stay faithful to prior practice as much as possible. Covalently bound ligands have generally been assigned the chain identifier of the macromolecule to which they are bound. Examination of the database suggests there is less consistency in chain identifier assignment for noncovalently bound ligands; however, by far the most common practice was to assign no chain identifier to these ligands. Traditionally, chain identifiers have not been assigned to solvent molecules. Thus current policy is to assign chain identifiers only to covalently bound ligands.

While we believe that maintaining consistency in the archive is a very important part of our role in managing the PDB resource, we also wish to acknowledge the different views that have been expressed about the current practice of chain identifier assignment. To summarize what we understand from the recent discussions briefly, it is being suggested that the assignment of chain identifiers should be permitted for solvent and noncovalently bound ligands. This could be used either to show a specific interaction with a macromolecule or to show some other relationship common to a group of ligands or solvent.

We want the PDB resource to serve the needs of the community as much as possible. We are always open to suggestions that lead to improved quality of the data and our services and encourage continued discussion on this particular point. If there is continuing strong support in the community for this change then we will propose this a modification to the format in a future Format Advisory Notice.

11-Jan-2000

Issue 4 PDB Newsletter Now Available

Issue 4 of the PDB newsletter is now available from the PDB Web site. This issue archives the breaking news items published over the last three months and contains summaries of the current status of data deposition and processing, data uniformity, and data distribution.

1-Jan-2000

Happy New Year !

The Protein Data Bank would like to wish all of our users a very Happy New Year.