POP-OUT | CLOSE

Introduction

A short online questionnaire was emailed to 1450 RCSB PDB CD-ROM subscribers in February 2004. The purpose of this instrument was to get to know the subscribers, solicit suggestions for improvement, ask their opinion on how well we are doing and gauge interest in a DVD data product. The questionnaire was well received and had a 23% response rate. Many respondents, in addition to answering the questions, provided additional information in the comments section. Several even took additional time to write us separately.

The responding subscribers are located in 42 different countries in seven regions of the world. A comparison of the geographic distribution of respondents with the CD subscription list shows that the respondents are generally representative of the geographic distribution of the subscribers. The European and Australian communities are slightly better represented while North America is slightly less represented in the responses.

PDB Home
Additional User Demographics
As expected, 99.4% of the respondents use the PDB data format, however, 7.4% also use the mmCIF data format. The full data releases are used by 77% of those responding while the incremental updates are only used by 23% of those responding.

Many subscribers have good online access to the PDB.

  • more than 80% of the respondents use the PDB web site
  • 26% use the ftp site
  • 46% have a T1 or better connection to the Internet
  • 11% connect to the Internet through a modem at 56 kbps or less

A majority of the respondents, about 60%, work at educational institutions. A total of 63% of all the organizations are involved in research. More specifically, approximately 35% are educational institutions, 28 % are research organizations, 25% have both an educational and research component and 12% were identified as commercial entities. The commercial organizations included companies involved in art, research, education, software development and medicine.

Data CD Use

The RCSB PDB data CDs contain various types of content. As expected, more than 90% of the subscribers use the PDB atomic coordinates. X-ray structure factors are used by 45% of respondents and NMR constraints by about 28%. Approximately 10% use the obsolete listing, 33% use the CD-ROM documentation and approximately 40% use included software. Some of the respondents added that they specifically use the secondary structure sections, annotations, seqres, and protein-ligand information from the data files. Also listed were educational resources on the web, information found on the web discussion list and the PDB Select list content not distributed on CD-ROM.

How many different people make use of the CD sent to a subscriber?
The number of people the data CDs reaches is greater than the number of subscribers. Answers to how many additional people make use of a subscriber's CD ranges from 0 to hundreds. Those answering that hundreds were using the CD were often involved in teaching. The chart below provides a summary of how the respondents answered.

PDB Home
Which content is most important to users?
In order to understand how important various content and formats are, we asked the subscribers to rate a list of a dozen items on a five point scale and tell us if the item could be rated Essential, Very Important, Important, Somewhat Important, or Not Important. A large majority of respondents, 77%, believe that the atomic coordinate data is essential. Slightly fewer (73%) believe that the PDB format is essential. The only item rated by the majority as "not important" is documentation in PC word processing format.

What should be considered in future products?
Many respondents expressed concern in various ways about the growing number of disks required to store the complete set of data. Interest in the development of a DVD product is high, 59% are very interested and 26% are moderately interested. A majority of respondents were also interested in the ability to download packaged data currently released on CD.

How are we doing?
We asked the subscribers "Are you satisfied with the overall usability of RCSB PDB CD-ROMS? If not, tell us how we can improve." Thirteen percent let us know that there is room for improvement.

2.5%   Not used as yet
84.5%   Satisfied
13%   Not Satisfied

User Comments and Suggestions

A compilation of comments from users about problems or suggested improvements is listed below. Each comment is followed by a brief response. Suggestions for improvements are being further reviewed and considered by RCSB PDB staff.

Comment: Add an index, preferably an html page.
Repsonse: A top level index to find specific data spread across six or more disks could be generated in either pdf or html format. Rudimentary html indexes were compiled by hand and included on the April 2004 incremental update currently in production. A more detailed index that includes compound information and classification requires the development of a program to assemble that information.

Comment: Organize the data files differently.
Repsonse: The practice of naming directories with a two letter code used to be convenient and meaningful. However, placing new data files in directories named with the 2nd and 3rd characters in the PDB ID has become confusing.

The placement of experimental data adds to the complexity of locating data. Because fewer requests are received for structure factor or NMR constraint files, this experimental data is placed on separate disk sets. This reduces CD production costs, but creates additional inconvenience.

We recognize that changing naming conventions would inconvenience those users who have developed automatic processes to load data from the CDs into databases. The reorganization of data into groups will need to be explored in more depth with input from our user community.

Comment: Provide an interface
Repsonse: No interface is provided on the data CDs because the purpose of the current product is to deliver PDB data on disk for those who cannot easily ftp data or who prefer to work with the data from disk. A growing minority of users, however, continue to request a different product that includes a user interface.

Comment: Provide a SEARCH engine or search feature
Repsonse: Suggestions for a search feature reiterate previously expressed requests for a PDB-In-A-Box product which is currently in development. Ways in which the In-A-Box product might connect, overlap, or interact with the current data CDs requires further consideration.

Comment: Provide a guided tour or a tutorial.
Repsonse: The PDB staff receives several queries a month from newcomers asking for assistance in understanding what the data files contain and how that data can be used. Frequently, first time users expect an automatic program to run when they put the disk into the drive. Although the CD documentation was recently revised to include additional content descriptions, newcomers continue to ask for an expanded, easy, basic level introduction to the data and how it can be used. Development of a guided tour or tutorial is under consideration.

Comment: Provide an automated way to move the data from a flat text file into a database.
Repsonse: Some types of software for working with PDB data automatically reformat the data for use as needed by the program. However, some users must reformat the data themselves to fit specific needs. Determining if a general and useful tool can be developed to assist users who transfer PDB data into some type of database will require further study.

Comment: Include improved or additional software.
Repsonse: We continue to get regular requests to add software and to support software currently included on the data CDs. It is beyond the scope of the CD product and beyond current resources to include and support additional software. Current development efforts are focused on the improvement of tools available from the website..

Comment: Use DVD to reduce the number of disks.
Repsonse: As noted earlier, at least 85% of the respondents expressed some level of interest in a DVD product. A DVD product prototype, already scheduled for development, will be available in the last quarter of 2004.

Comment: Incremental updates are not popular with some users.
Repsonse: Production costs continue to rise. Update disks were introduced as a cost saving measure. As we work toward producing a DVD product, it may be possible to fit all the data onto a single disk and we may be able to return to producing full releases each quarter.

Comment: Provide help for problems with included software.
Repsonse: Although we state in the CD documentation that we do not and cannot provide support for included software, we regularly receive requests for assistance with problems users experience when working with software included on the CDs. To address these concerns we will:
  • Identify and correct problems that may have been introduced in the disk mastering process.
  • Eliminate some problematic, out of date software currently provided.
  • Include an HTML document providing descriptions and links to software, tools and resources available on the Internet.

Comment: CDs ordered were not received.
Repsonse: It is distressing to learn that subscribers requesting data CDs were missed or did not receive requested data. We are currently working toward improving the subscription fulfillment services. The online CD information page and order form have been changed to more clearly display product descriptions and subscription options. To facilitate communication, a new mailing address (pdbcd@rcsb.org) has been added to the documentation and web pages.

Comment: Perform a quick data review before CD production.
Repsonse: Data distributed on CD comes directly from the PDB ftp site. Data uniformity and data reliability review is a continuous process. The CD mastering procedures include testing on both PC and Unix/Linux platforms. Testing procedures have been expanded to include the Mac platform. Please use info@rcsb.org to report questions or concerns about data.

Comment: Update pre-1997 PDB format data files.
Repsonse: Adopted changes to the PDB Format (Version 2.0) do not appear in entries released before March 31, 1997. Currently there are no plans or resources available to convert entries released prior to 1997. However, all mmCIF format data files are completely updated. Software available on the PDB website allows users to take an mmCIF file and generate a file in PDB format. We are exploring how to best inform CD subscribers about this website resource.

Comment: Be consistent with data locations on subsequent releases.
Repsonse: Data on the CDs is currently classified into four general categories; entries, models, x-ray structure factors experimental data, and NMR constraint experimental data. In order to use disk space effectively, files are arranged across disk sets to maximize the use of space. Recently, the supporting files and documents in the pub directory were augmented and reorganized. Data reorganization causes changes that inconvenience users who have automated data transfer into their own systems. This type of reorganization is not expected on a continuing basis. Our goal is to produce a well organized and consistent product.

Summary

While the approval rating from 84.5% of our subscribers is very good, there is room for improvement. Problems are being addressed immediately. Suggestions for improvements are being reviewed and considered by RSCB PDB staff. Suggested improvements already being developed include a top level index, PDB-In-A-Box, and a DVD data product. Additional questions or concerns about CD data products can be addressed to pdbcd@rcsb.org or to info@rcsb.org . The RSCB PDB staff thanks the user community for their response to the questionnaire.