What's New in this Release (September 13, 2011)
Explore the Archive
PDB-101: Educational Resources for Exploring a Structural View of Biology
Finding Protein Modifications
New Top Bar Searching: Category Options, Autocomplete Feature, Improved Macromolecule Name Queries and more
The top search bar has been redesigned to help users easily and intuitively create precise searches.
Typing in the top search bar displays an interactive pop-up box with suggestions of common related search terms, organized in different categories from different areas of the PDB. The autocomplete feature will make suggestions based on even a few letters.
For example, entering the word "human" in the top search bar will present several search term options organized by category (such as macromolecules, authors, organism). Each suggestion includes the number of results (in parenthesis) and links to the set of matching structures.
Selecting the term from the appropriate category will give more precise results than the default simple text search. The results for Organism "Homo Sapiens (human)" for instance, will not include entries of author "Human, J." of from organism "Human immunodeficiency virus" (that would be returned from a simple text search for the word "human").
Furthermore the provided suggestions will enable searches that would not be possible with a simple text search. Typing "bird" on the search box, provides the suggestion to get all PDB entries for birds (organisms classified below "Aves (birds)" in the taxonomy tree). Most of these entries will not contain the word "bird" in their actual text while the results will not include entries from various authors that just happen to have last name "Bird".
The suggestion box provides matches from several data fields and associated classifications or ontologies, including: citation authors, macromolecule, organisms, chemical component names, various identifiers (PDB ID, chemical component, PubMed, UniProt), common words in the PDB text, enzyme and domain classifications, protein sequences, chemical formulas, and more.
For example, entering the text string "GNAAAAKKGSEQESVKEFLAKAKEDFLKKWETPSQN" in the search box will return possible BLAST search options. Typing "C14 O5 S" will be directed to a chemical formula search for PDB ligands.
The suggestion box first displays a few options from each category, ranked by similarity and the number of matches. If the first suggestions shown aren't enough, selecting the "more" link in a category will provide additional choices.
For example, typing "coli" in the search box will first return more popular variants of "E.coli". Selecting the "more" option will display the more rarely found organism "Colinus virginianus (northern bobwhite)." Users may also choose to search directly and bypass the provided suggestions. To retrieve all structures with citation author "Korkegian" (with or without middle initial), click on the "Find all" link below the "Author" suggestions or select the magnifying glass icon to run the search.
Top Bar Searching: Specific Categories
Searches can also be limited to specific categories by selecting the Author, Macromolecule, Sequence, or Ligand option.
By clicking on Author above the search box, users will limit their searches (and autosuggestions) to primary citation authors.
By clicking on Macromolecule above the search box, users will limit their searches (and autosuggestions) to this improved search functionality.
This new search is based on names of macromolecules as defined by sequence reference databases (like UniProt) and their cross-references to the PDB.
For example, entering "prothrombin" will return all PDB entries which have cross-references to UniProt entries with that name. Users can then utilize the drill-down functionality to focus on a particular organism or other category of prothrombin entries.
The "Sequence" search mode will offer BLAST search options based upon the sequence entered (e.g., "GNAAAAKKGSEQESVKEFLAKAKEDFLKKWETPSQN")
The link for "additional sequence options" connects to the Advanced Search options for sequence searching.
Enter a small molecule name or 3-character code in the "Ligand" search mode to view the corresponding Ligand Summary page. This page gives an overview of the chemical component, and links to the PDB entries that contain it.
Select the "additional ligand options" link to search for ligands by specifying Chemical structure, Name, or Formula in the Chemical Component Search.
Top Bar Searching: Browse
The location of the Browse Trees option has been relocated for accessibility and consistency.
The Browse icon in the right section of the top search bar directs to the interface for browsing different ontology trees (taxonomy, GO terms, Enzyme classification, SCOP/CATH, Protein Modifications, and more) that cross-reference the PDB.
Search for nodes in these trees based on their textual description and link to the PDB entries associated with a node (and its subtree) with a single click.
For instance, typing "birds" in the "Source Organism" taxonomy tree selects the position in the taxonomy hierarchy for "Aves (birds)". Users can browse the tree to see bird species with structures in the PDB, or click directly on that node to retrieve all structures from birds.
Most RCSB PDB search functions, including the top bar searches, auto complete suggestions, and classification tree browser are implemented using Advanced Searches of various types.
The Advanced Search interface offers full access to all possible searches and their parameters to help users build complex queries combining different types of searches.
For example, users can combine a "Sequence (BLAST/FASTA/PSI-BLAST)" search for the sequence "GNAAAAKKGSEQESVKEFLAKAKEDFLKKWETPSQN" with a "Chemical Formula" search for "Cl2 S" to retrieve any structures with proteins that match the given sequence and contain ligands with 2 chlorine and a sulfur atoms in their chemical formula.
Sorting of Results
The order of results of a PDB text search or a sequence search are now based on the relevance of the term (for a text search) or the alignment score (for a sequence search).
Sorting by "Relevance" is available as an option - actually the default one, in the "Sort by" drop-down list of the results page of such searches.
Text search relevance is affected by the number of times a term was matched in the text and by how similar was the matching word, but it is also by the significance of the matching PDB field. So entries where the term was found in the title, or the classification will come first in the results.
Typing the word "blood" on the top bar search box and clicking on the search button (ignoring any suggestions) which does a plain PDB text search, will returns entries classified as "Blood Clotting" on the top of the results.
New Structures Widget
Quickly find new entries, related structure articles, and unreleased articles in the "New Structures" widget, located on the top right section of the home page.
Query History and MyPDB
View recent searches and recent query results with the MyPDB Widget in the left-side menu. With these options, users can also modify or extend earlier queries.
Users can also log in to (or register for a new) MyPDB account to save queries for future reference, set-up automatic weekly notifications for structures of interest, and store personal annotations about interesting structures.
Even for users that choose not to login, they MyPDB widget allows access to their latest "Query Results" and even their recent "Query History".
New Home Page Widget: Explore Archive
Tour the PDB archive by "drilling-down" on significant properties of structures like "Organism" and "Polymer type", with just a few clicks using the home page's new Explore Archive widget.
This widget applies the same drill down options available from each set of search results to the contents of the entire archive. Click on the "Homo sapiens" link under Organism to view all structures from human. From the next set of drill-down options, select "DNA" to retrieve all pure human, pure DNA solved structures.
The "Explore Archive" widget also gives a quick statistical overview of the PDB. Users browse can browse the charts individually, or view them all together by clicking on the "Show all" link.
Molecule of the Month Structures in Search Results
The Molecule of the Month articles are now linked to search results and every time the search contains one of the discussed structures of a molecule of the month, a summary of the article will be displayed.
Mobile Access to Molecule of the Month Articles
A new link at the top of every Molecue of the Month article points to a downloadable ePub document that contains a complete copy of that article. These articles can then be viewed offline in an ePub reader software application, such as the free iBooks application for the iPhone, iPad & iPod devices, or the Aldiko ePub reader for Android devices. ePub (Electronic Publication) is an open standard for digitally published documents.
PDB-101 Structure Focus: Summary Views of Highlighted Structures
PDB-101 Structure Focus pages highlight specific PDB entries discussed in a Molecule of the Month article. Each Focus provides a description that explains why it was selected as an example structure, and offers an interactive 3D representation of the structure, sequence display, ligand information, and links to any other articles discussed in the Molecule of the Month feature.
Alternative Interactive Molecular View for Mobile Devices
The PDB-101 Structure Summary page supports mobile device access by providing an alternative molecular view since Java applets (and hence Jmol) is not normally supported on mobile devices. This interactive molecular view, which requires the use of a HTML5-compliant web browser, enables the user to rotate the molecule about the Y-axis by dragging on the image left and right.
Redesigned Educational Resources Page
The Educational Resources page has been organized into into individual sections with tabs. Images and download links are now easier to find, and general readability has been improved.
RSS Feed on PDB-101 Page
Using the Protein Modification Browser
The Protein Modification Browser was constructed based on protein modification ontology (PSI-MOD) from The Proteomics Standards Initiative (http://www.psidev.info/). From here users can browse the protein residue modifications, view the number of associated PDB structures, and search for the specific associated structures.
- browse the protein residue modifications ontology
Click the grey triangle at the beginning of a term to see its children. The children will be highlighted in grey. Users can browse the hierarchy by clicking the grey triangles at each level.
User can also input a term or PSI-MOD ID in the text box to see its context in the ontology. The browser will expand to show its parents and children in the full hierarchy.
- view the number of associated PDB structures
Hover over a term to see the number of associated PDB structures in a tooltip. Click the term will issue a query and retrieve the PDB structures in query result page.
- search for the specific associated structures.
Input a term or PSI-MOD ID in the text box to find out its context in the ontology tree. Auto-complete suggestions can guide users enter the term and ID more precisely. Then click the highlighted term to retrieve the associated PDB structures.
The image shows an example of searching "glucosyl" and then selecting "N4-(N-acetylamino)glucosyl-L-asparagine (PSI-MOD:831)" from the auto completed suggestion list. A tooltip shows the number of associated structures when hover over the term.
Using the Protein Modification Advanced Search
Protein modifications represent pre-, co-, or post-translational modifications. More than 200 types of protein modifications have been collected from the RESID database, PSI-MOD protein modification ontology, and the wwPDB Chemical Components Dictionary, and mapped to PDB sequences. The protein modification advanced search has been further expanded in this release.
Two more sources (Name and Keyword) have been added. Users can specify the protein modification source type (Name, Keyword, RESID, PSI-MOD, and Chemical Component Dictionary) and the associated name/ID of the modification.
The image on the left shows an example search by Source PSI-MOD of ID MOD:00831. The number of associated structures is same as the one in the tooltip in the above tree browser.
Redesigned File Download Page
The Download Files page, available from the Tools widget in the left hand menu, has been reorganized to make it easier to download:
- Coordinate and experimental data files in different formats
- FASTA file of sequences from PDB entries
- Chemical component files
- Other files available for download, including data files available via http; FTP index files, theoretical models; snapshots
The Protein Comparison Tool interface now can suggest PDB and chain IDs as well as SCOP domain IDs based on various user inputs. It supports searching by
- PDB ID (e.g. 1cdg)
- SCOP ID (e.g. d1cdga1)
- SCOP classification ID (e.g. b.1.18)
- SCOP stable ID (e.g. 21816)
- text search (based on SCOP descriptions)
The Protein Comparison Tool has been extended to allow the pairwise alignment of SCOP domains. It automatically applies the SCOP domain definition for provided IDs, but otherwise behaves the same as the previous version that is using whole-chains for the alignment.View alignment of d4hhba_ vs d4hhbb_