Advanced Search: Overview
Advanced search provides the capability of combining multiple searches of specific types of data in a logical AND or OR. The result is a list of structures that comply with ALL or ANY search criteria, respectively. More complex logic will be provided in a later release of the RCSB PDB. The easiest way to understand how to use advanced search is to work through an example:
Step 1 Choose a query type
This will produce a long list of possible query types organized by category. If you are not sure of what one to pick select one and a brief description will appear. Once a query type is selected you can also select the question mark to get more help on that query. If it was not what you were seeking, select a different query type.
In the above example a query type of UniProt Accession Number(s) is selected and a brief prompt is provided.
Step 2 Get a Result Count
Click the button that says Result Count. This will provide the number of PDB structures that match the query.
In this example 183 structures are found to contain this UniProt identifier P69905. Note that this query is matching a sequence, but it is structures that are returned. That is, the structure may contain not only this single sequence but other different sequences as well. All structures will be listed that contain the selected sequence, irrespective of what else they contain.
The result count provides a sense of how focused the query has become and whether it needs to be refined further. At this point the user may select Submit Query to list the 183 structures that match the search criteria or further refine the query.
3. Refining the Query
By selecting the + a search criteria can be added.
Here we have added a search (query type deposition date) for all structures deposited between January 1, 2008 and November 1, 2009. As previously the result count indicates the number of structure that match the single criteria. In this example 10008 structures were added in that period.
4. Final Query Result
At this point additional search criteria could be added if desired. Alternatively the Submit Query button will return all structures deposited in the selected time period AND which contain the designated protein sequence from UniProt.
5 Sequence Filtering
While not relevant here since structures are selected that match a specific sequence, in other circumstances you may reduce the number of sequence redundant structures returned by using the Remove Similar Sequences option.
In the above example the 10008 structures deposited between January 1, 2008 and November 1, 2009 are filtered such that if any one of the polypeptide chains in a given structure matches any polypeptide chain in another structure at 90% or greater sequence identity, only one of those structures will be displayed. For this query this has the effect of reducing the number of structures resulting from the query from 10008 to 5523. A variety of sequence identities can be selected as cutoffs.
6. Changing the Logic
The default is a AND condition between individual components of the query which is defined by the Match all condition shown below.
The alternative is Match Any which is a logical OR operator.
7. Selecting type of Results
Several types of searches that are related with Chemical Components like "Chemical Name" or "SMILES/SMARTS", provide
the option to retrieve Ligand (short term for chemical components) results instead of the default "Structure" results.
By selecting "Ligand" from the "Results" drop box for a Chemical Name search, for the word "adenosine" for example, one may retrieve the Ligands that match the search criteria - instead of the PDB Structures that contain such Ligands.
It is possible to use this option even in composite searches as long as one of the sub-queries relates to chemical components.