PDB Focus: Redundancy Reduction Cluster Data Available on the PDB FTP Site
The results of the weekly clustering of protein chains in the
PDB are posted at ftp://ftp.rcsb.org/pub/pdb/derived_data/NR/. These clusters are used in the "remove similar sequences" feature on SearchLite, SearchFields, and the home page on the RCSB PDB Web sites.
Files that list the clusters and their rankings at 50%, 70% and
90% sequence identity are available. Smaller rank numbers indicate
higher (better) ranking. Chains with rank number 1 are ranked as
the best representative of their cluster.
The contents of these files and the details of the clustering and
ranking are further described at
ftp://ftp.rcsb.org/pub/pdb/derived_data/NR/README and www.rcsb.org/pdb/redundancy.html.
|