6WN7

Homo sapiens S100A5


Experimental Data Snapshot

  • Method: X-RAY DIFFRACTION
  • Resolution: 1.25 Å
  • R-Value Free: 0.206 
  • R-Value Work: 0.171 
  • R-Value Observed: 0.186 

wwPDB Validation   3D Report Full Report


This is version 1.3 of the entry. See complete history


Literature

Learning peptide recognition rules for a low-specificity protein.

Wheeler, L.C.Perkins, A.Wong, C.E.Harms, M.J.

(2020) Protein Sci 29: 2259-2273

  • DOI: https://doi.org/10.1002/pro.3958
  • Primary Citation of Related Structures:  
    6WN7

  • PubMed Abstract: 

    Many proteins interact with short linear regions of target proteins. For some proteins, however, it is difficult to identify a well-defined sequence motif that defines its target peptides. To overcome this difficulty, we used supervised machine learning to train a model that treats each peptide as a collection of easily-calculated biochemical features rather than as an amino acid sequence. As a test case, we dissected the peptide-recognition rules for human S100A5 (hA5), a low-specificity calcium binding protein. We trained a Random Forest model against a recently released, high-throughput phage display dataset collected for hA5. The model identifies hydrophobicity and shape complementarity, rather than polar contacts, as the primary determinants of peptide binding specificity in hA5. We tested this hypothesis by solving a crystal structure of hA5 and through computational docking studies of diverse peptides onto hA5. These structural studies revealed that peptides exhibit multiple binding modes at the hA5 peptide interface-all of which have few polar contacts with hA5. Finally, we used our trained model to predict new, plausible binding targets in the human proteome. This revealed a fragment of the protein α-1-syntrophin that binds to hA5. Our work helps better understand the biochemistry and biology of hA5, as well as demonstrating how high-throughput experiments coupled with machine learning of biochemical features can reveal the determinants of binding specificity in low-specificity proteins.


  • Organizational Affiliation

    Institute of Molecular Biology, University of Oregon, Eugene, Oregon, USA.


Macromolecules
Find similar proteins by:  (by identity cutoff)  |  3D Structure
Entity ID: 1
MoleculeChains Sequence LengthOrganismDetailsImage
Protein S100-A595Homo sapiensMutation(s): 2 
Gene Names: S100A5S100D
UniProt & NIH Common Fund Data Resources
Find proteins for P33763 (Homo sapiens)
Explore P33763 
Go to UniProtKB:  P33763
PHAROS:  P33763
GTEx:  ENSG00000196420 
Entity Groups  
Sequence Clusters30% Identity50% Identity70% Identity90% Identity95% Identity100% Identity
UniProt GroupP33763
Sequence Annotations
Expand
  • Reference Sequence
Small Molecules
Ligands 1 Unique
IDChains Name / Formula / InChI Key2D Diagram3D Interactions
CA
Query on CA

Download Ideal Coordinates CCD File 
G [auth A]
H [auth A]
I [auth A]
J [auth A]
K [auth B]
G [auth A],
H [auth A],
I [auth A],
J [auth A],
K [auth B],
L [auth B],
M [auth B],
N [auth C],
O [auth C],
P [auth C],
Q [auth D],
R [auth D],
S [auth D],
T [auth F],
U [auth F],
V [auth F],
W [auth E],
X [auth E]
CALCIUM ION
Ca
BHPQYMZQTOCNFJ-UHFFFAOYSA-N
Experimental Data & Validation

Experimental Data

  • Method: X-RAY DIFFRACTION
  • Resolution: 1.25 Å
  • R-Value Free: 0.206 
  • R-Value Work: 0.171 
  • R-Value Observed: 0.186 
  • Space Group: P 32
Unit Cell:
Length ( Å )Angle ( ˚ )
a = 76.28α = 90
b = 76.28β = 90
c = 84.24γ = 120
Software Package:
Software NamePurpose
PHENIXrefinement
MOSFLMdata reduction
PHASERphasing
SCALAdata scaling

Structure Validation

View Full Validation Report



Entry History 

Deposition Data

Revision History  (Full details and data files)

  • Version 1.0: 2020-09-30
    Type: Initial release
  • Version 1.1: 2020-10-07
    Changes: Database references
  • Version 1.2: 2020-11-11
    Changes: Database references
  • Version 1.3: 2023-10-18
    Changes: Data collection, Database references, Refinement description