SMILES/SMARTS
Searches for a chemical component (ligand) using a SMILES or a SMARTS string .
You may launch the Chemical structure editor based on the MarvinSketch applet to provide or edit a SMILES or SMARTS string expression
SMILES (Simplified Molecular Input Line Entry Specification) is a linear notation for describing chemical structures.
The SMILES search supports four search types:
| Search type | Query | Result |
|---|---|---|
| Exact | ![]() |
![]() |
| Substructure | ![]() |
![]() |
| Superstructure | ![]() |
![]() |
| Similar | ![]() |
![]() |
Exact - finds an exact structure match
Substructure - finds ligands that contain the specified structure as a substructure
Superstructure - finds ligands that are substructures (fragments) of the specified structure
Similar - finds structures that bind similar ligands. Specify a similarity threshold to change the degree of similarity in the [0...1] range: 0 - dissimilar ... 1 identical.
The similarity is based on the number of chemical features in common between the query and the target molecule.
Similarity is calculated using the Tanimoto Coefficient.
SMARTS (SMiles ARbitrary Target Specification) is an extension of SMILES to specify a complex substructure pattern in a flexible way.
The SMARTS search supports the following query:
Substructure - finds ligands that match the SMARTS substructure pattern.
Polymeric type option
The polymeric type option can be used to specify the context where the chemical component is used:
Free: in entries where it is not part of a polymeric chain of a protein or nucleic acid
Polymeric: in entries where it is part of a chain.
Any: in any of the cases listed above
SMILES Examples
Similarity with similarity threshold of 0.8
C1C2C(C(S1)CCCCC(=O)O)NC(=O)N2
- SMILES string for biotinThis query returns structures containing biotin, its steroisomers, and various biotin derivatives including: BTN (biotin), BTQ (epi-biotin), BSO (biotin-d-sulfoxide), IMI (2-iminobiotin), SHM (homobiotin), SNR (norbiotin), etc.
-
Exact match (without stereochemistry)
C1C2C(C(S1)CCCCC(=O)O)NC(=O)N2
- SMILES string for biotinThis query returns structures containing BTN (biotin) and its stereoisomer BTQ (epi-biotin).
-
Exact match (with complete stereochemistry specified in SMILES string)
C1[C@H]2[C@@H]([C@@H](S1)CCCCC(=O)O)NC(= O)N2
- isomeric SMILES string for BTN (biotin)This query only returns structures containing a single stereoisomer: BTN (biotin).
C1[C@H]2[C@@H]([C@H](S1)CCCCC(=O)O)NC(=O )N2 -
isomeric SMILES string for BTQ (epi-biotin)This query only returns structures containing a single stereoisomer: BTQ (epi-biotin)
-
Substructure match
C1C2C(C(S1)CCCCC(=O)O)NC(=O)N2
This query returns structures that contain ligands with a biotin substructure, i.e. biotin-d-sulfoxide and biotinyl-5-amp.
C1[C@H]2[C@@H]([C@@H](S1)CCCCC(=O)O)NC(= O)N2
- isomeric SMILES string for biotinThis query returns structures that contain ligands with a biotin substructure which have the same stereochemistry within the substructure. For example the stereoisomer epi-biotin is not found by this query.
-
Substructure match where polymeric type is Polymeric (modified residues or proteins or nucleic acid)
c1ccccc1
This query returns structures that contain a benzene ring in a non-standard modified residue of their polymer chains
SMARTS Examples
Please note that all SMARTS searches should be used with the "substructure" search type.
-
Substructure search for compounds containing halogen atoms
[F,Cl,Br,I]
-
Substructure search for amidinium group
[NX3][CX3]=[NX3+]
-
Substructure search for carbonyl group, excluding amide
(A),[$(C=O)!$(C(N)=O)]
-
Substructure search for rings of size 8-10 with at least one carbon atom
[C;r8,r9,r10]
-
Substructure search hydrogen bond donor
[!$([#6,H0,-1,-2,-3])]
Background on the SMARTS notation and many examples can be found at Daylight Chemical Information Systems:


