REAL Compound Libraries
In addition to the full REAL database, we provide a 38.2 million diverse set that represent the REAL drug-like space (compounds that comply with “rule of 5” and Veber criteria: MW≤500, SlogP≤5, HBA≤10, HBD≤5, rotatable bonds≤10, and TPSA≤140) and lack PAINS and toxic compounds.
Diverse REAL drug-like set contains compounds that have no analogs with a Tanimoto similarity more than 0.6 (Morgan 2 fingerprint, 512 bit) within the set and within entire Enamine stock screening compound collection. We prepared diverse REAL drug-like sets from the REAL drug-like set using MaxMin algorithm.
REAL lead-like compounds
The lead-like subset of REAL database has been obtained from the entire REAL database by filtration using the following molecular criteria: MW≤460, -4≤SlogP≤4.2, HBA≤9, HBD≤5, rings≤4, rotatable bonds≤10. Within the set, we have charted a “350/3” subset with compounds with most stringent physicochemical profiles to have high potency for optimization: 270≤MW≤350, 14≤heavy atoms≤26, SlogP≤3, and aryl rings≤2. PAINS and toxic compounds were removed.
Enamine has a large fragment collection in stock. REAL database expands this fragment space allowing you to find novel fragments for your in-house collection and analogues of the found hits. We have prepared REAL Fragment collection by applying “rule of 3” criteria (MW<300, SlogP≤3, HBA≤3, HBD≤3, rotatable bonds≤3, and TPSA≤60) to the entire REAL collection. We have also extracted a single pharmacophore subset that complies with even more stringent molecular selection criteria: 140≤MW≤230, 0≤SlogP≤2, 10≤heavy atoms≤16, rotatable bonds≤3, and chiral centers≤1. PAINS and toxic compounds were removed.
REAL compounds by chemical classes
Prefiltering REAL database by distinct structural motives that pop-up frequently in virtual screening significantly reduces computational time. We have created a number of REAL database subsets based on the presence of specific chemical moieties/pharmacophores in compound structures. PAINS and toxic compounds were removed.
- REAL amino acids, 4.8M cpds, CXSMILES
- REAL carboxylic acids, MW≤400, clogP≤3, 57.7M cpds, CXSMILES
- REAL lead-like aliphatic carboxylic acids, 41.7M cpds, CXSMILES
- REAL lead-like aromatic carboxylic acids, 14M cpds, CXSMILES
- REAL lead-like aliphatic primary amines, 49.9M cpds, CXSMILES
- REAL lead-like aromatic primary amines, 220.8M cpds, CXSMILES
- REAL secondary amines, 8-21 heavy atoms, 59.4M cpds, CXSMILES
- REAL hydroxamates, 282K cpds, CXSMILES
- REAL Terminal Acetylenes, 151.3M cpds, CXSMILES
REAL natural product-like compounds
We have utilized an approach published by P. Ertl et. al to predict natural product-likeness of the REAL compounds. The REAL natural product-like compounds comprise drug-like molecules with the positive natural product-likeness score.