REAL database
The largest enumerated database of synthetically feasible molecules
The REAL Database is one of the ways to explore the REAL Compounds. It is a classical database of enumerated structures. The database is a tool to find new hit molecules using large-scale virtual screening and to search for analogs of your hit molecules. The REAL Database is accessible as SMILES, SDF, and it is searchable on Enaminestore.
The current release of the REAL database comprises over 4.5 billion molecules which comply with “rule of 5“ and Veber criteria: MW≤500, SlogP≤5, HBA≤10, HBD≤5, rotatable bonds≤10, and TPSA≤140.
The REAL database in CXSMILES (Extended SMILES to represent special features of molecules, i.e. enhanced stereochemical representation). The database is split in parts. Molecules are sorted based on Heavy atom count (HAC)
- REAL database, 4.5Bn cpds, HAC 6-21, CXSMILES
- REAL database, 4.5Bn cpds, HAC 22-23, CXSMILES
- REAL database, 4.5Bn cpds, HAC 24, CXSMILES
- REAL database, 4.5Bn cpds, HAC 25, CXSMILES
- REAL database, 4.5Bn cpds, HAC 26 Part 1, CXSMILES
- REAL database, 4.5Bn cpds, HAC 26 Part 2, CXSMILES
- REAL database, 4.5Bn cpds, HAC 27 Part 1, CXSMILES
- REAL database, 4.5Bn cpds, HAC 27 Part 2, CXSMILES
- REAL database, 4.5Bn cpds, HAC 28 Part 1, CXSMILES
- REAL database, 4.5Bn cpds, HAC 28 Part 2, CXSMILES
- REAL database, 4.5Bn cpds, HAC 29-38, CXSMILES
Despite its size, the REAL database is easy to work with. Along with SMILES and catalog IDs, you can find for each REAL molecule important physicochemical parameters (MW, sLogP, HBA, HBD, etc.,), structural alerts (PAINS, Brenk, and Eli Lilly medchem rules), relation to the REAL compound libraries, and type of chemistry and, therefore, an effort, utilized for the synthesis (“s”, simple chemistry, standard effort, “m”, advanced chemistry, high effort). The list of building blocks utilized to assemble the REAL compounds can be received upon request.