Recursion Bridges the Protein and Chemical Space with Massive Protein-Ligand Interaction Predictions Spanning 36 Billion Compounds

Recursion has predicted the protein target(s) for approximately 36 billion chemical compounds in the Enamine REAL Space, reported to be the world’s largest searchable chemical library. These advances were made possible by NVIDIA DGX Cloud supercomputing and the recent acquisition of Cyclica’s MatchMaker technology.

SALT LAKE CITY, Aug. 08, 2023 (GLOBE NEWSWIRE) -- Recursion (NASDAQ: RXRX), a leading clinical stage TechBio company decoding biology to industrialize drug discovery, today announced it has successfully screened the Enamine REAL Space chemical library using its MatchMaker technology, recently acquired from Cyclica, to predict the protein target(s) for approximately 36 billion chemical compounds. This accomplishment was made possible by several other enabling discoveries, including the predicted structures derived from the AlphaFold2 database for more than 15,000 human proteins containing more than 80,000 potential binding pockets, as well as the Enamine REAL Space, which is reported to be the world’s largest searchable chemical library comprised of approximately 36 billion make-on-demand molecules. In total, this screen digitally evaluated more than 2.8 quadrillion small molecule-target pairs.

This achievement represents a significant and exciting step toward achieving our mission of decoding biology and chemistry,” said Chris Gibson, Ph.D., Co-Founder and CEO of Recursion. “Until this point, the groundbreaking progress across biology and chemistry that enabled this moment – namely, AlphaFold, the Enamine virtual chemical library and the rapid advancement of large-scale compute and new machine learning approaches – have largely lived in isolation of one another or have been bridged at relatively small scales. Leveraging Recursion’s machine learning and computational expertise and NVIDIA’s technology, we have layered these advances together to predict how each of the molecules in this vast chemical universe may interact with the protein universe.

The company generated this massive new data layer of predicted interactions in less than 90 days after closing the acquisition of Cyclica and in under 30 days since initiating the collaboration with NVIDIA.

MatchMaker uses machine learning to assess whether a small molecule is compatible with a specific protein binding pocket, providing a solution that is significantly less computationally intensive and much more scalable than traditional docking and physics-based interaction simulations. Similar to Recursion’s phenomics platform, the scalability of MatchMaker enables a “high-dimensional” view of biochemistry: activity is predicted not just for a single target, but for many at the same time. This enables three core advantages: First, this predicted data layer can be used to determine which wet-lab experiments should be executed to advance programs faster across a wide range of targets and chemical space. Second, this predicted data layer can be used as part of Recursion’s multi-modal dataset to better understand biological activity across programs quickly and at scale. Finally, this approach can pre-screen for more computationally expensive precision modeling techniques implemented by Recursion’s computational and digital chemistry teams, to more efficiently advance programs.

We are excited to collaborate with Recursion to explore the chemical space and support our mission to accelerate drug discovery,” said Andrey Tolmachev, Ph.D., Founder and Owner of Enamine. He continued: “This achievement in the 36 Billion REAL Space is just a start of our journey. The chemical knowledge accumulated at Enamine over its 35-years history allows us to explore trillions of relationships without compromising the high success rate of synthesis. We believe the predictions made by Recursion can help us prioritize parts of the chemical universe and provide an opportunity to develop focused chemical spaces and novel compounds around discovered hits quickly.

Much of the initial testing and infrastructure development for the project was completed using BioHive-1, Recursion’s in-house supercomputer, an NVIDIA DGX SuperPOD, which is ranked among the top 125 most powerful supercomputers in the world across any industry by TOP500 as of June 2023. The final analysis was made possible by NVIDIA’s DGX Cloud, an advanced AI-training-as-a-service solution to which Recursion gained access following its recently announced collaboration with NVIDIA. Recursion worked with urgency to make this effort happen in a short period of time.

Bringing together powerful data, AI and data-center scale compute, Recursion’s MatchMaker running on NVIDIA DGX Cloud essentially created a time machine for the company’s drug discovery program and sets a new bar for the industry,” said Kimberly Powell, Vice President of Healthcare at NVIDIA. “Within one week, the Recursion team was able to achieve what would have otherwise taken 100,000 years to compute with physics-based methods — setting the stage for a wet-lab, dry-lab flywheel to better predict drug-target interactions and increase a drug’s probability of success in the clinic.

Recursion plans to leverage this new database of predictions to industrialize its chemistry operations across its pipeline and in service to its partners, enabling significantly greater efficiency in its medicinal chemistry cycles. Further, Recursion plans to continue improving and expanding the number and type of chemical properties and interactions it can predict using in-house tools, tools acquired through the acquisition of Cyclica, and tools being developed by Valence Labs, the semi-autonomous research hub powered by Recursion and formed through the acquisition of Valence Discovery.

Download PDF