Chemical Space

Chemical Space and Virtual Reality for Drug Discovery

Among the many applications of virtual reality, one is particularly useful for the NCCR TransCure: it helps explore the immense chemical space in which chemists and biologists search for small molecules able to act on proteins of interest. The group of Jean-Louis Reymond is world-renowned for its expertise in this field and can also give us tips on how learning chemistry can be fun!

Having at your disposal a small molecule capable of selectively blocking or enhancing the function of a protein can be a decisive advantage when trying to understand its biological function and its link to disease. For the proteins under study in TransCure – transporters and channels – such small molecules are mostly unknown. For this reason, we decided to address this problem by assembling an interdisciplinary team of chemists and biologists. The former design and synthesise small molecules, while the latter establish and implement assays to test their activity. This approach resembles what is done in the pharmaceutical industry in the early stages of drug discovery.

In our projects in Jean-Louis Reymond’s lab (University of Bern), we use the concept of chemical space to select the small molecules to be tested. Chemical space is a virtual multidimensional space in which molecules with similar structures and properties are positioned close to each other (Fig. 1). When searching for a compound to block a particular transporter or channel, we start with a small molecule already known to act on a related protein, or to act only weakly and unselectively on the protein of interest. We then use our dedicated browser to choose a few hundred compounds in its chemical space vicinity from over twelve million screening compounds in the ZINC database. Next, we purchase a sample of each of the selected compounds and test if they block or increase the activity of the transporter or channel under study. If an active compound is found, we repeat the steps of chemical space sampling, purchasing and testing closely related molecules to further improve the activity. In the advanced stages of the project, we turn to organic synthesis to fine-tune the chemical structure of the active compound. This approach has worked beautifully for several TransCure projects, including novel and now-patented inhibitors of endocannabinoid transport, as well as inhibitors of the iron transporter DMT1 and of the TRPV6 and TRPM4 ion channels.

What better tool than virtual reality (VR) to understand the virtual chemical space? After testing several versions during public outreach events at the Botanical Garden in Bern and the Night of Research at the University of Bern, we released a VR view of DrugBank  (Fig. 2 and YouTube video). DrugBank is a database of all drugs in use or under investigation worldwide. In this VR tool, the chemical structures of marketed drugs are shown in a 3-dimensional space projected from a 42-dimensional chemical space called Molecular Quantum Numbers (MQN). Our web-based tool Faerun similarly features SureChEMBL, a database of all molecules found in the patent literature, with additional interactive options including SmilesDrawer to display the structural formulae of molecules. We routinely use these 3-dimensional representations in our projects to better understand the molecular diversity and to choose molecules wisely.

Another use of chemical space is to find out if a newly discovered inhibitor might also bind to other proteins. For instance, our Polypharmacology Browser 2 (PPB2) uses chemical space analysis to predict which proteins are most likely to interact with any given molecule. The phenomenon is quite general: on average drug molecules interact with at least seven different proteins. Luckily, this is not always the case; for instance, by using PPB2 in combination with experimental testing, we found that our best TRPM4 inhibitor acts on this channel alone, without any detectable off-target effects.

In the context of polypharmacology predictions, the accuracy of the virtual chemical space chosen to describe molecules is very critical. In PPB2, we obtain good predictions when comparing molecules in the 42-dimensional MQN-space featured in our VR visualisation, however by far the best predictions come from machine learning with ECFP4. ECFP4 is a so-called ‘fingerprint’ defining a complex 1,024-dimensional chemical space, which is widely used in drug design because it describes molecules very accurately. Nevertheless, we recently designed a new fingerprint called MHFP6, which outperforms ECFP4 for drug design. MHFP6 is a probabilistic description of molecules using concepts of natural language processing. This charts a new course in chemical space exploration in which we visualise project data with WebMolCS, yet another web-based tool developed for TransCure projects.

In addition to the updates on chemical space given so far, we would also like to offer chemistry beginners a tip: in order to understand chemical space, you should know about structural formulae which are in a way the ‘language’ of chemists. A structural formula is a drawing featuring letters connected by lines. The letters represent atoms, using abbreviations from the periodic table of the elements, and the lines represent bonds between these atoms. Such a drawing defines the identity and properties of a molecule. To help learn this language, we recommend the chemistry learning game, which we developed in collaboration with the EU project BigChem. In this game, you are challenged to associate the names and properties of molecules with structural formulae, and you quickly learn them. Try the game out and enjoy it – you might become a chemical space expert!


Daniel Probst, PhD student group Reymond
Jean-Louis Reymond, NCCR TransCure PI