Researchers from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) and Department of Electrical Engineering and Computer Science (EECS) have developed a model that better selects lead molecule candidates based on desired properties. It also modifies the molecular structure needed to achieve a higher potency, while ensuring the molecule is still chemically valid.
MIT Computer Science and Artificial Intelligence Laboratory (CSAIL) is a research institute at the Massachusetts Institute of Technology (MIT) formed by the 2003 merger of the Laboratory for Computer Science and the Artificial Intelligence Laboratory. The institute is based in Cambridge, Massachusetts (USA).
The research was conducted as part of the Machine Learning for Pharmaceutical Discovery and Synthesis Consortium between MIT and eight pharmaceutical companies, announced in May. The consortium identified lead optimization as one key challenge in drug discovery.
Systems that attempt to automate molecule design have cropped up in recent years, but their problem is validity. Those systems often generate molecules that are invalid under chemical rules, and they fails to produce molecules with optimal properties. This essentially makes full automation of molecule design infeasible.
The new model basically takes as input molecular structure data and directly creates molecular graphs – detailed representations of a molecular structure, with nodes representing atoms and edges representing bonds. It breaks those graphs down into smaller clusters of valid functional groups that it uses as “building blocks” that help it more accurately reconstruct and better modify molecules.
The researchers trained their model on 250,000 molecular graphs from the ZINC database, a collection of 3-D molecular structures available for public use. They tested the model on tasks to generate valid molecules, find the best lead molecules, and design novel molecules with increase potencies. The researchers next aim to test the model on more properties, beyond solubility, which are more therapeutically relevant.