White paper:
Unlocking the Chemistry of Tomorrow with Computer-Aided Synthesis Planning

Interview with Professor Tim Cernak
We had the privilege of interviewing Prof. Tim Cernak, an Assistant Professor of Medicinal Chemistry at the University of Michigan, whose diverse research interests span chemical synthesis, automation, data science, and more. With over a decade of experience, Prof. Cernak has been at the forefront of revolutionizing the field. In this interview, we delve into the world of computer-aided synthesis planning (CASP), a field that leverages automation, computational analysis, and artificial intelligence to streamline chemical syntheses. Prof. Cernak will discuss the pivotal role of CASP, the evolution from rule-based programs to machine learning, and the synergy between human expertise and AI. We’ll explore how neural networks and expert-coded reaction rules enhance synthetic accuracy and discuss recent breakthroughs like sustainable synthesis from industrial waste. Moreover, we’ll uncover the potential of AI-driven organic synthesis and explore the outlook for the future. Join us in this enlightening conversation with Prof. Tim Cernak as we journey through the realms of CASP, laboratory automation, and the promising future of intelligent automation in synthetic chemistry.

Prof. Timothy Cernak
Assistant Professor of Medicinal Chemistry and Chemistry
Tim Cernak was born in Montreal, Canada in 1980. He obtained a B.Sc. in Chemistry from the University of British Columbia Okanagan and there studied the aroma profile of Chardonnay wines. Following PhD training in total synthesis with Prof. Jim Gleason at McGill University, Tim was a FQRNT Postdoctoral Fellow with Tristan Lambert at Columbia University. In 2009, Tim joined the Medicinal Chemistry team at Merck Sharp & Dohme in Rahway, New Jersey. There he developed technologies for miniaturized synthesis and late-stage functionalization. In 2013, Tim moved to Merck’s Boston site. In 2018, Dr. Cernak joined the Department of Medicinal Chemistry at the University of Michigan in Ann Arbor as an Assistant Professor. The Cernak Lab is exploring an interface of chemical synthesis and data science. Tim is a co-Founder of Entos, Inc.
In the field of computer-aided synthesis planning, what role does CASP play in integrating human intuition and computational capabilities?
Computer-Aided Synthesis Planning (CASP) is a rapidly growing field with a rich history. CASP systems are designed to aid chemists in the decision-making process by suggesting synthetic routes that meet specific criteria such as yield, cost, and safety.
Perhaps more than any other field of science, total synthesis embraces and celebrates the art and elegance of its process. The field distills years of developments in strategic planning, systems of developed logic, and innovative and novel experimental reactivity modes into a series of planned reaction steps. The growing corpus of chemical reactions, and their associated rules and mechanisms, cannot possibly be memorized by a human. Interestingly, though, they follow all the rules of a game, and can be encoded into a computer. This logic combines the human decision-making process, expert knowledge, and chemical intuition with the computational power of machine learning models and algorithms to generate efficient synthetic routes.
Can you highlight the progression from rule-based programs to machine learning in the context of computational analysis of synthetic planning?
Sure, in the beginning rule-based programs relied on manually curated sets of chemical reactions and functional group transformations to generate synthetic routes. These approaches involve manual curation of reaction rules, which imparts expert-level context but limits the number of rules that can be considered, especially with the pace that new reaction rules are reported in the primary literature. More recently, machine learning (ML) methods have emerged as a powerful tool for chemical synthesis planning, because they can incorporate large datasets of chemical reactions.
What are the significant milestones and implications of using artificial intelligence algorithms for proposing synthetic routes in small-molecule synthesis?
Modern synthesis algorithms are beginning to push beyond the obvious and into more complex synthesis challenges. Earlier CASD generations could follow the rules of organic chemistry, but the answers weren’t too far from the obvious answers. As the corpus of available reactions to consider has grown, more novel suggestions are starting to pop out. Experimental validation of computer-planned routes is finally becoming more popular with some key milestones including fully automated execution on robotic platforms, generation of complex natural products by computer-planned routes that are indistinguishable from human-planned routes, and human-computer partnerships that have arrived at exceptionally brief synthesis recipes to access natural products. The merger high-throughput experimentation techniques are an exciting new area that promises to further accelerate the drug discovery process. As more and more systematically captured and machine-readable reaction data becomes available, machine learning predictions will improve. In the future we will likely see even more physics-based predictions encoded into retrosynthesis calculations.
What is the synergy between expert and machine-learning approaches in improving retrosynthetic planning?
In retrosynthetic planning, the goal is to identify optimal synthetic routes for a target molecule by working backwards from the target molecule to simpler starting materials. Expert and machine learning (ML) approaches can be used together to improve the effectiveness of retrosynthetic planning.
Expert approaches rely on the knowledge and experience of human chemists to identify key steps in a synthesis. These approaches are often based on a set of manually curated chemical rules that have been developed over years of research. While these expert approaches are valuable, they are limited by the scope and complexity of the transformations that can be described using manually curated rules.
ML approaches, on the other hand, can analyze large datasets of known chemical reactions and automatically learn patterns and transformations that are difficult or impossible to capture using expert approaches. The ability of ML algorithms to identify these new patterns and transformations can augment the domain knowledge and intuition of human chemists.
The synergy between expert and ML approaches in retrosynthetic planning can be seen in the development of expert-guided ML algorithms. In these approaches, expert knowledge is used to guide the selection of potential synthetic routes generated by an ML model. This allows for increased accuracy and specificity in the selection of synthetic routes, while still benefiting from the efficiency and scale of ML algorithms.
Overall, the combination of expert and ML approaches in retrosynthetic planning has the potential to significantly improve the speed and effectiveness of this process, leading to more efficient drug discovery and ultimately, the development of new therapies for patients.
Could you elaborate on how neural networks trained on expert-coded reaction rules contribute to achieving higher synthetic accuracy in retrosynthetic planning?
Neural networks that are trained on expert-coded reaction rules can improve the accuracy of synthetic predictions in retrosynthetic planning by incorporating detailed chemical knowledge while remaining flexible enough to address incomplete or novel reactions.
Expert-coded reaction rules are a set of pre-defined chemical rules that describe known chemical transformations. These rules are based on years of research and domain knowledge and can be used as a foundation for training neural networks. By training neural networks on these rules, the networks can learn to recognize patterns of reactions and better predict the outcomes of chemical transformations.
The use of neural networks trained on expert-coded reaction rules can also help address the challenge of incomplete or novel reactions. For example, if a chemical transformation has only been previously observed in a limited set of reactions, the neural network can be trained to predict the outcome of that transformation based on the available data and the expert rules. This is particularly useful in the context of drug discovery, where many of the target molecules have never been synthesized before.
By incorporating detailed chemical knowledge in the form of expert-coded rules, neural networks trained on these rules can achieve higher accuracy in retrosynthetic planning. This not only helps chemists to predict the outcomes of chemical reactions more accurately, but also speeds up the drug discovery process by assisting chemists in identifying optimal synthetic routes for new compounds.
How do you leverage a library of chemical reactions and metadata to design sustainable syntheses?
This is an important aspect of future work. The beauty of computational retrosynthesis is that you can demerit protocols that use ecological insensitive reagents, for instance reactions requiring dichloromethane as solvent or that produce a large amount of metal waste. Meanwhile, you can reward protocols that leverage more environmentally friendly options.
One way to do this is to incorporate information on the environmental impact of specific chemical reactions into the library. A variety of factors such as the amount of waste generated, energy required, and toxicity of reagents, can be encoded. This enables chemists to identify sustainable synthetic routes for target molecules, by selecting reactions from the library that meet the required sustainability criteria.
In addition to information on the environmental impact of specific chemical reactions, metadata about chemical reactions can also be used to design sustainable syntheses. This metadata can include information such as reaction yields, solvents used, and energy requirements. By analyzing this metadata, chemists can identify more efficient and sustainable synthetic routes.
By leveraging a library of chemical reactions and metadata to design sustainable syntheses, chemists can reduce the environmental impact of synthetic processes, leading to a more sustainable and less wasteful chemical industry. Furthermore, by selecting reactions from a library of sustainable reactions, chemists can also improve the speed and cost-effectiveness of the drug discovery process, leading to more accessible and affordable treatments for patients.
In the context of artificial intelligence-driven organic synthesis, how can AI algorithms and robotic platforms be coupled together?
The coupling of AI algorithms and robotic platforms in artificial intelligence-driven organic synthesis could significantly improve the speed and efficiency of the discovery process. AI driven organic synthesis is automating the logic of synthesis, while robotic platforms are automating the hands-on lab work so it’s a powerful combination.
AI algorithms can be used to predict the optimal synthetic routes for a given target molecule, leveraging vast databases of chemical reactions and related data. Robotic platforms can then be used to synthesize the target molecule based on the predicted routes. The use of these platforms allows for rapid experimentation and high-throughput synthesis of a large number of compounds.
The combination of AI algorithms and robotic platforms can be further enhanced by using feedback loop mechanisms to optimize the synthetic process in real-time. For example, an AI algorithm could monitor the synthetic process and provide feedback on reaction conditions, leading to further optimization of the synthetic route and ultimately better outcomes.
Another way in which AI algorithms and robotic microfluidic platforms can be coupled together is through the use of machine learning algorithms to improve the performance of the microfluidic platform. Through continuous monitoring and feedback, machine learning algorithms can learn to predict optimal reaction conditions and improve the efficiency and accuracy of the microfluidic platform.
Ultimately, by combining the power of AI algorithms and robotic microfluidic platforms, the drug discovery process can be accelerated, allowing for the rapid identification of novel drug candidates. This has the potential to significantly improve patient outcomes by providing faster and more effective treatments for a wide range of diseases.
The exciting thing is that laboratory robotics are becoming more and more accessible to all, through commercial vendors or open-source democratized hardware platforms. The field is sure to accelerate even further as lab automation is playing an increasing role in the education of undergraduate and graduate programs, preparing the next generation of researchers for this future of work.
How has laboratory automation transformed traditional synthesis, and what are the key advantages it offers over manual operation?
Laboratory automation has transformed traditional synthesis by allowing for high-throughput experimentation and faster, more precise synthesis of complex organic compounds. Some of the key advantages it offers over manual operation include increased efficiency, higher precision, improved reproducibility, ability to handle hazardous materials and integration with artificial intelligence and machine learning algorithms.
Automation allows for the rapid and reproducible synthesis of large numbers of compounds, reducing the time and effort required for routine chemical tasks and freeing up scientists to focus on more complex research. Automated equipment can dispense precise volumes of reagents, leading to less variability in reaction outcomes and improved accuracy of results and can handle hazardous materials and reactions safely, protecting researchers from potentially harmful chemicals.
Automation ensures that experiments are performed under consistent conditions, reducing the likelihood of human error and increasing the reproducibility of results. It can be integrated with artificial intelligence and machine learning algorithms to further optimize laboratory operations and accelerate the development of new compounds.
Overall, laboratory automation has the potential to significantly improve the efficiency and precision of traditional synthesis, leading to faster drug discovery and ultimately more effective treatments for patients.
What is the outlook of the future of synthetic automation, and how do you envision the progression from labor-intensive processes to intelligent automation?
The future of synthetic automation is expected to be marked by continued advances in robotics, machine learning, and artificial intelligence, leading to increasingly sophisticated and intelligent automation systems.
One of the key areas of focus in future synthetic automation will be the development of autonomous laboratories that can operate 24/7 without human intervention. These systems will be equipped with robotics and machine learning algorithms that are capable of performing routine tasks, monitoring experiments, and making autonomous decisions based on the data generated.
Another area of focus will be the further integration of artificial intelligence into the drug discovery process. This will involve the development of AI algorithms that can analyze vast amounts of chemical data, predict the outcomes of chemical reactions, and optimize experimental conditions in real-time.
Over time, it is expected that synthetic automation will become more and more intelligent, with the automation systems taking on increasingly complex tasks and generating new insights that can accelerate drug discovery. These highly automated systems will allow researchers to perform a wide range of experiments with minimal labor and human oversight, leading to faster and more efficient drug development.
Overall, the trajectory of synthetic automation is toward more intelligent and sophisticated systems that can handle complex experiments and rapidly generate large amounts of data that can be used to further optimize the drug discovery process. In the future, we can expect to see a progression from labor-intensive processes to highly automated, intelligent systems that can revolutionize drug discovery and lead to new treatments for a wide range of diseases.
Considering the insights from these papers and the advancements in computer-aided retrosynthesis, how do you see the continued integration of artificial intelligence and automation shaping the landscape of chemical synthesis in the years to come?
The continued integration of artificial intelligence and automation is expected to dramatically shape the landscape of chemical synthesis in the future. AI and automation are expected to speed up drug discovery by allowing chemists to synthesize compounds more quickly and efficiently. This will enable researchers to test a greater number of molecules and explore a wider range of chemical space. The integration of AI and automation may also help reduce the environmental impact of chemical synthesis by enabling researchers to identify more sustainable chemical routes.
AI algorithms with automated synthesis can lead to reductions in human error and result in more precise synthesis, thereby yielding higher quality products. It will accelerate the exploration of chemical space in drug discovery and lead to the identification of new compounds that would otherwise be difficult to discover through traditional methods. The integration of AI and automation into chemical synthesis has the potential to transform the field in profound ways, greatly accelerating the speed and efficiency of drug discovery while simultaneously improving our ability to identify sustainable and effective drug molecules. The future of chemical synthesis looks bright with the continued integration of these technologies.
Disclosure: The Cernak Lab has received research funding or in-kind donations from MilliporeSigma, Burlington, MA, an affiliate of Merck, Relay Therapeutics, Janssen Therapeutics, SPT Labtech, and MSD Inc. T.C. holds equity in Scorpion Therapeutics and is a cofounder of and equity holder in Iambic Therapeutics.