Interpretable Graph Neural Networks with Counterfactual Analysis for Sustainable Chemical Transformations
Keywords:
Graph Neural Network; Interpretability; Counterfactual Analysis; Catalyst Design; Sustainable Chemical TransformationAbstract
Sustainable chemical transformation plays a vital role in achieving carbon neutrality and promoting green development, with catalyst design being a key factor. Traditionally, developing new catalysts involves expensive and time-consuming trial-and-error experiments or computational modeling. Although machine learning methods have recently shown promise in speeding up the process, their "black-box" nature often makes it difficult to understand how they work or to design better catalysts based on their predictions. To overcome this challenge, our study introduces a new framework that combines the transparency of Graph Neural Networks (GNNs) with counterfactual analysis. This combination helps uncover the complex links between a catalyst's structure and its performance. Specifically, we built a GNN model capable of accurately predicting both the activity and selectivity of catalysts. We used the selective transformation of biomass-based molecules—like 5-hydroxymethylfurfural—on different metal catalysts as our test case. To enhance interpretability, we incorporated Grad-CAM and attention mechanisms, allowing the model to visually highlight important atomic sites and structural features that influence how well a catalyst works. On top of that, we used counterfactual analysis to answer a key question: "What small changes to the catalyst’s structure would make it perform better?" This technique introduces targeted, minimal modifications to the catalyst model to reveal what improvements could be made, offering practical insights for rational catalyst optimization. Our findings show that combining GNN interpretability with counterfactual thinking not only delivers accurate performance predictions but also uncovers structural insights that might go unnoticed by traditional chemical reasoning. This data-driven approach presents a promising path forward in the quest for smarter, greener catalysts—dramatically lowering research costs and speeding up the discovery of efficient and eco-friendly catalytic materials.