A team of Transregio researchers from the University of Bielefeld University and LMU Munich has developed a new open source tool: shapiq. The software extends existing methods of explainable AI by analyzing not only the influence of individual features, but also their joint interactions. This makes complex models such as neural networks or decision trees more transparent - an important step towards trust, fairness and optimization in artificial intelligence. Maximilian Muschalik and Fabian Fumagalli will present their results at the Neural Information Processing Systems (NeurIPS) 2024 conference.
Modern artificial intelligence (AI) is based on models that are often referred to as “black boxes” - their decisions are difficult for users to understand. However, in sensitive areas such as medicine, finance or autonomous driving, it is crucial to understand how a model arrives at its results. This is where Shapley values come into play, a method from game theory that measures the contribution of individual characteristics.
The problem is that many decisions are influenced not only by individual features, but also by their interactions. For example, features such as “latitude” and “longitude” can only be used together to determine a exact location. To capture this complexity, the team uses Shapley interactions - an extension of the classic Shapley values. “Shapley interactions allow us to go beyond the purely isolated consideration of features and better understand complex relationships”, explains Maximilian Muschalik, lead author of the project. "With shapiq, we are not only contributing to fundamental research, but also creating a practical solution for users". Fabian Fumagalli, author and expert in the field, explains further: "The calculation of Shapley interactions is a complex problem that requires specific algorithms, which we have now presented".
The tool: shapiq
The shapiq Python package has been specifically designed to standardize and simplify the research and application of Shapley values and interactions. Key features include:
- Efficient calculation: Despite the high theoretical complexity of Shapley interactions, shapiq provides algorithms that enable efficient calculation - even with large data sets and complex models.
- Visualization: The results can be presented clearly so that even non-experts can intuitively understand the interactions.
- Benchmarking: The tool includes a comprehensive benchmark suite of eleven real-world cases, allowing researchers to systematically evaluate the performance of new algorithms.
- Flexibility: From decision trees (such as XGBoost) to neural networks and modern language models - shapiq is suitable for all model types.
The tool is not only a step forward for research, but also offers practitioners an instrument to make models easier to understand and safer to use.
Presentation at NeurIPS conference
The results of the project will be presented at the NeurIPS 2024 conference. Interested parties can learn more about the application and possibilities of the tool there or via the GitHub repository. There you can also find extensive documentation and examples of practical use.
Open source development
Interested users can directly support the research of the Transregio team by marking the GitHub repository with a star or by informing the team directly about suggestions for improvement. As the software is being developed as open source, interested users can also contribute directly to the implementation.