With the superior results achieved by black box machine learning techniques, in particular neural networks, there has been an increasing demand for understanding how an artificial intelligence (AI) system achieves its results and decisions. This sparked the field of explainable artificial intelligence (also coined explainable AI or XAI).
In this seminar, you will familiarize yourself with recent advancements in the field of explainable artificial intelligence. You will read research papers as well as conduct own experiments where applicable, and you will discuss the insights with the other participants of the seminar.
As a participant, you are supposed to introduce a particular XAI technique and present it to the seminar participants. Each seminar paper undergoes a peer review process in the seminar. Presentations are supposed to be about 25 minutes long.
This seminar is organized by Prof. Dr. Heiko Paulheim
Available for Master students (2 SWS, 4 ECTS)
For now, we plan with an on-campus seminar.
There will be four dates with 2–3 presentations each:
Note: the topic list below does contains one literature pointer per topic. These papers are examples, but they are not exhaustive, i.e., it is part of your task to collect more papers on the topic.
|Local interpretable model-agnostic explanations (LIME)||Ribeiro et al.: “Why Should I Trust You?”: Explaining the Predictions of Any Classifier, 2016|
|Feature importance||Casalicchio et al.: Visualizing the Feature Importance for Black Box Models, 2018|
|Shapeley values||Lindberg and Lee: A Unified Approach to Interpreting Model Predictions, 2017|
|Saliency maps||Simonyan et al.: Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps, 2013|
|Activation maximization||Erhan et al.: Understanding Representations Learned in Deep Architectures, 2010|
|Surrogate models||Bastani et al.: Interpretability via Model Extraction, 2017|
|Rule extraction||Zilke et al.: DeepRED – rule extraction from deep neural networks, 2015|
|Partial dependence plots (PDP)||Cafri and Bailey: Understanding Variable Effects from Black Box Prediction: Quantifying Effects in Tree Ensembles Using Partial Dependence, 2021|
|Individual conditional expectation||Goldstein et al.: Peeking Inside the Black Box: Visualizing Statistical Learning With Plots of Individual Conditional Expectation, 2015|
|Decomposition||Robnik-Šikonja et al.: Explaining Classifications For Individual Instances, 2008|
|Model distillation||Tan et al.: Distill-and-Compare: Auditing Black-Box Models Using Transparent Model Distillation, 2017|
|Sensitivity analysis||Cortez and Embrechts: Using sensitivity analysis and visualization techniques to open black box data mining models, 2013|
|Layer wise relevance propagation (LRP)||Montavon et al.: Layer-Wise Relevance Propagation: An Overview, 2019|
|Prototypes and criticisms||Kim et al.: Examples are not Enough, Learn to Criticize! Criticism for Interpretability, 2016|
|Counterfactual explanations||Wachter et al.: Counterfactual Explanations without Opening the Black Box: Automated Decisions and the GDPR, 2017|
The following articles provide introductions and surveys for the topic and are recommended for all seminar participants to read: