Existing methods for explainable artificial intelligence (XAI), including popular feature importance measures such as SAGE, are mostly restricted to the batch learning scenario. However, machine learning is often applied in dynamic environments, where data arrives continuously and learning must be done in an online manner. Therefore, we propose iSAGE, a time- and memory-efficient incrementalization of SAGE, which is able to react to changes in the model as well as to drift in the data-generating process. We further provide efficient feature removal methods that break (interventional) and retain (observational) feature dependencies. Moreover, we formally analyze our explanation method to show that iSAGE adheres to similar theoretical properties as SAGE. Finally, we evaluate our approach in a thorough experimental analysis based on well-established data sets and data streams with concept drift.


Muschalik, M., Fumagalli, F., Hammer, B., Hüllermeier, E., (2023) iSAGE: An Incremental Version of SAGE for Online Explanation on Data Streams. In: Koutra, D., Plant, C., Gomez Rodriguez, M., Baralis, E., Bonchi, F. (eds) Machine Learning and Knowledge Discovery in Databases: Research Track. ECML PKDD 2023. Lecture Notes in Computer Science(), vol 14171. Springer, Cham.

Explainable artificial intelligence has mainly focused on static learning scenarios so far. We are interested in dynamic scenarios where data is sampled progressively, and learning is done in an incremental rather than a batch mode. We seek efficient incremental algorithms for computing feature importance (FI). Permutation feature importance (PFI) is a well-established model-agnostic measure to obtain global FI based on feature marginalization of absent features. We propose an efficient, model-agnostic algorithm called iPFI to estimate this measure incrementally and under dynamic modeling conditions including concept drift. We prove theoretical guarantees on the approximation quality in terms of expectation and variance. To validate our theoretical findings and the efficacy of our approaches in incremental scenarios dealing with streaming data rather than traditional batch settings, we conduct multiple experimental studies on benchmark data with and without concept drift.


Fumagalli, F., Muschalik, M., Hüllermeier, M., Hammer, B., (2023) Incremental permutation feature importance (iPFI): towards online explanations on data streams. Mach Learn (2023). DOI:

Removal-based explanations are a general framework to provide feature importance scores, where feature removal, i.e. restricting a model on a subset of features, is a central component. While many machine learning applications require dynamic modeling environments, where distributions and models change over time, removal-based explanations and feature removal have mainly been considered in a static batch learning environment. Recently, an interventional and observational perturbation method was presented that allows to remove features efficiently in dynamic learning environments with concept drift. In this paper, we compare these two algorithms on two synthetic data streams. We showcase how both yield substantially different explanations when features are correlated and provide guidance on the choice based on the application.


Fumagalli, F., Muschalik, M., Hüllermeier, E., Hammer, B., (2023) On Feature Removal for Explainability in Dynamic Environments. ESANN 2023 proceedings. DOI:

Post-hoc explanation techniques such as the well-established partial dependence plot (PDP), which investigates feature dependencies, are used in explainable artificial intelligence (XAI) to understand black-box machine learning models. While many real-world applications require dynamic models that constantly adapt over time and react to changes in the underlying distribution, XAI, so far, has primarily considered static learning environments, where models are trained in a batch mode and remain unchanged. We thus propose a novel model-agnostic XAI framework called incremental PDP (iPDP) that extends on the PDP to extract time-dependent feature effects in non-stationary learning environments. We formally analyze iPDP and show that it approximates a time-dependent variant of the PDP that properly reacts to real and virtual concept drift. The time-sensitivity of iPDP is controlled by a single smoothing parameter, which directly corresponds to the variance and the approximation error of iPDP in a static learning environment. We illustrate the efficacy of iPDP by showcasing an example application for drift detection and conducting multiple experiments on real-world and synthetic data sets and streams.


Muschalik, M., Fumagalli, F., Jagtani, R., Hammer, B., Hüllermeier, E., (2023) iPDP: On Partial Dependence Plots in Dynamic Modeling Scenarios. In: Longo, L. (eds.) Explainable Artifical Intelligence. xAI 2023. Communications in Computer and Information Science, vol 1901. Springer, Cham. DOI: https.//


A scoring system is a simple decision model that checks a set of features, adds a certain number of points to a total score for each feature that is satisfied, and finally makes a decision by comparing the total score to a threshold. Scoring systems have a long history of active use in safety-critical domains such as healthcare and justice, where they provide guidance for making objective and accurate decisions. Given their genuine interpretability, the idea of learning scoring systems from data is obviously appealing from the perspective of explainable AI. In this paper, we propose a practically motivated extension of scoring systems called probabilistic scoring lists (PSL), as well as a method for learning PSLs from data. Instead of making a deterministic decision, a PSL represents uncertainty in the form of probability distributions. Moreover, in the spirit of decision lists, a PSL evaluates features one by one and stops as soon as a decision can be made with enough confidence. To evaluate our approach, we conduct a case study in the medical domain.


Hanselle, J., Fürnkranz, J., Hüllermeier, E., (2023) Probabilistic Scoring Lists for Interpretable Machine Learning. Lecture Notes in Computer Science(), vol 14050. S. 189-203.

Explanation has been identified as an important capability for AI-based systems, but research on systematic strategies for achieving understanding in interaction with such systems is still sparse. Negation is a linguistic strategy that is often used in explanations. It creates a contrast space between the affirmed and the negated item that enriches explaining processes with additional contextual information. While negation in human speech has been shown to lead to higher processing costs and worse task performance in terms of recall or action execution when used in isolation, it can decrease processing costs when used in context. So far, it has not been considered as a guiding strategy for explanations in human-robot interaction. We conducted an empirical study to investigate the use of negation as a guiding strategy in explanatory human-robot dialogue, in which a virtual robot explains tasks and possible actions to a human explainee to solve them in terms of gestures on a touchscreen. Our results show that negation vs. affirmation 1) increases processing costs measured as reaction time and 2) increases several aspects of task performance. While there was no significant effect of negation on the number of initially correctly executed gestures, we found a significantly lower number of attempts—measured as breaks in the finger movement data before the correct gesture was carried out—when being instructed through a negation. We further found that the gestures significantly resembled the presented prototype gesture more following an instruction with a negation as opposed to an affirmation. Also, the participants rated the benefit of contrastive vs. affirmative explanations significantly higher. Repeating the instructions decreased the effects of negation, yielding similar processing costs and task performance measures for negation and affirmation after several iterations. We discuss our results with respect to possible effects of negation on linguistic processing of explanations and limitations of our study.


Groß, A., Singh, A., Banh, NC., Richter, B., Scharlau, I., Rohlfing, KJ. and Wrede, B., (2023) Scaffolding the human partner by contrastive guidance in an explanatory human-robot dialogue. Front. Robot. AI 10:1236184. DOI: 10.3389/frobt.2023.1236184

In recent years the use of Artificial Intelligence (AI) has become increasingly prevalent in a growing number of fields. As AI systems are being adopted in more high-stakes areas such as medicine and finance, ensuring that they are trustworthy is of increasing importance. A concern that is prominently addressed by the development and application of explainability methods, which are purported to increase trust from its users and wider society. While an increase in trust may be desirable, an analysis of literature from different research fields shows that an exclusive focus on increasing trust may not be warranted. Something which is well exemplified by the recent development in AI chatbots, which while highly coherent tend to make up facts. In this contribution, we investigate the concepts of trust, trustworthiness, and user reliance.

In order to foster appropriate reliance on AI we need to prevent both disuse of these systems as well as overtrust. From our analysis of research on interpersonal trust, trust in automation, and trust in (X)AI, we identify the potential merit of the distinction between trust and distrust (in AI). We propose that alongside trust a healthy amount of distrust is of additional value for mitigating disuse and overtrust. We argue that by considering and evaluating both trust and distrust, we can ensure that users can rely appropriately on trustworthy AI, which can both be useful as well as fallible.


Peters, T.M., Visser, R.W. (2023). The Importance of Distrust in AI. In: Longo, L. (eds) Explainable Artificial Intelligence. xAI 2023. Communications in Computer and Information Science, vol 1903. Springer, Cham.

In XAI it is important to consider that, in contrast to explanations for professional audiences, one cannot assume common expertise when explaining for laypeople. But such explanations between humans vary greatly, making it difficult to research commonalities across explanations. We used the dual nature theory, a techno-philosophical approach, to cope with these challenges. According to it, one can explain, for example, an XAI's decision by addressing its dual nature: by focusing on the Architecture (e.g., the logic of its algorithms) or the Relevance (e.g., the severity of a decision, the implications of a recommendation). We investigated 20 game explanations using the theory as an analytical framework. We elaborate how we used the theory to quickly structure and compare explanations of technological artifacts. We supplemented results from analyzing the explanation contents with results from a video recall to explore how explainers justified their explanation. We found that explainers were focusing on the physical aspects of the game first (Architecture) and only later on aspects of the Relevance. Reasoning in the video recalls indicated that EX regarded the focus on the Architecture as important for structuring the explanation initially by explaining the basic components before focusing on more complex, intangible aspects. Shifting between addressing the two sides was justified by explanation goals, emerging misunderstandings, and the knowledge needs of the explainee. We discovered several commonalities that inspire future research questions which, if further generalizable, provide first ideas for the construction of synthetic explanations.


Terfloth, L., Schaffer, M., Buhl, H.M., Schulte, C., (2023) Adding Why to What? Analyses of an Everyday Explanation. arXiv:2308.04187. DOI:

The study investigates two different ways of guiding the addressee of an explanation - an explainee, through action demonstration: contrastive and non-contrastive. Their effect was tested on attention to specific action elements (goal) as well as on event memory. In an eye-tracking experiment, participants were shown different motion videos that were either contrastive or non-contrastive with respect to the segments of movement presentation. Given that everyday action demonstration is often multimodal, the stimuli were created with re- spect to their visual and verbal presentation. For visual presentation, a video combined two movements in a contrastive (e.g., Up-motion following a Down-motion) or non-contrastive way (e.g., two Up-motions following each other). For verbal presentation, each video was combined with a sequence of instruction descriptions in the form of negative (i.e., contrastive) or assertive (i.e., non-contrastive) guidance. It was found that a) attention to the event goal increased for this condition in the later time window, and b) participants’ recall of the event was facilitated when a visually contrastive motion was combined with a verbal contrast.


Sing, A., Rohlfing, K.J., (2023) Contrastiveness in the context of action demonstration: an eye-tracking study on its effects on action perception and action recall. In: Proceedings of the Annual Meeting of the Cognitive Science Society 45 (45). Cognitive Science Society; 2023.

Emotions play an important role in human decision-making. However, first approaches to incorporating knowledge of this influence into AI-based decision support systems are only very recent. Accordingly, our target is to develop an interactive intelligent agent that is capable of explaining the recommendations of AI-systems while taking emotional constraints into account. This article addresses the following research questions based on the emotions of happiness and anxiety: (1) How do induced emotions influence risk propensity in HCI? (2) To what extent does the explanation strategy influence the human explanation recipient in a lottery choice? (3) How well can an HCI system estimate the emotional state of the human? Our results showed that (1) our emotion induction strategy was successful. However, the trend took the opposite direction of ATF predictions. (2) Our explanation strategy yielded a change in the risk decision in only 26% of the participants; in some cases, participants even changed their selection in the opposite direction. (3) Emotion recognition from facial expressions did not provide sufficient indications of the emotional state - because of head position and a lack of emotional display - but heart rate showed significant effects of emotion induction in the expected direction. Importantly, in individual cases, the dynamics of facial expressions followed the expected path. We concluded that (1) more differentiated explanation strategies are needed, and that temporal dynamics may play an important role in the explanation process, and (2) that a more interactive setting is required to elicit more emotional cues that can be used to adapt the explanation strategy accordingly.


Schütze, C., Lammert, O., Richter, B., et al., (2023) Emotional Debiasing Explanations for Decisions in HCI. Artificial Intelligence in HCI. Lecture Notes in Computer Science(), vol 14050. S. 318-336.

Human integration in machine learning can take place in various forms and stages. The current study examines the process of feature selection, with a specific focus on eliciting and aggregating feature rankings by human subjects. The elicitation is guided by the principles of expert judgment elicitation, a field of study that has investigated the aggregation of multiple opinions for the purpose of mitigating biases and enhancing accuracy. An online experiment was conducted with 234 participants to evaluate the impact of different elicitation and aggregation methods, namely behavioral aggregation, mathematical aggregation, and the Delphi method, compared to individual expert opinions, on feature ranking accuracy. The results indicate that the aggregation method significantly affects the rankings, with behavioral aggregation having a more significant impact than mean and median aggregation. On the other hand, the Delphi method had minimal impact on the rankings compared to individual rankings.


Kornowicz, J., Thommes, K., (2023) Aggregating Human Domain Knowledge for Feature Ranking. Artificial Intelligence in HCI. Lecture Notes in Computer Science(), vol 14050. S. 98-114.

Artificial intelligence (AI) outperforms humans in plentiful domains. Despite security and ethical concerns, AI is expected to provide crucial improvements on both personal and societal levels. However, algorithm aversion is known to reduce the effectiveness of human-AI interaction and diminish the potential benefits of AI. In this paper, we built upon the Dual System Theory and investigate the effect of the AI response time on algorithm aversion for slow-thinking and fast-thinking tasks. To answer our research question, we conducted a 2×2 incentivized laboratory experiment with 116 students in an advice-taking setting. We manipulated the length of the AI response time (short vs. long) and the task type (fast-thinking vs. slow-thinking). Additional to these treatments, we varied the domain of the task. Our results demonstrate that long response times are associated with lower algorithm aversion, both when subjects think fast and slow. Moreover, when subjects were thinking fast, we found significant differences in algorithm aversion between the task domains.



Lebedeva, A., Kornowicz, J., Lammert, O., Papenkordt, J., (2023) The Role of Response Time for Algorithm Aversion in Fast and Slow Thinking Tasks. Artificial Intelligence in HCI. Lecture Notes in Computer Science(), vol 14050. S. 131-149. This publication was created in cooperation with Arbeitswelt.Plus (

Scharlau, I., Karsten, A., (2023) Schreibfokussierte Graduiertenförderung: Reflexive Spezialisierung für interdisziplinäre Forschungskontexte. In: Berendt, B., Fleischman, A., Salmhofe, G., Schaper, N., Szczyrba, B., Wiemer, M., Wildt, J. (Eds.), (2023)  Neues Handbuch Hochschullehre. DUZ medienhaus, 2023, pp. 17–35, LibreCat-ID: 45862

Machine learning is frequently used in affective computing, but presents challenges due the opacity of state-of-the-art machine learning methods. Because of the impact affective machine learning systems may have on an individual's life, it is important that models be made transparent to detect and mitigate biased decision making. In this regard, affective machine learning could benefit from the recent advancements in explainable artificial intelligence (XAI) research. We perform a structured literature review to examine the use of interpretability in the context of affective machine learning. We focus on studies using audio, visual, or audiovisual data for model training and identified 29 research articles. Our findings show an emergence of the use of interpretability methods in the last five years. However, their use is currently limited regarding the range of methods used, the depth of evaluations, and the consideration of use-cases. We outline the main gaps in the research and provide recommendations for researchers that aim to implement interpretable methods for affective machine learning.



Johnson, S., Hakobyan, O., Drimalla, H., (2023) Towards Interpretability in Audio and Visual Affective Machine Learning: A Review. arXiv preprint arXiv:2306.08933. DOI:

We investigate how people with atypical bodily capabilities interact within virtual reality (VR) and the way they overcome interactional challenges in these emerging social environments. Based on a videographic multimodal single case analysis, we demonstrate how non-speaking VR participants furnish their bodies, at-hand instruments, and their interactive environment for their practical purposes. Our findings are subsequently related to renewed discussions of the relationship between agency and environment, and the co-constructed nature of situated action. We thus aim to contribute to the growing vocabulary of atypical interaction analysis and the broader context of ethnomethodological conceptualizations of unorthodox and fractured interactional ecologies.


Klowait, N., Erofeeva, M., (2023) Halting the Decay of Talk. In: Social Interaction. Video Based Studies of Human Sociality. vol. 6 DOI:


Popular speech disentanglement systems decompose a speech signal into a content and a speaker embedding, where a decoder reconstructs the input signal from these embeddings. Often, it is unknown, which information is encoded in the speaker embeddings. In this work, such a system is investigated on German speech data. We show that directions in the speaker embeddings space correlate with different acoustic signal properties that are known to be characteristics of a speaker, and manipulating these embeddings in that direction, the decoder synthesises a speech signal with modified acoustic properties.



Rautenberg F., Kuhlmann M., Ebbers .J, et al. (2023). Speech Disentanglement for Analysis and Modification of Acoustic and Perceptual Speaker Characteristics. In: Fortschritte der Akustik - DAGA 2023. ; 2023:1409-1412.

In virtual reality (VR), participants may not always have hands, bodies, eyes, or even voices—using VR helmets and two controllers, participants control an avatar through virtual worlds that do not necessarily obey familiar laws of physics; moreover, the avatar’s bodily characteristics may not neatly match our bodies in the physical world. Despite these limitations and specificities, humans get things done through collaboration and the creative use of the environment. While multiuser interactive VR is attracting greater numbers of participants, there are currently few attempts to analyze the in situ interaction systematically. This paper proposes a video-analytic detail-oriented methodological framework for studying virtual reality interaction. Using multimodal conversation analysis, the paper investigates a nonverbal, embodied, two-person interaction: two players in a survival game strive to gesturally resolve a misunderstanding regarding an in-game mechanic—however, both of their microphones are turned off for the duration of play. The players’ inability to resort to complex language to resolve this issue results in a dense sequence of back-and-forth activity involving gestures, object manipulation, gaze, and body work. Most crucially, timing and modified repetitions of previously produced actions turn out to be the key to overcome both technical and communicative challenges. The paper analyzes these action sequences, demonstrates how they generate intended outcomes, and proposes a vocabulary to speak about these types of interaction more generally. The findings demonstrate the viability of multimodal analysis of VR interaction, shed light on unique challenges of analyzing interaction in virtual reality, and generate broader methodological insights about the study of nonverbal action.



Nils Klowait, "One the Multimodal Resolution of a Search Sequence in Virtual Reality", Human Behavior and Emerging Technologies, vol. 2023, Article ID 8417012, 15 pages, 2023.

In this paper, we investigate the effect of distractions and hesitations as a scaffolding strategy. Recent research points to the potential beneficial effects of a speaker’s hesitations on the listeners’ comprehension of utterances, although results from studies on this issue indicate that humans do not make strategic use of them. The role of hesitations and their communicative function in human-human interaction is a much-discussed topic in current research. To better understand the underlying cognitive processes, we developed a human–robot interaction (HRI) setup that allows the measurement of the electroencephalogram (EEG) signals of a human participant while interacting with a robot. We thereby address the research question of whether we find effects on single-trial EEG based on the distraction and the corresponding robot’s hesitation scaffolding strategy. To carry out the experiments, we leverage our LabLinking method, which enables interdisciplinary joint research between remote labs. This study could not have been conducted without LabLinking, as the two involved labs needed to combine their individual expertise and equipment to achieve the goal together. The results of our study indicate that the EEG correlates in the distracted condition are different from the baseline condition without distractions. Furthermore, we could differentiate the EEG correlates of distraction with and without a hesitation scaffolding strategy. This proof-of-concept study shows that LabLinking makes it possible to conduct collaborative HRI studies in remote laboratories and lays the first foundation for more in-depth research into robotic scaffolding strategies.



Richter, B., Putze, F., Ivucic, G., Brandt, M., Schütze, C., Reisenhofer, R., Wrede, B. and Schultz T. (2023), EEG Correlates of Distractions and Hesitations in Human–Robot Interaction: A LabLinking Pilot Study. Multimodal Technol. Interact. 2023, 7(4), 37;

Since robots can facilitate our everyday life by assisting us in basic tasks, they are continuously integrated into our life. However, for a robot to establish itself, a user must accept and trust its doing. As the saying goes, you don't trust things you don't understand. Therefore, the base hypothesis of this paper is that providing technical transparency for users can increase understanding of the robot architecture and its behaviors as well as trust and acceptance towards it. In this work, we aim to improve a robot's understanding, trust, and acceptance by displaying transparent visualizations of its intention and perception in augmented reality. We conducted a user study where robot navigation with certain interruptions was demonstrated to two groups. The first group did not have AR visualizations displayed during the first demonstration; in the second demonstration, the visualizations were shown. The second group had the visualizations displayed throughout only one demonstration. Results showed that understanding increased with AR visualizations when prior knowledge had been gained in previous demonstrations.

Leonie Dyck, Helen Beierling, Robin Helmert, and Anna-Lisa Vollmer. 2023. Technical Transparency for Robot Navigation Through AR Visualizations. In Companion of the 2023 ACM/IEEE International Conference on Human-Robot Interaction (HRI '23). Association for Computing Machinery, New York, NY, USA, 720–724.

Sociologica – International Journal for Sociological Debate is a peer-reviewed journal published three times a year. The journal publishes theoretical, methodological and empirical articles providing original and rigorous contributions to the sociological current debate. Founded in 2007, Sociologica is one of the first international journals of sociology published solely online.

Esposito E., eds. (2022). Sociologica Vol. 16 No. 3:

This short introduction presents the symposium ‘Explaining Machines’. It locates the debate about Explainable AI in the history of the reflection about AI and outlines the issues discussed in the contributions.

Esposito, E. (2023). Explaining Machines: Social Management of Incomprehensible Algorithms. Introduction. Sociologica, 16(3), 1–4.

Dealing with opaque algorithms, the frequent overlap between transparency and explainability produces seemingly unsolvable dilemmas, as the much-discussed trade-off between model performance and model transparency. Referring to Niklas Luhmann's notion of communication, the paper argues that explainability does not necessarily require transparency and proposes an alternative approach. Explanations as communicative processes do not imply any disclosure of thoughts or neural processes, but only reformulations that provide the partners with additional elements and enable them to understand (from their perspective) what has been done and why.  Recent computational approaches aiming at post-hoc explainability reproduce what happens in communication, producing explanations of the working of algorithms that can be different from the processes of the algorithms.

Esposito, E. (2023). Does Explainability Require Transparency?. Sociologica, 16(3), 17–27.

The automatic generation of explanations is an increasingly important problem in the field of Explainable AI (XAI). However, while most work looks at how complete and correct information can be extracted or how it can be presented, the success of an explanation also depends on the person the explanation is targeted at. We present an adaptive explainer model that constructs and employs a partner model to tailor explanations during the course of the interaction. The model incorporates different linguistic levels of human-like explanations in a hierarchical, sequential decision process within a non-stationary environment. The model is based on online planning (using Monte Carlo Tree Search) to solve a continuously adapted MDP for explanation action and explanation move selection. We present the model as well as first results from explanation interactions with different kinds of simulated users.

Robrecht, A. and Kopp, S. (2023). SNAPE: A Sequential Non-Stationary Decision Process Model for Adaptive Explanation Generation. In Proceedings of the 15th International Conference on Agents and Artificial Intelligence - Volume 1: ICAART, ISBN 978-989-758-623-1; ISSN 2184-433X, pages 48-58. DOI: 10.5220/0011671300003393


Metaphors frame a given target domain using concepts from another, usually more concrete, source domain. Previous research in NLP has focused on the identification of metaphors and the interpretation of their meaning. In contrast, this paper studies to what extent the source domain can be predicted computationally from a metaphorical text. Given a dataset with metaphorical texts from a finite set of source domains, we propose a contrastive learning approach that ranks source domains by their likelihood of being referred to in a metaphorical text. In experiments, it achieves reasonable performance even for rare source domains, clearly outperforming a classification baseline.

Meghdut Sengupta, Milad Alshomary, and Henning Wachsmuth. 2022. Back to the Roots: Predicting the Source Domain of Metaphors using Contrastive Learning. In Proceedings of the 3rd Workshop on Figurative Language Processing (FLP), pages 137–142, Abu Dhabi, United Arab Emirates (Hybrid). Association for Computational Linguistics.

It’s conference time and with their selection of keynote speakers conferences tend to be seismographs for trends in research. I had a look at a handful of this year’s international AI conferences. Here is my selection: a continuing trend seems to be explainable AI with IJCAI featuring even two keynotes in this area: Mihaela van der Schaar bringing our attention to machine learning interpretability in medicine which requires new methods for non-static data and which targets to enable medical scientists to make new discoveries by unraveling the underlying governing equations of medicine from data. However, Tim Miller reminds us to not let the “inmates run the asylum”. He argues that machine learning researchers may not bring in the best perspective to develop approaches for explanations that are helpful and understandable for lay persons. He makes a case of rather taking social scientists on board together with experts from human-computer interaction. Indeed, interdisciplinarity research has to be at the core of making AI decisions understandable and tractable for lay persons. At AAAI Cynthia Rudin has shared her experiences on bringing interpretable models into situations with high societal stakes such as decisions in criminal justice, healthcare, financial lending, and beyond, collaborating with people from different fields. It appears that this branch of AI requires new efforts in trans- and interdisciplinary research and I think we can expect highly interesting new insights from this field.

Schmid U, Wrede B, eds. Explainable AI. KI - Künstliche Intelligenz. 2022.

During the last years, Explainable AI (XAI) has been established as a new area of research focusing on approaches which allow humans to comprehend and possibly control machine learned (ML) models and other AI-systems whose complexity makes the process which leads to a specific decision intransparent. In the beginning, most approaches were concerned with post-hoc explanations for classification decisions of deep learning architectures, especially for image classification. Furthermore, a growing number of empirical studies addressed effects of explanations on trust in and acceptability of AI/ML systems. Recent work has broadened the perspective of XAI, covering topics such as verbal explanations, explanations by prototypes and contrastive explanations, combining explanations and interactive machine learning, multi-step explanations, explanations in the context of machine teaching, relations between interpretable approaches of machine learning and post-hoc explanations, neuro-symbolic approaches and other hybrid approaches combining reasoning and learning for XAI. Addressing criticism regarding missing adaptivity more interactive accounts have been developed to take individual differences into account. Also, the question of evaluation beyond mere batch testing has come into focus.

Wrede, B. AI: Back to the Roots?. Künstl Intell 36, 117–120 (2022).

Recent approaches to Explainable AI (XAI) promise to satisfy diverse user expectations by allowing them to steer the interaction in order to elicit content relevant to them. However, little is known about how and to what extent the explainee takes part actively in the process of explaining. To tackle this empirical gap, we exploratively examined naturally occurring everyday explanations in doctor–patient interactions (N = 11). Following the social design of XAI, we view explanations as emerging in interactions: first, we identified the verbal behavior of both the explainer and the explainee in the sequential context, which we could assign to phases that were either monological or dialogical; second, we investigated in particular who was responsible for the initiation of the different phases. Finally, we took a closer look at the global conversational structure of explanations by applying a context-sensitive model of organizational jobs, thus adding a third layer of analysis. Results show that in our small sample of conversational explanations, both monological and dialogical phases varied in their length, timing of occurrence (at the early or later stages of the interaction) and their initiation (by the explainer or the explainee). They alternated several times in the course of the interaction. However, we also found some patterns suggesting that all interactions started with a monological phase initiated by the explainer. Both conversational partners contributed to the core organizational job that constitutes an explanation. We interpret the results as an indication for naturally occurring everyday explanations in doctor–patient interactions to be co-constructed on three levels of linguistic description: (1) by switching back and forth between monological to dialogical phases that (2) can be initiated by both partners and (3) by the mutual accomplishment and thus responsibility for an explanation’s core job that is crucial for the success of the explanation. Because of the explorative nature of our study, these results need to be investigated (a) with a larger sample and (b) in other contexts. However, our results suggest that future designs of artificial explainable systems should design the explanatory dialogue in such a way that it includes monological and dialogical phases that can be initiated not only by the explainer but also by the explainee, as both contribute to the core job of explicating procedural, clausal, or conceptual relations in explanations.

Fisher, J.B., Lohmer, V., Kern, F. et al. Exploring Monological and Dialogical Phases in Naturally Occurring Explanations. Künstl Intell (2022).

With the perspective on applications of AI-technology, especially data intensive deep learning approaches, the need for methods to control and understand such models has been recognized and gave rise to a new research domain labeled explainable artificial intelligence (XAI). In this overview paper we give an interim appraisal of what has been achieved so far and where there are still gaps in the research. We take an interdisciplinary perspective to identify challenges on XAI research and point to open questions with respect to the quality of the explanations regarding faithfulness and consistency of explanations. On the other hand we see a need regarding the interaction between XAI and user to allow for adaptability to specific information needs and explanatory dialog for informed decision making as well as the possibility to correct models and explanations by interaction. This endeavor requires an integrated interdisciplinary perspective and rigorous approaches to empirical evaluation based on psychological, linguistic and even sociological theories.

Schmid, U., Wrede, B. What is Missing in XAI So Far?. Künstl Intell (2022).

Smart home systems contain plenty of features that enhance well-being in everyday life through artificial intelligence (AI). However, many users feel insecure because they do not understand the AI’s functionality and do not feel they are in control of it. Combining technical, psychological and philosophical views on AI, we rethink smart homes as interactive systems where users can partake in an intelligent agent’s learning. Parallel to the goals of explainable AI (XAI), we explored the possibility of user involvement in supervised learning of the smart home to have a first approach to improve acceptance, support subjective understanding and increase perceived control. In this work, we conducted two studies: In an online pre-study, we asked participants about their attitude towards teaching AI via a questionnaire. In the main study, we performed a Wizard of Oz laboratory experiment with human participants, where partici- pants spent time in a prototypical smart home and taught activity recognition to the intelligent agent through supervised learning based on the user’s behaviour. We found that involvement in the AI’s learning phase enhanced the users’ feeling of control, perceived
understanding and perceived usefulness of AI in general. The participants reported positive attitudes towards training a smart home AI and found the process understandable and controllable. We suggest that involving the user in the learning phase could lead to better personalisation and increased understanding and control by users of intelligent agents for smart home automation.

Sieger, L., Hermann, J., Schomäcker, A., Heindorf, S., Meske, C., Hey, C., and Doğangün, A.. 2022. User Involvement in Training Smart Home Agents: Increasing Perceived Control and Understanding. In Proceedings of the 10th International Conference on Human-Agent Interaction (HAI ’22), December 5–8, 2022, Christchurch, New Zealand. ACM, New York, NY, USA, 10 pages.

In this paper, we present a visual programming software for enabling non-technical domain experts to create robot-assisted therapy scenarios for multiple robotic platforms. Our new approach is evaluated by comparing it with Choregraphe, the standard visual programming framework for the often used robotics platforms Pepper and NAO. We could show that our approach receives higher usability values and allows users to perform better in some practical tasks, including understanding, changing and creating small robot-assisted therapy scenarios.

Schütze, C., Groß, A., Wrede, B., & Richter, B. (2022). Enabling Non-Technical Domain Experts to Create Robot-Assisted Therapeutic Scenarios via Visual Programming. In R. Tumuluri, N. Sebe, G. Pingali, D. B. Jayagopi, A. Dhall, R. Singh, L. Anthony, et al. (Eds.), ICMI '22 Companion: INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION (pp. 166-170). New York, NY, USA: ACM.

In this paper, we present a software-architecture for robot-assisted configurable and autonomous Joint-Attention-Training scenarios to support autism therapy. The focus of the work is the expandability of the architecture for the use of different robots, as well as the maximization of the usability of the interface for the therapeutic user. By evaluating the user-experience, we draw first conclusions about the usability of the system for computer and non-computer scientists. Both groups can solve different tasks without any major issues, and the overall usability of the system was rated as good.

Groß, A., Schütze, C., Wrede, B., and Richter, B. (2022). An Architecture Supporting Configurable Autonomous Multimodal Joint-Attention-Therapy for Various Robotic Systems. In R. Tumuluri, N. Sebe, G. Pingali, D. B. Jayagopi, A. Dhall, R. Singh, L. Anthony, et al. (Eds.), ICMI '22 Companion: INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION (pp. 154-159). New York, NY, USA: ACM.

The algorithmic imaginary as a theoretical concept has received increasing attention in recent years as it aims at users’ appropriation of algorithmic processes operating in opacity. But the concept originally only starts from the users’ point of view, while the processes on the platforms’ side are largely left out. In contrast, this paper argues that what is true for users is also valid for algorithmic processes and the designers behind. On the one hand, the algorithm imagines users’ future behavior via machine learning, which is supposed to predict all their future actions. On the other hand, the designers anticipate different actions that could potentially performed by users with every new implementation of features such as social media feeds. In order to bring into view this permanently reciprocal interplay coupled to the imaginary, in which not only the users are involved, I will argue for a more comprehensive and theoretically precise algorithmic imaginary referring to the theory of Cornelius Castoriadis. In such a perspective, an important contribution can be formulated for a theory of social media platforms that goes beyond praxeocentrism or structural determinism.

Schulz, C. (2022). A new algorithmic imaginary. Media, Culture & Society.

Modified action demonstration—dubbed motionese—has been proposed as a way to help children recognize the structure and meaning of actions. However, until now, it has been investigated only in young infants. This brief research report presents findings from a cross-sectional study of parental action demonstrations to three groups of 8–11, 12–23, and 24–30-month-old children that applied seven motionese parameters; a second study investigated the youngest group of participants longitudinally to corroborate the cross-sectional results. Results of both studies suggested that four motionese parameters (Motion Pauses, Pace, Velocity, Acceleration) seem to structure the action by organizing it in motion pauses. Whereas these parameters persist over different ages, three other parameters (Demonstration Length, Roundness, and Range) occur predominantly in the younger group and seem to serve to organize infants' attention on the basis of movement. Results are discussed in terms of facilitative vs. pedagogical learning.

Rohlfing K. J., Vollmer A.-L., Fritsch J. and Wrede B. (2022). Which “motionese” parameters change with children's age? Disentangling attention-getting from action-structuring modifications. Front. Commun. 7:922405. doi: 10.3389/fcomm.2022.922405

One of the many purposes for which social robots are designed is education, and there have been many attempts to systematize their potential in this field. What these attempts have in common is the recognition that learning can be supported in a variety of ways because a learner can be engaged in different activities that foster learning. Up to now, three roles have been proposed when designing these activities for robots: as a teacher or tutor, a learning peer, or a novice. Current research proposes that deciding in favor of one role over another depends on the content or preferred pedagogical form. However, the design of activities changes not only the content of learning, but also the nature of a human–robot social relationship. This is particularly important in language acquisition, which has been recognized as a social endeavor. The following review aims to specify the differences in human–robot social relationships when children learn language through interacting with a social robot. After proposing categories for comparing these different relationships, we review established and more specific, innovative roles that a robot can play in language-learning scenarios. This follows Mead’s (1946) theoretical approach proposing that social roles are performed in interactive acts. These acts are crucial for learning, because not only can they shape the social environment of learning but also engage the learner to different degrees. We specify the degree of engagement by referring to Chi’s (2009) progression of learning activities that range from active, constructive, toward interactive with the latter fostering deeper learning. Taken together, this approach enables us to compare and evaluate different human–robot social relationships that arise when applying a robot in a particular social role.

Rohlfing K. J., Altvater-Mackensen N., Caruana N., van den Berghe R., Bruno B., Tolksdorf N. F. and Hanulíková A. (2022) Social/dialogical roles of social robots in supporting children’s learning of language and literacy—A review and analysis of innovative roles. Front. Robot. AI 9:971749. doi: 10.3389/frobt.2022.971749

As AI is more and more pervasive in everyday life, humans have an increasing demand to understand its behavior and decisions. Most research on explainable AI builds on the premise that there is one ideal explanation to be found. In fact, however, everyday explanations are co-constructed in a dialogue between the person explaining (the explainer) and the specific person being explained to (the explainee). In this paper, we introduce a first corpus of dialogical explanations to enable NLP research on how humans explain as well as on how AI can learn to imitate this process. The corpus consists of 65 transcribed English dialogues from the Wired video series 5 Levels, explaining 13 topics to five explainees of different proficiency. All 1550 dialogue turns have been manually labeled by five independent professionals for the topic discussed as well as for the dialogue act and the explanation move performed. We analyze linguistic patterns of explainers and explainees, and we explore differences across proficiency levels. BERT-based baseline results indicate that sequence information helps predicting topics, acts, and moves effectively.

Wachsmuth, H., & Alshomary, M. (2022). "Mama Always Had a Way of Explaining Things So I Could Understand'': A Dialogue Corpus for Learning to Construct Explanations. Proceedings of the 29th International Conference on Computational Linguistics arXiv.

This paper presents preliminary work on the formalization of three prominent cognitive biases in the diagnostic reasoning process over epileptic seizures, psychogenic seizures and syncopes. Diagnostic reasoning is understood as iterative exploration of medical evidence. This exploration is represented as a partially observable Markov decision process where the state (i.e., the correct diagnosis) is uncertain. Observation likelihoods and belief updates are computed using a Bayesian network which defines the interrelation between medical risk factors, diagnoses and potential findings. The decision problem is solved via partially observable upper confidence bounds for trees in Monte-Carlo planning. We compute a biased diagnostic exploration policy by altering the generated state transition, observation and reward during look ahead simulations. The resulting diagnostic policies reproduce reasoning errors which have only been described informally in the medical literature. We plan to use this formal representation in the future to inversely detect and classify biased reasoning in actual diagnostic trajectories obtained from physicians.

Battefeld, D., & Kopp, S. (2022). Formalizing cognitive biases in medical diagnostic reasoning. Presented at the 8th Workshop on Formal and Cognitive Reasoning (FCR), Trier. Link:

Explainable Artificial Intelligence (XAI) has mainly focused on static learning tasks so far. In this paper, we consider XAI in the context of online learning in dynamic environments, such as learning from real-time data streams, where models are learned incrementally and continuously adapted over the course of time. More specifically, we motivate the problem of explaining model change, i.e. explaining the difference between models before and after adaptation, instead of the models themselves. In this regard, we provide the first efficient model-agnostic approach to dynamically detecting, quantifying, and explaining significant model changes. Our approach is based on an adaptation of the well-known Permutation Feature Importance (PFI) measure. It includes two hyperparameters that control the sensitivity and directly influence explanation frequency, so that a human user can adjust the method to individual requirements and application needs. We assess and validate our method’s efficacy on illustrative synthetic data streams with three popular model classes.

Muschalik, M., Fumagalli, F., Hammer, B., Hüllermeier E. (2022). Agnostic Explanation of Model Change based on Feature Importance. Künstl Intell. doi: 10.1007/s13218-022-00766-6

Advances in the development of AI and its application in many areas of society have given rise to an ever-increasing need for society’s members to understand at least to a certain degree how these technologies work. Where users are concerned, most approaches in Explainable Artificial Intelligence (XAI) assume a rather narrow view on the social process of explaining and show an undifferentiated assessment of explainees’ understanding, which mostly are considered passive recipients of information. The actual knowledge, motives, needs and challenges of (lay)users in algorithmic environments remain mostly missing. We argue for the consideration of explanation as a social practice in which explainer and explainee co-construct understanding jointly. Therefore, we seek to enable lay users to document, evaluate, and reflect on distinct AI interactions and correspondingly on how explainable AI actually is in their daily lives. With this contribution we want to discuss our methodological approach that enhances the documentary method by the implementation of ‘digital diaries’ via the mobile instant messaging app WhatsApp – the most used instant messaging service worldwide. Furthermore, from a theoretical stance, we examine the socio-cultural patterns of orientation that guide users’ interactions with AI and their imaginaries of the technologies – a sphere that is mostly obscured and hard to access for researchers. Finally, we complete our paper with empirical insights by referring to previous studies that point out the relevance of perspectives on explaining and understanding as a co-constructive social practice.

Finke, J., Horwath, I., Matzner, T., Schulz, C. (2022). (De)Coding Social Practice in the Field of XAI: Towards a Co-constructive Framework of Explanations and Understanding Between Lay Users and Algorithmic Systems. In: Degen, H., Ntoa, S. (eds) Artificial Intelligence in HCI. HCII 2022. Lecture Notes in Computer Science(), vol 13336. Springer, Cham. doi:

Intelligent agents interacting with humans through conversation (such as a robot, embodied conversational agent, or chatbot) need to receive feedback from the human to make sure that its communicative acts have the intended consequences. At the same time, the human interacting with the agent will also seek feedback, in order to ensure that her communicative acts have the intended consequences. In this review article, we give an overview of past and current research on how intelligent agents should be able to both give meaningful feedback toward humans, as well as understanding feedback given by the users. The review covers feedback across different modalities (e.g., speech, head gestures, gaze, and facial expression), different forms of feedback (e.g., backchannels, clarification requests), and models for allowing the agent to assess the user's level of understanding and adapt its behavior accordingly. Finally, we analyse some shortcomings of current approaches to modeling feedback, and identify important directions for future research. Full article

Axelsson A., Buschmeier H. and Skantze G. (2022) Modeling Feedback in Interaction With Conversational Agents - A Review. Front. Comput. Sci. 4:744574. doi: 10.3389/fcomp.2022.744574


The recent surge of interest in explainability in artificial intelligence (XAI) is propelled by not only technological advancements in machine learning but also by regulatory initiatives to foster transparency in algorithmic decision making. In this article, we revise the current concept of explainability and identify three limitations: passive explainee, narrow view on the social process, and undifferentiated assessment of explainee’s understanding. In order to overcome these limitations, we present explanation as a social practice in which explainer and explainee co-construct understanding on the microlevel. We view the co-construction on a microlevel as embedded into a macrolevel, yielding expectations concerning, e.g., social roles or partner models: typically, the role of the explainer is to provide an explanation and to adapt it to the current level of explainee’s understanding; the explainee, in turn, is expected to provide cues that direct the explainer. Building on explanations being a social practice, we present a conceptual framework that aims to guide future research in XAI. The framework relies on the key concepts of monitoring and scaffolding to capture the development of interaction. We relate our conceptual framework and our new perspective on explaining to transparency and autonomy as objectives considered for XAI.

Rohlfing, K. J., Cimiano, P., Scharlau, I., Matzner, T., Buhl, H. M., Buschmeier, H., Esposito, E, Grimminger, A, Hammer, B., Häb-Umbach, R., Horwath, I., Hüllermeier, E., Kern, F., Kopp, S., Thommes, K., Ngonga Ngomo, A., Schulte, C., Wachsmuth, H., Wagner, P. & Wrede, B. (2021). Explanation as a social practice: Toward a conceptual framework for the social design of AI systems. IEEE Transactions on Cognitive and Developmental Systems 13(3), 717-728. doi: 10.1109/TCDS.2020.3044366