About the project
Objective
This project aims to develop adaptive social robots that can understand humans’ communicative behaviour and task-related physical actions and adapt their interaction to suit. We aim to investigate and demonstrate fluid and seamless adaptation of intelligent systems to users’ contexts, needs or preferences. To achieve fluidity, such adaptation needs to happen with minimal interruption to the users’ ongoing interaction with the system, without requiring user intervention, while providing accountability and control of the adaption in a task-appropriate, timely, and understandable manner. This will be explored in multiple embodiments: smart speakers, back-projected robotic heads, and dual-arm robots.
Our use case scenario is an adaptive intelligent kitchen assistant that helps humans prepare food and other kitchen-centric tasks, focusing on supporting ageing in place. Our systems will engage in face-to-face spoken and physical collaboration with humans, track the users’ affective states and task-related actions in real-time, adjust performance based on previous interactions, adapt to user preferences, and show intention using a self-regulation perception-production loop. The project will use the Intelligence Augmentation Lab that TMH and RPL plan to set up.
Background
Intelligent systems built around big datasets and machine learning techniques are becoming ubiquitous in people’s lives – smart appliances, wearables, and, increasingly, robots. As these systems are intended to assist an ever wider range of users in their homes, workplaces or public spaces, a typical one-fits-all approach becomes insufficient. Instead, these systems will need to take advantage of the machine learning techniques upon which they are built to adapt to the specific task, user constellation continually, and shared environment in which they are operating. In long-term deployments, the state of the environment, user preferences, skills, and abilities change and must be adapted. This is relevant for socially assistive robots in people’s homes, education or healthcare settings, and robots working alongside workers in small-scale manufacturing environments.
Crossdisciplinary collaboration
The research team represents the School of Electrical Engineering and Computer Science (EECS, KTH), the School of Engineering Sciences in Chemistry, Biotechnology and Health (CBH, KTH) and the Department of Computer and System Science (DSV) at Stockholm University.
Watch the recorded presentation at the Digitalize in Stockholm 2023 event:
Articles
Avancerade adaptiva intelligenta system
Activities & Results
Find out what’s going on!
Results
The main results achieved in the first half of the project towards the 3 main objectives enumerated in the proposal:
- Understanding and Design for the Smart Kitchen
Deeper understanding of the cooking process (Kuoppamäki et al, 2021) led to a number of studies on interaction in and around cooking, including list advancement (Jaber & McMillan, 2022), content navigation (Zhao et al., 2022), designing conversational interaction (Kuoppamäki et al., 2023), the impact of contextual awareness on command construction and delivery (Jaber et al., in submission) and an ongoing study comparing proactive organisational support between young adults and those over 65 (Kuoppamäki et al.).
- Perception and Representation of Long Term Human Action
Research showing that thermal imaging is a modality relevant for detecting frustration in human-robot interaction (Mohamed et al., 2022). The models to predict frustration based on a dataset of 18 participants interacting with a Nao social robot in our lab were tested using features from several modalities: thermal, RGB, Electrodermal Activity (EDA), and all three combined. The models reached an accuracy of 89% with just RGB features, 87% using only thermal features, 84% using EDA, and 86% when using all modalities. We are also investigating the accuracy of frutration prediction models using data collected at the KTH Library where students ask directions to a Furhat robot.
- Input, Output, and Interaction for Smart Assistive Technologies
In order to fulfil the goal of adaptation of virtual agent and robot behaviour to different contexts, we have researched interlocutor-aware facial expressions in dyadic interaction (Jonell et al, 2020) as well as adaptive facial expressions in controllable generation of speech and gesture: we have developed techniques to generate conversational speech, with control over speaking style to signal e.g. certainty or uncertainty (Wang et al, 2022, Kirkland et al 2022) as well as models that are able to generate coherent speech and gesture from a common representation (Wang et al 2021). Another direction concerns adaptation to spatial contexts and environments, where we have used imitation learning and physical simulation to produce referential gestures (Deichler et al, 2022).
Publications
We like to inspire and share interesting knowledge!
- Parag Khanna, Mårten Björkman, Christian Smith. “Human Inspired Grip-Release Technique for Robot-Human Handovers”, 2022 IEEE-RAS International Conference on Humanoid Robots, Ginowan, Japan. (Accepted for publication)
- Sanna Kuoppamäki, Mikaela Hellstrand, and Donald McMillan (2023, forthcoming). Designing conversational scenarios with older adults digital repertoires: Graphic transcript as a design method. In: Hänninen, R., Taipale, S., Haapio-Kirk, L. (eds). Embedded and everyday technology: Digital repertoires in an ageing society. UCL Press.
- Youssef Mohamed, Giulia Ballardini, Maria Teresa Parreira, Séverin Lemaignan, and Iolanda Leite. 2022. Automatic Frustration Detection Using Thermal Imaging. In Proceedings of the 2022 ACM/IEEE International Conference on Human-Robot Interaction (HRI ’22). IEEE Press, 451–460
- Razan Jaber and Donald McMillan. 2022. Cross-Modal Repair: Gaze and Speech Interaction for List Advancement. In Proceedings of the 4th Conference on Conversational User Interfaces (CUI ’22). Association for Computing Machinery, New York, NY, USA, Article 25, 1–11.
- Yaxi Zhao, Razan Jaber, Donald McMillan, and Cosmin Munteanu. 2022. “Rewind to the Jiggling Meat Part”: Understanding Voice Control of Instructional Videos in Everyday Tasks. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems (CHI ’22). Association for Computing Machinery, New York, NY, USA, Article 58, 1–11.
- Kirkland, A., Lameris, H., Gustafson, J., Székely, É. (2022) “Where’s the uh, hesitation? The interplay between filled pause location, speech rate and fundamental frequency”, Interspeech 2022, Incheon, Korea.
- Wang, S., Gustafson, J., Székely, É. (2022) “Evaluating Sampling-based Filler Insertion with Spontaneous TTS”, 13th Edition of the Language Resources and Evaluation Conference (LREC 2022), Marseille.
- Deichler, A., Wang, S., Alexanderson, S., & Beskow, J. (2022). Towards Context-Aware Human-like Pointing Gestures with RL Motion Imitation. In Context-Awareness in Human-Robot Interaction: Approaches and Challenges, workshop at 2022 ACM/IEEE International Conference on Human-Robot Interaction.
- Donald McMillan and Razan Jaber. 2021. Leaving the Butler Behind: The Future of Role Reproduction in CUI. In Proceedings of the 3rd Conference on Conversational User Interfaces (CUI ’21). Association for Computing Machinery, New York, NY, USA, Article 11, 1–4.
- Sanna Kuoppamäki, Sylvaine Tuncer, Sara Eriksson, and Donald McMillan. 2021. Designing Kitchen Technologies for Ageing in Place: A Video Study of Older Adults’ Cooking at Home. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 5, 2, Article 69 (June 2021), 19 pages.
- Katie Winkle, Gaspar Isaac Melsión, Donald McMillan, and Iolanda Leite. 2021. Boosting Robot Credibility and Challenging Gender Norms in Responding to Abusive Behaviour: A Case for Feminist Robots. In Companion of the 2021 ACM/IEEE International Conference on Human-Robot Interaction (HRI ’21 Companion). Association for Computing Machinery, New York, NY, USA, 29–37.
- Wang, S., Alexanderson, A., Gustafson, J., Beskow, J., Henter, G. and Székely, É. (2021) “Integrated Speech and Gesture Synthesis”, 23rd ACM International Conference on Multimodal Interaction (ICMI 2021), Montreal
- Jonell, P., Kucherenko, T., Henter, G. E., & Beskow, J. (2020, October). Let’s Face It: Probabilistic Multi-modal Interlocutor-aware Generation of Facial Gestures in Dyadic Settings. In Proceedings of the 20th ACM International Conference on Intelligent Virtual Agents (pp. 1-8).