Abstract

Human-robot interaction is an integral part of Social robotics. Any social robot needs to be able to interact with human and other physical agents considering the social cues and behaviors involved. In this case study we introduce Nadine, a robot able to express human-like emotions, personality, behaviors, dialog etc. and capable of perceiving both user/ environmental cues and respond to them in a natural realistic manner. We present Nadine’s social robot characteristics such as speech recognition and synthesis, gaze, face, object recognition, affective system, dialog interaction capabilities etc and her presence in several museums.

Keywords

Social Robotics, Human-Computer Interaction, Human-Robot Interaction

A social robot should be designed to be able to interact with humans and possibly also with other robots. Social robots should be autonomous systems with local AI that allows them to interact independently in response to cues from people and things in their environment. Smart robot intelligence is typically based on a cognitive computing model to simulate human thought processes.

The most important features that can influence the perception of a virtual character or a robot are the human likeness, the naturalness of the movements and the emotions expressed by them [1].  In order to avoid the possible feeling of eeriness, the human and auditory components should match the aforementioned features. However, these characteristics may also be dependent on the purposes that the robot serves, the main objective of which is to reduce human labor or menial work, automation task and behave in a socially appropriate way.

Nadine is one of the most realistic female humanoid social robot around the world and she looks and act incredibly lifelike—being modelled on Prof Nadia Magnenat Thalman. This robot has a realistic human appearance and a natural skin and hair. She has also very realistic hands. Nadine is a socially intelligent robot who is friendly, greets you back, makes eye contact, and remembers all the nice chats you had with her. She is able to answer questions in several languages, show emotions both in her gestures and in her face depending on the content of the interaction with the user. Nadine can recognize people she has previously met and engage in flowing conversation. Nadine is also fitted with a personality, meaning her mood can sour depending on what you say to her. Nadine has a total of 27 degrees of freedom (dof) for facial expressions and upper body movements. She can recognise anybody she has met, and remembers facts and events related to each person. Nadine is the ideal companion when nobody is there. She can assist people with special needs, read stories, show images, put on skype sessions, send emails, and communicate with the family. She is part of the human assistive new technology which is badly needed as society cannot afford a full-time social worker for each person with special needs. She can play the role of a personal, private coach always available when nobody is there.

Fig.1 Professor Nadia Magnenat-Thalmann and Nadine

Social Robotics Architecture

A social robot based on its design and capabilities can find application in health-care, banking, teaching etc. Regardless of the field, a basic requirement of any social robot would be interactions with humans, environmental awareness, dynamically understanding social cues etc. The social robot needs to maintain a natural human-like tone and flow to any conversation with any user. Therefore, the design of a social robot platform is a complex task that takes into account several factors like maintaining naturalness of conversation, generalization to any field, multi-lingual support etc.

To fulfill these requirements, Nadine’s architecture consists of 3 layers, namely, perception, processing, and interaction

Fig.2 Proposed social robots architecture of Nadine with perception-processing-interaction layers

 

The first layer is the one of Perception which serves the perception of users’ intention and understanding of the environment. For this purpose, several sensors, cameras, microphones can be used by a robot in different ways. In Nadine, a Microsoft Kinect, webcam and microphone are used for providing vision and audio inputs in their raw form.

The second one is the Processing layer which is s the core module of Nadine that receives all results from the perception layer about environment and user to act upon them. This layer includes various sub-modules such as dialog processing (chatbot), the affective system (emotions, personality, mood), Nadine’s memory of previous encounters with users etc. Each of the perception layer inputs is associated with an activation level in this layer.

The third layer is the one of Interaction where appropriate responses for the current user interaction are received and carried out so that it can be visibly seen in Nadine’s face or gesture. This layer can also be called a Nadine controller. The interaction layer is a hardware dependent layer as each of the response has to be shown in Nadine or spoken by Nadine. But this layer can be easily customized to other robots and virtual humans.

 

Nadine Social Robot Characteristics

It is essential that Nadine exhibits important characteristics of a social robot and the proposed architecture is generic and easy to allow customization if required. A social robot’s main task involves, as mentioned before, interacting and communicating with humans following some accepted social behavior and rules. Its operation would also depend on the place of deployment as it has familiarized with its role.

  1. Face recognition

In Nadine’s architecture, face recognition is one of the perception layer’s submodule. The identity of the person is used by Nadine to customize her responses and behavior. The affective module can also use this information to change emotions, mood etc if necessary.

  1. Speech Recognition and Synthesis

For speech recognition, Nadine employs Google cloud speech-to-text model [3] that can support up to 120 languages for this purpose. Once a verbal response has been decided for a given situation, there are 2 main tasks, namely, speech synthesis with provided response and tone of the conversation and perform lip synchronization by moving Nadine’s mouth.

  1. Action/Gesture Recognition

Based on the actions and current context, Nadine would have to modify her behaviors and reactions (verbal and non-verbal) to enhance human-robot interaction. In this design, action recognition is a perception layer submodule that serves as a stimulus to the processing layer to change Nadine’s responses and behaviors.

  1. Object Recognition

Nadine’s design considers object recognition as a perception layer sub-module, thus making each recognized object, a stimulus to which Nadine can react based on the user conversation, context and task at hand. Using the detected objects as stimuli, Nadine’s platform allows her to react to them, such as, provide information about the objects detected around her.

  1. Memory

Nadine uses face recognition method to identify people and Nadine Episodic Memory [4] enables to remember past discourses and retrieve appropriate past sentences in current conversations. It is a part of Nadine’s processing layer.

  1. Gaze behaviour

In Nadine, a real-time neck-eye motion generation scheme is used for controlling eye gaze based on [5] that provides an analytical solution for this problem. Nadine actively divides her attention among all the users in her proximity. She uses social features such as social distance, whether a specific user within a group is known as well as who is speaking and whether a specific user is displaying a recognized movement.

  1. Affective system – Emotion engine

The affective system is a part of the processing layer of Nadine architecture that controls Nadine’s emotions, personality and mood during interaction. This emotion engine is responsible to simulate the affect dynamics of social robots.

  1. Data processing – Dialogue Manager

Nadine’ s dialogue manager focuses on generating appropriate responses to any user speech inputs

Fig.3 Nadine in NTU IMI

Nadine Social Robot Actions

At the beginning, Nadine started working as a receptionist at Nanyang Technological University (NTU) and thus, her circle of friends was limited to students, staff, and visitors at NTU’s Institute of Media Innovation.

As a continuation, Nadine made her first public appearance, as a key highlight, at the ArtScience Museum in Singapore during the exhibition, "HUMAN+: The Future of our Species", that was held from May to October 2017. There she interacted with more than 100,000 visitors.

The main purpose of this exhibition, HUMAN +: The future of our Species, was to explore the possible future paths of our species. This means to delve into questions like what it means to be human in a world of artificial intelligence, lifelike robots and genetic modification. It probes the social, ethical and environmental questions raised by using technology to modify ourselves. Will virtual reality be the new reality? What would happen if a robot knew what we wanted before we knew ourselves? How might we modify ourselves to adapt to an environment that we are drastically transforming? Is longevity a noble aspiration or a terrible threat for the planet?

Presently, Nadine is working as a customer service agent at AIA Singapore [6]. She has been trained to handle questions that are usually asked to AIA customer service agents and She also encourages AIA customers to sign up with AIA e-care registration portal. Customer service interactions were used to train a machine-learning based conversational dialog engine. A client-server architecture was also set up between our platform and AIA portal to allow fast and secure communication.

Conclusions

It is important for Nadine to possess human-like social interaction skills. It is vital for Nadine, for example, to be able to read documents (which are in hard copy), send email, be able to call or do skype. Nadine has got all these capabilities in addition to basic ones and this makes her more socially acceptable. This social acceptance makes her capable of serving several positions and purposes in different fields, like medicine, education and cultural heritage.

References

  1. Borst de A., Gelder de B. (2015), Is it the real deal? Perception of virtual characters versus humans: An affective cognitive neuroscience perspective, Front Psychol, 6:576, doi:10.3389/fpsyg.2015.00576
  2. Martin, Mayo (2017-05-19). "Singapore's receptionist robot makes her public debut at ArtScience Museum's futuristic show". Channel NewsAsia. Retrieved 2017-05-19.
  3. GoogleCloudPlatform, “https://cloud.google.com/,” 2018.
  4. Zhang, J. Zheng, and N. M. Thalmann, “MCAEM: mixed-correlation analysis-based episodic memory for companionuser interactions,” The Visual Computer, vol. 34, pp. 1129 – 1141, June 2018.
  5. Zhang, A. Beck, and N. Magnenat-Thalmann, “Human-like behavior generation based on head-arms model for robot tracking external targets and body parts,” IEEE Trans. on Cybernetics, vol. 45, pp. 1390 – 1400, August 2015.
  6. Benjamin, Ang (2018-10-22). "Singapore: AIA transforms customer service with insurance industry's first humanoids"Asia Insurance Review. Retrieved 2018-10-26.

 

Author

Evangelia Baka, Manoj Ramanathan, Nidhi Mishra, Nadia Magnenat Thalmann

University of Geneva – MIRALab

Nanyang Technological University - IMI

Thematic Area 6