Loading...
Projects / Programmes source: ARIS

Fusion of verbal and non-verbal signals for the next generation of intelligent communication interfaces – HUMANIPA

Research activity

Code Science Field Subfield
2.08.00  Engineering sciences and technologies  Telecommunications   

Code Science Field
T180  Technological sciences  Telecommunication engineering 

Code Science Field
2.02  Engineering and Technology  Electrical engineering, Electronic engineering, Information engineering 
Keywords
human-machine interaction, communication technologies, conversational interfaces, multimodal fusion, interactive alignment, social interaction
Evaluation (rules)
source: COBISS
Researchers (7)
no. Code Name and surname Research area Role Period No. of publicationsNo. of publications
1.  53072  Špela Antloga  Linguistics  Researcher  2020 - 2022  38 
2.  38013  Uroš Berglez  Telecommunications  Researcher  2019  51 
3.  06821  PhD Zdravko Kačič  Telecommunications  Head  2019 - 2022  705 
4.  51357  Simona Majhenič  Linguistics  Researcher  2019 - 2022  41 
5.  18876  PhD Matej Rojc  Telecommunications  Researcher  2019 - 2022  246 
6.  23838  PhD Darinka Verdonik  Linguistics  Researcher  2019 - 2022  198 
7.  34282  Danilo Zimšek  Telecommunications  Researcher  2019  40 
Organisations (1)
no. Code Research organisation City Registration number No. of publicationsNo. of publications
1.  0796  University of Maribor, Faculty of Electrical Engineering and Computer Science  Maribor  5089638003  27,415 
Abstract
The proposed research project addresses a novel approach to natural human-machine communication, which tries to imitate the conversation among people and is also capable to address individual psychological and sociological signals of interactional episodes. The influence of human like characteristics, such as the capability to hold a conversation and representative embodiment, is one of the key factors of face-to-face interaction.  In HMI and user-centered design tasks, the affective computing and social awareness represent key directions in how a machine should respond to various user-generated stimuli in order to evoke also attitude and response of the user. In order to deliver natural conversation-like human-machine interaction, the machine-operated systems must resolve the following issues: a) to understand and to interpret users’ inputs (embodied requests promoted through verbal plus nonverbal signals); e.g. to recognize the multimodal communicative intent, and b) to model and rely the targeted communicative intents as a truly viable natural response.  The proposed project defines a new communication paradigm called the understanding of the spoken language - CLU. The paradigm is built based on the ‘socially aware interaction’ and ‘computers are social actors’ paradigms. The proposed paradigm and model extend the definition of natural language processing with embodied cognition (especially kinesics) and is based on the idea of coherent fusion of verbal and nonverbal signals in natural discourse. Namely, in natural face-to-face interaction, the co-verbal signals conveyed together with the spoken content (or even in absence of it), are essential for establishing discourse cohesion. One could say that the verbal/linguistic parts of spoken language (e.g., words, grammar, syntax) carry symbolic/semantic interpretation of the message, while the co-verbal parts (e.g. gestures, expressions, prosody) carry the social component of each message and serve as an orchestrator of communication. Overall, we will emphasize the work on approaches originating from natural language processing, big-data analytics, statistical modelling, and machine learning. Thus, we will search for optimal inferences (possible fusion functions) by evaluating them through newly developed CLU algorithms. Based on the exploratory and descriptive research implemented through analytics of contextually sound scenarios, we will define hypotheses and inferences regarding the fusion functions in conversational space. Through explanatory research, we will investigate into multi-signal relationships and intertwining of conversational concepts and outline new conversational strategies underlined with required language and paralanguage resources. Thus, we will create conversational knowledge. We will develop new fusion as AI algorithms, aligned with the stated hypotheses, and test them through qualitative and qualitative approaches. Thus, we will transform the new conversational strategies into interactive strategies and models and evaluate the CLU paradigm and model. Social and emphatic interaction, based on this new paradigm, will enable the development of much more socially aware and acceptable machine operated systems and user interfaces in next generation communication systems.
Significance for science
The proposed research will promote and enable original ‘conversation-like’ information exchange, capable of utilizing belief-based dialog managers, which will incorporate more ‘human-like’ senses and conversational phenomena to achieve the targeted goal in a more natural way. Consequently, these systems could also help tackle social issues such as: active ageing, long-term care, and inclusion. As a significant advance in the CIs and conversational intelligence research activities worldwide, we propose to analyse, discover and understand communicative intent as the basis of any interactive action. This involves intertwining all conversational signals as part of the machine’s cognition. Multiple technologies will be fused into a new concept, called embodied language understanding (CLU), including NLP, NLU and ELP. The expected benefit is delivering new social and emphatic belief-based interaction models capable of adapting to the user’s context and to facilitate context of the conversational situation not only via speech but also through visual interpretation of the collocutor’s responses and social cues. Such socially-enabled systems and agents have great potential to be used in various HCI concepts, from tour guides, tutors, instructors, to companions, and helpers. Moreover, these agents can adapt to the user’s needs and therefore successfully combat technological ignorance and significantly improve the use and exploitation of ICT in ambient assisted living (AAL), and other environments, such as: robotics, automotive assistance, virtual assistance, healthcare, marketing, etc. Overall, social and emphatic interaction, as promoted by the ELU, will enable the development of more socially aware and socially acceptable systems. Thus, the proposed results are in line with S4 (smart city and smart home), as well as EU strategies targeting social inclusion, active ageing, and ambient assisted living. With the fusion of verbal and non-verbal signals in interaction models and machine-learned algorithms, and with the utilization of deep-learning, current block-chain approaches will become obsolete. By means of outlined and analytically confirmed theoretical inferences one can devise its own deep networks to determine the interplay between conversational signals and give the machine the capability to understand and interpret the user’s requests. Technologically, the results envisaged in the proposed project will have a significant impact in all scientific fields, where multimodal human behaviour plays a central role, for instance: context initiated telecommunication services, human-machine interaction and conversational interfaces, embodied conversational agents, natural language processing and affective computing, and opinion mining.
Significance for the country
The proposed research will promote and enable original ‘conversation-like’ information exchange, capable of utilizing belief-based dialog managers, which will incorporate more ‘human-like’ senses and conversational phenomena to achieve the targeted goal in a more natural way. Consequently, these systems could also help tackle social issues such as: active ageing, long-term care, and inclusion. As a significant advance in the CIs and conversational intelligence research activities worldwide, we propose to analyse, discover and understand communicative intent as the basis of any interactive action. This involves intertwining all conversational signals as part of the machine’s cognition. Multiple technologies will be fused into a new concept, called embodied language understanding (CLU), including NLP, NLU and ELP. The expected benefit is delivering new social and emphatic belief-based interaction models capable of adapting to the user’s context and to facilitate context of the conversational situation not only via speech but also through visual interpretation of the collocutor’s responses and social cues. Such socially-enabled systems and agents have great potential to be used in various HCI concepts, from tour guides, tutors, instructors, to companions, and helpers. Moreover, these agents can adapt to the user’s needs and therefore successfully combat technological ignorance and significantly improve the use and exploitation of ICT in ambient assisted living (AAL), and other environments, such as: robotics, automotive assistance, virtual assistance, healthcare, marketing, etc. Overall, social and emphatic interaction, as promoted by the ELU, will enable the development of more socially aware and socially acceptable systems. Thus, the proposed results are in line with S4 (smart city and smart home), as well as EU strategies targeting social inclusion, active ageing, and ambient assisted living. With the fusion of verbal and non-verbal signals in interaction models and machine-learned algorithms, and with the utilization of deep-learning, current block-chain approaches will become obsolete. By means of outlined and analytically confirmed theoretical inferences one can devise its own deep networks to determine the interplay between conversational signals and give the machine the capability to understand and interpret the user’s requests. Technologically, the results envisaged in the proposed project will have a significant impact in all scientific fields, where multimodal human behaviour plays a central role, for instance: context initiated telecommunication services, human-machine interaction and conversational interfaces, embodied conversational agents, natural language processing and affective computing, and opinion mining.
Most important scientific results Interim report
Most important socioeconomically and culturally relevant results
Views history
Favourite