Loading...
Projects / Programmes source: ARIS

Learning, analysis, and detection of motion in the framework of a hierarchical compositional visual architecture

Research activity

Code Science Field Subfield
2.07.07  Engineering sciences and technologies  Computer science and informatics  Intelligent systems - software 

Code Science Field
P176  Natural sciences and mathematics  Artificial intelligence 

Code Science Field
1.02  Natural Sciences  Computer and information sciences 
Keywords
Computer vision; modeling visual categories of motion, actions, activities; learning visual categories of motion; visual categorization of motion, actions, activities; hierarchical motion modeling; video databases; interactive user interfaces.
Evaluation (rules)
source: COBISS
Researchers (17)
no. Code Name and surname Research area Role Period No. of publicationsNo. of publications
1.  34707  Robert Bevec  Manufacturing technologies and systems  Technical associate  2012 - 2014  21 
2.  19284  PhD Marko Boben  Computer intensive methods and applications  Researcher  2013  84 
3.  29381  PhD Luka Čehovin Zajc  Computer science and informatics  Researcher  2014  125 
4.  26454  PhD Matjaž Depolli  Computer science and informatics  Researcher  2011 - 2014  100 
5.  32148  PhD Denis Forte  Computer intensive methods and applications  Researcher  2011 - 2014  15 
6.  06856  PhD Stanislav Kovačič  Systems and cybernetics  Researcher  2011 - 2014  390 
7.  30155  PhD Matej Kristan  Computer science and informatics  Researcher  2011 - 2014  330 
8.  05896  PhD Aleš Leonardis  Computer science and informatics  Head  2011 - 2014  455 
9.  35073  Matjaž Majnik  Computer science and informatics  Researcher  2013  10 
10.  33172  PhD Rok Mandeljc  Systems and cybernetics  Junior researcher  2011 - 2014  56 
11.  21310  PhD Janez Perš  Systems and cybernetics  Researcher  2011 - 2014  244 
12.  32441  PhD Aleksandra Rashkovska Koceva  Computer science and informatics  Researcher  2011 - 2014  82 
13.  18198  PhD Danijel Skočaj  Computer science and informatics  Researcher  2011 - 2014  317 
14.  29551  PhD Vildana Sulić Kenk  Systems and cybernetics  Researcher  2011 - 2014  33 
15.  34398  PhD Domen Tabernik  Computer science and informatics  Researcher  2012 - 2014  51 
16.  06875  PhD Roman Trobec  Computer science and informatics  Researcher  2011 - 2014  469 
17.  11772  PhD Aleš Ude  Manufacturing technologies and systems  Researcher  2011 - 2014  475 
Organisations (3)
no. Code Research organisation City Registration number No. of publicationsNo. of publications
1.  0106  Jožef Stefan Institute  Ljubljana  5051606000  91,908 
2.  1538  University of Ljubljana, Faculty of Electrical Engineering  Ljubljana  1626965  28,002 
3.  1539  University of Ljubljana, Faculty of Computer and Information Science  Ljubljana  1627023  16,688 
Abstract
Perception of motion plays a central role in biological visual systems. Sophisticated mechanisms for observing, extracting, and utilizing motion exist even in primitive animals. For humans, successful motion processing is a prerequisite for accomplishing many everyday tasks. For example, action and activity recognition and categorisation are of crucial importance for the awareness of one’s environment and for interaction with one’s surroundings. The current state-of-the-art computer vision methods work well for the problems within limited domains and for specific tasks, and activity recognition and categorisation are no exception. However, when applied in more general settings, such methods turn out to be frail, less efficient or even computationally intractable. In a nutshell, the classic approaches are neither general, nor do they scale well. Consequently, new paradigms that would alleviate these shortcomings are constantly sought. Scientific advances in the recent years, especially in the field of neuroscience, have provided us with inspiration and insights that have given rise to novel approaches in computer vision. Far from duplicating the functionality of the human brain, they aim to improve the performance of computer vision methods by utilizing a selection of biologically inspired design principles. One such principle is the concept of hierarchical compositionality, which has been already exploited in the design of state-of-the-art object categorization methods, with significant contributions from the proposers of this project. Compared to the other state-of-the-art approaches, the approaches based on hierarchical compositionality allow for much more efficient use of the existing resources. This is achieved through sharing of both the representation units and the computations, and by transfer of the knowledge, therefore making the learning process much more efficient. Recently, the hierarchical approaches that deal with the analysis of motion began to emerge. Nevertheless, the analysis of motion by employing the hierarchical compositionality models is still in its infancy. Aims: The proposed project aims at a holistic approach towards learning, detection and recognition / categorisation of the visual motion and the phenomena derived from it. The approach is based on a novel and powerful paradigm of learning multi-layer compositional hierarchies. While individual ingredients, such as the hierarchical processing, compositionality and incremental learning, have already been subjects of a research, they have, to the best of our knowledge, never been treated in a unified motion-related framework. Such a framework is crucial for robustness, versatility, ease of learning and inference, generalisation, real-time performance, transfer of the knowledge, and scalability for a variety of cognitive vision tasks.   Challenge: The main scientific challenge lies in the design of the structures and the learning process in order to enable efficient learning of robust, extendable, and general-purpose visual representations that would facilitate the execution of various motion-related tasks in the real-world settings. Another part of the challenge is the introduction of proper benchmarks for these tasks in order to establish a base for the evaluation and comparison of the competing approaches.   Scientific novelty & relevance: The current state-of-the-art engineered solutions require an extensive training and hand design, are frail, and cannot generalize well enough to respond to novel situations. The paradigm of learning the multi-layer compositional visual hierarchies offers a way to overcome these limitations.   Feasibility: The proposed project is feasible due to its focus on a set of well-defined requirements that have been translated into highly advanced methodology, both from scientific and technological point of view, followed by a carefully defined set of experiments that have been planned in order to validate the proposed ap
Significance for science
Project results are directly relevant both to the primary scientific field, the field of computer vision, as well as the development of the motion analysis. Learning and motion recognition is one of the central topics of computer-, as well as artificial-cognitive vision. The reason is the significant scientific challenge and the applicability of the developed solutions for various automated systems that are based on efficient motion sensing, object tracking, and activity recognition. The main contribution of the project is a novel approach, based on learning of hierarchical compositional motion models, which provides a link to perception of the shape category, whether or not the shape is modelled in a hierarchical compositional way. This leads to a more consistent approach of treating the shape and motion, which can be augmented with the third component, i.e., tracking using properly adapted hierarchical compositional models. An important result of the project is realization that the result of learning the generative model (including hierarchical compositional models) is not necessary widely usable, even if the model works well on the established datasets. This realization speaks in favor of hierarchical compositional models over other learning strategies such as deep neural networks; with a properly-designed hierarchical compositional model, one can verify, layer by layer, whether the learned structures make semantical sense or not, in addition to the objective testing results.The developed framework requires some kind of a mechanism for limiting the “attention” during the training, which can be implemented either by tracking of the important objects in a scene, or with an additional source of non-visual information (e.g. gaze direction, if available, or non-visual robot sensors).Fast parallel implementations increase the usability of the developed methods in the field of cognitive robotics. Therefore, in the field of cognitive robotics, the project results will enable faster and more robust detection and interpretation of motion. This will extend the possibility of achieving complex tasks and interactions with the environment and users.
Significance for the country
Analysis of human motion is an important part of future applications, either in surveillance and security systems, intelligent buildings, and traffic surveillance, or in entertainment, games, and sports. Know-how that was obtained and disseminated by the project group in the course of this project has great importance for the promotion of Slovenia, for access to the foreign knowledge, inclusion in international distribution of work, and education of new generations of scientists. Project group is deeply involved in knowledge transfer from academia to industry, which has been proven by its socio-economic achievements. Results of this project have been applied to an industrial project, which involves development of an automatic multimedia device for video-conference calls, which allows user to select and track a person they are conferencing with while that person is moving around the room. There is an intensive collaboration with sports scientists on applications that include modeling of the athletes’ motion and analysis of their activity in sports games. Members of project group developed a low-cost low-complexity intelligent camera, which is an ideal platform for visual sensor networks, especially for intelligent environments. Members of the project group are also taking initiative in certain areas of computer vision in form of organizing challenges/competitions and workshops in conjunction with renowned computer vision conference, as seen from the list of socioeconomic achievements. Members of the project group are frequently in contact with potential clients for computer vision solutions, both from industry and academic environment. Frequently, they were faced with the problem that the state of the art was not sufficiently advanced to provide all solutions that the clients required. Therefore, the knowledge, acquired in this project, will find its way into practical use via the above-mentioned channels. Project group has acquired new competence in the field of motion modeling, object detection, and object tracking, which they can offer to all interested parties, including any startup companies.
Most important scientific results Annual report 2011, 2012, 2013, final report, complete report on dLib.si
Most important socioeconomically and culturally relevant results Annual report 2011, 2012, 2013, final report, complete report on dLib.si
Views history
Favourite