Loading...
Projects / Programmes source: ARIS

Deep generative appearance modeling in visual tracking

Research activity

Code Science Field Subfield
2.07.07  Engineering sciences and technologies  Computer science and informatics  Intelligent systems - software 

Code Science Field
P176  Natural sciences and mathematics  Artificial intelligence 

Code Science Field
1.02  Natural Sciences  Computer and information sciences 
Keywords
computer vision, visual tracking, generative modeling, artificial neural networks
Evaluation (rules)
source: COBISS
Researchers (1)
no. Code Name and surname Research area Role Period No. of publicationsNo. of publications
1.  29381  PhD Luka Čehovin Zajc  Computer science and informatics  Head  2019 - 2022  124 
Organisations (1)
no. Code Research organisation City Registration number No. of publicationsNo. of publications
1.  1539  University of Ljubljana, Faculty of Computer and Information Science  Ljubljana  1627023  16,239 
Abstract
Predicting object state (e.g, pose) in video streams is one of fundamental challenges of computer vision, knowing where the object is at a given point in time can help autonomous vehicles avoid obstacles, alert if elderly people fall at home, analyze performance in professional sport, discover behaviour of animals, or help robots actively learn new concepts. Yet, there are numerous open challenges that have to be solved to develop a general visual tracking method capable of robustly handling scenarios, mentioned above. If the illumination changes or the target becomes partially occluded or even if the target is articulated and moves only its parts (e.g. human limbs) this causes appearance changes that are difficult to explain with simple image transformations. Humans, on the other hand, can solve complex tracking scenarios by relying on a massive amount of knowledge about the world accumulated through lifelong learning. This knowledge contains info about object categories, their possible deformations and appearance variations which are crucial for retaining a stable representation of the tracked object. Following this insight and based in the recent successes of the deep learning paradigm, the main research goal of this project is to build a deep generative models of object’s appearance, suitable for visual tracking. We want to determine a mapping from an image representation to a high-dimensional latent parameter space that would structure appearance variations of various objects in a way that would be useful in visual tracking scenarios, i.e. when we have access to only a limited amount of trustworthy training examples of object’s appearance and would like to generalize them using prior knowledge. The work will be divided into four work packages: WP1 (Generative deep neural network models for appearance modeling), WP2 (Generative appearance models in visual tracking), WP3 (Training and testing data acquisition), WP4 (Dissemination). This is a postdoctoral project, there will be only one person working on it. The applicant is a member of the Visual Cognitive Systems Laboratory (ViCoS) and has published numerous papers in top computer vision journals and at major computer vision conferences, mostly related to visual tracking, but also has significant research experience in other computer vision topics. He is one of the founders of the largest competition in visual tracking methods and have an excellent insight into the current state-of-the-art in the field of visual tracking which will be crucial in development of new algorithms.
Significance for science
The proposed project is a basic research high-risk/high-gain project. The research challenges of the project are highly relevant and novel. Generative neural networks are a vibrant research domain with many open research questions. The application of generative deep network architectures to the problem of visual tracking is a nearly unexplored territory. The interaction between a discriminative and generative component for visual tracking has also not been explored. Long-term tracking, i.e. tracking an object for very long time, possibly through occlusions and out-of-view disappearances, is also a less-explored research area that is slowly gaining momentum. It is clear that such tracking scenarios require a stable appearance representation that can be refined with new data as it is available. This is what this project aims to achieve. The potential impact of the project is significant and transcends visual tracking alone.  From the point of long-term computer vision development, generative models can bring together several separated tasks, e.g. detection, classification, segmentation and tracking into a joint framework. This goes in line with the recent introduction of a multi-task learning of deep neural networks. In the short-run the introduction of deep generative models into visual tracking will improve stability of trackers and open new research directions for the field. From the application standpoint, the methods developed within the project could be extended and used in robotics for better object perception, in multimedia applications for improved object segmentation, in augmented reality for more accurate object localisation to name just a few. The results of the proposed project will also be published at major conferences and journals in the field of computer vision.
Significance for the country
The proposed project is a basic research high-risk/high-gain project. The research challenges of the project are highly relevant and novel. Generative neural networks are a vibrant research domain with many open research questions. The application of generative deep network architectures to the problem of visual tracking is a nearly unexplored territory. The interaction between a discriminative and generative component for visual tracking has also not been explored. Long-term tracking, i.e. tracking an object for very long time, possibly through occlusions and out-of-view disappearances, is also a less-explored research area that is slowly gaining momentum. It is clear that such tracking scenarios require a stable appearance representation that can be refined with new data as it is available. This is what this project aims to achieve. The potential impact of the project is significant and transcends visual tracking alone.  From the point of long-term computer vision development, generative models can bring together several separated tasks, e.g. detection, classification, segmentation and tracking into a joint framework. This goes in line with the recent introduction of a multi-task learning of deep neural networks. In the short-run the introduction of deep generative models into visual tracking will improve stability of trackers and open new research directions for the field. From the application standpoint, the methods developed within the project could be extended and used in robotics for better object perception, in multimedia applications for improved object segmentation, in augmented reality for more accurate object localisation to name just a few. The results of the proposed project will also be published at major conferences and journals in the field of computer vision.
Most important scientific results Interim report
Most important socioeconomically and culturally relevant results Interim report
Views history
Favourite