Deep generative appearance modeling in visual tracking

Code

Z2-1866 (B) - included in ARIS records

Head

PhD Luka Čehovin Zajc

Period

7/1/2019 - 2/28/2022

Range in 2022

0.5 FTE

Science

Engineering sciences and technologies (1)

Reseacher status

Researcher (1)
Junior expert or technical associate (0)

Education

Doctor's degree (1)

Sex

Man (1)

Status

Employed at RO and RRD (1)

No. of publications

100–999 (1)

Projects / Programmes source: ARIS

Deep generative appearance modeling in visual tracking

Research activity

Code	Science	Field	Subfield
2.07.07	Engineering sciences and technologies	Computer science and informatics	Intelligent systems - software

Code	Science	Field
P176	Natural sciences and mathematics	Artificial intelligence

Code	Science	Field
1.02	Natural Sciences	Computer and information sciences

Keywords

computer vision, visual tracking, generative modeling, artificial neural networks

Evaluation (rules)

Evaluation of bibliographic research performance indicators according to ARIS methodology

Citations Citations for bibliographic records in COBIB.SI that are linked to records in citation databases

Researchers (1)

no.	Code	Name and surname	Research area	Role	Period	No. of publicationsNo. of publications
1.	29381	PhD Luka Čehovin Zajc	Computer science and informatics	Head	2019 - 2022	125

Organisations (1)

no.	Code	Research organisation	City	Registration number	No. of publicationsNo. of publications
1.	1539	University of Ljubljana, Faculty of Computer and Information Science	Ljubljana	1627023	16,716

Abstract

Predicting object state (e.g, pose) in video streams is one of fundamental challenges of computer vision, knowing where the object is at a given point in time can help autonomous vehicles avoid obstacles, alert if elderly people fall at home, analyze performance in professional sport, discover behaviour of animals, or help robots actively learn new concepts. Yet, there are numerous open challenges that have to be solved to develop a general visual tracking method capable of robustly handling scenarios, mentioned above. If the illumination changes or the target becomes partially occluded or even if the target is articulated and moves only its parts (e.g. human limbs) this causes appearance changes that are difficult to explain with simple image transformations. Humans, on the other hand, can solve complex tracking scenarios by relying on a massive amount of knowledge about the world accumulated through lifelong learning. This knowledge contains info about object categories, their possible deformations and appearance variations which are crucial for retaining a stable representation of the tracked object. Following this insight and based in the recent successes of the deep learning paradigm, the main research goal of this project is to build a deep generative models of object’s appearance, suitable for visual tracking. We want to determine a mapping from an image representation to a high-dimensional latent parameter space that would structure appearance variations of various objects in a way that would be useful in visual tracking scenarios, i.e. when we have access to only a limited amount of trustworthy training examples of object’s appearance and would like to generalize them using prior knowledge. The work will be divided into four work packages: WP1 (Generative deep neural network models for appearance modeling), WP2 (Generative appearance models in visual tracking), WP3 (Training and testing data acquisition), WP4 (Dissemination). This is a postdoctoral project, there will be only one person working on it. The applicant is a member of the Visual Cognitive Systems Laboratory (ViCoS) and has published numerous papers in top computer vision journals and at major computer vision conferences, mostly related to visual tracking, but also has significant research experience in other computer vision topics. He is one of the founders of the largest competition in visual tracking methods and have an excellent insight into the current state-of-the-art in the field of visual tracking which will be crucial in development of new algorithms.

Significance for science

The proposed project is a basic research high-risk/high-gain project. The research challenges of the project are highly relevant and novel. Generative neural networks are a vibrant research domain with many open research questions. The application of generative deep network architectures to the problem of visual tracking is a nearly unexplored territory. The interaction between a discriminative and generative component for visual tracking has also not been explored. Long-term tracking, i.e. tracking an object for very long time, possibly through occlusions and out-of-view disappearances, is also a less-explored research area that is slowly gaining momentum. It is clear that such tracking scenarios require a stable appearance representation that can be refined with new data as it is available. This is what this project aims to achieve.

The potential impact of the project is significant and transcends visual tracking alone.  From the point of long-term computer vision development, generative models can bring together several separated tasks, e.g. detection, classification, segmentation and tracking into a joint framework. This goes in line with the recent introduction of a multi-task learning of deep neural networks. In the short-run the introduction of deep generative models into visual tracking will improve stability of trackers and open new research directions for the field. From the application standpoint, the methods developed within the project could be extended and used in robotics for better object perception, in multimedia applications for improved object segmentation, in augmented reality for more accurate object localisation to name just a few. The results of the proposed project will also be published at major conferences and journals in the field of computer vision.

Significance for the country

The proposed project is a basic research high-risk/high-gain project. The research challenges of the project are highly relevant and novel. Generative neural networks are a vibrant research domain with many open research questions. The application of generative deep network architectures to the problem of visual tracking is a nearly unexplored territory. The interaction between a discriminative and generative component for visual tracking has also not been explored. Long-term tracking, i.e. tracking an object for very long time, possibly through occlusions and out-of-view disappearances, is also a less-explored research area that is slowly gaining momentum. It is clear that such tracking scenarios require a stable appearance representation that can be refined with new data as it is available. This is what this project aims to achieve.

The potential impact of the project is significant and transcends visual tracking alone.  From the point of long-term computer vision development, generative models can bring together several separated tasks, e.g. detection, classification, segmentation and tracking into a joint framework. This goes in line with the recent introduction of a multi-task learning of deep neural networks. In the short-run the introduction of deep generative models into visual tracking will improve stability of trackers and open new research directions for the field. From the application standpoint, the methods developed within the project could be extended and used in robotics for better object perception, in multimedia applications for improved object segmentation, in augmented reality for more accurate object localisation to name just a few. The results of the proposed project will also be published at major conferences and journals in the field of computer vision.

Most important scientific results

Interim report

Most important socioeconomically and culturally relevant results

Interim report

Filters

Deep generative appearance modeling in visual tracking

Views history

Favourite

Filters

Deep generative appearance modeling in visual tracking

FRASCATI classification

CERIF classification

FORD classification

Confirmation required

Views history

Favourite