Vzdrževanje velikih podatkovnih baz na podlagi vizualne informacije z inkrementalnim učenjem (Slovene)

Code

L2-6765 (B) - included in ARIS records

Head

PhD Danijel Skočaj

Period

7/1/2014 - 6/30/2017

Range in 2017

0.94 FTE

Science

Engineering sciences and technologies (15)

Reseacher status

Researcher (15)
Junior expert or technical associate (0)

Education

Doctoral degree (7)
Master's degree (4)
Other (4)

Sex

Man (15)

Status

Employed at RO and RRD (11)
No data on employment in RO (4)

No. of publications

0 (2)
10–99 (8)
100–999 (5)

Projects / Programmes source: ARIS

Vzdrževanje velikih podatkovnih baz na podlagi vizualne informacije z inkrementalnim učenjem (Slovene)

Research activity

Code	Science	Field	Subfield
2.07.00	Engineering sciences and technologies	Computer science and informatics

Code	Science	Field
P176	Natural sciences and mathematics	Artificial intelligence

Code	Science	Field
1.02	Natural Sciences	Computer and information sciences

Keywords

semisupervised learning; incremental learning; interactive learning; object detection; object recognition; context learning; semiautonomous system; image databases; traffic sign detection; mobile mapping system; register of traffic signalization

Evaluation (metodology)

Evaluation of bibliographic research performance indicators according to ARIS methodology

Citations Citations for bibliographic records in COBIB.SI that are linked to records in citation databases

Organisations (2) , Researchers (15)

1539 University of Ljubljana, Faculty of Computer and Information Science

no.	Code	Name and surname	Research area	Role	Period	No. of publicationsNo. of publications
1.	29381	PhD Luka Čehovin Zajc	Computer science and informatics	Researcher	2016 - 2017	158
2.	30155	PhD Matej Kristan	Computer science and informatics	Researcher	2014 - 2017	369
3.	05896	PhD Aleš Leonardis	Computer science and informatics	Researcher	2014 - 2017	457
4.	35073	Matjaž Majnik	Computer science and informatics	Researcher	2014	10
5.	33172	PhD Rok Mandeljc	Systems and cybernetics	Researcher	2015 - 2017	56
6.	18198	PhD Danijel Skočaj	Computer science and informatics	Head	2014 - 2017	361
7.	34398	PhD Domen Tabernik	Computer science and informatics	Researcher	2014 - 2017	60
8.	38300	MSc Peter Uršič	Computer science and informatics	Researcher	2015 - 2016	17

1945 DFG CONSULTING information systems, Ltd.

no.	Code	Name and surname	Research area	Role	Period	No. of publicationsNo. of publications
1.	11117	MSc Tomaž Gvozdanović	Geodesy	Researcher	2014 - 2017	68
2.	30802	Simon Jud	Geodesy	Researcher	2014 - 2017	0
3.	37267	Samo Kumar	Geodesy	Researcher	2015 - 2016	0
4.	23717	Marko Mahnič	Computer science and informatics	Researcher	2015 - 2017	20
5.	25143	MSc Uroš Ranfl	Computer science and informatics	Researcher	2014 - 2017	23
6.	24391	MSc Domen Smole	Geodesy	Researcher	2015 - 2017	65
7.	28775	PhD Rok Vezočnik	Geodesy	Researcher	2014 - 2015	100

Abstract

We live in the era of information abundance. However, rather than quantity, the central concern is becoming the quality and credibility of the acquired data. This is especially true for visual information databases. Although the field of computer vision has achieved significant progress recently, the methods for automatic image interpretation are still not reliable enough to be used for autonomous annotation and maintenance of image and video databases (e.g. databases of detected objects). On the other hand, manual annotation of video sequences with relevant objects is very time consuming, expensive, as well as tedious and therefore prone to errors. In this project we aspire to combining two approaches: computer-based automation of image interpretation that is necessary for database maintenance as well as suitable introduction of a human verifier into the loop. Such combination is of central importance for developing a methodology suitable for semi-automatic maintenance of traffic signalization records, which is partially our project’s practical goal. Even the database of such records for only state roads in the Republic of Slovenia may contain more than 250.000 entries along with additional information. Automation is therefore crucial for continuous maintenance of such databases. The main goal of the project is to develop a framework for semi-supervised incremental learning as well as specific methods for visual learning and recognition that will increase the quality and efficiency of large visual information databases maintenance. We will approach the problem holistically. The problem of incremental learning in interaction with a human will be addressed on a general meta level, at which we will determine a strategy for interactive learning that will give the best results in terms of learning success and recognition rate as well as the reduction of required manual interventions. To achieve this, we will develop powerful visual learning methods at the base level. These methods will not use visual information alone as it is common with traditional computer vision systems. Instead, we will focus on considering also the context and fusion of different kinds of available information (multimodal information fusion, temporal information fusion, geometry, etc.). We will use different kinds of context (temporal and spatial as well as semantic context) to narrow down the search area in the images and improve recognition results. Besides that we will develop methods for learning and incremental updating of the context. We therefore expect significant scientific contributions to the field of incremental learning for object detection, context learning, as well as learning of optimal strategies of interactive incremental learning. The developed methodology will be applied to the case of maintenance of records of vertical and horizontal traffic signalization. This use case is very suitable for evaluation of developed algorithms, since the problem is very well defined, we have an abundance of multimodal data at our disposal, and at the same time enables efficient learning and use of contextual information. The development of proposed algorithms is also required for significant automation of the records maintenance process. We therefore also expect a significant contribution of our research towards improving efficiency of traffic signalization monitoring that would in the long run significantly reduce the cost of some elements of the traffic infrastructure. We believe that the proposed project can be well realised due to clearly defined goals and the requirements that have been translated into scientific and technologically advanced methodology that is suitable for application to the considered use case.

Significance for science

The successfully realized objectives of this research project are from the scientific point of view important for both, the primary research areas of computer vision and machine learning as well as for the development of the field of image and video databases. There are many large databases containing visual information that are being manually maintained; the manual maintenance of such databases is very time consuming and expensive. In the framework of this project, we developed methods that (semi)automatize this process. We developed algorithms for learning and detecting as well as recognising objects that are suitable for maintaining large databases of images. Detection of objects in images stored in databases is performed in an automated way. A developed prototype system detects objects in images and in the case of reliable detections automatically updates the database, otherwise the operator is asked to provide additional information. Thus, the manual work is significantly reduced, since these requests for additional verification are quite rare. These ambiguous detected and verified examples can later be used for updating the learned model, which can further increase its reliability.  The main contributions from the computer vision point of view are the development of effective and reliable methods for learning as well as detection and recognition of objects. We developed a new machine learning method based on incremental learning of representative samples. We also designed a different presentation of visual information that models images as unordered sets of parts encoded by a convolution neural network. We proposed a general method for learning discriminative parts that supports sequential learning.  We also combined two paradigms for modelling visual information; hierarchical, with an explicit modelling of relations between objects parts, and a paradigm based on artificial neural networks that prevailed in the field of computer vision. In deep architectures, we introduced a more explicit modelling of spatial relations between individual parts, and introduced the concept of spatially-adaptive filter units for deep neural networks, which enables efficient work with large receptive fields. This approach represents a novelty in the field of deep neural networks and has a lot of potential both for object recognition and semantic segmentation.  The solution to the problem of traffic sign detection was also based on the latest object detection methods based on deep architectures. We proposed a process for data augmentation, based on the modelling of various factors that influence the appearance of learning samples. We also addressed the problems concerning uneven representations of objects of different sizes and positive and negative learning examples.  The developed process is especially suitable for use in the target application domain - maintaininance of the traffic sign records, i.e., for detection and recognition of traffic signs in images. An important contribution of this project is also a dataset of traffic signs. It contains 7,000 annotated images of 200 different categories of traffic signs, with at least 20 examples of the size of at least 30 pixels for each category, where all the traffic signs in all images are very accurately labelled. Currently, it is the best such image dataset available, with the largest number of traffic sign categories, suitable for evaluating algorithms for detection and recognition.

Significance for the country

The successfully realized objectives of this project will have a multifaceted influence on the development of the Slovenian economy and society.   The direct relevance of the project to the economy is evident, as the successful development of new methods for image interpretation enables a more efficient and less expensive utilization of information content based on visual information. High-tech companies that develop and offer services in the field of digital image and multimedia databases could use the methods developed within the project to improve their services and increase competitiveness. A similar technology could be used by technologically highly innovative companies in the field of robotics and cognitive systems, where the use of visual information plays a very important role.   A very suitable domain for utilizing the produced results is the maintenance of the traffic signalization records. The implementation and application of the designed prototype system will improve the frequency and precision of maintenance of such databases; as a result, the maintenance costs will decrease. We expect an impact on the entire industry of remote sensing and geodesy. To our knowledge, the current mobile mapping systems do not offer such advanced computer vision and artificial intelligence functionalities as we developed. As a result, majority of work is based on time consuming manual labour that has been in many cases already outsourced to cheaper (frequently non-European) countries. By implementing the results of this project this trend could be stopped.  In this project we focused on detecting the traffic signalization with the purpose of maintaining records. However, the developed methods could also lead to more efficient methods for the use on real-time embedded mobile platforms, e. g., a car driving assistant or a utility that would enable blind people to navigate in traffic more easily. In the era of self-driving cars development, the detection of traffic signs is, of course, a very desirable competence. Either way, the research results are very useful from the application perspective, and promises many opportunities for exploitation of results, as well as development and marketing of the new technology with immediate economic effects.   In recent years, we are witnessing an increased public interest in artificial intelligence and machine learning. Since detection and recognition of traffic signs is a well-known problem, an automatic solution to this problem certainly attracts a lot of interest. By demonstrating the operation of the underlying algorithms, we will be able to explain the main principles of machine learning and thus realistically demonstrate the potential that this technology makes possible.   Last but not least, during the project we also invited students to collaborate with the research team, and they gathered valuable research experience. The findings that we obtained on the project were also included in the teaching process and they enriched our pedagogical activities and experience. We will transfer them to the next generation of students of computer and information science, which is an extremely important area for the development of Slovenia.

Most important scientific results

Annual report 2014, 2015, final report

Most important socioeconomically and culturally relevant results

Annual report 2014, 2015, final report

Vzdrževanje velikih podatkovnih baz na podlagi vizualne informacije z inkrementalnim učenjem (Slovene)

Views history

Favourite

Vzdrževanje velikih podatkovnih baz na podlagi vizualne informacije z inkrementalnim učenjem (Slovene)

FRASCATI classification

CERIF classification

FORD classification

Confirmation required

Views history

Favourite