Loading...
Projects / Programmes source: ARIS

Predictive clustering on data streams

Research activity

Code Science Field Subfield
2.07.07  Engineering sciences and technologies  Computer science and informatics  Intelligent systems - software 

Code Science Field
1.02  Natural Sciences  Computer and information sciences 
Keywords
data mining, evolving data streams, multi-target regression, multi-label classification, semisupervised learning, feature ranking, change detection
Evaluation (rules)
source: COBISS
Points
7,593.71
A''
1,852.67
A'
3,808.48
A1/2
4,911.2
CI10
8,916
CImax
629
h10
41
A1
25.65
A3
6.06
Data for the last 5 years (citations for the last 10 years) on April 24, 2024; A3 for period 2018-2022
Data for ARIS tenders ( 04.04.2019 – Programme tender, archive )
Database Linked records Citations Pure citations Average pure citations
WoS  373  8,992  7,853  21.05 
Scopus  470  12,836  11,274  23.99 
Researchers (16)
no. Code Name and surname Research area Role Period No. of publicationsNo. of publications
1.  31818  PhD Andreja Abina  Chemistry  Researcher  2023 - 2024  62 
2.  53798  Jure Brence  Computer science and informatics  Researcher  2020 - 2024  21 
3.  36220  PhD Martin Breskvar  Computer science and informatics  Researcher  2020 - 2023  36 
4.  11130  PhD Sašo Džeroski  Computer science and informatics  Head  2020 - 2024  1,204 
5.  57060  Boštjan Gec  Computer science and informatics  Researcher  2022 - 2024 
6.  31050  PhD Dragi Kocev  Computer science and informatics  Researcher  2020 - 2024  204 
7.  53530  Ana Kostovska  Computer science and informatics  Junior researcher  2020 - 2024  41 
8.  27800  PhD Zoran Levnajić  Physics  Researcher  2020 - 2024  135 
9.  36356  PhD Aljaž Osojnik  Computer science and informatics  Researcher  2020 - 2024  47 
10.  27759  PhD Panče Panov  Computer science and informatics  Researcher  2020 - 2024  155 
11.  38206  PhD Matej Petković  Computer science and informatics  Researcher  2020 - 2023  65 
12.  34452  PhD Nikola Simidjievski  Computer science and informatics  Researcher  2020 - 2024  58 
13.  39156  PhD Tomaž Stepišnik  Computer science and informatics  Junior researcher  2020  28 
14.  57192  Sintija Stevanoska  Computer science and informatics  Researcher  2022 - 2024 
15.  16302  PhD Ljupčo Todorovski  Computer science and informatics  Researcher  2020 - 2024  443 
16.  28893  MSc Sergeja Vogrinčič  Computer science and informatics  Technical associate  2023 - 2024 
Organisations (2)
no. Code Research organisation City Registration number No. of publicationsNo. of publications
1.  0106  Jožef Stefan Institute  Ljubljana  5051606000  90,724 
2.  2338  Jožef Stefan International Postgraduate School  Ljubljana  1917544  11,430 
Abstract
Data streams are high frequency information sources that have recently become ubiquitous. Properties specific to them include the high frequency of arrival of new examples and the time-order thereof. Crucial among these properties is the possibility that the data (and the underlying mechanisms generating it) can change - this is called concept drift. Data stream mining methods must thus be able to detect it and adapt accordingly. The need for mining data streams has increased and so has their complexity, which can be categorized along several dimensions. One is the complexity of the target to predict, where we are increasingly often encountering multi-target prediction tasks. Another is the need to handle examples with missing values of the targets in the context of semi-supervised learning or clustering. Finally, specific to data streams is the occurrence of the phenomenon of concept drift and the need to detect it and adapt to it. Responding to the need to handle complex data streams, this project will develop online learning methods that can 1) Handle tasks of both flat and hierarchical multi-target regression and multi-label classification; 2) Efficiently perform unsupervised learning (clustering), as well as semi-supervised learning for (hierarchical) multi-target prediction tasks; 3) Estimate importance of features for supervised and semi-supervised tasks of multitarget prediction; and 4) Detect and handle changes during the learning of predictive models for different types of structured outputs, also in the context of semi-supervised learning. It will systematically evaluate the developed methods using appropriate evaluation methodology. The developed methods will be made publicly available through a major data stream mining platform. Their use will also be promoted and facilitated by appropriately annotating the methods (with terms from an ontology of data stream mining), making them easier to find/use. Finally, the utility of the developed methods will be demonstrated on real world case studies from the challenging areas of environmental and health monitoring, as well as space operations monitoring and optimization.
Views history
Favourite