Loading...
Projects / Programmes source: ARIS

Analysis of large text datasets

Research activity

Code Science Field Subfield
2.07.07  Engineering sciences and technologies  Computer science and informatics  Intelligent systems - software 

Code Science Field
T171  Technological sciences  Microelectronics 
Keywords
machine learning, text learning, learning on the Web, information retreival
Evaluation (rules)
source: COBISS
Researchers (1)
no. Code Name and surname Research area Role Period No. of publicationsNo. of publications
1.  12570  PhD Dunja Mladenić  Computer science and informatics  Head  1999 - 2001  662 
Organisations (1)
no. Code Research organisation City Registration number No. of publicationsNo. of publications
1.  0106  Jožef Stefan Institute  Ljubljana  5051606000  90,724 
Abstract
The research will be focused at the development of new and improvement of the existing computer methods for the analysis of large text datasets. Special emphasis will be put on the analysis of Slovenian text. The developed methods will enable automatic document categorization of Slovenian text, adaptation of the existing methods for text-learning to Slovenian texts, analysis of text datasets based on the new, extended document representation and better Web browsing by using a personal browsing assistant based on the new text analysis methods. The development of different applications will be enabled, including automatic updating of some existing document categorizations that are currently updated manualy, like for example, the categorization of Slovene Web documents named žMat Kurja'' or the specialized categorization of Slovenian text documents žBiomedicina Slovenica’, a national bibliography for biomedicine.
Views history
Favourite