Loading...
Projects / Programmes source: ARIS

Language Resources for Slovene

Research activity

Code Science Field Subfield
6.05.00  Humanities  Linguistics   

Code Science Field
H350  Humanities  Linguistics 
H352  Humanities  Grammar, semantics, semiotics, syntax 
H360  Humanities  Applied linguistics, foreign languages teaching, sociolinguistics 
P175  Natural sciences and mathematics  Informatics, systems theory 
P176  Natural sciences and mathematics  Artificial intelligence 
Keywords
language resources, corpus linguistics, corpora, slovene language, linguistics annotation, lematization, disambigouation, parsing, text-minin, semantic-web
Evaluation (rules)
source: COBISS
Researchers (10)
no. Code Name and surname Research area Role Period No. of publicationsNo. of publications
1.  14681  PhD Vojko Gorjanc  Linguistics  Researcher  2003 - 2005  478 
2.  17137  Marko Grobelnik  Computer science and informatics  Technical associate  2003 - 2005  439 
3.  15000  PhD Monika Kalin Golob  Linguistics  Researcher  2003 - 2005  576 
4.  20331  PhD Tina Lengar Verovnik  Linguistics  Researcher  2003 - 2005  338 
5.  12570  PhD Dunja Mladenić  Computer science and informatics  Researcher  2004 - 2005  662 
6.  11651  PhD Marko Stabej  Linguistics  Head  2003 - 2005  627 
7.  21346  Maja Škrjanc  Computer science and informatics  Researcher  2003  42 
8.  20453  PhD Špela Vintar  Linguistics  Researcher  2003 - 2005  265 
9.  13232  PhD Primož Vitez  Linguistics  Researcher  2003 - 2005  357 
10.  16060  PhD Jana Zemljarič Miklavčič  Linguistics  Researcher  2003 - 2005  111 
Organisations (3)
no. Code Research organisation City Registration number No. of publicationsNo. of publications
1.  0106  Jožef Stefan Institute  Ljubljana  5051606000  90,682 
2.  0581  University of Ljubljana, Faculty of Arts  Ljubljana  1627058  97,913 
3.  0582  University of Ljubljana, Faculty of Social Sciences  Ljubljana  1626957  40,399 
Abstract
The aim of project "Language resources for the Slovene language" is to develop text corpora and software tools for researching Slovene texts and the Slovene language in general. It is designed as the qualitative and quantitative upgrading of the Slovene reference corpus FIDA with the involvement of the original partners in the FIDA project (Faculty of Arts - University of Ljubljana, Jozef Stefan Institute, DZS d.d., Amebis d.o.o.) and one new partner (Faculty of Social Studies - University of Ljubljana). The upgrading will consist of several components: the size of the corpus will be doubled (200.000.000 words), spoken corpus component and internet texts will be added and new guidelines for balancing the corpus will be implemented. Parallel to corpus enlargement, software tools for automatic processing of the incoming texts will be developed, as well as software for extraction and analysis of linguistic information. All the results will be publicly available for research and pedagogic purposes and from that point of view, the project will represent a major step forward in developing research and language-policy infrastructure in linguistics, social studies and information technology.
Views history
Favourite