This special issue of Slovenščina 2.0 journal is dedicated to collocations and stems from a workshop on collocations, which was organised by the members of the project group as part of the eLex 2019 conference. The special issue contains seven papers, which address different perspectives of collocation: what a collocation actually is, what is its relation to other multiword expressions, how much collocational data should be included in the dictionaries and how it should be presented, and how collocational information should be encoded to make it useful for different purposes. The The contributions thus cover a wide range of topics related to collocations, in six different languages, giving this special issue a truly international focus and relevance.
C.03 Guest-associated editor
COBISS.SI-ID: 24561667The Collocations Dictionary of Modern Slovene, which contains 35,989 headwords and 7,717,561 collocations, is the first dictionary of collocations for Slovene and represents the first step towards filling the gap in the field of language resources for Slovene, particularly those aimed at facilitating language production. The Collocations Dictionary of Modern Slovene is characterized by a phase-based entry representation, collocational data in context, and numerous filtering and sorting options. The dictionary is available at https://viri.cjvt.si/kolokacije, but also as a database in the CLARIN.SI repository. The database was already used for different social-educational purposes, e.g. in the development of the language didactic game Game of Words and in the mobile version of Collocations Dictionary and Synonym Dictionary (developed independently by a developer from Croatia).
F.11 Development of a new service
COBISS.SI-ID: 20172291SLONEST-noun 1.0 represents an ontology developed for nouns and contains a total of 271 categories of semantic types: 21 top-level categories, which are further divided into up to three levels of hierarchical subcategories. The ontology was also developed for, and tested on, collocation data, thus a selection of collocations is also provided for most categories. For every collocation, noun headwords and collocates are clearly labelled, and the information on grammatical structure (id and name) is provided. The ontology is already used in the lexicographic practice, i.e. for the purposes of the Digital Dictionary Database for Slovene and the Comprehensive Slovene-Hungarian Dictionary, which are being compiled at the Centre for Language Resources and Technologies, University of Ljubljana. Where relevant, especially at top-level semantic types, the corresponding semantic type (i.e. lexicographer file) from Wordnet is listed, along with the level of matching ("full" or "partial"). It is precisely this information that makes the ontology very relevant in the international context as it will facilitate linking of dictionaries and other lexical resources; and at the moment, linking is a very hot topic in linguistics and lexicography, with the benefits for different aspects of society (smart cities, artificial intelligence etc.)
F.15 Development of a new information system/databases
COBISS.SI-ID: 62581507The dictionary is compiled at the Centre for Language Resources and Technologies at the University of Ljubljana (funded by Slovenian Research Agency via the infrastructure network centres at the University of Ljubljana) and is important for different segments of society, from culture to education and economy. The methodology of identification and extraction of collocations, developed in the KOLOS project, is key for this dictionary. Moreover, the ontology of semantic types is used to label dictionary senses/concepts. The Slovenian-Hungarian dictionary also plays a vital role in the development of data model and semantic concepts for the Digitial Dictionary Database for Slovenian, which is being prepared in the Development of Slovene in a Digital Environment project (funded by the Ministry of Culture).
D.01 Chairing over/coordinating (international and national) projects
COBISS.SI-ID: 67672930The Language Monitor shows the information on temporal trends of words and N-grams. The Language Monitor uses the data of the Gigafida 2.0 reference corpus of Modern Slovene (Krek et al. 2020) for the period up to 2018, and the IJS NewsFeed service (from 2019 onwards) that extracts texts from over 100 different Slovenian online sources. The Language monitor was developed also by using the methodology for analysing temporal trends of collocations (and individual words), which has been developed and evaluated in the KOLOS project. It is the first tool of this kind in Slovenia, enabling monitoring language change in Slovenian (including neologisms), and thus puts Slovenia on the same level with leading countries and institutions in this field.
F.17 Transfer of existing technologies, know-how, methods and procedures into practice
COBISS.SI-ID: 62272771