1.

FIŠER, Darja. Semantic Annotation of Corpora.

Semantically annotated corpora are indispensible in natural language processing tasks, such as automatic word sense disambiguation, information retrieval and machine translation. For Slovene, no previous attempt has been made to obtain such a corpus. This paper presents and discusses a project in which the most frequent nouns from a corpus of Slovene were manually annotated with wordnet senses.

COBISS.SI-ID: 43099234

2.

FIŠER, Darja, POLAK, Senja, VINTAR, Špela. Learning to mine definitions from Slovene structured and unstructured knowledge-rich resources.

The paper presents an innovative approach to extract Slovene definition candidates from domain-specific corpora using morphosyntactic patterns, automatic terminology recognition and semantic tagging with wordnet senses. The results of the experiment are encouraging, with accuracy ranging from 67% to 71%. The paper also addresses some drawbacks of the approach and suggests ways to overcome them in future work.

COBISS.SI-ID: 43122530

Z6-3668 — Annual report 2010

1.

FIŠER, Darja. Semantic Annotation of Corpora.

2.

FIŠER, Darja, POLAK, Senja, VINTAR, Špela. Learning to mine definitions from Slovene structured and unstructured knowledge-rich resources.