In this paper we present the legal framework, dataset and annotation schema of socially unacceptable discourse practices on social networking platforms in Slovenia. On this basis we aim to train an automatic identification and classification system with which we wish contribute towards an improved methodology, understanding and treatment of such practices in the contemporary, increasingly multicultural information society.
In this paper we present a set of experiments and analyses on predicting the gender of Twitter users based on languageindependent features extracted either from the text or the metadata of users' tweets. We perform our experiments on the TwiSty dataset containing manual gender annotations for users speaking six different languages. Our classification results show that, while the prediction model based on language-independent features performs worse than the bag-of-words model when training and testing on the same language, it regularly outperforms the bag-of-words model when applied to different languages, showing very stable results across various languages. Finally we perform a comparative analysis of feature effect sizes across the six languages and show that differences in our features correspond to cultural distances.
The aim of the paper is to compare on the one hand the case law of the European Court of Human Rights which allows the states to exercise relatively wide scope of powers in prosecuting hate speech, and on the other hand the Slovenian legal regulation and in particular its implementation which only rarely considers hateful comments as problematic. The paper analyses this problem and exposes the reasons for such state of affairs. These are associated with indulgent attitude (of the state prosecution in particular) towards the hate speech problem, and as a possible reason for this it identifies the historical role of the state prosecution in prosecuting the so-called verbal delict from the times of former Yugoslavia.