1.

Decision support system for traceability and co-existence management in food supply chains

In the framework of the EU project Co-Extra (FP6-7158: GM and non-GM supply chains: their CO-EXistence and TRAceability), we have developed a decision support system (DSS) called "Co-Extra DSS". The system addresses various decision problems that occur in supply chains with or without genetically modified food and feed, and are interesting for users such as food or feed producers, analytic laboratories and policy makers. The core of the DSS is composed of eight qualitative multi-attribute models in three problem areas: assessment of analytical and sampling methods, assessment of products based on traceability data, and assessment of production processes. The models were developed in collaboration with domain experts using our decision method DEX and software DEXi. The approach has proven successful, as it is being or will be applied also in several other bilateral and EU projects.

COBISS.SI-ID: 2676303

2.

Applying semantic technology to business news analysis

An interdisciplinary method combining financial text mining and ontology based reasoning is proposed and evaluated. The experiments are performed on texts from the financial domain, using the well-known Cyc ontology. Our experiments show that using semantic technologies for business news analysis helps to provide the user with more relevant answers to his or her queries.

COBISS.SI-ID: 26855719

3.

Class imbalance and the curse of minority hubs

The paper deals with evaluating the impact of hubness on learning under class imbalance with nearest neighbor methods. Our results suggest that, contrary to the common belief, minority class hubs might be responsible for most misclassification in many high-dimensional datasets.

COBISS.SI-ID: 27022119

4.

Semantic subgroup explanations

The paper describes a methodology for explaining results of other data mining methods. We used an existing semantic subgroup discovery algorithm that takes as input groups of examples (the output of other methods) and a domain ontology. The output of the algorithm is a set of descriptive rules that describe and explain the differences between the input sets with conjunctions of ontological concepts. In the paper we used the publicly available Gene Ontology (GO) together with breast cancer experimental data (gene expression profiles). The ontology encodes biological domain knowledge about cells of various organisms. In the first part of the experiment we employed subgroup discovery to get rules in terms of genes. These rules represent the low-level description of the data. In the second part of the experiment we then employed semantic subgroup discovery to infer high-level explanations of the rules infered in the first part. These high-level descriptions were constructed from the biological knowledge encoded in GO and offer a useful and different view of the same data. We also implemented the experiments as a work-flow on the Clowdflows platform, which is available at http://clowdflows.org/workflow/911/.

COBISS.SI-ID: 27322407

5.

Computational protein function prediction

We developed a new method for computational gene (or protein) function prediction, which was published in PLOS Computational Biology. The method is based on the principles of homology and phyletic profiles and uses ensembles of trees for hierarchical multi-label classification. In addition, the method was experimentally evaluated with wet lab experiments. The results show that the confidence estimates, obtained by applying our method, can be used to make informed decisions on experimental validation of computational predictions. The method was further used in the first large-scale community-based critical assessment of protein function annotation (CAFA) experiment, which was published in Nature Methods. Fifty-four methods representing the state of the art for protein function prediction were evaluated on a target set of 866 proteins from 11 organisms. Two findings stand out: (i) today's best protein function prediction algorithms substantially outperform still widely used first-generation methods; and (ii) although the top methods perform well enough to guide experiments, there is considerable need for improvement of currently available tools.

COBISS.SI-ID: 26401319

P2-0103 — Annual report 2013

1.

Decision support system for traceability and co-existence management in food supply chains

2.

Applying semantic technology to business news analysis

3.

Class imbalance and the curse of minority hubs

4.

Semantic subgroup explanations

5.

Computational protein function prediction