1.

The complete Gabor-Fisher classifier for robust face recognition

Face recognition systems exploiting Gabor filters are at present amongst the most robust and efficient face-based biometric systems. Commonly, these systems adopt a number of complex Gabor filters using the magnitude responses of the filtering operation to derive useful features for the recognition task. In our paper, we extend this common approach and introduce a novel feature type derived from Gabor phase responses. We show that our Gabor phase features contain complementary information to Gabor magnitude features and that the two feature types result in competitive recognition performance.

COBISS.SI-ID: 7787604

2.

An evaluation of video-to-video face verification

Due to the widespread use of web-cams and mobile devices embedded with a camera, it is now possible to realize facial video recognition, rather than resorting to just still images. This paper presents an evaluation of person identity veri?cation using facial video data. It involves 18 systems made available by seven academic institutes such as IDIAP, the University of Surrey, the University of Ljubljana, etc. These systems provide for a diverse set of assumptions, allowing us to assess the effect differences in approaches for video-to-video face authentication.

COBISS.SI-ID: 8062804

3.

Towards the optimal minimization of a pronunciation dictionary model

This paper presents the results of our efforts to obtain the minimum possible finite-state representation of a pronunciation dictionary. Finite-state transducers are widely used to encode word pronunciations and our experiments revealed that the conventional redundancy-reduction algorithms developed within this framework yield suboptimal solutions. We found that the incremental construction and redundancy reduction of acyclic finite-state transducers creates considerably smaller models (up to 60%) than the conventional, non-incremental (batch) algorithms implemented in the OpenFST toolkit.

COBISS.SI-ID: 7879764

4.

Multimodal emotion recognition based on the decoupling of emotion and speaker information

A gradient descent transformation for the decoupling of emotion and speaker information contained in the acoustic features is presented. The Interspeech ’09 Emotion Challenge feature set is used as the baseline for the audio part. For the video signal the nuisance attribute projection is used to derive the transformation matrix representing emotional state features. The audio and video sub-systems are combined at the matching score level. The presented system is assessed on the eNTERFACE ’05 database where improved recognition performance is observed compared to the stat-of-the-art systems.

COBISS.SI-ID: 7879508

P2-0250 — Annual report 2010

1.

The complete Gabor-Fisher classifier for robust face recognition

2.

An evaluation of video-to-video face verification

3.

Towards the optimal minimization of a pronunciation dictionary model

4.

Multimodal emotion recognition based on the decoupling of emotion and speaker information