6th Annual Meeting of the International Multisensory Research Forum
    Home > Papers > Richard Reilly
Richard Reilly

Speaker Identification based on Automatic Crossmodal Fusion of Audio and Visual Data
Poster Presentation

Richard Reilly
University College Dublin

Niall Fox
University College Dublin

     Abstract ID Number: 158
     Full text: Not available
     Last modified: March 21, 2005
     Presentation date: 06/07/2005 11:30 AM in MART Auditorium
     (View Schedule)

Abstract
It has long been reported that multimodal integration enhances our ability to detect, locate and discriminate external stimuli. Translating this automatic integration process to electronic systems and devices for speech or speaker recognition is of great research interest.

In this paper an audio-visual speaker identification system employing multimodal integration is reported, with the audio and visual speech modalities combined using automatic classifier fusion. The visual modality employs the speaker’s lip information. The fusion uses a feedback mechanism that automatically adapts audio or visual information based on the output of reliability estimates from both the audio and the visual feedforward recognisers.

The robustness of the system was assessed, employing additive white Gaussian noise for the audio modality and ten levels of JPEG compression for the visual modality. Experiments were carried out on a large data set of 251 subjects from an international audio-visual database (XMV2TS). The results show improved audio-visual speaker identification at all tested levels of audio and visual mismatch, compared to the individual audio or visual modality speaker identification. By combining multisensory information in this way, audio-visual speaker identification accuracies range from 99.2% for no audio and visual noise to 71.4% at the most severe mismatch levels.

The automatic fusion of information from the different modalities based on this physiological model offers enormous benefit for speech identification, recognition and other applications.

Research
Support Tool
  For this 
non-refereed conference abstract
Capture Cite
View Metadata
Printer Friendly
Context
Author Bio
Define Terms
Related Studies
Media Reports
Google Search
Action
Email Author
Email Others
Add to Portfolio



    Learn more
    about this
    publishing
    project...


Public Knowledge

 
Open Access Research
home | overview | program
papers | organization | schedule | links
  Top