Sonification of images for the visually impaired using a multi-level approach

  title={Sonification of images for the visually impaired using a multi-level approach},
  author={Michael Banf and Volker Blanz},
This paper presents a system that strives to give visually impaired persons direct perceptual access to images via an acoustic signal. The user explores the image actively on a touch screen and receives auditory feedback about the image content at the current position. The design of such a system involves two major challenges: what is the most useful and relevant image information, and how can as much information as possible be captured in an audio signal. We address both problems, and propose… 

Figures and Tables from this paper

A Hierarchical Visual Feature-Based Approach For Image Sonification

This paper presents a new image sonification system that strives to help visually impaired users access visual information via an audio (easily decodable) signal that is generated in real time when

Man made structure detection and verification of object recognition in images for the visually impaired

This paper presents two learning based algorithms that are designed for the purpose of extracting and processing suitable information in images for the visually impaired, and strives to alleviate the rejection of false object detections before sonification.

Exploring Eye-Tracking-Driven Sonification for the Visually Impaired

This work enhances the sonification approaches for color, text and facial expressions with eye tracking mechanisms and proposes an eye tracking system to allow the user to choose which elements of the field of view should be sonified.

PictureSensation – a mobile application to help the blind explore the visual world through touch and sound

PictureSensation introduces a swipe-gesture based, speech-guided, barrier free user interface to guarantee autonomous usage by a blind user, and implements a recently proposed exploration and audification principle, which harnesses exploration methods that the visually impaired are used to from everyday life.

Interactive web-based image and graph analysis using Sonification for the Blind

An interactive web application developed for the blind students studying in schools which help them to grasp useful information from various images and graphs and a functional prototype of the academic tool that uses modern technologies like web development, object detection and sonification to enhance the visual comprehension of visually impaired students is created.

A Study of Multi-Sensory Experience and Color Recognition in Visual Arts Appreciation of People with Visual Impairment

Visually impaired visitors experience many limitations when visiting museum exhibits, such as a lack of cognitive and sensory access to exhibits or replicas. Contemporary art is evolving in the

Facilitating Independence for Photo Taking and Browsing by Blind Persons

This dissertation research aims to facilitate independence for blind persons to locate and browse photos in a sequential manner, as opposed to global, through user-centered development of a smartphone application that can be used without sight.

Using non-speech sounds to increase web image accessibility for screen-reader users

It is suggested that non-speech sounds could substitute or complement alternative text when describing images on the Web, and required lower mental and temporal demands and led to less effort and frustration and better task performance.

Seeing the Movement through Sound: Giving Trajectory Information to Visually Impaired People

This paper presents a sonification model to convert object tracking information into sound in real time. The goal is to generate a sound that describes the information given by a trajectory - such as

A qualitative study to support a blind photography mobile application

A mobile app is developed to help blind persons take and recognize picture content using non-visual cues and was tested with five legally and totally blind persons with mostly positive results.



A Modular Computer Vision Sonification Model For The Visually Impaired

The model exploits techniques from Computer Vision and aims to convey as much information as possible about the image to the user, including color, edges and what the authors refer to as Orientation maps and Micro-Textures.

An experimental system for auditory image representations

Computerized sampling of the system output and subsequent calculation of the approximate inverse (sound-to-image) mapping provided the first convincing experimental evidence for the preservation of visual information in sound representations of complicated images.

EdgeSonic: image feature sonification for the visually impaired

Preliminary experiments show that the combination of local edge gradient sonification and distance-to-edge sonification are effective for understanding basic line drawings and a significant improvement in image understanding with the introduction of proper user training.


An experimental system called SoundView, which allows the exploration of a color image through touch and hearing, tries to achieve maximal alignment of the color and sound spaces by preserving the perceptual metrical and topological structure of color space, as well as by incorporating common associations between sound and color.

An outdoor navigation aid system for the visually impaired

Initial usability evaluation shows the feasibility and potential of the AudioGuide, an outdoor navigation aid system developed based on a PDA for the visually impaired people.

Principles of Visual Information Retrieval

  • M. Lew
  • Computer Science
    Advances in Pattern Recognition
  • 2001
This book is essential reading for researchers in VIR, and final-year undergraduate and postgraduate students on courses such as Multimedia Information Retrieval, Multimedia Databases, and others.

Bilateral filtering for gray and color images

In contrast with filters that operate on the three bands of a color image separately, a bilateral filter can enforce the perceptual metric underlying the CIE-Lab color space, and smooth colors and preserve edges in a way that is tuned to human perception.

Visual categorization with bags of keypoints

This bag of keypoints method is based on vector quantization of affine invariant descriptors of image patches and shows that it is simple, computationally efficient and intrinsically invariant.

Discriminative random fields: a discriminative framework for contextual interaction in classification

  • Sanjiv KumarM. Hebert
  • Computer Science
    Proceedings Ninth IEEE International Conference on Computer Vision
  • 2003
This work presents discriminative random fields (DRFs), a discrim inative framework for the classification of image regions by incorporating neighborhood interactions in the labels as well as the observed data that offers several advantages over the conventional Markov random field framework.