|
|
Adaptive Virtual Sound Imaging (2003-2007)
Advances in
computer technology and low cost cameras open up new possibilities
for three dimensional (3D) sound reproduction. The problem is to
update the audio signal processing scheme for a moving listener, so
that the listener perceives only the intended virtual sound image.
Background
Binaural
technology is often used for the reproduction of virtual sound
images. The principle of binaural technology is to control the sound
field at the listener's ears so that the reproduced sound field
coincides with the desired real sound field. For the implementation
of binaural technology over loudspeakers, it is necessary to cancel
the cross-talk that prevents a signal meant for one ear from being
heard at the other. However, such cross-talk cancellation, normally
realized by time-invariant filters, works only for a specific
listening location and the sound field can only be controlled in a
limited area refereed to as the ‘’sweet-spot’’. If the listener
moves away from the optimal listening location, it is required that
the inverse filters are updated so that the sweet-spot is steered to
the listener's new location. The issues related to filter updates
have been investigated intensively in this work.
The aim of
this project is to find a way to improve filter update techniques as
well as to determine the filter update rate necessary to stabilize
an acoustic image regardless of listener movement. This work is
based on the assumption that the location of the listener is known
from a visual head tracking device.
The
effectiveness of cross-talk cancellation depends on the geometry of
the system and in theory each frequency band can be reproduced from
a loudspeaker pair with an optimal source span. Therefore the
concept of Frequency Distributed Loudspeakers (FDLs) has been
studied, and the idea is to reproduce each frequency from an optimal
source angle within a given listening area.
The area
that the listener can move within when the filters are updated can
be determined by introducing the concept of ‘’operational area’’.
Hence, the operational area represents the region where the
‘’sweet-spot’’ can be moved within using an adaptive virtual sound
system. The extent of the operational area depends on performance
criteria and is investigated thoroughly.
The
relatively small ‘’sweet-spot’’ of a static virtual sound imaging
system, creates strong demand for an effective head tracking
algorithm within the field of virtual sound. Adding access to a
video camera for the audio system gives the possibility to track
head movements and update the inverse filters accordingly. The
increasing interest in visual tracking is due in part to the falling
cost of computing power, video cameras and memory. A sequence of
images grabbed at or near video rate typically does not change
radically from frame to frame, and this redundancy of information
over multiple images can be extremely helpful for analysing the
input, in order to track individual objects.
|
Operational area
The performance of the binaural audio signal processing scheme is
limited by the condition number of the associated inversion problem.
The condition number as a function of frequency for different
listener positions and rotation is examined using an analytical
model. The resulting size of the ‘’operational area’’ with listener
head tracking is illustrated for different geometries of loudspeaker
configurations together with related cross-over design techniques.
|
|
Adaptive and static cross-over frequencies
|
HRTF
database
The measurement of arguably the most comprehensive KEMAR database of
head related transfer functions yet available is presented. A
complete database of head related transfer functions measured
without the pinna is presented.
Visual tracking
The update
of the audio signal processing scheme is initiated by a visual
tracking system that performs head tracking without the need for the
listener to wear any sensors.
|
HRTF database measurement rig |
|
Colour tracker
Contour tracker |
Filter update
techniques
The solution to the problem of updating
the filters without any audible change is solved by using either a very fine
mesh for the inverse filters or by using commutation techniques. The filter
update techniques are evaluated with subjective experiments and have proven
to be effective both in an anechoic chamber and in a listening room, which
supports the implementation of virtual sound imaging systems under realistic
conditions.
Integrated virtual sound imaging system
The design and implementation of a
visually adaptive Virtual Sound Imaging (VSI) system is carried out. The
system is evaluated with respect to filter update rates and cross-talk
cancellation effectiveness.
|
VSI System
|
Up
|