Projects


Here's a list of projects I've worked on in reverse chronological order (most recent projects listed first). The project descriptions are taken from paper abstracts.

Papers describing these projects can be found in my list of publications.


Head-tracked 3-D audio using loudspeakers


Front view


Side view

A loudspeaker-based, head-tracked, 3-D virtual acoustic display. This is the topic of my recently completed dissertation.

Abstract

3-D audio systems, which can surround a listener with sounds at arbitrary locations, are an important part of immersive interfaces. A new approach is presented for implementing 3-D audio using a pair of conventional loudspeakers. The new idea is to use the tracked position of the listener's head to optimize the acoustical presentation, and thus produce a much more realistic illusion over a larger listening area than existing loudspeaker 3-D audio systems. By using a remote head tracker, for instance based on computer vision, an immersive audio environment can be created without donning headphones or other equipment.

The general approach to a 3-D audio system is to reconstruct the acoustic pressures at the listener's ears that would result from the natural listening situation to be simulated. To accomplish this using loudspeakers requires that first, the ear signals corresponding to the target scene are synthesized by appropriately encoding directional cues, a process known as "binaural synthesis," and second, these signals are delivered to the listener by inverting the transmission paths that exist from the speakers to the listener, a process known as "crosstalk cancellation." Existing crosstalk cancellation systems only function at a fixed listening location; when the listener moves away from the equalization zone, the 3-D illusion is lost. Steering the equalization zone to the tracked listener preserves the 3-D illusion over a large listening volume, thus simulating a reconstructed soundfield, and also provides dynamic localization cues by maintaining stationary external sound sources during head motion.

The dissertation discusses the theory, implementation, and testing of a head-tracked loudspeaker 3-D audio system. Crosstalk cancellers that can be steered to the location of a tracked listener are described. The objective performance of these systems has been evaluated using simulations and acoustical measurements made at the ears of human subjects. Many sound localization experiments were also conducted; the results show that head-tracking both significantly improves localization when the listener is displaced from the ideal listening location, and also enables dynamic localization cues.

For more information:

Gardner, W. G. (1997). 3-D Audio Using Loudspeakers. Ph.D. Thesis, MIT Media Lab.

The dissertation has been published by Kluwer Academic Publishers:

Gardner, W. G. (1998). 3-D Audio Using Loudspeakers. Kluwer Academic Publishers, Norwell, MA. ISBN 0-7923-8156-4.

Gardner, W. G. (1997). Head-Tracked 3-D Audio Using Loudspeakers. Proc. IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, New Paltz, NY.

For information on visual head tracking at the Media Lab, see Sumit Basu's work on model-based head tracking and Nuria Oliver's work on LAFTER: Lips and Face Real-Time Tracker.


Spatial audio synthesis

(46K)

A headphone-based, 3-D auditory display using HRTFs measured from a KEMAR.

Abstract

An extensive set of head-related transfer function (HRTF) measurements of a Knowles Electronic Mannequin for Acoustic Research (KEMAR) has been made. The measurements consist of the left and right ear impulse responses from a Realistic Optimus Pro 7 loudspeaker mounted 1.4 meters from the KEMAR. Maximum length (ML) pseudo-random binary sequences were used to obtain the impulse responses at a sampling rate of 44.1 kHz. In total, 710 different positions were sampled at elevations from -40 degrees to +90 degrees.

This data has been used to implement a realtime 3-D spatialization system which runs on an SGI Indigo computer. The system allows a single monophonic source to be positioned arbitrarily around the head of a listener wearing headphones. The system uses two 128 point convolvers for the left and right channels and runs at a 32 kHz sampling rate. Control of source elevation, azimuth, and distance is achieved using a MIDI controller.

For more information:

Gardner, W. G., and Martin, K. D. (1995). HRTF measurements of a KEMAR. J. Acoust. Soc. Am. 97 (6), pp. 3907-3908.

Gardner, W. G., and Martin, K. D. (1994). HRTF measurements of a KEMAR dummy head microphone. MIT Media Lab Perceptual Computing Technical Report #280. Included on the CD-ROM "Standards in Computer Generated Music", Goffredo Haus and Isabella Pighi, eds., published by the IEEE CS Technical Committee on Computer Generated Music, 1996.

The HRTF data, 3-D spatialization software, and related information can be accessed via the KEMAR HRTF page.


Reverberation algorithms

A study of reverberation algorithms resulting in a book chapter on the subject.

Abstract

This chapter discusses reverberation algorithms, with emphasis on algorithms that can be implemented for realtime performance. The chapter begins with a concise framework describing the physics and perception of reverberation. This includes a discussion of geometrical, modal, and statistical models for reverberation, the perceptual effects of reverberation, and subjective and objective measures of reverberation. Algorithms for simulating early reverberation are discussed first, followed by a discussion of algorithms that simulate late, diffuse reverberation. This latter material is presented in chronological order, starting with reverberators based on comb and allpass filters, then discussing allpass feedback loops, and proceeding to recent designs based on inserting absorptive losses into a lossless prototype implemented using feedback delay networks or digital waveguide networks.

For more information:

Gardner, W. G. (1998). "Reverberation algorithms", in Applications of Digital Signal Processing to Audio and Acoustics, ed. M. Kahrs, K. Brandenburg, Kluwer Academic Publishers, Norwell, MA.

This book is now available; for more information, see the publisher's page for the book or the editor's page for the book.


Efficient convolution without latency

A block convolution algorithm that runs without latency by partitioning the filter response into blocks of exponentially increasing size.

Abstract

A block FFT implementation of convolution is vastly more efficient than the direct form FIR filter. Unfortunately, block processing incurs significant input/output delay which is undesirable for realtime applications. A hybrid convolution method is proposed which combines direct form and block FFT processing. The result is a zero delay convolver that performs significantly better than direct form methods.

For more information:

Gardner, W. G. (1995). Efficient convolution without input-output delay. J. Audio Eng. Soc. 43 (3), 127-136.

This paper won the 1997 AES Publications Award for the outstanding paper published in the Journal of the Audio Engineering Society during the two preceding years.

Gardner, W. G. (1994). Efficient convolution without input-output delay. Presented at the 97th convention of the Audio Engineering Society, San Francisco. Preprint 3897.


Perception of reverberation

Reverberation level matching experiments that determine how the loudness of running reverberation depends on the source signal and the reverberant response.

Abstract

Several psychoacoustics experiments are being conducted to investigate the audibility of reverberation in music. Anechoic musical signals are processed by two electronic reverberators, one with a fixed reverberation level, the other with an adjustable reverberation level control. The two reverberator outputs are selected with an A/B switch and presented to a subject via headphones. Subjects are instructed to adjust the level control to make the two stimuli sound equally reverberant.

One set of experiments matches tone sequences with different on/off duty cycles using the same reverberation time and decay shape. The results indicate that the perception of reverberation level is highly dependent on the quiet gaps present in the signal, as predicted by loudness masking. Dependence on melody is also significant, but less easily explained.

Another set of experiments matches different reverberation times and decay shapes using the same musical input signal. At a reference reverberation level of -20 dB, a typical solo piece may require 10 dB more reverberation at RT = 0.5 sec to sound as reverberant as the same piece at RT = 2.0 sec.

For more information:

Gardner, W. G., and Griesinger, D. (1994). Reverberation level matching experiments. Proc. of the Sabine Centennial Symposium, Acoust. Soc. of Am., pp. 263-266.


The virtual acoustic room

(61K)

A surround sound system that simulates room acoustics. This was the topic of my Master's thesis.

Abstract

A room simulator has been developed as part of a project involving virtual acoustic environments. The system is similar to auditorium simulators for home use. The simulated reverberant field is rendered using four to six loudspeakers evenly spaced around the perimeter of a listening area. Listeners are not constrained to any particular orientation, although best results are obtained near the center of the space. The simulation is driven from a simple description of the desired room and the location of the sound source. The system accepts monophonic input sound and renders the simulated reverberant field in realtime. Early echo generation is based on the source image model, which determines a finite impulse response filter per output channel. Diffuse reverberant field generation is accomplished using infinite impulse response reverberators based on nested and cascaded allpass filters. The system is implemented using Motorola 56001 digital signal processors, one per output channel.

For more information:

Gardner, W. G. (1992). A realtime multichannel room simulator. J. Acoust. Soc. Am., 92 (A), 2395.

Gardner, W. G. (1992). The virtual acoustic room. Master's thesis, MIT Media Lab.

The virtual acoustic room was implemented using the Reverb application described below.


Reverb application for the Macintosh

A reverberator design tool.

Reverb is a Macintosh application that allows the user to construct stereo delays, echo effects, chorusers, and reverberators using a simple programming language. The resulting audio effects can be applied to soundfiles, or can be run in realtime on a Digidesign Audiomedia card. When running in realtime, effects parameters can be adjusted via external MIDI control.

For more information:

Gardner, W. G. (1992). Reverb: a reverberator design tool for Audiomedia. Proc. Int. Comp. Music Conf., San Jose, CA.

The program is available via FTP, and comes with a manual and sample effects programs. The program is very out of date at this point, though it is still functional under certain conditions. Be sure to read the release notes accompanying the program.

Back to sound.media.mit.edu root


Bill Gardner <billg@media.mit.edu>
Wave Arts, Inc.
99 Massachusetts Avenue
Arlington, MA 02474
Tel/Fax: 781-646-3794