Auditory contours and Gestalt rules for sound analysis

Yoonseob Lim (Boston University), Barbara Shinn-Cunningham (Boston University), Timothy Gardner (Boston University)

Biological auditory systems have evolved to extract meaningful signals in complex acoustic environments, where engineering solutions often fail. The principles underlying this performance are not known, but it is hoped that a closer examination of neural auditory systems could inspire new paradigms. One distinguishing feature of the biological solution is the existence of parallel processing pathways from the cochlea to the cortex. Theories are needed to explain how this diverse set of channels could be optimally combined at higher levels of processing. Here we define a method for sound analysis that builds an efficient high-level representation from parallel early streams. In the proposed method, redundant pathways represent the sound in a range of time-scales. Through these streams, the sound waveform is converted into over-complete set of auditory contours---smooth curves in the time-frequency plane. From this collection of contours, Gestalt principles that apply to human visual and auditory perception guide the selection of the simplest, most compact representation for a given sound.  As a result of this analysis, components of complex sounds are portrayed as coherent shapes, extending over long ranges in time and frequency, allowing for a variety of new forms of sound transformation and identification.

Auditory contours and Gestalt rules for sound analysis
a. Multi-band analysis of a zebra finch syllable. Contour representations are calculated for different time scales; only the simplest contours are preserved from each time-scale. (Low power contours are also removed.) Percussive elements are captured in the short time-scale analysis, while more harmonic and tonal structure is represented in the long time-scale analysis. Combining “good” contours from different bands yields a sound that is perceptually similar to the original sound (compare sonogram of resynthesis with original sonogram in (b)). b. Sonogram of original zebra finch syllable.
Preferred presentation format: Poster
Topic: Computational neuroscience

Document Actions