ABRL Ongoing Projects

Contribution of cochlear mechanics on informational masking

Information masking has been considered as a central process, but there are some evidence indicating a possible contribution from auditory periphery (e.g., cochlea).  Our early studies indicate that listeners with high-frequency sensorineural hearing loss make greater use of high- frequency information than low even though this strategy is nonoptimal while the normal-hearing listeners do not show such a trend.  Moreover, the performance trend on how to attend to the information contained in the individual frequencies of suprathreshold sounds is variable across individuals and is remarkably consistent over time.   The contribution of cochlear mechanics on informational masking has been explored by using otoacoustic emissions (OAEs).  This project is a part of collaboration with Glenis Long at the graduate center of CUNY.

For more information, see the Glenis Long site here.

Computational Auditory Scene Analysis

Current computational models of sound source identification fall far short of the human capacity for identification. Now, a recent advance in sparse signal encoding suggests a means of significantly improving the performance of these models.

Computational models can play an important role in helping to understand our ability to identify everyday objects and events from sound (Bregman, 1990). The traditional approach to modeling has relied on the extraction of structured features and Gestalt schema for identifying sound sources in a mixture (Ellis, 1996; Martin, 1999). These models require high information rates and much prior knowledge of signals for their performance, yet they still fall short of the human capacity for identification. 

Now, a recent significant advance in sparse signal encoding suggests a means of improving the performance of these models. Compressed sensing, CS, (Donoho, 2006), replaces the extraction of knowledge-based features with the projection of signals onto a small set of incoherent basis functions. The result for sparse signals is accurate identification with few samples and little prior information about signals. 

Our goal in this project is to determine whether CS can be included as an early stage of encoding in traditional models to substantially reduce the information rate required by these models to approach the identification performance of human listeners.

[Work Supported by NIDCD Grant #R01 DC06875] 

Auditory Pattern Analysis

Information in sound is conveyed by patterns of variation across frequency and time. We have developed a computation model (the Component Relative-Entropy Model) that accurately predicts in many cases how listeners detect both lawful and random patterns of acoustic variation as occur in nature.

Our remarkable ability to process information in sound is demonstrated everyday as we make sense of the complex and continuous pattern of variation in the acoustic signals we encounter. The purpose of this project is to achieve a better understanding of this ability through a formal analysis of the ability to discriminate variable acoustic patterns made up of tones. 

There are three key elements of our approach. First, all efforts are linked by a single mathematical- methodological framework where the information in a pattern is given precise meaning and listener performance is evaluated relative to a common theoretical standard. Second, the relative extent to which listeners make use of (weight) different sources of information within patterns is determined from trial-by-trial analyses of the data. Third, specific hypotheses regarding the outcome of experiments are generated based on known nonlinear transformations performed at the auditory periphery and a decision model that has made accurate predictions for the results of many past studies [R.A. Lutfi, J. Acoust. Soc. Am. 94, 748-758 (1993)]. 

The results of the proposed studies are intended to further our understanding of how natural redundancies in patterns aid detection in noisy backgrounds, and how listeners process invariant relations among components that define dynamic properties of patterns like those of speech and other meaningful sounds.

[Work Supported by NIDCD Grant #R01 DC01262] 

Human Sound Source Identification

Our ability to identify simple objects and events from sound is crucial for normal function in the world. Understanding the normal processes underlying this ability is key to the development of effective technologies for dealing with the impact of dysfunctional hearing on everyday listening.

What information in a sound allows us to identify its source? If the physical properties of the source (and the force driving it) are known, then theoretically it is possible to determine the sound it will produce. The inverse problem is rarely as simple. If one is uncertain regarding the source, then recovering source properties from the sound, analytically, can prove quite difficult. The human auditory system excels at this task, even in the face of enormous uncertainty regarding possible sources. 

How it does this remains largely a mystery. In this project we take a novel approach to investigate human sound source identification. Using the principles of theoretical acoustics, we approximate the sound pressure waveform at the ear as it is generated by a number of simple resonant objects. We then examine the listeners ability to detect the lawful covariation among parameters of the resultant acoustic waveform. 

By measuring correlates between various features of the waveform and the listeners response we are able to identify the relevant aspects of the dynamic variation in the acoustic signal that listeners use to identify these sources.

[Work Supported by NIDCD Grant #R01 DC06875]

 

ABRL Completed Projects

Auditory Preferences of Children with Autism

Knowledge of what makes speech interesting to children with autism can be exploited in strategies to increase their processing and production of speech. This project aims to evaluate the listening preference of children with autism for different prosodic features of speech.

Recent research has found evidence of enhanced prosodic perception and decreased prosodic production among children with Autism Spectrum Disorder (ASD). Because a large number of children with ASD remain non-verbal, it is important to understand the factors responsible for the mismatch of perceptual input and speech output. 

The current study evaluates one such factor by examining the effect of prosodic features on listening preference for speech. Comparisons are made across three types of utterances of spondees (naturally spoken, sung, and monotonic) produced by both male and female speakers. 

Listening preference of 2-5 year old children with ASD and a matched control of typically-developing children is assessed by measuring the frequency with which the child initiates different utterances as a form of play.

[Work Supported by an Undergraduate Hilldale Foundation Award from the University of Wisconsin-Madison] 

Auditory Abilities of Children

This project represented an effort to characterize and quantify the development of auditory processing skills in preschool and school-aged children. 

There were two distinguishing features of this effort. First, each child was tested repeatedly in all conditions of an experiment producing precise estimates of both between and within subjects variability. Second, the psychophysical procedures were adaptations of the rigorous forced-choice paradigms routinely used to assess adult auditory function. 

This permits meaningful comparisons of adult and child performance. Our published results showed that the average frequency resolution, temporal resolution, and spectral pattern discrimination skills of children did not reach adult levels until age six or later. 

However, variability in performance among children is much higher than among adults. Our experiments suggested that this variability may have resulted from sub-optimal listening strategies employed by children. A major goal of current experiments was to understand and characterize in detail these listening strategies. 

We applied a simplified version of the sample-discrimination paradigm (Lutfi, 1989) in which various aspects of a child's performance in discriminating tone patterns were compared to the performance of an ideal observer as specified by detection theory.

[Work Supported by NICHHD Grant #R01 HD NS23333] 

Discrimination of Complex Sounds by the Hard of Hearing

Threshold measures such as the pure-tone audiogram might tell us how well a person hears but they cannot tell us how well a person listens. How might reduced sensitivity for particular frequencies affect the distribution of attention across frequency? 

In this project we applied a recently developed analytical technique to assess how normal-hearing and sensorineural hearing-impaired listener's attend to the information contained in the individual frequencies of suprathreshold sounds. The result was an information weighting function of frequency, what we referred to as a pure-tone infogram. 

Our early studies indicate that listeners with high-frequency sensorineural hearing loss made greater use of high-frequency information than low, even though this strategy was non-optimal. It is as if the hearing-impaired listeners somehow attempted to compensate for their loss by attending more to information in a region of reduced sensitivity. 

The results cannot be explained by a general tendency to listen at high frequencies since the normal-hearing listeners did not show such a trend. We experimented with ways of improving the performance of hearing-impaired listeners in these tasks by modifying their weighting function or by “repackaging” the information so as to match their weighting function.

[Work Supported by Grants from the DRF and NOHR]

Additivity of Auditory Masking

This project implemented a formal methodology for studying and modeling the auditory processes involved in the detection of signals in the presence of complex masking sounds. 

In the method, signal threshold was obtained in the presence of two or more simple maskers (tones or noise bands) individually and in various combinations. The maskers might have been separated in frequency, in time, or in both frequency and time. An attempt was then made to predict the combined effect of the maskers from the individual effects of the maskers using basic axioms of Measurement Theory (Coombs, Dawes, and Tversky, 1970).

Specifically, mathematical transformations were sought whereby the combined masking effects were related to the individual effects by addition. The rationale underlying this "measurement approach" was that the mathematical transformations, if they existed, were used in lieu of a process model to summarize the results of past experiments, to predict the results of future experiments, and to shed light on the nature of the mechanisms underlying masking by complex sounds.

[Work Supported by NSF Grant #BNS83-08498, ONR Grant #1142-88-A0138 and AFOSR Grant #86-NL-178] 

Perception of Auditory Motion

The goal of this project was to simulate an out-of-head experience of sound motion over headphones. Our approach was to reconstruct the information intrinsic to the sound pressure wavefront of a moving object as it travels along an arbitrary trajectory in three-dimensional space. 

Various cues to motion such as Doppler shifts, changes in interaural-time delays, changes in interaural-intensity differences, and sound scattering by the head and pinnae was shown to have dependencies obeying the physical laws of kinematics. We applied these lawful relations and developed mathematical techniques for synthesizing stimuli to be played over headphones. 

The precise stimulus control afforded by this approach was allowance of future psychophysical tests to determine which features of the acoustic wavefront human listeners use to track sound trajectory.