Roughness and Fluctuation Strength

Roughness and Fluctuation Strength are sensations induced by signal modulations. The first description of the phenomenon is reported by Helmholtz: if two tones have different frequencies, they coexist inside the propagation environment, but the resulting sound pressure exhibits amplitude fluctuations called beats. Helmholtz distinguishes three perceptual qualities (see the figure below).

If the two-tone frequencies are only slightly different (the difference is smaller than 20 Hz), one single frequency (the mean of the two) is heard and the beating can be perceived as such, because the ear is able to follow the modulation. The resulting sensation is called Fluctuation Strength.

When the frequency difference is increased above 20 Hz, the modulation becomes too fast to follow, and the sensation become similar to what would be heard when rubbing rough surfaces together. The resulting sensation is called Roughness.

When the frequency difference becomes even larger (typically above 120 Hz), the two tones are separately heard without any modulation.

The three perceptual qualities for a pair of pure tones as a function of their frequency difference (Df) from Helmholtz.

Increasing Roughness tends to make sounds more aggressive and annoying, even if it does not modify the acoustic level or loudness. Zwicker and Fastl propose the asper as the unit to describe the Roughness sensation. They defined that one asper is the Roughness induced by a pure tone at one kHz with a level of 60 dB, amplitude-modulated at a frequency of 70 Hz, with a modulation depth of 100%.

The Fluctuation Strength reaches its maximum at a modulation frequency of four Hz. Its unit is the vacil. The reference value One vacil is defined as the Fluctuation Strength induced by a pure tone at one kHz with a level of 60 dB, amplitude-modulated at a frequency of four Hz, with a modulation depth of 100%.

Although defined around the concept of amplitude modulation, Roughness and Fluctuation Strength can also be caused by frequency modulation. Note also that these two sensations are not independent of loudness (and therefore acoustic level): they increase when loudness is increased.

For amplitude-modulated sinusoidal signals, Roughness and Fluctuation Strength sensations can be easily estimated on the basis of the modulation of the signal envelopes.

For broadband signals, the estimation becomes more difficult. For example, a white noise is random by definition, and, as a consequence, its envelope also presents random modulations. But no real Roughness or Fluctuation Strength sensation can be heard, because the modulations of the different frequencies are not in phase.

To take into account this phenomenon, our model makes use of the following strategy:

First the signal is filtered by a filter bank which models the auditory cochlear filtering. A modulation rate is computed for each channel after the filtering.

These modulations are aggregated, by taking in account correlations between different channels. For a random signal, possible modulations in the different channels are not correlated, and their aggregation will give a weak value. For a truly modulated signal, modulations in each channel will be all synchronized and correlated. Their aggregation will give a higher value.