Psychoacoustic Tonality

Psychoacoustic tonality is based on the ECMA-418-2 (formerly known as ECMA 74 Annex G) standard. ECMA-418-2 proposes a tonality indicator, generally based on the principle of "partial loudness" applied on tones in noise. Partial loudness corresponds to the loudness of a sound when it is presented together with masking noise, as opposed to when it is presented alone in quiet. It considers that sounds at a given level that are presented with masking noise are perceived as less loud as when presented alone, which is often confirmed experimentally in the literature. Here, the partial loudness of the tonal content is under consideration, and serves as a basis for the calculation of tonality.

The unit of tonality according to ECMA-418-2 is the tuHMS (tonality unit – Hearing Model of Sottek). A value of 1 tuHMS corresponds to the tonality of a 1-kHz tone with a sound pressure of 40 dB SPL in quiet. For other signals, tonality values should vary between 0 tuHMS and a few tuHMS (sometimes even above 10 tuHMS) depending on the tonal contents of the signals. A value of 0 tuHMS indicates that no tonality could be detected (that is, no tonal content), while high values in tuHMS indicate prominent tonal contents.

The tonality calculation is summarized in the second figure below. Its input is basically the specific loudness calculated according to a hearing model described in the standard (summarized in the first figure).

Specific loudness according to ECMA-418-2

Calculation of specific loudness according to ECMA-418-2 relies on a perception model, and includes these steps:

  • Outer and middle/inner ear filtering
  • Decomposition of the signal into overlapping critical bands by means of an auditory filterbank
  • Half-wave rectification, to account for the fact the auditory nerves only fire when the basilar membrane vibrates in a specific direction
  • Time-block-wise root-mean-square value calculation
  • Compressive non-linearity to transform signal energy into loudness values
  • Comparison to threshold in quiet to obtain the final specific loudness values
Figure 1. Calculation of ECMA-418-2 specific loudness. From ECMA-418-2 (Figure F.1)

Tonality from specific loudness

The calculation of tonality relies on a hypothesis under which the neuronal processing in human hearing applies a running autocorrelation analysis of the critical band signals. Calculation follows these steps:

  • Block-wise autocorrelation functions (ACF) applied to the outputs of the auditory filterbank
  • Averaging across neighboring critical bands and time blocks
  • Windowing of the averaged ACFs to focus analysis on the tonality part
  • Estimation of the tonal loudness using a Discrete Fourier Transform of the windowed ACFs
  • Noise reduction corresponding to a ratio calculation between tonal loudness and total loudness
  • Final estimation of specific tonality, time-dependent tonality, and overall tonality values
Figure 2. Calculation of ECMA-418-2 tonality. From ECMA-418-2 (Figure G.1)