librosa.feature.spectral_flatness

librosa.feature.spectral_flatness(y=None, S=None, n_fft=2048, hop_length=512, win_length=None, window='hann', center=True, pad_mode='reflect', amin=1e-10, power=2.0)[source]

Compute spectral flatness

Spectral flatness (or tonality coefficient) is a measure to quantify how much noise-like a sound is, as opposed to being tone-like [1]. A high spectral flatness (closer to 1.0) indicates the spectrum is similar to white noise. It is often converted to decibel.

[1]Dubnov, Shlomo “Generalization of spectral flatness measure for non-gaussian linear processes” IEEE Signal Processing Letters, 2004, Vol. 11.
Parameters:
y : np.ndarray [shape=(n,)] or None

audio time series

S : np.ndarray [shape=(d, t)] or None

(optional) pre-computed spectrogram magnitude

n_fft : int > 0 [scalar]

FFT window size

hop_length : int > 0 [scalar]

hop length for STFT. See librosa.core.stft for details.

win_length : int <= n_fft [scalar]

Each frame of audio is windowed by window(). The window will be of length win_length and then padded with zeros to match n_fft.

If unspecified, defaults to win_length = n_fft.

window : string, tuple, number, function, or np.ndarray [shape=(n_fft,)]
  • a window specification (string, tuple, or number); see scipy.signal.get_window
  • a window function, such as scipy.signal.hanning
  • a vector or array of length n_fft
center : boolean
  • If True, the signal y is padded so that frame t is centered at y[t * hop_length].
  • If False, then frame t begins at y[t * hop_length]
pad_mode : string

If center=True, the padding mode to use at the edges of the signal. By default, STFT uses reflection padding.

amin : float > 0 [scalar]

minimum threshold for S (=added noise floor for numerical stability)

power : float > 0 [scalar]

Exponent for the magnitude spectrogram. e.g., 1 for energy, 2 for power, etc. Power spectrogram is usually used for computing spectral flatness.

Returns:
flatness : np.ndarray [shape=(1, t)]

spectral flatness for each frame. The returned value is in [0, 1] and often converted to dB scale.

Examples

From time-series input

>>> y, sr = librosa.load(librosa.util.example_audio_file())
>>> flatness = librosa.feature.spectral_flatness(y=y)
>>> flatness
array([[  1.00000e+00,   5.82299e-03,   5.64624e-04, ...,   9.99063e-01,
      1.00000e+00,   1.00000e+00]], dtype=float32)

From spectrogram input

>>> S, phase = librosa.magphase(librosa.stft(y))
>>> librosa.feature.spectral_flatness(S=S)
array([[  1.00000e+00,   5.82299e-03,   5.64624e-04, ...,   9.99063e-01,
      1.00000e+00,   1.00000e+00]], dtype=float32)

From power spectrogram input

>>> S, phase = librosa.magphase(librosa.stft(y))
>>> S_power = S ** 2
>>> librosa.feature.spectral_flatness(S=S_power, power=1.0)
array([[  1.00000e+00,   5.82299e-03,   5.64624e-04, ...,   9.99063e-01,
      1.00000e+00,   1.00000e+00]], dtype=float32)