Core IO and DSP

Audio processing

load(path[, sr, mono, offset, duration, ...]) Load an audio file as a floating point time series.
to_mono(y) Force an audio signal down to mono.
resample(y, orig_sr, target_sr[, res_type, ...]) Resample a time series from orig_sr to target_sr
get_duration([y, sr, S, n_fft, hop_length, ...]) Compute the duration (in seconds) of an audio time series, feature matrix, or filename.
autocorrelate(y[, max_size, axis]) Bounded auto-correlation
zero_crossings(y[, threshold, ...]) Find the zero-crossings of a signal y: indices i such that sign(y[i]) != sign(y[j]).
clicks([times, frames, sr, hop_length, ...]) Returns a signal with the signal click placed at each specified time

Spectral representations

stft(y[, n_fft, hop_length, win_length, ...]) Short-time Fourier transform (STFT)
istft(stft_matrix[, hop_length, win_length, ...]) Inverse short-time Fourier transform (ISTFT).
ifgram(y[, sr, n_fft, hop_length, ...]) Compute the instantaneous frequency (as a proportion of the sampling rate) obtained as the time-derivative of the phase of the complex spectrum as described by [R3].
cqt(y[, sr, hop_length, fmin, n_bins, ...]) Compute the constant-Q transform of an audio signal.
hybrid_cqt(y[, sr, hop_length, fmin, ...]) Compute the hybrid constant-Q transform of an audio signal.
pseudo_cqt(y[, sr, hop_length, fmin, ...]) Compute the pseudo constant-Q transform of an audio signal.
fmt(y[, t_min, n_fmt, kind, beta, ...]) The fast Mellin transform (FMT) [R5] of a uniformly sampled signal y.
interp_harmonics(x, freqs, h_range[, kind, ...]) Compute the energy at harmonics of time-frequency representation.
salience(S, freqs, h_range[, weights, ...]) Harmonic salience function.
phase_vocoder(D, rate[, hop_length]) Phase vocoder.
magphase(D) Separate a complex-valued spectrogram D into its magnitude (S) and phase (P) components, so that D = S * P.

Magnitude scaling

amplitude_to_db(S[, ref, amin, top_db]) Convert an amplitude spectrogram to dB-scaled spectrogram.
db_to_amplitude(S_db[, ref]) Convert a dB-scaled spectrogram to an amplitude spectrogram.
power_to_db(S[, ref, amin, top_db, ref_power]) Convert a power spectrogram (amplitude squared) to decibel (dB) units
db_to_power(S_db[, ref]) Convert a dB-scale spectrogram to a power spectrogram.
perceptual_weighting(S, frequencies, **kwargs) Perceptual weighting of a power spectrogram:
A_weighting(frequencies[, min_db]) Compute the A-weighting of a set of frequencies.

Time and frequency conversion

frames_to_samples(frames[, hop_length, n_fft]) Converts frame indices to audio sample indices
frames_to_time(frames[, sr, hop_length, n_fft]) Converts frame counts to time (seconds)
samples_to_frames(samples[, hop_length, n_fft]) Converts sample indices into STFT frames.
samples_to_time(samples[, sr]) Convert sample indices to time (in seconds).
time_to_frames(times[, sr, hop_length, n_fft]) Converts time stamps into STFT frames.
time_to_samples(times[, sr]) Convert timestamps (in seconds) to sample indices.
hz_to_note(frequencies, **kwargs) Convert one or more frequencies (in Hz) to the nearest note names.
hz_to_midi(frequencies) Get the closest MIDI note number(s) for given frequencies
midi_to_hz(notes) Get the frequency (Hz) of MIDI note(s)
midi_to_note(midi[, octave, cents]) Convert one or more MIDI numbers to note strings.
note_to_hz(note, **kwargs) Convert one or more note names to frequency (Hz)
note_to_midi(note[, round_midi]) Convert one or more spelled notes to MIDI number(s).
hz_to_mel(frequencies[, htk]) Convert Hz to Mels
hz_to_octs(frequencies[, A440]) Convert frequencies (Hz) to (fractional) octave numbers.
mel_to_hz(mels[, htk]) Convert mel bin numbers to frequencies
octs_to_hz(octs[, A440]) Convert octaves numbers to frequencies.
fft_frequencies([sr, n_fft]) Alternative implementation of np.fft.fftfreqs
cqt_frequencies(n_bins, fmin[, ...]) Compute the center frequencies of Constant-Q bins.
mel_frequencies([n_mels, fmin, fmax, htk]) Compute the center frequencies of mel bands.
tempo_frequencies(n_bins[, hop_length, sr]) Compute the frequencies (in beats-per-minute) corresponding to an onset auto-correlation or tempogram matrix.

Pitch and tuning

estimate_tuning([y, sr, S, n_fft, ...]) Estimate the tuning of an audio time series or spectrogram input.
pitch_tuning(frequencies[, resolution, ...]) Given a collection of pitches, estimate its tuning offset (in fractions of a bin) relative to A440=440.0Hz.
piptrack([y, sr, S, n_fft, hop_length, ...]) Pitch tracking on thresholded parabolically-interpolated STFT

Dynamic Time Warping

dtw(X, Y[, metric, step_sizes_sigma, ...]) Dynamic time warping (DTW).
fill_off_diagonal(x, radius[, value]) Sets all cells of a matrix to a given value if they lie outside a constraint region.


logamplitude(S[, ref, amin, top_db, ref_power]) Convert a power spectrogram (amplitude squared) to decibel (dB) units