Open an interactive online version by clicking the badge Binder badge or download the notebook.

Pyrato Introduction#

Pyrato (Python Room Acoustics Tools) is a collection of functions for commonly used operations in room acoustics. It is part of the pyfar ecosystem of Python packages for acoustics research.

The pyrato package contains classes and functions for commonly used operations in room acoustics.

We develop all packages openly across diverse research institutions, avoiding general branding of code or functionality. Contributions (in any form, i.e. reporting issues, submitting bug-fixes or implementation of new features) are welcome. To get in touch, either contact us via

Room Acoustic Parameter Calculation Using Pyrato#

This notebook introduces Pyrato, a room‑acoustics toolbox, and demonstrates how to calculate standard room acoustic parameters from a measured room impulse response. The analysis uses a real room impulse response (RIR) recorded in the Berliner Philharmonie, preserving the full decay to capture the authentic acoustics of the hall. The measured RIR is pre‑processed by bandpass filtering into octave bands and visualized (including interactive plots) to inspect the time‑domain structure in each band. On this basis, the notebook computes key parameters such as the Energy Decay Curve (EDC), reverberation time T20, clarity C80, and Speech Transmission Index (STI), illustrating a complete workflow from loading the broadband RIR measurement to parameter evaluation within the pyrato ecosystem.

For our investigations, we use a real room impulse response (RIR) provided by the pyfar.signlas.files module.

This RIR was measured in the Berliner Philharmonie, a world-famous concert hall in Berlin, Germany. It was recorded using a high-quality microphone and sound source, capturing the hall’s unique acoustic signature.

Loading the RIR via pyfar.signals.files.room_impulse_response returns a pyfar.Signal object, which:

  • Stores the time-domain impulse response

  • Preserves the entire decay, including the late reverberant tail

  • Supports multi-channel signals (e.g., stereo or array measurements)

  • Integrates sampling rate and meta-data

The RIR is part of the example signals provided by the pyfar.signals.files module. To load the data, run the following cell

[3]:
rir = pf.signals.files.room_impulse_response(sampling_rate=48000, crop_noise_tail=False)

plt.figure()
pf.plot.time(rir, unit='ms', dB=True)
plt.title("Measured RIR — Berliner Philharmonie (raw)")
plt.show()

Pre-Processing#

For frequency-dependent room acoustic analysis, the RIR is filtered into octave bands using pyfar.dsp.filter.fractional_octave_bands. This returns a multi-channel Signal where each channel corresponds to one octave band (e.g., 125 Hz, 250 Hz, 500 Hz, etc.), enabling separate analysis of how the Berliner Philharmonie’s acoustics vary across the frequency spectrum.

Before filtering, the RIR is aligned by detecting the first sample which is below the threshold level relative to the maximum level of the impulse response and shifting the impulse start to \(t=0\) using pyfar.dsp.find_impulse_response_start and pyfar.dsp.time_shift.

The Philharmonie RIR is split into these standard ISO octave bands. This pre-processing step is essential because parameters like T20, C80, and STI are computed per frequency band to capture the hall’s frequency-dependent behavior.

[4]:
start_sample = pf.dsp.find_impulse_response_start(rir)
rir_shifted = pf.dsp.time_shift(rir, -start_sample, mode='linear')

# Filter into octave bands
rir_octave_bands  = pf.dsp.filter.fractional_octave_bands(
    rir_shifted, num_fractions=1, frequency_range=(125, 4000))
octave_bands  = pf.constants.fractional_octave_frequencies_nominal(
    num_fractions=1, frequency_range=(125, 4000))

Room Acoustic Parameters#

Room acoustic parameters quantify the perceptual quality of a room for music and speech. The pyrato.parameters module provides implementations of ISO 3382 parameters calculated directly from the RIR or its octave-band filtered versions.

Lundeby’s Method#

Lundeby et al. [2] proposed a noise compensation method, where the RIR is truncated to the intersection time and additionally intruced a correction term \(C_\mathrm{Comp}\).

\[E_\mathrm{Lundeby}(t) = \int_t^{t_i} h^2(\tau) d\tau + C_\mathrm{Comp}\]
\[C_\mathrm{Comp} = p_{i}^2 \cdot T_{60, \mathrm{late}} \cdot \frac{1}{6 \cdot ln(10)} \cdot R_s\]

\(p_{i}^2\) denotes the squared sound pressure at the intersection time, \(T_{60, \mathrm{late}}\) the late reverberation time after Lundeby and \(R_\mathrm{s}\) the sampling rate.

This is implemented in pyrato.energy_decay_curve.energy_decay_curve_lundeby.

The band-passed EDC can be inspected interactively. Pyfar provides convenient keyboard shortcuts for switching between channels, zooming, and panning.

All following figures support interactive shortcuts. If the plot does not respond, click the figure first to activate it.

Use , / . (DE keyboard) or [ / ] (EN keyboard) to cycle through octave bands. Shift + a toggles between showing all channels and a single channel.

[6]:
edc_lundeby =  pyrato.edc.energy_decay_curve_lundeby(
        rir_octave_bands,
        freq='broadband',
        noise_level='auto',
        is_energy=False,
        time_shift=True,
        channel_independent=True,
        normalize=True,
        plot=False)
plt.figure()
pf.plot.time(edc_lundeby, unit='s', dB=True)
plt.title(f"Energy Decay Curve")


[6]:
Text(0.5, 1.0, 'Energy Decay Curve')

Reverberation Time T20#

Reverberation time T20 measures how long it takes for sound to decay by 60 dB after the source stops. T20 fits a regression line to the −5 dB to −25 dB range of the EDC (optimal SNR region) and extrapolates to −60 dB (pyrato.parameters.reverberation_time()).

[ ]:
T20 = pyrato.parameters.reverberation_time_linear_regression(edc_lundeby,T='T20',return_intercept=False)
print("Octave band center frequencies in Hz:", octave_bands)
print("\nReverberation time T20 in s:\n",T20)

Now, to plot T20 over our octave band center frequenciea we can simply create a pyfar.FrequencyData() type object and plot it directly in pyfar.

[ ]:
T20_FrequencyData = pf.FrequencyData(T20[:, 0], octave_bands)

plt.figure()
pf.plot.freq(T20_FrequencyData,marker='o', dB=False)
plt.xlabel('Frequency in Hz')
plt.ylabel('T20 in s')
plt.title('Reverberation Time T20 in Octave Bands')

Clarity C80#

Parameter Clarity is frequently used for measuring intelligibility and musical definition. It quantifies the balance between early (first 80 ms: direct sound + early reflections) and late reverberation energy via their logarithmic ratio:

\[C_{80} = 10 \log_{10} \left( \frac{ \displaystyle \int_{0}^{80} p^2(t)\,\mathrm{d}t }{ \displaystyle \int_{80}^{\infty} p^2(t)\,\mathrm{d}t } \right)\]

Clarity can be efficiently computed directly from the EDC as it is implemented in pyrato.parameters.clarity():

\[C_{80} = 10 \log_{10} \frac{E(0) - E(80)}{E(80) - E(\infty)} = 10 \log_{10} \left( \frac{E(0)}{E(80)} - 1 \right),\]

where \(E(\infty) = 0\) by definition of the EDC. This approach applies to many room-acoustic parameters that are based on an energy ratio.

For flexibility we also allow different time limits. So calculation of a \(C_{\mathrm{limit}}\) with limit = 50, 60, … milliseconds is also possible.

We calculate the C80 for our previously computed octave bands EDCs with the early time limit in milliseconds:

[ ]:
C80_octave_bands = pyrato.parameters.clarity(
    edc_lundeby,
    early_time_limit=80
)

print("Octave band center frequencies in Hz:", octave_bands)
print("\nClarity C80 in dB:\n", C80_octave_bands)

Single number averages#

Rooms are aligned with one another using the single value specified in ISO 3382-2.

According to ISO 3382-2, the average should be taken over the 500 Hz and 1 kHz octave bands.

[ ]:
C80_singhle_number = np.mean(C80_octave_bands[2:4])
print(f"\nClarity C80 single number: {C80_singhle_number.item():.2f} dB\n",)

Speech Transmission Index#

The Speech Transmission Index (STI) quantifies speech intelligibility on a scale from 0 (unintelligible) to 1 (perfect). It measures how well the temporal amplitude modulations of speech are preserved when transmitted through a room or audio system.

STI is calculated by analyzing the modulation transfer across seven octave bands (125 Hz – 8 kHz) and 14 defined modulation frequencies (≈0.5–16 Hz).

Room reflections, reverberation, and background noise reduce modulation depth, lowering STI values.

In pyrato, the pyrato.parameters.speech_transmission_index_indirect() function can compute STI directly from a room impulse response (RIR) without additional specific test signals.

Therefore we load measurements for typical male speech (65 dB(A), IEC 60268-16:2020 Annex B) and background noise representative of an unoccupied auditorium (NC-20, ANSI/ASA S12.2-2019).

And compute the SNR simply by subtracting the noise from the signal level.

[ ]:
level       = np.array([70, 66, 83, 67, 60, 60, 60])   # dB(A), octave bands 125 Hz–8 kHz
noise_level = np.array([40, 33, 26, 22, 19, 17, 16])   # dB(A), NC-20 reference

snr = level - noise_level

Then, we can simply calculate the STI with pyrato and format the result numpy-like.

[ ]:
sti = pyrato.parameters.speech_transmission_index_indirect(
    rir,
    rir_type="acoustical",
    level=level,
    snr=snr,
    ambient_noise_correction=True,
).item()

print(f"Speech Transmission Index (STI): {sti:.2f}")

Other Packages#

Now, feel free to have a look at pyfar.org to get an idea of the pyfar sub-packages.

License notice#

This notebook © 2024 by the pyfar developers is licensed under CC BY 4.0

CC BY 4.0

Watermark#

[ ]:
%load_ext watermark
%watermark -v -m -iv