Auditory Events of Multi-loudspeaker Playback

Zotter, Franz; Frank, Matthias

doi:10.1007/978-3-030-17207-7_2

Franz Zotter⁵ &
Matthias Frank⁵

Part of the book series: Springer Topics in Signal Processing ((STSP,volume 19))

27k Accesses
2 Citations

Abstract

This chapter describes the perceptual properties of auditory events, the sound images that we localize in terms of direction and width, when distributing a signal with different amplitudes to one or a couple of loudspeakers. These amplitude differences are what methods for amplitude panning implement, and they are also what mapping of any coincident-microphone recording implies when reproduced over the directions of a loudspeaker layout. Therefore several listening experiments on localization are described and analyzed that are essential to understand and model the psychoacoustical properties of amplitude panning on multiple loudspeakers of a 3D audio system. For delay-based recordings or diffuse sounds, there is some relation, however, it is found to be less stable for the desired applications. Moreover, amplitude panning is not only about consistent directional localization. Loudness, spectrum, temporal structure, or the perceived width should be panning-invariant. The chapter also shows experiments and models required to understand and provide those panning-invariant aspects, especially for moving sounds. It concludes with openly-available response data of most of the presented listening experiments.

It is evident that until one knows what information needs to be presented at the listener’s ears, no rational system design can proceed.

Michael A. Gerzon 1976 and AES Vienna [1] 1992.

You have full access to this open access chapter, Download chapter PDF

Starting from classic listening experiments on stereo panning by Leakey [2], Wendt[3], and pairwise horizontal panning by Theile [4], this chapter explores the relevant perceptual properties for 3D amplitude panning and their models. Important experimental studies considered here are for instance those by Simon [5], Kimura [6], F. Wendt [7], Lee [8], Helm [9], and Frank [10, 11]. By the experimental results, it is possible to firmly establish Gerzon’s [1] E, $\varvec{r}_\mathrm {E}$ and $\Vert \varvec{r}_\mathrm {E}\Vert $ estimators for perceived loudness, direction, and width that apply to most stationary sounds in typical studio and performance environments.

2.1 Loudness

At a measurement point in the free field, the same signal fed to equalized loudspeakers of exactly the same acoustic distance would superimpose constructively ($+6$ dB).

In a room with early reflections and a less strict equality of the incoming pair of sounds (typical, slight inaccuracy in loudspeaker/listener position, different mounting situations, different directions in the directivities of ears and loudspeakers), the superposition can be regarded as stochastically constructive ($+3$ dB) in particular at frequencies that aren’t very low.

For the above reasoning, typical amplitude panning rules try to keep the weights distributing the signal to the loudspeakers normalized by root of squares instead of normalizing to the linear sum, in order to obtain constant loudness ([12], VBAP):

$$\begin{aligned} g_l\leftarrow \frac{g_l}{\sqrt{\sum _{l=1}^\mathrm {L} g_l^2}}. \end{aligned}$$

(2.1)

Loudness Model. If all loudspeakers are equalized, located at the same distance to the listener, and fed by the same signal with different amplitude gains $g_l$, a constructive interference could be expected so that the amplitude becomes [1]

$$\begin{aligned} P=\sum _{l=1}^\mathrm {L} g_l. \end{aligned}$$

(2.2)

However, the interference stops to be strictly constructive as soon as the room is not entirely anechoic, the sitting position is not exactly centered, or even for anechoic and centered conditions at high frequencies, when the superposition at the ears cannot be assumed to be purely constructive anymore. Then it is better to assume a less well-defined, stochastic superposition in which a squared amplitude is determined by the sum of the squared weights [1]:

$$\begin{aligned} E=\sum _{l=1}^\mathrm {L} g_l^2. \end{aligned}$$

(2.3)

Therefore, the most common amplitude panning rules use root-squares normalization to obtain a loudness impression that is as constant as possible.

The measure E seems to be most useful when designing and evaluating amplitude-panning or coincident microphone techniques. It is not surprising that the ITU-R BS.1770-4^{Footnote 1} uses the Leq(RLB) measure as a loudness model: it is essentially the RMS level after high-pass filtering, cf. [13], which is closely related to the E measure detected from loudspeaker signals.

An interesting refinement was proposed by Laitinen et al. [14], which uses a measure $\root p \of {\sum _{l=1}^\mathrm {L}g_l^p}$ in which the exponent p is close to 1 at low frequencies under anechoic conditions and close to 2 at high frequencies/under reverberant conditions.

2.2 Direction

In the early years of stereophony, researchers investigated the differences in delay times and amplitudes required to control the perceived direction. Below, only experiments are considered that did not use fixation of the listener’s head.

2.2.1 Time Differences on Frontal, Horizontal Loudspeaker Pair

The dissertation of K. Wendt in 1963 [3] shows notably accurate listening experiments done on $\pm 30^\circ $ two-channel stereophony using time delays, in which listeners indicated from where they heard the sounds for each of the tested time differences. H. Lee revisited the properties in 2013 [8], but with musical sound material and an experiment, in which the listener adjusted the time differences until the perceived direction matched the one of a corresponding fixed reference loudspeaker, Fig. 2.1.

The time differences are seldom applicable to reliable angular auditory event placement: auditory images are strongly frequency-dependent (not shown here) and therefore unstable for narrow-band sounds. Leakey and Cherry showed 1957 [2] that time-delay stereophony loses its effect under the presence of background noise.

2.2.2 Level Differences on Frontal, Horizontal Loudspeaker Pair

K. Wendt’s [3] and H. Lee’s [8] experiments deliver insights in sound source positioning with $\pm 30^\circ $ two-channel stereophony, however this time with level differences.

As opposed to Fig. 2.1, in which auditory image panning with time differences were characterized by statistical spreads of up to $15^\circ $, level-difference-based panning is clearly smaller in the spread of perceived directions than $10^\circ $, Fig. 2.2.

Signal dependency. Wendt [3] described the signal dependency of panning curves on various transient and band-limited sounds, and Lee [8] for musical sounds. A new comprehensive investigation on frequency dependency was carried out by Helm and Kurz [9]. With level differences $\{0,\,3,\,6,\,9,\,12\}$ dB and third-octave filtered pulsed pink noise at $\{125,250,500,1\mathrm {k},2\mathrm {k},4\mathrm {k}\}$ Hz, they showed that the perceived angle pointed at by the listeners using a motion-tracked pointer was similar between the broad-band case and third-octave bands below 2 kHz. In bands above 2 kHz, smaller level differences cause a larger lateralization, see interpolated curves in Fig. 2.3.

2.2.3 Level Differences on Horizontally Surrounding Pairs

Successive pairwise panning on neighboring loudspeaker pairs is typically used to pan auditory events freely along the loudspeakers of a horizontally surrounding loudspeaker ring. The classical research done specifically targeted at such applications was contributed by Theile and Plenge 1977 [4]. They used a mobile reference loudspeaker with some reference sound that could be moved to match the perceived direction of a loudspeaker pair playing pink noise with level differences at different orientations with respect to the listener’s head. There is also the experiment of Pulkki [15] using a level-adjustment task, in which levels were adjusted as to match the auditory event to one of a reference loudspeaker at three different reference directions and for different head orientations. A comprehensive experiment was done by Simon et al. [5], who used a graphical user interface displaying the floor plan of a $45^\circ $-spaced loudspeaker ring to have the listeners specify the perceived direction. Martin et al. in 1999 [16] used a graphical user interface showing the floorplan of a 5.1 ring in their experiment, and last but not least, Matthias Frank used a direct pointing method to enter the perceived direction [10] in one of his experiments.

As the experiments did not seem to yield consistent results, a comprehensive level-difference adjustment experiment with 24 loudspeakers arranged as a horizontal ring was done in [17] and partially repeated later in [11], see results in Fig. 2.4. In the repeated experiment [11] it became clear that in the anechoic room, a large amount of the differently pronounced localization biases can be avoided by encouraging the listeners to do front-back and left-right head motion by a few of centimeters, whenever there is doubt.

2.2.4 Level Differences on Frontal, Horizontal to Vertical Pairs

Quite extensively, T. Kimura investigates the localization of auditory events between frontal, vertical $\pm 13.5^\circ $ loudspeaker pairs in 2012 [6, 18]. The work of F. Wendt in 2013 [7, 19] also investigates a slant and vertical loudspeaker pair, Fig. 2.5. Kimura uses pulsed white noise, Wendt uses pulsed pink noise.

Obviously, the horizontal spread is always smaller than the vertical spread and the spread does not align with the direction of the loudspeaker pair. The largest vertical spread appears for the vertical loudspeaker pair.

2.2.5 Vector Models for Horizontal Loudspeaker Pairs

A weighted sum of the loudspeakers’ direction vectors $\varvec{\uptheta }_1$, $\varvec{\uptheta }_2$ could be conceived as simple linear model of the perceived direction, using a linear blending parameter $0\le q\le 1$

$$\begin{aligned} \varvec{r}&=(1-q)\,\varvec{\uptheta }_1+q\,\varvec{\uptheta }_2. \end{aligned}$$

(2.4)

The parameter q adjusts where the resulting vector $\varvec{r}$ is located on the connecting line between $\varvec{\uptheta }_1$ and $\varvec{\uptheta }_2$. On frontal loudspeaker pairs, localization curves typically run through the middle direction $q=\frac{1}{2}$ for level differences of 0 dB. If only one loudspeakers is active, the result is either of the loudspeaker directions, thus the parameter is $q=0$ or $q=1$.

Classical definitions. As the simplest choice for q, one could insert $q=\frac{g_2}{g_1+g_2}$ or $q=\frac{g_2^2}{g_1^2+g_2^2}$ to get the vector definitions as weighted average using either the linear or squared gains according to [1]:

$$\begin{aligned} \varvec{r}_\mathrm {V}&=\frac{g_1\,\varvec{\uptheta }_1+g_2\,\varvec{\uptheta }_2}{g_1+g_2},&\varvec{r}_\mathrm {E}&=\frac{g_1^2\,\varvec{\uptheta }_1+g_2^2\,\varvec{\uptheta }_2}{g_1^2+g_2^2}. \end{aligned}$$

(2.5)

For both models, equal gains $g_1=g_2$ yield $q=\frac{1}{2}$, and also the endpoints with $g_2=0$ or $g_1=0$ correspond to $q=0$ or $q=1$, respectively. However, the slope of the $\varvec{r}_\mathrm {E}$ vector is steeper than the one of the $\varvec{r}_\mathrm {V}$. For instance, if $g_2=2\,g_1$, the vector $\varvec{r}_\mathrm {V}$ lies on $q=2/3$ of the line between $\varvec{\uptheta }_1$ and $\varvec{\uptheta }_2$, while $\varvec{r}_\mathrm {E}$ lies at $q=4/5$ of the connecting line.

The $\varvec{r}_V$ vector for the $\pm \alpha $ loudspeaker pair at the directions $\varvec{\uptheta }_{1,2}^\mathrm {T}=(\cos \alpha ,\,\pm \sin \alpha )$ corresponds to the tangent law [20], whose formal origin lies in a model of summing localization based on a simple model of the ear signals, cf. Appendix A.7. The equivalence of this law to the vector model follows from the tangent $\tan \varphi $ as ratio of the y divided by x component of the $\varvec{r}_V$ vector, $\tan \varphi =\frac{g_1\,\sin (\alpha )+g_2\,\sin (-\alpha )}{g_1\,\cos (\alpha )+g_2\,\cos (\alpha )}=\frac{g_1-g_2}{g_1+g_2}\tan \alpha $.

Adjusted slope. Differently steep curves were fitted by an adjustable-slope model [17]

$$\begin{aligned} \varvec{r}_\gamma&=\frac{|g_1|^\gamma \,\varvec{\uptheta }_1+|g_2|^\gamma \,\varvec{\uptheta }_2}{|g_1|^\gamma +|g_2|^\gamma }, \end{aligned}$$

(2.6)

which uses $\gamma =1$ for $\varvec{r}_\mathrm {V}$ and $\gamma =2$ for $\varvec{r}_\mathrm {E}$. Figure 2.6 compares the prediction by $\varvec{r}_\mathrm {V}$, $\varvec{r}_\mathrm {E}$, and $\varvec{r}_\gamma $ to frequency-dependently perceived directions in frontal horizontal pairs, to perceived directions in a lateral stereo pair, and to perceived directions in a frontal pair that is either horizontal or vertical, using various studies mentioned above.

Practical choice $\varvec{r}_\mathrm {E}$. While a specific exponent $\gamma $ closely fitting the experimental data may vary, a constant value is preferable. Figure 2.6 indicates that in most cases focusing on $\varvec{r}_\mathrm {E}$ is reasonable and sufficiently precise, see also [11].

2.2.6 Level Differences on Frontal Loudspeaker Triangles

V. Pulkki [21] and F. Wendt [7, 19] investigated localization properties for frontal loudspeaker triplets with level differences, see Fig. 2.7. Both used pulsed pink noise in their experiments.

While V. Pulkki used an indirect adjustment task to evaluate VBAP control angles to obtain auditory events directionally matching the respective reference loudspeakers, F. Wendt uses a direct pointing method. Wendt’s experiments indicate that loudspeaker triplets with three different azimuthal positions yield a smaller spread in the indicated direction than such with vertical loudspeaker pairs (not the case in Pulkki’s experiments).

2.2.7 Level Differences on Frontal Loudspeaker Rectangles

F. Wendt [7, 19] moreover presents experiments about frontal loudspeaker rectangles, again using a pointer method and pulsed pink noise, Fig. 2.8.

Again it seems that arrangements avoiding vertical loudspeaker pairs exhibit a smaller statistical spread in the responses.

2.2.8 Vector Model for More than 2 Loudspeakers

For more than two active loudspeakers and in 3D, a vector model based on the exponent $\gamma =2$ yields the $\varvec{r}_\mathrm {E}$ vector [1]

$$\begin{aligned} \varvec{r}_\mathrm {E}&=\frac{\sum _{l=1}^\mathrm {L}g_l^2\,\varvec{\uptheta }_l}{\sum _{l=1}^\mathrm {L}g_l^2\, {\varvec{\uptheta }_l}}. \end{aligned}$$

(2.7)

2.2.9 Vector Model for Off-Center Listening Positions

At off-center listening positions, the distances to the loudspeakers are not equal anymore, resulting in additional attenuation and delay for each loudspeaker depending on the position. For stationary sounds, this effect can be incorporated into the energy vector by additional weights $w_{\mathrm {r},i}$ and $w_{\uptau ,i}$

$$\begin{aligned} \varvec{r}_\mathrm {E}&=\frac{\sum _{l=1}^\mathrm {L}(w_{\mathrm {r},l}\, w_{\uptau ,l}\, g_l)^2\,\varvec{\uptheta }_l}{\sum _{l=1}^\mathrm {L}(w_{\mathrm {r},l}\, w_{\uptau ,l}\, g_l)^2\, {\varvec{\uptheta }_l}}. \end{aligned}$$

(2.8)

The weight $w_{\mathrm {r},l}$ models the attenuation of a point-source-like propagation $\frac{1}{r}$. The reference distance is the distance to the closest loudspeaker at the evaluated listening position, thus the weight of each loudspeaker results in

$$\begin{aligned} w_{\mathrm {r},l} = \frac{1}{r_l}. \end{aligned}$$

(2.9)

The incorporation of delays into the energy vector requires a transformation that yields the weights $w_{\uptau ,l}$ for each loudspeaker. It is reasonable that these weights attenuate the lagging signals in order to reduce their influence on the predicted direction. An attenuation of $\frac{1}{4}\frac{\mathrm {dB}}{\mathrm {ms}}$ is known from the echo threshold in [22], similarly [23], and has successfully been applied for the prediction of localization in rooms [24]. The weight of each loudspeaker is calculated as $\tau _l=\frac{c}{r_l}$ in seconds at the listening position under test

$$\begin{aligned} w_{\uptau ,l} = 10^{\frac{-1000}{4\cdot 20}\tau _l}. \end{aligned}$$

(2.10)

Further weights can be applied in order to model the precedence effect in more detail, as proposed by Stitt [25, 26]. Listening test results in [27] compared the differently complex extensions of the energy vector and revealed that the simple weighting with $w_{\mathrm {r},i}$ and $w_{\uptau ,i}$ is sufficient for a rough prediction of the perceived direction in typical playback scenarios.

The left side of Fig. 2.9 shows the predicted directions by the energy vector for various listening positions when playing back the same signal on a standard stereo loudspeaker pair with a radius of 2.5 m. The absolute localization error can be calculated from the difference of the predicted direction and the desired panning direction. The right side of Fig. 2.9 depicts areas with localization errors within 4 ranges: $0^\circ \dots 10^\circ $ (white, perfect localization), $10^\circ \dots 30^\circ $ (light gray, plausible localization), $30^\circ \dots 90^\circ $ (gray, rough localization), and $>\!\!90^\circ $ (dark gray, poor localization).

Concerning a single playback scenario, i.e. a single panning direction on a loudspeaker setup, the perceptual sweet area for plausible playback can be estimated by the area with localization errors below $30^\circ $. For the prediction of a more general sweet area, the absolute localization errors can be computed for all possible panning directions in a fine grid of $1^\circ $ and averaged at each listening position as shown in Fig. 2.10.

2.3 Width

M. Frank [10] investigated the auditory source width for frontal loudspeaker pairs with 0 dB level difference and various aperture angles, as well as the influence of an additional center loudspeaker on the auditory source width. The response was given by reading numbers off a left-right symmetric scale written on the loudspeaker arrangement (Fig. 2.11).

Figure 2.11 (right) shows the statistical analysis of the responses. Obviously the additional center loudspeaker decreases the auditory source width.

Auditory source with is difficult to compare for different directions and also single loudspeakers yield auditory source widths that vary with direction. Still, a relatively constant auditory source width is desirable for moving auditory events. For static auditory events, the narrowest-possible extent can be desirable.

2.3.1 Model of the Perceived Width

The angle $2\arccos \Vert \varvec{r}_\mathrm {E}\Vert $ describes the aperture of a cap cut off the unit sphere perpendicular to the $\varvec{r}_\mathrm {E}$ vector, at its tip, from the origin, see Fig. 2.12. As the $\varvec{r}_\mathrm {E}$ vector length is between 0 (unclear direction) and 1 (only one loudspeaker active), this angle stays between $180^\circ $ and $0^\circ $.

M. Frank’s experiments about the auditory source width [10, 28] showed that stereo pairs of larger half angles $\alpha $ were also heard as wider. The length of the $\varvec{r}_\mathrm {E}$ vector gets shorter with the half angle $\alpha $. In a symmetrical loudspeaker pair $\varvec{\uptheta }_{12}^\mathrm {T}=(\cos \alpha ,\,\pm \sin \alpha )$ with $g_1=g_2=1$, the y coordinate of the $\varvec{r}_\mathrm {E}$ vector cancels and its length is

$$\begin{aligned} \Vert \varvec{r}_\mathrm {E}\Vert =r_\mathrm {E,x}&=\cos \alpha . \end{aligned}$$

The corresponding spherical cap is same size as the loudspeaker pair $2\arccos \Vert \varvec{r}_\mathrm {E}\Vert =2\alpha $. However, only $\frac{5}{8}$ of the size was indicated by the listeners of the experiments, which yields the following estimator of the perceived width:

$$\begin{aligned} ASW&=\textstyle \frac{5}{8}\cdot \frac{180^\circ }{\pi }\cdot 2\arccos \Vert \varvec{r}_\mathrm {E}\Vert . \end{aligned}$$

(2.11)

For an additional center loudspeaker $g_3=1$, $\varvec{\uptheta }^\mathrm {T}=(1,0)$, the estimator yields

$$\begin{aligned} \Vert \varvec{r}_\mathrm {E}\Vert =r_\mathrm {E,x}&=\frac{1}{3}+\frac{2}{3}\cos \alpha , \end{aligned}$$

an increase matching the experiments as $\arccos \Vert \varvec{r}_\mathrm {E}\Vert <\alpha $, see Figs. 2.13 and 2.12.

2.4 Coloration

Despite research primarily focuses on the spatial fidelity of multi-loudspeaker playback, the overall quality of surround sound playback was found to be largely determined by timbral fidelity ($70\%$) [29]. Loudspeakers in a studio or performance space are often characterized by different colorations that are caused by different reflection patterns (most often the wall behind the loudspeaker). When changing the active loudspeakers, or their number, these differences become audible. On the one hand, static coloration, e.g. the frequency responses of the loudspeakers, can typically be equalized. On the other hand, changes in coloration during the movement of a source cannot be equalized easily and yield annoying comb filters.

Although coloration is often assessed verbally [30], we employ a simple technical predictor based on the composite loudness level (CLL) by Ono [31, 32]. The CLL spectrum predicts the perceived coloration and is calculated from the sum of the loudnesses of both ears in each third-octave band. Studies about loudspeaker and headphone equalization show that differences in third-octave band levels of less than 1dB are inaudible by most listeners [33, 34]. This criterion can also be applied for the perception of coloration, i.e., differences between CLL spectra of less than 1dB are assumed to be inaudible.

Pairwise panning between loudspeakers results in a single active loudspeaker for source directions that coincide with the direction of a loudspeaker and two equally loud loudspeakers for source directions exactly between two neighboring loudspeakers, cf. Fig. 2.14. In the second case, the different propagation paths from the two loudspeakers to the ears create a comb filter. This comb filter is not present for sources played from a single loudspeaker. Thus, moving a source between the two directions yields noticeable coloration. This is in contrast to static sources, for which Theile’s experiments [35] indicated that they are perceived without coloration.

The actual shape of the afore-mentioned comb filter depends on the angular distance between the loudspeakers. The first notch and its depth decreases with the distance. This implies that coloration increases for playback with higher loudspeaker densities.

A similar comb filter is created when using a triplet of loudspeakers with the same loudspeaker density as the pair, e.g. L, C, R compared to C, R. In order to avoid a strong increase in source width or annoying phasing effects, the outmost loudspeakers L and R are strongly reduced in their level, typically around -12dB compared to loudspeaker C. In doing so, the similarity of the comb filters yields barely any coloration when moving a source between the two directions, cf. Fig. 2.15.

Judging from what is shown above, it appears beneficial to activate always a few loudspeaker to stabilize the coloration, as opposed to using just one loudspeaker and moving the playback to another one. Keeping the number of simultaneously active loudspeakers more or less constant does not only prevent coloration of source movements, it also yields a more constant source width. Because of this relation between coloration and source width, the fluctuation of $\Vert \varvec{r}_\mathrm {E}\Vert $ is also a simple predictor of panning-dependent coloration.

In general, the strongest coloration is perceived under anechoic listening conditions. In reverberant rooms, the additional comb filters introduced by reflections help to conceal the comb filters due to multi-loudspeaker playback.

2.5 Open Listening Experiment Data

Experimental data from azimuthal localization in frontal and lateral loudspeaker pairs Figs. 2.3 and 2.4, azimuthal/elevational localization in horizontal, skew, and vertical frontal pairs Fig. 2.5, triangles Fig. 2.7, and quadrilaterals Fig. 2.8 are available online at https://opendata.iem.at in the listening experiment data project, as well as the data to the width experiment in Fig. 2.11.

The opendata.iem.at listening experiment data project contains evaluation routines to analyze the 95%-confidence intervals symmetrically based on means, standard deviations and the inverse Student’s t-distribution CIMEAN.m, or more robustly based on median and inter-quartile ranges CI2.m and Student’s t-distribution, or for two-dimensional data analysis robust_multivariate_confidence_region.m. The MATLAB script plot_gathered_data.m reads the formatted listening experiment data and its exemplary code generates figures like the above.

In order to support others providing own listening experiment data, the MATLAB functions write_experimental_data.m read_experimental_data.m are provided on the website.

Notes

1.
https://www.itu.int/rec/R-REC-BS.1770-4-201510-I/en Algorithms to measure audio programme loudness and true-peak audio level (10/2015).

References

M. Gerzon, General metatheory of auditory localization, in prepr. 3306, Convention Audio Engineering Society (1992)
Google Scholar
D.M. Leakey E.C. Cherry, Influence of noise upon the equivalence of intensity differences and small time delays in two-loudspeaker systems. J. Acoust. Soc. Am. 29, 284–286 (1957)
Article Google Scholar
K. Wendt, Das Richtungshören bei Überlagerung zweier Schallfelder bei Intensitäts- und Laufzeitsterophonie, Ph.D. Thesis, RWTH-Aachen (1963)
Google Scholar
G. Theile, G. Plenge, Localization of lateral phantom sources. J. Audio Eng. Soc. 25(4), 96–200 (1977)
Google Scholar
L. Simon, R. Mason, F. Rumsey, Localisation curves for a regularly-spaced octagon loudspeaker array, in prepr. 7015, Convention Audio Engineering Society (2009)
Google Scholar
T. Kimura, H. Ando, 3D audio system using multiple vertical panning for large-screen multiview 3D video display. ITE Trans. Media Technol. Appl. 2(1), 33–45 (2014)
Article Google Scholar
F. Wendt, M. Frank, F. Zotter, Panning with height on 2, 3, and 4 loudspeakers, in Proceedings of the ICSA (Erlangen, 2013)
Google Scholar
H. Lee, F. Rumsey, Level and time panning of phantom images for musical sources. J. Audio Eng. Soc. 61(12), 978–988 (2013)
Google Scholar
J.M. Helm, E. Kurz, M. Frank, Frequency-dependent amplitude-panning curves, in 29th Tonmeistertagung (Köln, 2016)
Google Scholar
M. Frank, Phantom sources using multiple loudspeakers in the horizontal plane, Ph.D. Thesis, Kunstuni Graz (2013)
Google Scholar
M. Frank F. Zotter, Extension of the generalized tangent law for multiple loudspeakers, in Fortschritte der Akustik - DAGA (Kiel, 2017)
Google Scholar
V. Pulkki, Virtual sound source positioning using vector base amplitude panning. J. Audio Eng. Soc. 45(6), 456–466 (1997)
Google Scholar
G. Soulodre, Evaluation of objective measures of loudness, in prepr. 6161, 116th AES Convention (Berlin, 2004)
Google Scholar
M.-V. Laitinen, J. Vilkamo, K. Jussila, A. Politis, V. Pulkki, Gain normalization in amplitude panning as a function of frequency and room reverberance, in Proceedings 55th AES Conference (Helsinki, 2014)
Google Scholar
V. Pulkki, Compensating displacement of amplitude-panned virtual sources, in 22nd Conference Audio Engineering Society (2003)
Google Scholar
G. Martin, W. Woszczyk, J. Corey, R. Quesnel, Sound source localization in a five-channel surround sound reproduction system, in prepr. 4994, Convention Audio Engineering Society (1999)
Google Scholar
F. Zotter, M. Frank, Generalized tangent law for horizontal pairwise amplitude panning, in Proceedings of the 3rd ICSA, Graz (2015)
Google Scholar
T. Kimura, H. Ando, Listening test for three-dimensional audio system based on multiple vertical panning, in Aoustics-12, Hong Kong (2012)
Article Google Scholar
F. Wendt, Untersuchung von phantomschallquellen vertikaler lautsprecheranordnungen, M. Thesis, Kunstuni Graz (2013)
Google Scholar
H. Clark, G. Dutton, P. Vanerlyn, The ’stereosonic’ recording and reproduction system. Repr. J. Audio Eng. Soc. Proc. Inst. Electr. Eng. 19576(2), 102–117 (1958)
Google Scholar
V. Pulkki, Localization of amplitude-panned virtual sources ii: two- and three-dimensional panning. J. Audio Eng. Soc. 49(9), 753–767 (2001)
Google Scholar
B. Rakerd, W.M. Hartmann, J. Hsu, Echo suppression in the horizontal and median sagittal planes. J. Acoust. Soc. Am. 107(2) (2000)
Article Google Scholar
P.W. Robinson, A. Walther, C. Faller, J. Braasch, Echo thresholds for reflections from acoustically diffusive architectural surfaces. J. Acoust. Soc. Am. 134(4) (2013)
Article Google Scholar
F. Zotter, M. Frank, M. Kronlachner, J. Choi, Efficient phantom source widening and diffuseness in ambisonics, in EAA Symposium on Auralization and Ambisonics (Berlin, 2014)
Google Scholar
P. Stitt, S. Bertet, M. van Walstijn, Extended energy vector prediction of ambisonically reproduced image direction at off-center listening positions. J. Audio Eng. Soc. 64(5), 299–310 (2016)
Article Google Scholar
P. Stitt, S. Bertet, M. Van Walstijn, Off-center listening with third-order ambisonics: dependence of perceived source direction on signal type. J. Audio Eng. Soc. 65(3), 188–197 (2017)
Article Google Scholar
E. Kurz, M. Frank, Prediction of the listening area based on the energy vector, in Proceedings of the 4th International Conference on Spatial Audio (ICSA) (Graz, 2017)
Google Scholar
M. Frank, Source width of frontal phantom sources: perception, measurement, and modeling. Arch. Acoust. 38(3), 311–319 (2013)
Article Google Scholar
F. Rumsey, S. Zieliński, R. Kassier, S. Bech, On the relative importance of spatial and timbral fidelities in judgments of degraded multichannel audio quality. J. Acoust. Soc. Am. 118(2), 968–976 (2005), http://link.aip.org/link/?JAS/118/968/1
Article Google Scholar
S. Choisel, F. Wickelmaier, Evaluation of multichannel reproduced sound: scaling auditory attributes underlying listener preference. J. Acoust. Soc. Am. 121(1), 388–400 (2007)
Article Google Scholar
K. Ono, V. Pulkki, M. Karjalainen, Binaural modeling of multiple sound source perception: methodology and coloration experiments. Audio Eng. Soc. Conv. 111, 11 (2001), http://www.aes.org/e-lib/browse.cfm?elib=9884
K. Ono, V. Pulkki, M. Karjalainen, Binaural modeling of multiple sound source perception: Coloration of wideband sound. Audio Eng. Soc. Conv. 1124 (2002), http://www.aes.org/e-lib/browse.cfm?elib=11331
A. Ramsteiner, G. Spikofski, Ermittlung von Wahrnehmbarkeitsschwellen für Klangfarbenunterschiede unter Verwendung eines diffusfeldentzerrten Kopfhörers (Fortschritte der Akustik, DAGA, 1987)
Google Scholar
M. Karjalainen, E. Piirilä, A. Järvinen, J. Huopaniemi, Comparison of loudspeaker equalization methods based on DSP techniques. J. Audio Eng. Soc. 47(1/2), 14–31 (1999), http://www.aes.org/e-lib/browse.cfm?elib=12117
G. Theile, Über die Lokalisation im überlagerten Schallfeld, Ph.D. dissertation, Technische Universität (Berlin, 1980)
Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Electronic Music and Acoustics, University of Music and Performing Arts, Graz, Austria
Franz Zotter & Matthias Frank

Authors

Franz Zotter
View author publications
You can also search for this author in PubMed Google Scholar
Matthias Frank
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Franz Zotter .

Rights and permissions

Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Zotter, F., Frank, M. (2019). Auditory Events of Multi-loudspeaker Playback. In: Ambisonics. Springer Topics in Signal Processing, vol 19. Springer, Cham. https://doi.org/10.1007/978-3-030-17207-7_2

Download citation

DOI: https://doi.org/10.1007/978-3-030-17207-7_2
Published: 01 May 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-17206-0
Online ISBN: 978-3-030-17207-7
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics