The auditory system takes all sounds derived from rapidly changing pitch percepts of the natural environment and reconstructs a useful representation of reality. The nature of this processing has been described by Bregman as 'auditory scene analysis' (Bregman, 1990, 2002). The early Gestalt psychologists proposed that the brain groups auditory elements into configurations using simple rules of proximity, similarity, good continuation and common fate (Wertheimer, 1923, cited in Deutsch, 1999; see Table 1.3 and Figure 1.7).
Principle | Description |
---|---|
Proximity | Closer sounds are grouped together in preference to those spaced further apart. |
Similarity | Sounds resembling one another are grouped together as likely from the same source. |
Symmetry | Related sounds exhibit symmetrical auditory properties. |
Good continuation | Sounds that continue each other are perceptually linked. |
Common Fate | Sounds moving together are likely to be connected. |
Gestalt principles facilitate the grouping of components in an auditory scene (Bregman, 1990, 2002). Distributed areas of the secondary auditory association cortices are thought to determine Gestalt patterns that are perceived from perceptual inputs (Patterson, Uppenkamp, Johnsrude, & Griffiths, 2002; refer to Figure 1.8). Gestalt principles are particularly relevant for degraded or ambiguous stimuli, as is often the case in our environment. In reconstructing sound events, perceptual grouping mechanisms allow linkages to be formed between some elements and inhibit linkages between others (Deutsch, 1999). For instance, in trying to listen to a single stream of events, such as the spoken word 'shoe', the sound energy produced by this auditory event is mixed with frequency components arising from other concurrent events in the environment, such as a violin playing or a car passing. This concept is illustrated in Figures 1.9 and 1.10. For the brain to build separate perceptual descriptions of sound-generating events, it must identify which combination of frequency components has arisen from a particular sound source (Bregman, 1990). Only by combining the right set of complex pitches, or frequency components, into the correct pattern can the identity of the signal be recognised.
Auditory scene analysis requires the processing of a sound object's location in space. The role of spatial properties of sound in combining complex pitch patterns of a source is crucial in pattern organisation (Bregman, 2002). Auditory scene analysis deficits have been found to occur in the form of impaired detection of sound movement properties (Bisiach, Cornacchia, Sterzi, & Vallar, 1984; Griffiths et al., 1997; Pinek & Brouchon, 1992). Such deficits have been demonstrated in studies of right hemisphere lesions, suggesting the involvement of areas outside the primary auditory cortex, such as the insula (Griffiths, Bench, & Frackowiak, 1994) and parietal cortex (Anderson, 1995).
Auditory scene analysis relies on sequential streaming in order to track components of a single sound source (Bregman, 1990; Deutsch, 1975). Sequential streaming connects events that have arisen at different times from the same source into a perceptual stream of sound (see Figure 1.11). The Gestalt grouping rules dictate the occurrence of this process through the continuity, proximity, and similarity of sequentially heard tones. Functional magnetic resonance imaging has shown that anterior auditory areas afford a mechanism for tracking pitch patterns from one sound source (Warren, Uppenkamp, Patterson, & Griffiths, 2003).
Auditory scene analysis requires the process of simultaneous streaming to determine which combination of components from several sound sources belong together. Simultaneous streaming takes acoustic inputs that occur at the same time, but at different places in the auditory scene, and treats them as properties of a single sound (see Figure 1.12). This process is more likely to occur when sound components follow Gestalt grouping rules of similarity, symmetry, common fate, and proximity (Bregman, 1990).