Gestalt principles
From Scholarpedia
| This article is undergoing 1 initial review (2 completed); It may contain inaccuracies and unapproved changes made by anonymous reviewers. | ||||||||||||||||||||
Gestalt principles, or gestalt laws, are rules of the organization of perceptual scenes. When we look at the world, we usually perceive complex scenes composed of many groups of objects on some background, with the objects themselves consisting of parts, which may be composed of smaller parts, etc. How do we accomplish such a remarkable perceptual achievement, given that the visual input is, in a sense, just a spatial distribution of variously colored individual points? The beginnings and the direction of an answer were provided by a group of researchers early in the twentieth century, known as Gestalt psychologists. Gestalt is a German word meaning 'shape' or 'form'. Gestalt principles aim to formulate the regularities according to which the perceptual input is organized into unitary forms, also referred to as (sub)wholes, groups, groupings, or Gestalten (the plural form of Gestalt). These principles mainly apply to vision, but there are also analogous aspects in auditory and somatosensory perception. In visual perception, such forms are the regions of the visual field whose portions are perceived as grouped or joined together, and are thus segregated from the rest of the visual field. The Gestalt principles were introduced in a seminal paper by Wertheimer (1923/1938), and were further developed by Köhler (1929), Koffka (1935), and Metzger (1936/2006; see review by Todorović, 2007). For a modern textbook presentation, including more recent contributions, see Palmer (1999).
Figure-ground articulation
If the visual field is homogeneous throughout, a situation labeled as Ganzfeld (German for 'whole field'), it has no consistent internal organization. A simple case of an inhomogeneous field is a display with a patch of one color surrounded by another color, as in Figure 1.
In such cases the visual field is perceived as articulated into two components, the figure (patch) on the ground (surround). This figure-ground articulation may seem obvious, but it is not trivial. This type of field organization has a number of remarkable features, first described in the work of Rubin (1915/1921), predating Wertheimer's publication. The two components are perceived as two segments of the visual field differing not only in color, but in some other phenomenal characteristics as well. The figure has an object-like character, whereas the ground has less perceptual saliency and appears as 'mere' background. The areas of the figure and the ground usually do not appear juxtaposed in a common plane, as in a mosaic, but rather as stratified in depth: there is a tendency to see the figure as positioned in front, and the ground at a further depth plane and continuing to extend behind the figure, as if occluded by it. Furthermore, the border separating the two segments is perceived as belonging to the figure rather than to the ground, and as delineating the figure's shape as its contour, whereas it is irrelevant to the shape of the ground. Certain displays are bi-stable, in that what is perceived as figure can also be perceived as ground and vice-versa. However, in displays structured such as Figure 1, in which a smaller region is wholly surrounded by a larger region, it is usually the former that appears as figure (although it may also be seen as a hole), and the latter as ground.
The described organization of the display into the figure and the ground is not its only conceivable segmentation. To illustrate this, consider that Figure 1, as presented on the computer screen, is a set composed of a certain number of pixels, and that the segmentation into figure and ground corresponds to a particular partition of this set into two subsets. However, this same set may be partitioned into a huge number of other pairs of subsets (such as the subset of pixels in the left half of the figure and the subset in the right half, or the subset at one side of any arbitrary line meandering through the display and the subset at the other side, or the subset consisting of even pixels in odd rows plus odd pixels in even rows and the complementary subset), or into any conceivable three subsets, or four subsets etc. Nevertheless, while an enormous number of such alternative partitions are conceivable, none of them is perceivable, save one or very few. The partition that is actually seen is not a matter of geometric combinatorics and attention to arbitrarily selected subsets: the natural, and often the only way that we can perceive such a display, given the structure of the visual input, is as segmented into the figure and the ground. Such articulation, in which a virtual infinity of geometrical possibilities is pruned down to a single or only a couple of perceptual realizations, is a very basic feature of the working of the visual system.
Although figure-ground articulation is a fundamental aspect of field organization, it is not usually itself referred to as a Gestalt law or principle of grouping. Rather, such terms are mostly used for describing the rules of the organization of somewhat more complex visual fields. There is no definitive list of Gestalt principles, but some of the most commonly discussed are listed and described below, illustrated with examples mainly based on Wertheimer (1923/1938) and Metzger (1936/2006). As demonstrated by these examples, the perceptual groupings are in some cases strong and unambiguous, but in other cases they are better described as tendencies, especially when different factors compete with each other.
Proximity principle
Organizational principles include not only single figures but also sets of non-adjacent regions. Such a set is depicted in Figure 2a. Each of the six patches is perceived as a visual unit, a figure on a common ground. However, they are also collectively the elements of a higher-order visual unit, the horizontal row. According to Gestalt theory, this integration of individual components into a superordinate whole can be accounted for by the proximity principle: elements tend to be perceived as aggregated into groups if they are near each other.
The effect of varying proximity is illustrated in Figure 2b. Due to the change of distance between some of the components, here the patches are perceived not just collectively as a sextuple, but also as being subdivided into a triple of doublets, an organization that in Wertheimer's notation is designated as 12/34/56.
A number of other potential partitions of the set in Figure 2b exist, such as into a doublet of triples (123/456), or into a quartet and a pair (1234/56), or even into combinations of non-adjacent items such as 16/25/34/, or 135/246 etc. However, it is extremely hard, if not impossible, to actually perceive such groupings in this figure. On the other hand, it is not impossible to see some subdivisions in Figure 2a. For example, with deliberate effort and concentrated attention one may eventually succeed in mentally partitioning the row of patches into three pairs. However, such a percept is usually only partially and locally successful (one clearly sees only one or two segregated pairs), appears contrived, and is fleeting. In contrast, perceiving the same partition in Figure 2b is spontaneous and effortless, and the percept is global and stable. Attention may contribute to figural perception, but, except in special cases, its role is usually limited: generally, it is not attention that creates the forms, but rather the forms, organized in accord with Gestalt principles, that draw attention.
With a different spatial distribution of the six components, such as in Figure 2c, another naturally perceived partition into sub-wholes arises (1/23/45/6). In this display the above clustering into three groups of pairs (12/34/56), although arguably simpler and more regular, is hard to perceptually realize, as it would involve grouping together some elements across relatively larger distances, but assigning other, relatively near elements, into different groups.
Common fate principle
The common fate principle states that elements tend to be perceived as grouped together if they move together. Thus if some of the elements in Figure 2 would begin to displace, such as to move up and down, they would be perceived as a group, even across larger distances.
Similarity principle
Another rule involving sets of discrete figures is the similarity principle: elements tend to be integrated into groups if they are similar to each other.
The similarity principle is illustrated in Figure 3, in which proximity is held constant, since the individual figures are at (approximately) the same distance from each other, as in Figure 2a. Nevertheless, they are perceptually partitioned into three adjacent pairs, due to similarity of color (3a), size (3b), orientation (3c), or shape (3d). Compounding the within-group similarities and between-group differences, by making the doublets similar / different in color and in size and in shape etc, would make these sub-wholes still more salient. Note also that by varying the proximity between the elements, this factor may be put into competition or co-operation with similarity. Thus by increasing the distance between elements 2 and 3, and elements 4 and 5, as in Figure 2b, the salience of the 12/34/56 grouping in the examples in Figure 3 would be increased, whereas by changing the distances as in Figure 2c, it would be decreased. Such manipulations, pointed out already by Wertheimer (1923), can be used to quantify the effects of Gestalt principles.
Continuity principle
The display in Figure 4a can be described as consisting of a number of elements arranged in three sub-wholes or branches, converging at X. According to the principle of proximity, one would expect branch BX to group with branch CX, but instead it groups with branch AX, forming the sub-whole AXB.
This grouping is an instance of the continuity principle: oriented units or groups tend to be integrated into perceptual wholes if they are aligned with each other. The principle applies in the same way for elements arranged along lines (4a) as well as for patterns built from corresponding lines themselves (4b). Note that the balance between continuity and proximity in the formation of salient sub-wholes may be shifted by varying similarity, which can be accomplished by coloring different branches differently. Thus coloring BX same as AX but different from CX would make AXB a more salient unit, whereas coloring BX same as CX but different than AX would make CXB more salient.
Closure principle
Figure 5 is constructed by adding some appropriate elements to Figure 4. Whereas in Figures 4a and 4b the component BX is grouped with AX, in Figures 4a and 5b there is a tendency for this component to rather group with CX, both BX and CX being sides of shape BCX. This is an instance of the closure principle: elements tend to be grouped together if they are parts of a closed figure. However, in this particular example, continuity is still relatively effective, enabling alternate perceptual partitions.
Note that the patterns in Figures 4a and 4b, although physically contained in Figures 5a and 5b, are hard to see there: they can be sought out with directed attention, but do not appear spontaneously as natural visual wholes. The reason for this is not simply that more elements are added in the display. This is demonstrated in Figure 6, in which the pattern in a is readily discernible in b in spite of many added elements, but is practically invisible in c, d, and e, although geometrically it is just as present there (and in the same place) as in a and b. The loss of the visual identity of the pattern is due to the effectiveness of the Gestalt principles, mainly continuity and closure, according to which its elements are perceptually integrated with other, added elements, and assigned to other, new visual wholes. One way in which its visual identity can be recovered is by simply changing its color to make it dissimilar from the surround. For a demonstration, position the cursor anywhere within the area of Figure 6. Note also that when the cursor is removed from the figure and the pattern again assumes the same color as the added elements, it quickly (though not necessarily instantaneously) fades from view, and no effort of attention can restore it to a salient visual whole. For a further demonstration, hold the left mouse button depressed while positioned within the area of the figure, which will remove the pattern and reveal only the added elements. A classical study of such 'hidden figure' effects was reported by Gottschaldt (1926).
|
Figure 6: Camouflage.
|
These examples are instances of camouflage, the phenomenon in which objects are hidden from view, not by being occluded but by being perceptually subdivided and repartitioned, that is, being broken up internally and their parts being grouped with parts of the surrounding environment. As used by animals in the struggle for survival and by humans in warfare, the power of Gestalt principles makes it possible for organisms and things which are in plain sight to become effectively invisible and therefore undetectable by adversaries. This illustrates the fact that to visually exist it is not enough for a physical object to be optically present, but that in addition it has to conform with certain perceptual laws.
Good gestalt principle
The pattern in Figure 7a is readily partitioned into two components, a straight line that crosses a wavy line. The alternative decomposition depicted in Figure 7b is unlikely because it would violate the continuity principle. However, an appeal to continuity does not explain why the partition in Figure 7c does not spontaneously arise easily either, given that both components are continuous lines, just as in Figure 7a.
In another, related example, Figure 8a spontaneously decomposes into a semi-wheel with curved cogs and a rectangular 'snake'. However, this perceptual outcome actually violates the continuity principle, because at the point at which these two components touch this decomposition involves angles, instead of following the directions of the crossing continuous lines, as indicated in Figure 8b.
According to Gestalt theorists, such examples are instances of the good gestalt principle: elements tend to be grouped together if they are parts of a pattern which is a good Gestalt, meaning as simple, orderly, balanced, unified, coherent, regular, etc as possible, given the input. In this sense, the straight line and the wavy line perceived in Figure 7a are better forms than the pairs of lines in 7b and 7c, and in Figure 8a the cog wheel and the snake are better forms than the hybrid shapes in Figure 8b, that would be generated by conforming to the continuity principle at the crossing point. In such cases global regularity takes precedence over local relations. This principle is also called the 'law of good form' or the 'law of Prägnanz', a German word that translates roughly as salience, incisiveness, conciseness, impressiveness, or orderliness.
Past experience principle
In some cases the visual input is organized according to the past experience principle: elements tend to be grouped together if they were together often in the past experience of the observer. For example, we tend to perceive the pattern in Figure 9a as a meaningful word, and see it as built up from particular individual letters, as in Figure 9b, although it has many other possible partitions, such as in Figure 9c. This specific perceived organization is plausibly mainly due to our familiarity with words and letters as written in script form of the Roman alphabet, and might not occur for observers lacking such familiarity.
Although thus acknowledged by the gestaltists, the experience-based principle was deemed of secondary importance, compared with the other, stimulus-based principles, and easily dominated by them. As an example, in the pattern in Figure 9d, in which a slightly overlapping inverted version is added, the original stimulus is much harder to see, due to the appearance of numerous new salient sub-patterns, generated by continuity and closure. Note also that in writing perhaps the most potent Gestalt principle is proximity, that, by employing the simple device of blank spaces (which were not used in antiquity) helps group together the letters that form words. This is demonstrated by the difficulty wehavewhenreadingtextnotseparatedbyblanks an dev enmor ew henbl an kspa cesap pea rinwr ongpl aces.
Auditory Gestalten
Similar as in vision, issues of organization, grouping, and segmentation arise in the auditory domain as well (Bregman, 1990; Kubovy & van Valkenburg, 2001). The acoustic input is just a one-dimensional temporally varying air pressure waveform, but based on it we can perceive an auditory scene involving multiple sources of human speech, vocal and instrumental music, animal sounds and other nature noises, occasionally all occurring at the same time, each with its own sub-phrasing and structure. Some visual Gestalt principles directly apply in the acoustic domain, but mainly in a temporal rather than spatial form. For example, silence or background noise, interrupted by a loud sound, followed again by silence or noise, is an auditory analogue of a figure on a ground. Similarly, a regular series of identical short clicks is an analogue of Figure 2a, with equal temporal intervals between sound events playing the role of equal spatial distances. With deliberate attention, one can mentally superimpose a structure on this sequence, such as hearing consecutive pairs of clicks, as in 12/34/56. However, such a phenomenal segmentation is achieved much more naturally and easily by simply increasing the intervals between some clicks, analogously to Figure 2b. This is an instance of an auditory temporal analogue of the visual spatial proximity principle; there is also a spatial auditory variant, involving pairs of identical sounds separated by equal intervals, but coming from different directions, such as left, left/in front, in front/right, right. Auditory analogues of instances of the visual similarity principle, as illustrated in Figure 3, are also readily established, but with differences and similarities of color, size etc being replaced by differences and similarities of loudness, pitch, and timbre of sounds. Auditory analogues of some other Gestalt principles may also be constructed.
Contemporary work
The principles described above, together with others not illustrated here, such as the symmetry principle (symmetrical components will tend to group together), the convexity principle (convex rather than concave patterns will tend to be perceived as figures), and others, are part of the classical heritage of perception studies. In contemporary research, of which only a few examples will be noted below, the seminal insights and issues raised by the gestaltists are developed and extended in various directions.
For example, contrary to the classical views, more recent research has indicated that even such a basic feature as figure-ground articulation may in some instances be based on experience (Peterson & Skow-Grant, 2003). For example, although in displays with two homogeneous regions, neither of which surrounds the other, assignment to figure and ground is often ambiguous, in some cases in which one region resembles an object, such as a tree in Figure 10, that region is preferably perceived as figure.
Palmer and colleagues have developed some new principles of visual field organization. For example, Palmer (1992) has proposed the common region principle: elements tend to be grouped together if they are located within the same closed region. An illustration is provided in Figure 11a. It depicts the same spatial distribution of elements which, in Figure 2c, elicited the grouping 1/23/45/6; however, with superimposed closed contours the preferred grouping becomes 12/34/56.
Palmer & Rock (1994) proposed the element connectedness principle: elements tend to be grouped together if they are connected by other elements. This principle is illustrated in Figure 11b. Like Figure 11a, Figure 11b is also based on Figure 2c, but, due to some elements being connected, the preferred perceived grouping is 12/34/56.
Researchers have also presented computational models of some Gestalt principles (Kubovy & van der Berg, 2008), studied their possible neural bases (Sasaki, 2007; Han et al., 2005; Qiu & von der Heydt, 2005; Roelfsema, 2006), and attempted to relate them to natural image statistics (Geisler et al., 2001; Elder & Goldberg, 2002).
Unresolved issues
As formulated by Wertheimer, Gestalt principles involve a 'ceteris paribus' (all other things being equal) clause (Palmer, 1999). That is, each principle is supposed to apply given that the other principles do not apply or are being held constant. In case two (or more) principles apply for the same input, and they favor the same grouping, it will tend to become strengthened; however, if they disagree, usually one wins or the organization of the percept is unclear. Several examples of the domination of one principle over another are presented above. However, although it has been addressed to some extent in the literature (e.g. see Kubovy & van der Berg, 2008), the significant theoretical problem of how to predict which principle will win in which circumstances remains to be worked out in much more detail.
Gestalt principles are usually illustrated with rather simple drawings, such as those above. Ideally, it should be possible to apply them to an arbitrarily complex image and, as a result, produce a hierarchical parsing of its content that corresponds to our perception of its wholes and sub-wholes. This ambitious goal is yet to be accomplished.
It has been suggested that most Gestalt principles are special instances of the overarching Good Gestalt principle, in the sense that being continuous, closed, similar etc are ways of being maximally good, ordered, simple etc. However, although this idea achieves some explanatory economy and unity, it does so at the cost of clarity and operationalizability: whereas it may be relatively simple to point out the presence of continuity, closure, etc, it is more difficult to establish what exactly makes a pattern visually good, simple, unified etc.
One important issue which was not discussed much in classical literature is the origin of Gestalt principles. Why is it that the perceptual input is organized in accord with proximity, continuity, closure etc? The gestaltists tended to favor the notion that these principles are among the fundamental properties of the perceptual system, providing the basis of our ability to make sense of the sensory signals. An opposed view is that the Gestalt principles are heuristics derived from some general features of the external world, based on experience with objects (Rock, 1975): objects in the world are usually located in front of some background (figure-ground articulation), have a texture different from the background (similarity), consist of parts which are near each other (proximity), move as a whole (common fate), and have closed contours which are continuous. In sum, although these principles have been discussed for more than 80 years and are presented in most perception textbooks, there are still a number of issues about them that need to be resolved.
References
- Bregman, A. (1990). Auditory Scene Analysis: the perceptual organization of sound. Boston, MA: The MIT Press.
- Elder, J. H. & Goldberg, R. M. (2002). Ecological statistics of Gestalt laws for the perceptual organization of contours. Journal of Vision, 2, 324-353, http://journalofvision.org/2/4/5/, doi:10.1167/2.4.5.
- Geisler WS, Perry JS, Super BJ, Gallogly DP (2001). Edge co-occurrence in natural images predicts contour grouping performance. Vision Research, 41, 711-24.
- Gottschaldt, K. (1926). Über den Einfluss der Erfahrung auf die Wahrnehmung von Figuren: I. Psychologische Forschung, 8, 261-317.
- Han S, Jiang Y, Mao L, Humphreys GW, Gu H. (2005). Attentional modulation of perceptual grouping in human visual cortex: functional MRI studies. Human Brain Mapping, 25, 424-32.
- Köhler, W. (1929). Gestalt Psychology. New York: Liveright.
- Koffka, K. (1935). Principles of Gestalt Psychology. New York: Harcourt, Brace.
- Kubovy, M. & van den Berg, M. (2008). The whole is equal to the sum of its parts: A probabilistic model of grouping by proximity and similarity in regular patterns. Psychological Review, 115, 131-154.
- Kubovy, M. & van Valkenburg, D. (2001). Auditory and visual objects. Cognition, 80, 97-126
- Metzger, W. (2006). Laws of Seeing. Cambridge, MA: MIT Press. (Original work published in German in 1936).
- Palmer, S. (1992). Common region: a new principle of perceptual grouping. Cognitive Psychology, 24, 436-447.
- Palmer, S. & Rock, I. (1994). Rethinking perceptual organization: The role of uniform connectedness. Psychonomic Bulletin & Review, 1, 29-35.
- Palmer, S. (1999). Vision Science. Photons to Phenomenology. Cambridge, MA: MIT Press.
- Peterson, M. A., Harvey, E. H., & Weidenbacher, H. L. (1991). Shape recognition contributions to figure-ground organization: Which route counts? Journal of Experimental Psychology: Human perception & performance, 17, 1075-1089.
- Qiu, F.T. & von der Heydt, R. (2005). Figure and Ground in the Visual Cortex: V2 Combines Stereoscopic Cues with Gestalt Rules. Neuron, 47, 155 - 166.
- Rock (1975). An Introduction to Perception. New York: Macmillan.
- Roelfsema,P.R. (2006). Cortical Algorithms for Perceptual Grouping. Annual Review of Neuroscience, 29, 203-227
- Rubin, E. (1921). Visuell wahrgenommene Figuren. Copenhagen: Glydenalske Boghandel (original work published in Danish in 1915).
- Sasaki Y. (2007). Processing local signals into global patterns. Current Opinion in Neurobiology, 17, 132-139.
- Todorović, D. (2007). W. Metzger: Laws of Seeing. Gestalt Theory, 28, 176-180.
- Wertheimer, M. (1923/1938). Untersuchungen zur Lehre von der Gestalt II. Psychologische Forschung, 4, 301-350. (Excerpts translated into English as 'Laws of organization in perceptual forms' in W.D Ellis (Ed.), A source book of Gestalt psychology. New York: Hartcourt, Brace and Co., and as 'Principle of perceptual organization' in D.C. Beardslee & Michael Wertheimer (Eds.), Readings in Perception, Princeton, NJ: D. Van Nostrand Co., Inc.).
Recommended reading
- Abridged translation of Wertheimer (1923), from Ellis (1938), including many illustrations of Gestalt principles. Note how Wertheimer starts the essay with the laconic sentence 'I stand at the window and see a house, trees, sky', and proceeds to use such simple phenomenological observations to lucidly criticize the notion that our percept of a scene consists of the set of point-like sensations of local color, and in this way motivates the need for laws of perceptual organization.
External links
- Max Wertheimer page in Wikipedia
- Max Wertheimer page on geocities
- Wolfgang Metzger page in Wikipedia
- Journal Gestalt theory
See also
Figure-ground perception, Visual search, Binding by synchrony, Vision
