Depiction

Depiction is the act of representing or portraying persons, objects, actions, or events through visual images or other pictorial media, particularly within artistic or communicative contexts.^[1] This process involves creating a structured arrangement of perceptible qualities—such as colors, shapes, or lines—that evokes recognition or understanding of the subject in the viewer's mind.^[2] In the philosophy of art and aesthetics, depiction has been analyzed through competing theories that address how such representations generate meaning and perceptual engagement. Resemblance theories, dating back to early modern philosophers like John Locke, maintain that depictions function by mimicking the visible properties of their subjects, such as form and color, to produce a likeness that facilitates recognition.^[3] In contrast, conventionalist accounts emphasize symbolic systems over mere similarity; for instance, Nelson Goodman's Languages of Art (1968) argues that pictorial representation operates through denotation via dense and syntactically replete symbol systems, akin to descriptive languages, where resemblance is neither necessary nor sufficient for depiction.^[4]^[3] Experiential theories further highlight the viewer's active role, positing that depiction elicits a specific kind of "seeing-in," where one perceives depicted elements as embedded within the pictorial surface itself, as developed by Richard Wollheim in works like Painting as an Art (1987).^[3] More recent approaches, such as the artifactual theory proposed in 2025, integrate these perspectives by defining pictures as intentional artifacts designed to provoke perceptual experiences of their subjects through spatial structures of qualities, extending to modern media like virtual reality while reconciling functional and structural explanations.^[2] These theories underscore depiction's evolution from historical techniques, including Renaissance perspective methods, to contemporary analyses of representation across visual arts and digital forms.^[3]

Fundamentals

Definition and Characteristics

Pictorial depiction refers to a form of representation in which two-dimensional artifacts express contents as pictorial spaces—three-dimensional, viewpoint-centered arrangements of objects and properties—through mappings onto a picture plane based on principles of descriptive geometry.^[5] The picture plane serves as the surface on which projections transpose three-dimensional scenes, using methods such as linear perspective, where lines converge to a vanishing point, or parallel projection, where lines remain uniform, to convey spatial relations and depth.^[5] This geometric foundation distinguishes pictorial depiction as a visual mode of reference that relies on structured projections rather than arbitrary conventions alone. Common examples of depictions include paintings, such as Leonardo da Vinci's Mona Lisa, which employs linear perspective to represent a seated figure in a landscape; photographs, which capture scenes via optical projection; and mosaics, like those in ancient Roman villas, assembled from tiles to form coherent images of figures and environments. Controversial cases arise with ambiguous forms, raising questions about intentionality and recognizability in depiction. Unlike linguistic or notational systems, which operate through discrete, syntactic elements like words or musical scores, depictions are non-verbal and visual, functioning within dense symbol systems where every mark contributes to the overall reference without clear boundaries between relevant and irrelevant features.^[4] A basic prerequisite for depiction is reference asymmetry: the pictorial symbol denotes its object directionally, such that the image refers to the depicted entity, but the entity does not reciprocate, highlighting depiction's one-way referential nature.^[6] While resemblance has traditionally been invoked as a criterion for depiction, it proves insufficient due to its symmetry, as entities resemble themselves and each other reciprocally without necessitating representation.^[6]

Historical Development

The concept of depiction originated in ancient Greek philosophy, where mimesis—imitation—was central to discussions of art's nature and value. In Plato's Republic (circa 380 BCE), particularly Book X, he critiqued mimetic arts like painting and poetry as deceptive copies of the physical world, which itself is merely a shadow of ideal Forms, rendering art a third remove from truth and potentially harmful to the soul by encouraging illusion over rational insight.^[7] This view positioned depiction as inferior to philosophy, influencing subsequent Western thought on representation's epistemological limits. Aristotle, in his Poetics (circa 335 BCE), offered a more affirmative take, praising mimesis as a natural human instinct that evokes pleasure through recognition and catharsis, though he focused more on dramatic than visual arts. During the early Renaissance, Italian humanists revived and transformed mimetic theory, emphasizing depiction's potential for naturalistic illusion and moral instruction. Leon Battista Alberti's Della pittura (1435) reconceived painting as a "window on the world," advocating linear perspective and proportional imitation to achieve verisimilitude, drawing on classical sources while adapting them to empirical observation and mathematical precision.^[8] This shift marked a departure from medieval symbolism toward realistic representation, as seen in works by artists like Masaccio, where depiction served to harmonize art with nature's laws.^[9] Alberti's framework influenced the period's emphasis on historia—narrative scenes rendered with lifelike detail—to engage viewers intellectually and emotionally. In the 20th century, theories of depiction evolved through interdisciplinary lenses of psychology and semiotics, challenging pure resemblance models. E.H. Gombrich's Art and Illusion (1960) argued that depiction arises not from direct copying but from perceptual schemata and viewer expectations, shaped by cultural traditions and cognitive processes like "beholder's share," where illusion emerges from interpretive matching rather than mechanical imitation.^[10] This work integrated Gestalt psychology and information theory, highlighting how styles evolve experimentally across art history. Post-1960s developments further blended philosophy, art history, and cognitive science; for instance, Richard Wollheim's concept of "seeing-in" (1980) described depiction as a twofold experience—perceiving the picture surface while envisioning depicted content—drawing on psychoanalytic insights into imaginative projection.^[1] By the early 21st century, cognitive approaches, such as those exploring neural mechanisms of visual recognition, underscored depiction's roots in evolved perceptual invariants, though seminal integrations like Dominic Lopes's Understanding Pictures (1996) emphasized cultural variability in pictorial understanding.^[1] Despite these advances, the historical narrative of depiction remains incomplete, with limited attention to pre-20th-century non-Western traditions, revealing a persistent Eurocentric bias in philosophical and art-historical scholarship. Traditions like Chinese landscape painting (shanshui), which prioritized symbolic harmony over illusionistic depth from the Tang dynasty onward, or Mesoamerican codices' narrative glyphs, offer alternative mimetic logics centered on cosmology and ritual rather than perceptual realism, yet they are often marginalized as "decorative" in Western analyses.^[11] This oversight stems from colonial-era frameworks that privileged European linear perspective as universal, underscoring the need for decolonized perspectives to fully trace depiction's global evolution.^[12]

Core Theories of Resemblance

Mimetic Approaches

Mimetic approaches to depiction conceive of pictorial representation as a form of imitation, or mimesis, in which an image copies or resembles the visible properties of its subject to evoke recognition of that subject.^[13] This theory posits that the success of a depiction depends on perceptual or structural similarity between the picture and what it represents, such that viewers identify the subject through shared features like shape, color, and proportion.^[13] At its core, mimesis draws from the ancient Greek understanding of art as a natural replication of reality, emphasizing symmetry in resemblance: if picture A resembles subject B, then B resembles A to the same degree.^[14] The concept of mimesis originates with Plato, who in works like the Republic critiqued art as an imitation of the physical world, which itself imitates ideal forms, viewing it as thrice removed from truth.^[15] Aristotle emerged as a key proponent of mimetic theory, who in his Poetics described poetry and the arts as inherently imitative, rooted in humanity's innate instinct to mimic from childhood. Aristotle argued that imitation distinguishes humans from other animals and serves educational and pleasurable purposes, as viewers delight in recognizing the likeness between the artwork and the real world.^[14] Early art theory extended this emphasis on natural likeness, viewing depiction as a faithful rendering of appearances to achieve verisimilitude, as seen in classical treatises that praised artists for their ability to mirror observable forms. Basic conditions for resemblance in mimetic theory require that the depiction shares relevant visible properties with its subject, enabling direct perceptual matching without reliance on arbitrary symbols.^[13] However, this approach encounters the symmetry problem: while resemblance is bidirectional—a portrait resembles its sitter, and the sitter resembles the portrait—depiction is unidirectional, as the sitter does not represent the portrait.^[13] Such issues, along with challenges from fictional subjects like dragons that lack real counterparts to resemble, highlight limitations in pure mimetic accounts.^[13] Pure resemblance theories of depiction, which posit that a picture depicts an object by resembling it, encounter fundamental challenges that undermine their explanatory power. A primary limitation is the asymmetry inherent in depiction: while a picture may resemble its subject, the subject does not thereby depict the picture, yet resemblance relations are symmetric—if A resembles B, then B resembles A to the same degree. This symmetry fails to account for the unidirectional nature of representation, as noted by Max Black in his analysis of pictorial logic, where he argues that such reciprocity would absurdly imply that any tree could depict any picture of a tree without contextual distinction.^[16] Another key problem arises with abstract or fictional subjects, where pure resemblance proves insufficient. For instance, a picture of a dragon cannot resemble any real dragon, as none exist, yet it successfully depicts one; similarly, depictions of abstract concepts like justice lack concrete counterparts for resemblance. Nelson Goodman critiqued this in Languages of Art, asserting that representation involves denotation rather than imitation, and that pictures can denote fictional entities with "zero denotation" without requiring any actual similarity. These issues reveal that resemblance alone cannot explain the full scope of what pictures represent.^[4] To address these shortcomings, philosophers have proposed refinements that qualify resemblance with perceptual or intentional conditions. One such refinement is experienced resemblance, which emphasizes perceived likeness under specific viewing conditions rather than objective similarity; for example, a viewer experiences a portrait as resembling its subject through outline shape or contour, even if the picture distorts other features. Robert Hopkins develops this in his analysis of depiction, arguing that the relevant resemblance is one experienced in the act of viewing, thereby resolving issues with fictional subjects by tying representation to perceptual psychology rather than metaphysics. These refinements draw briefly on cognitive recognition processes to explain how viewers interpret resemblances.^[17] A complementary approach incorporates artist intent, positing that depiction occurs when a picture bears resemblances intended by the creator to capture the subject's appearance. Catharine Abell advances this in her "canny resemblance" theory, where intentionality breaks the symmetry of resemblance by directing it toward the artist's purpose—thus, a picture of a unicorn depicts it through intended visual cues, even absent a real referent. Ben Blumson further integrates intent with resemblance in his defense of mediated depiction, reducing the metaphysical role of pure similarity while preserving its epistemic function in recognition.^[18]^[19] Philosophically, these critiques and refinements establish resemblance as a necessary but insufficient condition for depiction, necessitating hybrid theories that combine mimetic elements with experiential or semiotic factors. Gregory Fuller revisits the theory to argue that while resemblance serves as a perceptual prerequisite, it requires contextual and intentional supplementation to function as a full account of representation. This evolution highlights the complexity of pictorial meaning, prompting ongoing debates in aesthetics about the balance between likeness and interpretation.^[16]

Perceptual Mechanisms

Illusion-Based Explanations

Illusion-based explanations of depiction posit that pictures function by inducing perceptual illusions that simulate the experience of seeing real objects, leveraging the viewer's innate visual mechanisms to create a sense of resemblance without actual presence. Ernst Gombrich, in his seminal 1960 work Art and Illusion, argued that pictorial representation exploits psychological processes akin to optical illusions, where artists manipulate schemas—pre-existing mental templates derived from visual instincts—to evoke illusionistic effects that mimic reality. Gombrich emphasized that viewers do not passively copy nature but actively interpret images through trial-and-error corrections, much like hypothesis testing in perception, leading to a deceptive but compelling sense of depth and form.^[20] Psychological research supports this view by highlighting how perceptual cues in pictures trigger illusionary responses similar to those in three-dimensional environments. Richard Gregory's constructivist theory describes perception as an inferential process where ambiguous cues, such as linear perspective or shading in depictions, prompt the brain to hypothesize and "fill in" missing information, often resulting in illusory depth that deceives the eye.^[21] In contrast, James J. Gibson's ecological approach identified affordance cues in pictures—like texture gradients—that can evoke direct pickup of invariant information, yet even he acknowledged that under certain conditions, these cues contribute to illusionistic interpretations rather than veridical seeing.^[22] Cross-cultural studies, such as those by Jan B. Deregowski, reveal variability in susceptibility to pictorial illusions; for instance, some non-Western groups initially struggle with depth cues in two-dimensional images due to differing experiential schemas, though adaptation occurs with exposure. Similarly, child development research indicates that children aged 3-6 interpret pictures primarily based on their visual appearance, though they increasingly consider artistic intention in ambiguous cases, underscoring the learned aspects of pictorial understanding.^[23] In art history, illusion-based depiction aligns closely with traditions of realism and naturalism in Western art, where artists aimed to replicate optical effects for deceptive verisimilitude. Trompe l'oeil techniques, originating in ancient Roman murals and peaking in Renaissance and Baroque periods, exemplify this by using foreshortening, chiaroscuro, and precise modeling to create hyper-realistic illusions that fool viewers into reaching for depicted objects, as seen in works by artists like Cornelis Gijsbrechts.^[24] This approach influenced naturalistic movements, such as 19th-century academic painting, where illusionistic resemblance served to evoke emotional and perceptual immersion akin to direct encounter.^[25] Unlike invariant-based theories that emphasize objective structural matches, illusion explanations highlight the subjective, deceptive thrill of perceptual trickery in such art forms.^[22]

Invariant Structures

In ecological psychology, James J. Gibson proposed an ecological framework emphasizing direct perception for understanding depiction, where pictures re-present invariants—stable structures in the visual information available to perceivers—rather than mere resemblances or sensory deceptions. According to Gibson, perception involves the direct pickup of invariants from the ambient optic array, which consists of light rays structured by the environment's layout, surfaces, and objects. A picture, as an "arrested optic array," is a treated surface that projects a subset of these invariants through reflected light, allowing perceivers to detect persistent informational structures without mental mediation.^[26] Central to this approach is the notion of dual invariants: one set specifying the picture's surface (e.g., its flatness, edges, and texture as a physical object) and another re-presenting the invariants of the depicted scene (e.g., the relative sizes or occluding contours of objects within it). For instance, in a photograph of a cat, the light rays convey both the paper's uniform reflectance and the gradient of sizes indicating the cat's form and posture, enabling simultaneous perception of the surface and what it depicts. This dual structure arises because pictures preserve certain optical invariants, such as size-distance gradients or horizon ratios, from the original scene's projection onto a plane.^[26]^[27] Gibson explicitly rejected illusion-based accounts of depiction, arguing that pictures do not trick the visual system into mistaking a flat surface for three-dimensional reality; instead, they provide genuine, albeit limited, informational access to the depicted world's structure. Unlike illusions, which disrupt the optic array's coherence, pictures maintain detectable invariants that specify layout without ambiguity, as seen in line drawings where occluding edges directly afford the perception of depth and occlusion. This direct access aligns with Gibson's broader theory, where perception is attunement to environmental information rather than inference from retinal images.^[26] The implications of this framework include a universal competence in recognizing depictions, grounded in the species-wide ability to detect invariants, which addresses earlier cross-cultural studies suggesting profound gaps in picture understanding (e.g., among non-Western groups) as inconclusive or methodologically flawed due to unfamiliar stimuli or assumptions of cultural mediation. Gibson's view posits that such recognition emerges early in development and persists across contexts, as invariants are ecologically basic and not dependent on learned conventions, contrasting with resemblance theories that struggle to explain consistent perception without subjective interpretation.^[26]^[28]

Twofoldness and Seeing-In

The theory of twofoldness, as articulated by Richard Wollheim in his 1987 work Painting as an Art, posits that the perceptual experience of depiction entails a dual structure of awareness. This twofoldness consists of a configurational aspect, in which the viewer remains cognizant of the pictorial medium—such as the canvas, pigments, or lines—and a recognitional aspect, wherein the depicted subject is visually identified within that medium. Wollheim characterized seeing-in as a unique visual capacity distinct from ordinary seeing or illusion, enabling the viewer to experience the subject as emerging from, yet not obscuring, the surface.^[29] This framework emphasizes that pictorial perception preserves the flatness of the medium while simultaneously projecting three-dimensional content, avoiding the total immersion typical of illusions.^[30] The disposition to engage in seeing-in arises from the viewer's sensitivity to resemblances between the configured medium and potential subjects, modulated by the artist's intentional design. Wollheim argued that this disposition is not automatic but contextually triggered, requiring the marks or forms to be arrayed in a manner that invites recognitional uptake, often aligned with the creator's aim to evoke specific content. For instance, in representational artworks like Titian's Bacchus and Ariadne (1520–1523), the disposition manifests as viewers detect human figures and mythological scenes amid the layered oils, where the brushstrokes' arrangement both signifies and sustains awareness of the painted surface.^[29] Such examples illustrate how seeing-in operates in traditional oil paintings, fostering a perceptual interplay that respects the artwork's material presence.^[30] Philosophically, twofoldness and seeing-in serve as a bridge between raw perceptual experience and representational function, elucidating why depictions show their subjects in a manner that transcends mere denotation or symbolic reference. By integrating medium and content without conflating them, Wollheim's account explains the distinctive phenomenology of pictures, distinguishing them from linguistic or abstract signs that lack this visual duality.^[29] This theory has influenced subsequent debates on how depictions elicit direct, yet mediated, encounters with represented entities, underscoring the intentional and perceptual foundations of pictorial meaning.^[30]

Psychological Dimensions

Recognition and Experience

In Robert Hopkins' theory, pictorial experience constitutes a distinctive form of recognition whereby viewers experience a resemblance between the picture's design and the depicted object, thereby simulating the perceptual encounter with that object without engendering a complete illusion of its presence. This recognitional process relies on the viewer's awareness of the picture surface while simultaneously engaging visual mechanisms akin to those in direct object perception, allowing for an experiential grasp of the depicted content that is neither purely conceptual nor deceptive. Hopkins argues that such experience is essential to depiction, as it grounds the picture's representational success in the phenomenology of recognition rather than arbitrary conventions or causal traces. Flint Schier's recognition capacity approach posits that depictions succeed by activating viewers' pre-existing abilities to recognize objects in the world, extending these capacities to interpret novel pictorial designs without requiring learned codes or illusions. According to Schier, a picture depicts an object if it naturally elicits recognition of that object from viewers familiar with it, emphasizing the causal role of visuopsychological processes in bridging picture and referent. However, critics note limitations in this view, particularly its challenge in explaining how depictions achieve specific reference to absent or particular objects beyond generic recognitional triggering. Empirical research underscores the pivotal role of prior knowledge in pictorial recognition, with studies showing that familiarity with representational conventions enhances viewers' ability to interpret ambiguous or stylized depictions accurately. For instance, exposure to similar images or cultural artifacts improves recognition speed and precision by activating relevant schemas, facilitating the integration of pictorial cues with stored visual knowledge. Cross-cultural investigations reveal persistent gaps in recognition, such as lower accuracy among indigenous groups in Africa and Papua New Guinea when processing perspective-based pictures, attributed to limited prior exposure to such formats rather than innate perceptual deficits; these findings highlight how experiential factors shape recognitional success across diverse populations.^[31] This recognitional framework aligns briefly with seeing-in as a disposition toward object identification in pictures.

Imaginative Engagement

In Kendall Walton's influential theory of representation, depictions operate by prescribing imaginative activities, where the picture serves as a prop that authorizes viewers to imagine perceiving the depicted object or scene as present. This approach posits that the content of a depiction is not derived from perceptual similarity alone but from the fictional truths generated within a game of make-believe, analogous to children's play where objects like sticks become swords. By treating pictures as props, Walton explains how viewers engage imaginatively, transforming the act of seeing the picture into an imagined seeing of what it represents.^[32] The prop-oriented nature of this make-believe framework accounts for the representational power of diverse depictions, including those of fictional entities like dragons or abstract forms that lack straightforward perceptual resemblance to real-world objects. In such games, the principles of generation—rules implicit in the picture's design—dictate what is to be imagined, allowing even a simple line drawing to evoke a complex scene through authorized pretense. For example, a minimalist sketch might prescribe imagining a full landscape, extending beyond literal visual cues to encompass narrative or emotional elements. This versatility highlights how imagination bridges the gap between the prop and the represented, enabling depictions to function across artistic media.^[32] Despite its explanatory strengths, Walton's theory faces limitations in fully addressing the role of perceptual resemblance in depiction. While it emphasizes prescriptions for imagination, it does not adequately explain why certain visual features of pictures—such as shape, color, or texture—evoke recognition through direct perceptual matching rather than solely through fictional authorization. Critics contend that this oversight diminishes the theory's account of why props are effective, as imaginative engagement alone may not capture the involuntary perceptual pull of resemblance.^[33] Furthermore, the framework intersects with transparency debates, where it suggests viewers imagine seeing the depicted but struggles to reconcile this with claims that pictures, especially photographs, allow literal seeing-through to the subject, potentially conflating perceptual and imaginative modes.^[34]

Semiotic Frameworks

Denotation and Notation

In Nelson Goodman's seminal work Languages of Art (1968), depiction is conceptualized not as a matter of literal resemblance between image and object, but as a form of denotation within symbolic systems, where pictures function as labels or notations that refer to their subjects through conventional rules rather than perceptual similarity.^[35] Goodman argues that representation in pictorial systems operates via analogue schemes characterized by syntactic density—meaning there are no discrete, finite characters, and even infinitesimal differences in marks can potentially alter compliance—and semantic density, where the denoted objects can vary continuously without discrete boundaries.^[35] This contrasts sharply with digital notations, such as verbal languages, which rely on articulated, finitely differentiated symbols.^[35] Central to Goodman's theory are the properties of differentiation and articulation in notational systems. Differentiation refers to the finite division of compliant marks into distinct characters, while articulation involves syntactic rules that define how these characters combine exhaustively and disjointly to form meaningful expressions; pictorial systems, however, lack both, rendering them dense and continuous rather than discrete.^[35] Semantic repleteness further distinguishes pictures, as nearly all visual features—such as shape, color, and texture—contribute to denotation, making the system relatively full in its interpretive scope compared to more selective notations like diagrams.^[35] Goodman posits that any apparent resemblance in depiction arises secondarily from the conventional use of these systems, not as a foundational mechanism.^[35] Unlike digital notations, which permit multiple equivalent realizations (allographic arts like literature or musical scores), pictorial depictions are typically autographic, with their identity tied to the specific history of production, ensuring that forgeries undermine authenticity even if visually identical.^[35] This framework underscores depiction's role as a symbolic practice governed by syntactic and semantic structures, prioritizing referential function over mimetic illusion.^[35]

Iconicity in Pictures

In Charles Sanders Peirce's semiotic framework, signs are categorized into three fundamental types: icons, which signify through resemblance or analogy to their objects; indexes, which indicate through a direct existential connection; and symbols, which denote by virtue of convention or habit.^[36] Within depictions, particularly pictures, the iconic mode predominates, as the visual form mimics qualities of the represented object, such as shape, color, or spatial arrangement, enabling recognition based on similarity rather than arbitrary linkage.^[36] Peirce further subdivided icons into hypoicons—images that resemble in firstness (pure qualities), diagrams that represent through relational structures, and metaphors that convey by imputed qualities—allowing pictures to encompass a spectrum of resemblance beyond mere likeness. However, applying iconicity to pictorial elements raises definitional challenges, particularly for abstract components like lines or contours, which may not directly resemble objects but instead function diagrammatically by mapping relations, such as proportion or direction, rather than overall appearance. Peirce himself emphasized diagrams over static images in early writings, highlighting how lines in a portrait or sketch often blend iconic resemblance with structural analogy, complicating claims of pure iconicity and prompting debates on whether such elements truly "resemble" or merely abstractly correspond. Subsequent developments in semiotics extended Peirce's ideas to specific media like photography. Roland Barthes, in his analysis of the photographic image, posited that photographs achieve a high degree of iconicity through an uncoded denotation, where the image automatically resembles the real referent via mechanical reproduction, distinguishing it from more conventional signs. Conversely, Umberto Eco critiqued this resemblance-based view, arguing in his theory of signs that even photographic iconicity is mediated by cultural codes and interpretive habits, rendering pure resemblance illusory and all depictions partially symbolic.^[37] These perspectives fuel ongoing debates about whether all depictions qualify as iconic; for instance, abstract or stylized representations may prioritize indexical pointing or symbolic convention over resemblance, challenging the universality of iconicity in visual media.^[38] Efforts to integrate pictorial iconicity with linguistic semiotics have treated pictures as incomplete semiotic systems, analogous to language in generating meaning but differing in their reliance on visual resemblance rather than arbitrary signifiers. Systemic-functional approaches, for example, model images as resource selections for meaning-making, yet note limitations in non-iconic depictions like diagrams, where relational iconicity (e.g., arrows indicating flow) introduces incompleteness by omitting full perceptual details, thus blending with notational elements for clarity.^[39] This hybridization underscores how pictures, while iconic at core, often incorporate linguistic-like structures to compensate for visual ambiguities.

Deictic Elements

In semiotic analysis of depictions, the concept of deictic elements draws from linguistics to examine how visual representations imply the viewer's position, spatial orientation, and narrative involvement, akin to deictic words like "here" or "now" that anchor context. Art historian Norman Bryson adapted this framework in his 1983 work Vision and Painting: The Logic of the Gaze to distinguish between two modes of visual address in pictures: the "Gaze" and the "Glance." The Gaze embodies an objective, transcendent perspective with an absent narrator, suppressing deictic markers that would reference the painter's or viewer's bodily presence, time, or process of creation to achieve a universal, atemporal view.^[40] In contrast, the Glance involves a subjective, embodied engagement with a present narrator, incorporating deictic elements such as visible brushstrokes or implied viewpoints that highlight contingency and the viewer's active role in the scene.^[41] This binary underscores medium-specific qualities of depictions, where pictures convey implied stances without relying on linguistic cues, positioning the viewer as either an external observer or an integrated participant.^[42] Bryson's model applies particularly to genres like portraiture and landscape art, revealing how deictic elements foster viewer engagement. In portraiture, a subject's direct gaze toward the implied viewer introduces deictic pointing, aligning with the Glance by simulating interpersonal encounter and subjective intimacy, as seen in Renaissance works where the sitter's eyes anchor the viewer's presence in the depicted space.^[43] Landscape depictions, conversely, often employ the Gaze through panoramic compositions that erase traces of human scale or temporality, offering an objective survey that distances the viewer and emphasizes eternal, universal scenery, such as in classical European vistas.^[40] These applications demonstrate how deictic structures in non-verbal depictions elicit imaginative involvement, bridging the image's narrative stance with the audience's perceptual experience without explicit words.^[42] Despite its influence, Bryson's framework faces limitations, including an overemphasis on Western art traditions, where the Gaze dominates, while briefly contrasting it with Eastern practices like Chinese ink landscapes that more readily embrace the Glance's deictic vitality.^[42] Critics argue the binary opposition is reductionist, potentially oversimplifying visual textures and aesthetic qualities by treating images primarily as textual signs.^[41] Additionally, developed for static paintings, the model reveals a potential gap in addressing digital interactive depictions, where user-driven navigation dynamically alters deictic positions, extending beyond the fixed narrator-viewer dynamics of traditional media.^[44]

Interpretive Practices

Iconographic Analysis

Iconographic analysis provides a systematic framework for interpreting the content and meaning of visual depictions by examining their formal, cultural, and contextual layers. Developed primarily by art historian Erwin Panofsky in the early 20th century, this method distinguishes between surface-level recognition and deeper symbolic significance, enabling scholars to decode how images convey narratives and ideas beyond their apparent forms.^[45] Panofsky outlined three progressive levels of analysis to uncover meaning in depictions. The first level, pre-iconographic or natural, involves identifying basic forms and motifs through immediate visual perception, such as recognizing a figure holding a staff as a person with a walking aid, without cultural inference.^[45] The second level, iconographic or conventional, interprets these elements within established cultural or artistic traditions, for instance, identifying the staff as a shepherd's crook symbolizing pastoral care in Christian iconography.^[46] The third level, iconological or intrinsic, synthesizes these into the artwork's broader ideological or synthetic meaning, considering the historical, philosophical, and social context to reveal underlying worldviews, such as themes of humility in Renaissance religious art.^[45] This method has been widely applied in art history, particularly to Renaissance depictions where symbolic density requires layered decoding. For example, in Jan van Eyck's Arnolfini Portrait (1434), the first level notes everyday objects like a mirror and chandelier; the second identifies them as marital and fertility symbols drawn from Flemish customs; and the third interprets the scene as a testament to bourgeois piety and humanism in 15th-century Burgundy.^[47] Similarly, Giotto's frescoes in the Scrovegni Chapel (c. 1305) use standardized Christian motifs—such as the Crucifixion with specific attributes like the Veronica cloth—to convey theological narratives accessible to medieval viewers, aiding in the interpretation of devotional intent.^[46] These applications highlight iconographic analysis's role in unraveling symbolic content, transforming opaque images into legible expressions of cultural values.^[46] Over time, Panofsky's structured approach has evolved from a relatively static, author- and context-centered model to more dynamic interpretations that emphasize viewer dependency. Later scholarship, influenced by reception theory, incorporates how audiences' personal and temporal contexts shape meaning, acknowledging that interpretations vary across diverse viewers while building on Panofsky's foundational layers.^[48]

Cultural and Ideological Influences

W.J.T. Mitchell posits that depictions are profoundly tied to ideological shifts, functioning not as neutral representations but as active participants in historical and social dynamics. In his analysis, realism emerges as a quintessential bourgeois ideology, naturalizing class domination and the arbitrary mechanisms of power by presenting historical conditions as inevitable and objective.^[49] This ideological embedding is evident in colonial art, where European artists depicted colonized subjects to affirm imperial superiority; for instance, Eugène Delacroix's 1834 painting Women of Algiers in their Apartment exoticizes and sexualizes North African women, reducing them to passive objects that reinforce Western fantasies of dominance and otherness.^[50] Such portrayals, as Mitchell further elaborates, conceal the constructed nature of images, allowing them to serve as tools for ideological hegemony.^[49] Cultural variances in depiction underscore the limitations of Eurocentric frameworks, which often marginalize non-Western iconographies by imposing a singular, detached notion of the image. In many non-Western traditions, depictions integrate performatively with the body and ritual, as seen in African masks that embody spirits during ceremonies or East Asian calligraphy that fuses text and visual expression in dynamic, participatory ways, contrasting the Western emphasis on static, observational viewing.^[51] Critiques of Eurocentrism highlight this gap, arguing that art historical canons exclude such forms due to biases rooted in colonial legacies, thereby perpetuating an outdated hierarchy that privileges European perspectives over diverse cultural realities.^[12] For example, indigenous Nahua concepts of artistry as a "dialogue between head and heart" reveal depiction as a holistic cultural practice, challenging the Western separation of form from socio-spiritual context.^[12] On a broader scale, depictions reinforce stereotypes and entrenched power structures by embedding socio-political ideologies that shape collective perceptions and sustain inequalities. Colonial-era artworks, such as those depicting Native Americans as "noble savages" or "violent primitives," justified land appropriation and cultural erasure by framing indigenous peoples as temporally and developmentally inferior to Europeans.^[52] This reinforcement extends to gender and racial hierarchies, where images like Delacroix's Moroccan scenes perpetuate notions of Eastern chaos and subjugation, aligning with Edward Said's observation that such representations construct the "Orient" as a site of Western control and desire.^[50] By naturalizing these ideologies, depictions not only reflect but actively propagate power imbalances, influencing ongoing cultural narratives and institutional practices.

Emerging and Broader Issues

Modern Technological Depictions

Modern technological depictions encompass digital forms such as photography, computer-generated imagery (CGI), and virtual reality (VR), which extend and challenge longstanding theories of pictorial representation rooted in resemblance and perception. Digital photography, by capturing light through sensors to produce high-fidelity images, blurs the line between mechanical reproduction and artistic depiction, often achieving levels of detail that surpass traditional analog methods and prompting reevaluations of how images denote reality. CGI, utilized in film and animation for creating synthetic visuals, enables the depiction of impossible or enhanced scenes, such as fantastical environments in movies like those produced with tools from Industrial Light & Magic, thereby questioning conventional notions of iconicity by prioritizing algorithmic simulation over physical resemblance. VR further complicates these theories by immersing users in interactive 3D environments, where depictions are not static but dynamically responsive, fostering a sense of presence that traditional two-dimensional pictures cannot replicate, as explored in studies of cinematic VR narratives. These technologies introduce hyperrealism, where AI-enhanced images can appear more lifelike than actual photographs, leading to perceptual illusions that outstrip human-generated depictions in realism judgments. In the realm of artificial intelligence, depictions emerge from machine learning models trained on vast datasets to generate images from textual prompts, exemplified by OpenAI's DALL·E, introduced in 2021 and advanced through DALL·E 3 in 2023, which autoregressively models text and image tokens to produce novel visuals without domain-specific fine-tuning.^[53] Recent developments as of 2025 include tools like Google Gemini 2.5 Flash, enhancing photorealism, anatomical accuracy, and prompt adherence for co-creative applications.^[54] Such AI outputs raise philosophical questions about authorship, as the generated images result from human prompts interacting with algorithmic processes, positioning the user as a director rather than a sole creator, akin to guiding a computational brush. Resemblance in synthetic art becomes contested, with AI-generated faces often perceived as more human-like than real ones due to averaged features from training data, a phenomenon termed "AI hyperrealism" that challenges traditional depiction criteria by exploiting perceptual biases rather than direct mimicry. Empirical assessments confirm that viewers frequently prefer or misattribute AI art from DALL·E 2 as human-made, highlighting how these depictions evoke emotional and aesthetic responses comparable to conventional art while lacking intentional human origination. Theoretical frameworks for depiction, such as Richard Wollheim's "seeing-in" concept—which posits a twofold experience of both the picture surface and the represented content—have been extended to interactive media like VR, where users "see-in" dynamic scenes amid real-time navigation, enhancing imaginative engagement beyond static viewing. Similarly, Nelson Goodman's notation theory, emphasizing syntactic density in representational systems, applies to CGI and VR by treating digital renders as notational schemata that articulate complex spatial relations through code rather than mere resemblance. Recent cross-cultural studies, including those comparing perceptions in the United States, Japan, and China as of 2024-2025, have begun addressing gaps in AI image perception, though research remains limited outside Western contexts and underexplored in relation to local visual traditions.^[55]

Ethical and Practical Concerns

The creation and interpretation of depictions are shaped by several influencing factors, including the artist's intent, which can embed moral or ideological perspectives into the work, thereby affecting how viewers perceive its content.^[56] Equipment limitations, such as the constraints of traditional tools like brushes or digital software, often force artists to adapt their techniques, potentially altering the fidelity of the representation. Viewer context plays a critical role, as individual backgrounds and emotional states influence the ethical assessment and aesthetic appeal of depictions, sometimes leading to divergent interpretations of the same image.^[57] Debates persist regarding the differences between mediated seeing in depictions and direct face-to-face perception, with some arguing that artistic portrayals introduce interpretive layers absent in real encounters, potentially distorting emotional authenticity.^[58] In portraits, for instance, hand-crafted depictions may evoke a sense of sustained attention and perceptual depth that mechanical reproductions lack, raising questions about the ethical implications of substituting one for the other in interpersonal contexts.^[59] Ethical concerns in depictions prominently include misrepresentation in media, where altered or selective visuals can perpetuate stereotypes, particularly of marginalized groups, undermining social equity.^[60] The post-2020 rise of deepfakes, driven by advancements in generative AI, has amplified these issues, enabling non-consensual manipulations that erode trust in visual evidence and pose risks to privacy and public discourse.^[61] By 2023, deepfake incidents had surged tenfold from 2022 levels, with a 3,000% increase in related phishing and fraud; this trend continued, with a 244% rise in attacks reported in 2024 and deepfake files increasing from 500,000 in 2023 to 8 million in 2025.^[62]^[63]^[64] Ethical analyses highlight their potential to incite misinformation and harm individuals through fabricated depictions, prompting new regulations such as the EU AI Act (effective 2024), the U.S. TAKE IT DOWN Act (2025), and 64 state-level laws enacted in 2025 addressing deepfakes in elections and non-consensual content.^[65]^[66] Cultural appropriation in visual depictions further complicates ethics, as artists from dominant cultures adopting elements from marginalized ones without context or permission can trivialize sacred symbols and reinforce power imbalances.^[67] Practical debates center on the nature of realism in depictions, questioning whether photorealistic accuracy equates to truthful representation or merely a stylistic convention that overlooks subjective human experience.^[68] An emerging gap in addressing AI ethics for depictions lies in the incomplete frameworks for mitigating biases in image generation models, which often reproduce societal stereotypes due to unrepresentative training data, leaving regulatory and moral guidelines underdeveloped as of 2024.^[69] This shortfall highlights the need for more robust standards to ensure equitable and transparent AI-driven visual practices.^[70]

Depiction

Fundamentals

Definition and Characteristics

Historical Development

Core Theories of Resemblance

Mimetic Approaches

Perceptual Mechanisms

Illusion-Based Explanations

Invariant Structures

Twofoldness and Seeing-In

Psychological Dimensions

Recognition and Experience

Imaginative Engagement

Semiotic Frameworks

Denotation and Notation

Iconicity in Pictures

Deictic Elements

Interpretive Practices

Iconographic Analysis

Cultural and Ideological Influences

Emerging and Broader Issues

Modern Technological Depictions

Ethical and Practical Concerns

References

Table of Contents

Depiction

Fundamentals

Definition and Characteristics

Historical Development

Core Theories of Resemblance

Mimetic Approaches

Limitations and Refinements

Perceptual Mechanisms

Illusion-Based Explanations

Invariant Structures

Twofoldness and Seeing-In

Psychological Dimensions

Recognition and Experience

Imaginative Engagement

Semiotic Frameworks

Denotation and Notation

Iconicity in Pictures

Deictic Elements

Interpretive Practices

Iconographic Analysis

Cultural and Ideological Influences

Emerging and Broader Issues

Modern Technological Depictions

Ethical and Practical Concerns

References

Table of Contents

Sign in to contribute

Suggest an article

Something went wrong

Thank you!