Perspective on Persian Internet Memes: Exploring Multimodal Meaning Construction and Viewpoint Dynamics

Document Type : Research Paper

Author

Associate Professor of Linguistics Department, Faculty of Persian Literature and Foreign Languages, Payame- Noor University, Tehran, Iran

Abstract

This paper investigates the construction of meaning in Persian Internet memes, particularly focusing on image macros—prevalent multimodal memes characterized by a fixed image paired with textual commentary. By analyzing four popular image macros that convey humorous socio-cultural messages, this study demonstrates how these bimodal (verbo-pictorial) constructions work together to generate cohesive humorous meanings and convey distinct viewpoints. The theoretical framework is grounded in Construction Grammar and Conceptual Integration Theory, two cognitive linguistics approaches that enrich the analysis. Central to this study is the concept of Discourse Viewpoint Space, which emerges in each meme and highlights the significance of perspective in meaning creation. Through qualitative examination, the research reveals that new textual elements are often woven into iterative visual materials, forming the core content of the memes. The textual components typically fulfill constructional roles, such as adverbial clauses of reason and condition, thereby shaping the inventive aspects of meaning, often presented through fictive direct speech. The humorous interpretations arise from the systematic interplay between the images and text, as well as the integration of various conceptual spaces. The phrases and images invoke frame spaces derived from shared cultural references, such as sitcoms or interviews, reflecting community familiarity. This intersubjectivity plays a crucial role in shaping meaning. Consequently, Persian image macros serve as a medium through which meme creators articulate and reconstruct their viewpoints—including ideas, emotions, and stereotypes—by harnessing diverse discursive, textual, frame metonymy spaces, and contextual information.

Keywords

Main Subjects


 

  1. Introduction

Social media, among which internet memes are very convenient, replicable and widespread, provide different modes of communication for communities and individuals. Indisputably, internet memes can be considered as the integral part of internet discourse deserving profound study by linguists. Dancygier and Vandelanotte (2017: 565) suggest that memes are "emerging multimodal constructions" which simultaneously rely on both text and image.

The meme theory is originated in evolutionary biology. The biologist Richard Dawkins (1976) claimed that "while biological selection works at the level of individuals, genes are the replicators that enable variation, which is the precondition for evolution. Genes contain the genetic information material and at the same time serve as the transmission channel for them" (Csordás et al., 2017: 248). According to Dawkins (1976), the complexity of the human mind is enabled by the fact that genes are complemented by other information replicators in its evolution. These latter are named as memes. Therefore, memes are conceptualized as the units of cultural transmission and replicators of the human cultural environment (Dawkins, 1976: 192).

These biological analyses though make the foundations of this term, can only shed weak light on the internet memes concept and function. The meme's evolution diffusion patterns and evolutionary traits work like genes, while in form, they are memorable units of the human mind, the evolution and diffusion of which take place through interpersonal communications and various human artifacts as channels (e.g. pictures, books, narratives, written and digital data storages, etc.). In general, the term meme is popularly applied to a variety of "catchy and widely propagated ideas or phenomena" on the Internet (Knobel & Lankshear, 2007: 201) In other words, "while memes themselves are ideas and therefore are invisible, meme-carriers, i.e. the “physical” tools that contribute to spreading them, are observable. Their forms and representations can be most varied: a jingle[1], a thought, a password, fashion, a procedure or an architectural style can become a meme, as well as blind faith" (Csordás et al. 2017: 249).

Shifman (2014, p. 41) defines internet memes as "groups of digital items or texts created and shared separately by many individuals but in awareness of one another and having common characteristics of content, form, and/or stance". They have a kind of "intertextual nature by which they take images from dominant media structures, juxtaposing and remixing them to create new layers of meaning" (Miltner, 2011).

Today in social media, application of meme as a socio-cultural phenomenon and communication channel is prevalent and widespread and various types of them can be observed in which carriers of meme are different textual, visual, audial and gestural materials. According to Shifman (2014) memes may take different forms among which the basic still-image memes are the image macro and the reaction Photoshop (Shifman, 2014). In image macros, defined as "non-moving images with super-imposed text" (Dancygier and Vandelanotte, 2017:566), both text and image play their meaning construction roles. "In this genre, the particular background image tends to remain fairly constant within the meme; it is the text script that users continually modify." (Brideau & Berret, 2014) How verbal and visual modes are integrated to create coherent (mostly humorous or at least gratifying) effects requires linguistic investigations, since the picture and its textual complement carry the burden of meaning making.

This paper will investigate four image macros as Persian internet memes to see how text is embedded in the image to create the emergent meaning intended by meme makers in different situations. This will help us to see how such verbo-pictorial constructs are relevant to linguistics. Their analysis can explain feature/s of meaning construction. Moreover, it will study how integration of discourse spaces, expressed in texts and depicted in image macros, can create final humor.

The paper organization is as follows: in Section 2, the related literature including Conceptual Integration Theory, Multimodality, and Viewpoint are shortly overviewed. Section 3 is devoted to the general overview of the model presented by Dancygier and Vandelanotte (2017) including the main claims of Construction Grammar, Discourse Viewpoint Space, and Frame Metonymy.  In Section 4, the meaning construction and forming discourse viewpoint will be discussed under four subsections devoted to the four selected image macros: Why did you kill him? (4-1), Mokhtar (4-2), Who has lost? Her or me? (Habib) (4-3), and finally, No Difficulty (Dushvari) (4-4). Section 5 is devoted to findings of the research.

 

  1. Review of Literature

Cognitive Semantics investigations focus on "the conceptual frameworks and how language use reflects them" (Saeed, 1997: 301). The cognitive linguists Fauconnier and Turner (2002) show that conceptual blending is at the root of the cognitively modern human mind, and they expanded the two-domain model of metaphor into a multi-domain network, suggesting that meaning construction in more creative situations involves a more complicated interrelationship. In other simpler words, they have proposed that human creativity may be modeled by their theory of conceptual blending (conceptual integration). They developed a model of conceptualization in which some input spaces' components (sources and targets) are projected into a novel emerging space, called the blend. The vital relations manage the mappings of input spaces and a generic space structures that shared general components of all spaces. The background knowledge and immediate context play vital roles in the running of iterative outputs created in this way.

In the present research, following Dancygier and Vandelanotte (2017), it is assumed that meme meaning results from the interrelationship of several spaces in the final discourse viewpoint space. Huntington (2015) considers the centrality of the intertextual nature of memes as a unique form of visual rhetoric in activist contexts and contributes to the literature on user-generated and activist rhetoric. Having adapted Blakesley’s (2004: 3-4) formula for a definition of film rhetoric, she claimed that memes present "a rhetorical situation in which interactions among an anonymous creator, the meme and meme iteration, and the viewer combine to create “the total act of making meaning”". Meme elements are more than usual communication elements (sender, message, and receiver). According to Huntington (2015: 4), meme also involves "the context of events to which the meme responds and source texts from which the meme appropriates and remixes," and is "an assemblage of these elements." (Huntington, 2015: 4)

Moreover, memes are multimodal phenomena. Multimodality is viewed from different perspectives, among which are Semiotics and Social Semiotics which study the relationship between image and text. It seems that the study of image language started from Roland Barthes' (1977) semiotic approach. Having studied press photographs, he mentioned that "the photograph is not simply a product or a channel, but also an object endowed with a structural autonomy" (Barthes, 1977: 15). He continued that "the structure of the photograph is not an isolated structure; it is in communication with at least one other structure, namely the text – title, caption or article – accompanying every press photograph." Social semiotics is another approach to the language of image. Kress and van Leeuwen (1996) demonstrated the utilization of modal resources beyond verbal communication within social practices. They articulated a visual design grammar that examines how elements such as color, perspective, and composition convey meanings similarly to the functional grammar found in language. Kress (2009) is a social semiotic approach to multimodality. In Persian language, Amouzadeh and Tavangar (2008) was a study of sociolinguistic aspects of Persian advertising in post-revolutionary Iran. Studies like Amouzadeh and Tavangar (2004) attempted to investigate verbo-pictorial multimodalities for critical discourse purposes. This study was an attempt to decode pictorial metaphor and discover Ideologies in Persian commercial advertising.

 The study of multimodality continued with another recent linguistic approach, i.e. cognitive approach. Forceville had a prominent role in introducing and emphasizing the role of cognitive mechanisms like conceptual metaphors conveyed by other non-verbal modes like picture. Pourebrahim (2015) studied verbo-pictorial metaphors in some informative Persian posters based on Forceville's cognitive approach to multimodal metaphors. Pourebrahim (2016) investigated metaphorical realization of ideology in political caricatures. She studied the role of pictorial metaphors in CDA. Ansarian et al. (2022) studied two political Persian cartoons on the subject of trigger mechanism within the framework of three theories of Conceptual Metaphor, Multimodal Metaphor and Conceptual Blending. Ghaderi Nezhad et al. (2022) investigated the role of pictorial metaphors in creating and conveying the message of citizen cartoons in Persian. Though there are some studies on meaning –making of images, cartoons, advertisements and posters in Persian, Internet memes are not considered from any multimodal perspectives. 

Internet memes to be analyzed here are image macros which are represented in two verbal and pictorial modes, simultaneously. What is meant by “mode”? Forceville (2006: 382) believed that "what is labeled a mode is a complex of various factors". He defined it as "a sign system interpretable because of a specific perception process." So, since there are five senses, we would arrive at the following list: (1) the pictorial or visual mode; (2) the aural or sonic mode; (3) the olfactory mode; (4) the gustatory mode; and (5) the tactile mode." Then, he redefines this "crude categorization", because of the overlaps between modes, "for instance, the sonic mode under this description lumps together spoken language, music, and non-verbal sound. He stated that the situation is more complicated, and it is "impossible to give either a satisfactory definition of “mode,” or compile an exhaustive list of modes." However, Forceville postulated that there are different modes and that these include, at least, the following: (1) pictorial signs; (2) written signs; (3) spoken signs; (4) gestures; (5) sounds; (6) music (7) smells; (8) tastes; and (9) touch. (Forceville 2006: 384). In image macros, the first two modes are presented simultaneously, though with different degrees of effective meaning construction: pictorial and written signs, both understood visually. They can be called as verbo-pictorial memes. In this research humorous memes will be studied.

From a linguistic perspective, humor often involves some incongruence between two images (Kövecses, 2020: 132). It is claimed that humour can be analyzed based on mental spaces theory and its extension into Conceptual Integration Theory by Fauconnier and Turner. Jabłońska-Hood (2015: 108-109), trying to consider all linguistic and nonlinguistic factors in humour creation and incorporating the principles of cognitive linguistics, presents a set of assumptions to define humour. In this study, two points are important especially when we are trying to insert memes in the category of humour. That, "humour is a cognitive, superordinate category with its range of subordinate members."(p.108) Moreover, "humour is any instantiation of language; whether verbal, written, gesture-oriented or graphic; that provides a stimulus, in its linguistic form; to which a response follows in the form of humour appreciation"(p.108). Furthermore, "humour appreciation can range from a smile, through to laughter, including even a nod of the head, as long as it is accompanied by comprehension of the humorous", either such understanding be visible or be a cognitive understanding. (p. 109) "Humour exists in a particular context, and suggests the presence of a humorist and an audience", and it is "subjective, as well as context-oriented" (Jobłońska-Hood, 2015: 109).

 

  1. Theoretical Framework

Dancygier and Vandelanotte (2017) approach has been selected to analyse the memes. This approach is the combination of cognitive approaches of Construction grammar, and Conceptual Integration. They believed that "Internet memes are emerging pairings of form – specifically multimodal configurations of forms – with identifiable meanings, thereby qualifying as constructions in various understandings of construction grammar." (Dancygier and Vandelanotte, 2017: 567). They emphasized that "constructional meaning can be signaled even when some of the formal features of the full construction are missing." (Dancygier and Vandelanotte, 2017: 567) Having assumed this framework, they studied image macro Internet memes and attempted to explain their meaning construction process. They asserted that internet memes are emerging multimodal constructions, thus presented an analytic model based on Blending Theory attempting to find interactions between their linguistic and visual form.

According to Dancygier and Vandelanotte (2017: 568) "Internet memes cannot be studied in isolation, as individual artifacts. They participate in complex networks not only of previous examples of the same meme and new blended combinations of existing memes, but also of various nonmemetic artifacts (photos, films, iconic figures, ads, etc.). These artifacts are well-known in a given discourse community and provide instant access to rich frames whose contribution to the emerging meaning is central." They argued that the final interpretation which emerges out of this network can best be understood as being resolved at the level of a supervisory mental space which they have labeled the Discourse Viewpoint Space (Dancygier and Vandelanotte, 2016). Dancygier and Vandelanotte (2017: 568) assumed that "memes are viewpoint-driven multimodal constructions which rely on pre-existing attitudes and beliefs. Shift or manipulation of viewpoint leads to a new viewpointed construal, often an ironic one". They pointed out that the "various types of text or discourse spaces in memes involve different participants as speakers, addressees or other agents: the meme maker, the character represented, etc." These participants "profile different viewpoints". The viewpoint is "the driving force behind exchanging, commenting on and responding to memes." (Dancygier and Vandelanotte, 2017: 568)

Dancygier and Vandelanotte referred to two important interrelated aspects of meme viewpoint. That "the viewpoints expressed take into account and respond to attitudes, beliefs, stereotypes, clichés and the like assumed to be shared by or at least known by the addressee … [and] memes tend to form chains of successive responses and refashioning, recycling initial combinations to refer to new developments, current events, fashions, fads, and the like." (Dancygier and Vandelanotte, 2017: 568). What their analysis highlights, and I think distinguishes it from previous approaches, is that "constructional meaning can be signaled even when some of the formal features of the full construction are missing. This is a particular constructional version of frame metonymy, by which characteristic parts of a frame are sufficient to call up whole frames (Dancygier and Vandelanotte, 2017: 567).

To clarify their methodology, I briefly present one of their meme analyses here. They analyzed cases such as the meme in Fig. (1), in terms of a multi-level viewpoint structure and represented its level of viewpoint in Fig. (2). In Fig. (2), the box on the left represents the utterance as it could be initially understood, i.e., as spoken by the character depicted in the meme (MC). They dubbed it Discourse Space 1.

"The addressee here could be taken to be the meme maker in the meme-making space, if we analyze the meme maker’s process; if we analyze the meme viewer’s interpretive process, the addressee is the meme viewer. This space then is embedded (the big arrow represents the process) in Discourse Space 2 by means of the “faux” reporting clause (said no one ever), producing a clash between the initially assumed claim and the said no one ever faux-reporting clause, indicating that the apparent “quote” is in fact addressed by no one to no one. In order to meaningfully resolve this incongruous clash, a unifying higher space is set up, represented by the outer box, which they have labeled the Discourse Viewpoint Space." (Dancygier and Vandelanotte, 2017: 570-1).

 

 

Fig 1- The Said no one ever Meme

(discussed in Dancygier and Vandelanotte, 2017, 570)

Fig 2- The schematic representation of its viewpoint. (Dancygier and Vandelanotte, 2017, 571)

 

The black arrows in Fig. (2) show how "the faux-direct statement (in quotation marks) is affected by being embedded in the said no one ever clause, and then re-interpreted by seeking coherence at the higher (Discourse Viewpoint Space) level" (Dancygier and Vandelanotte, 2017: 571).

The details of viewpoint formation are different from one meme to the other, though the overall interaction process seems to be the same. The authors argued that "the final interpretation which emerges out of this network can best be understood as being resolved at the level of a supervisory mental space called the Discourse Viewpoint Space" (Dancygier and Vandelanotte, 2017: 568). In the following section, the Persian memes will be analyzed using this model.  

 

  1. Persian Image Macros

In this paper, among a vast number of Persian Internet memes trending on social media, those image macros with local socio-cultural themes and humorous effects have been selected purposefully. It is tried to select the memes that carry humorous socio-cultural themes, understood by most social group audiences, and are in compliance with moral and religious issues. The four Persian memes presented here will be analyzed from the cognitive perspective which believes various existing spaces of an internet meme are merged into each other so that a new meaning, often humorous, emerges in the form of a verbo-pictorial artifact. The spaces and the mechanism of their blending into one message, the structure of each space and the grammatical construct conveyed by embedded text or picture or both will be the focus of meaning construction analysis.

4-1. "Why did you kill him?"  Meme

The Persian meme "Why did you kill him?"[2] went viral in Persian social media in recent years.[3] In this image macro, several spaces and relevant frames are activated to create a humorous effect at the end. At first, there is an Image Frame of Murder evoked by the picture. There is a scene like that of the epic battles of Shahname,[4] or similar pictures. Image Frame of Murder contains the murderer, in red cloth, holding a sword in hand, and standing above the corpus of the victim, whose body is divided into two parts (see Fig. 3). Moreover, there may be some other people as witness(s) in the frame. In some memes, just some parts of this frame are (like the murderer and the crime instrument) used (Fig. 4) or all image may be completely dropped (Fig. 5).

 

            

Fig 3- Why did you kill him?

-He said "Summer 5 Toomans or Winter ones."

Fig 4- Why did you kill him?

- He called anime cartoon!

- So you are free!

Fig 5- The judge: Why did you kill him?

-He was as old as an ass, but said Aleh[5] instead of Areh.

 

This frame is mapped onto Discourse Space 1. Discourse space 1 is the space in which the utterance appears. Speakers are the meme characters in meme making space. There are faux or "fictive direct speeches" of someone who asks: "why did you kill him?" here. In other words, some people, among whom there may be judge or an unknown person, asked the murderer: "Why did you kill him? Damn you!" So the meme maker manages a faux conversation between the witness/judge and the murderer in this space. A repeated question is raised in the Discourse Space 1 in all memes which need a logical answer. The answer is provided by the meme character, the murderer in the textual space. The reason of murder is given here that evokes another space, which can be dubbed as Textual Space. This space is the novel part of the meme that distinguishes it from other similar memes and creates a new version of the meme. The Textual Space includes the reason of murdering which is different from meme to meme, an answer which functions as the adverbial clause beginning with because conjunctive element and attributes a saying or a behaviour or an action to the murdered one. As it was noted before, in some cases, the picture has been changed into other similar pictures or drawings (e.g. Fig. 4), or it may be ignored (e.g. in Fig. 5). However, the question remains the same in all memes: "Why did you kill him?" The participants of space 1 are Meme character (MC) as the speaker, and Meme maker (MM) as the addressee. The information taken from Image Frame of murder plays an important role in creating Discourse Space 1. Image Frame completes the discourse space by evoking the murdering scene in which a murderer, a victim and criminal weapon is represented. This Frame is not directly presented in Fig. 5, although the repetition of this image macro can retrieve it mentally for meme viewers. Text Space is embedded in the Discourse space 1 to add the reason of murdering. Merging of main clause and adverbial clause of reason (in the form of question and answer constituents) in a humorous way creates the viewpoint of the image macro in the higher space, Discourse Viewpoint Space. However, these two spaces are not enough to create humor effect. Overall humorous effect depends on the combination of all spaces plus contextual factors, as well as the attitudes, points of view and opinions of the meme maker at the corresponding context, which are represented neither in text nor image modes. World knowledge of murdering and its ordinary reasons cause the Meme viewers understand the joke. In all these image macros what causes humour is the unequal relation of the mistake and the punishment. In these image macros, two constituents play roles: "Why did you kill him?" as an interrogative sentence and "because he has…" construction in which the reason of killing is inserted, which is logically disproportionate with normal cases in real life and consequently leads to ridiculous effects.

Fig.6 shows the configuration of different spaces which are blended to construct the final humorous viewpoint of "Why did you kill him?" meme. As is illustrated here, Frame spaces 1 and 2, evoke terminologies of money (5 Tooman) and time (winter time vs. summer time). Twice a year the clocks change, once on the First day in Farvardin one hour forward[6] and the next, on the last day in Shahrivar one hour back.[7] In this meme, terms are used mistakenly as Summer 5 Toomans or winter 5 Toomans? In conversational circumstances, Persian speakers may mix the summer and winter times, especially in the beginning days (e.g. "Is it summer 5 or winter"?[8]). 

Another frame is that of Iranian currency which is losing its value because of intense sanctions of Western governments against Iran. Dramatic decreases of Iranian Rial and Tooman provide a context in which time changes and money change can be interwoven ironically, since the value of 5 Toomans[9] changes over short times.

Knowledge of these frame spaces is essential to the understanding of the humour exhibited in the macro image in Fig.3. Fig. 6 is the representation of space integrations and their role in the final discourse viewpoint.

 

 

Fig 6- Viewpoint in "Why did you kill him?" Meme

 

4-2. Mokhtar Memes

Mokhtarnameh (lit. The Book of Mokhtar) is an Iranian epic/history television series directed by Davood Mirbagheri, based on the life of Mokhtar Thaqafi, a Shiite Muslim leader based in Kufa, who led a rebellion against the Umayyad Caliphate in 685 to avenge the martyrdom of Imam Hussain (The grandson of Islamic Prophet Muhammad (PBUH)). There are different scenes of this TV series remixed and recycled as Internet memes, propagated in Persian social media. Two of them are in Figures 7 and 8. In image macro 7, a scene of Mokhtarnameh while Mokhtar is talking with his friend, Kian, is selected. In this image macro, Mokhtar is distressed and upset. In this Internet meme (Fig.7), a blended verbo-pictorial space is constructed. The image involves a scene in which Mokhtar, disappointed about worldly events and Kofi people, complainingly talks to Kian, his loyal friend. The function of the image in meaning construction is demonstrating disappointment and uneasiness about a subject matter. The image is taken from the Frame Space of Mokhtarnameh.

Discourse Space1 is formed where there is an utterance between two meme characters and toward the meme addressee. The faux conversation is added in the form of a text, seemingly direct speeches as a question and answer sentences attributed to the meme characters, which are called up from Frame Space of Mokhtarnameh. The Text space is integrated in the discourse space so that distinction of their roles in forming the final message is difficult. In other words, there is no clear-cut, delineated boundary between discourse and text spaces. Two Frame spaces of University Exam and Snap Driving are evoked in the text and understanding the details of each frame would help meme viewers to get the overall viewpoint. The interaction of spaces as well as the background contextual knowledge set a configuration of Discourse Viewpoint Space, in which incongruous meanings are formed and resolved. The background knowledge makes the irrelevant question and answer relevant, that is, those who fail in University Entrance Exam should be worried because they must choose low level jobs.

There is a cause and effect relation between textual material and what is illustrated in the image, especially what is appeared visually as the facial expressions of Mokhtar, who is metonymically the image of any experiencer of that emotion. It seems that two relative clauses play roles and integrate to emerge a new meaning: X is upset (in the image), because something disappointing has happened (in the text).

 

 

Fig 7- Mokhtar Meme1

- What was the result of the university entrance exam Abu Ishaq?

-What are the conditions of Snapp employment?)

Fig 8- Mokhtar Meme2

-Why are you happy, Mokhtar?

Didn’t you hear, Kian? Messi wants to leave Barsa.)

 

Though the image is stable, various texts can be embedded in it and novel and mostly ironical meanings can be emerged repeatedly. Fictive direct speeches are used to express the viewpoints here.

In most Mokhtar memes, the texts include a question by Kian. The question(s) supposed to be asked by Kian show explicitly or implicitly the causes of the problem which upset Mokhtar. (Kian: What was the result of the university entrance exam Abu Ishaq? Mokhtar answers:  What are the conditions of Snapp employment?) This answer shows that the results are not satisfying and he should think about another job like Snapp Driver. This conversation evokes dissatisfying situation of Entrance Exam rejection. There can be so many various displeasing situations to put in this frame based on the immediate context, in which meme makers viewpoints are conveyed. The viewpoint of meme maker (his/ her sense of dissatisfaction of rejection in exam and trying to find another job) is reflected multimodally, both through the picture and by textual content. 

In another image macro, Fig.8, with the same characters, Mokhtar is depicted smilingly, thus makes the possibility of creating a space for expressing good news, replicating again and again and filling with different political, social, cultural and sport good events from the perspective of meme maker.

As Fig.8 exhibits, during the conversation between meme characters, which seems to talk about the famous soccer player, Messi, and his leaving from the Barcelona Club, Kian asks "Why are you happy, Mokhtar?" Mokhtar smilingly answers: "Didn’t you hear? Messi wants to leave Barsa." The contextual information helps the reader to understand the message of a meme maker who seems to be a fan of Real Madrid soccer club and who believes that Messi's leaving may weaken Barsa, an event which consequently can pave the way for Real Madrid's success.

Both Mokhtar memes, which are somehow personal memes and convey personal attitudes of meme makers toward something, involve one construction: X is happy/unhappy because…. This construction is composed of two clauses, the main clause whose content is evoked by the image of happy /unhappy face of Mokhtar, and the subsequent adverbial clause, which conveys the causes of these emotional states and vary from one context to another. As represented in Fig.9, Discourse Space 1 is composed of Speaker or meme characters/MC (Mokhtar and Kian) and the Addressee (meme maker/MM) and the theme of being unhappy because of a bad happening (X is unhappy construction is evoked through this image space).The Image shows the psychological manner of MC. The text which adds another space (Text Space) consists of an interrogative sentence about the result of the University Entrance exam, and another interrogative sentence in response (asking about the conditions of Snap employment), implicitly exhibits disappointing results ironically. The intriguing point that shows space blending is the overall meaning of the meme, which is semantically more than the combination or integration of the input spaces, is accessed from contextual factors, like the addressee's knowledge of University Concourse in Iran, problems of an uneducated young, the conditions of Snap driving, etc.  The emergent overall meaning of Meme7 is that those who fail in the University Entrance Exam should choose socially low level jobs like snap driving. Snap driving metonymically evokes other low class professions. The cause of being unhappy, as a meaning construction, is represented in Discourse Viewpoint Space (X is Unhappy because of y). This can be replicated over and over by inserting different cause and effect relationships in various contextual experienced situations. The same cognitive integrative procedure occurs when X is happy in Mokhtar meme 8.

 

Fig 9- Viewpoint in Mokhtar Meme7

 

4-3. Who has lost? Her or Me? Meme

Who has lost? Her or me? (or, Habib) meme is derived from a scene in Lisanse-ha (Bachelors)[10]. In one humorous scene which has become viral in social community, Habib has a dialogue with his friend, Mas'oud, about the results of his marriage proposal (to a girl called Laleh) and the bride's sudden refusal just the moment they were at marriage contract ceremony.

Here, listening to the first lines of conversation, the viewers expect something about the misery of Laleh, who has not married Habib. However, just the opposite occurs. Mas'oud says that she has had a good marriage and is studying abroad now. What makes the situation humorous is Habib's question, "Who has lost? Her or me?", though he has apparently lost. The opposition between what is expected to happen and what really happens creates humorous effect in the Frame Space. This contrast is reflected in the question: "who has lost? Her or me?"

Fig.10 is one of these memes. The image macro is composed of verbal and pictorial modes. Here, the picture is that of Habib taken from that comedy drama. This image calls up that scene in the Film metonymically and contributes to Discourse space 1 in which the utterance emerges in meme making space. Meme character, meme addressee, and the phrase "Who has lost? Her or me?" form the set-up of this space. Discourse Space1 is composed of the speaker, (meme character, Habib), and the addressee (meme maker) in meme making space.

Embedded in this space is a novel space which distinguishes different versions of the meme and, in most cases, is in textual mode, so it can be dubbed as Text Space. In other words, a text is added introducing a new controversial situation ending with the same question (Who has lost?), while changing deictic first and non-first elements into us and you depending on the participants of that situation. Some information from other frames feed the Text Space, for example in Fig.10, Frame Space includes some knowledge of Twitter and its owner, though in Fig.11, knowledge of social community relations, message sending and receiving, constitutes this frame. The phrase "Who has lost? Her or me?" functions like a catch-phrase and is repeated in all memes, no matter monomodal jokes in which there is no picture of Habib, or verbo-pictorial cases like fiures10 and 11. The overall complete meaning emerges as the discourse space1 and text space interact. So to understand the complete message, the meme viewer's mind should zoom-out of these spaces toward a higher level space, a broader space, Discourse Viewpoint Space, in which the final construal happens as the result of the resolution of incongruent contradictory elements of the embedded spaces.

 

            

Fig 10- Habib Meme1

- Mr. Elon Musk! If we leave Twitter, who lost? You or us?

Fig 11- Habib Meme2

- Now that you don't see my messages, who lost? You or me?

 

Most of these memes have this textual structure: Declarative sentence which describes the situation. It may have a real or unreal faux addressee, and then followed by the mimetic interrogative sentences: "Who lost? You or Us?" Both these constituents appear in the text. Conditional propositions are created in the lower space have this construction: If x, then who lost, her or me? The final resolution of the situation, in which the speaker loses not the hearer, occurs in Discourse Viewpoint Space.

Dancygier (2021), relying on Mental Spaces Theory and the apparatus it makes available for a close analysis of viewpoint networks, analyzed examples from a range of discourse genres - textual, visual and multimodal, such as literature, political campaigns, and internet memes and storefront signs. These discourse contexts use Direct Discourse Constructions, but usually lack a fully profiled Deictic Ground. She claimed that in such cases the Deictic Ground is not a pre-existing conceptual structure, but rather is set up ad hoc to construe non-standard uses of Direct Discourse– she referred to such construals as Fictive Deictic Grounds. In that context, she proposed a re-consideration of the concept of Direct Discourse to explain its tight correlation with the concept of deixis. She also argued for a treatment of Deictic Ground as a composite structure, which may not be fully profiled in each case, while participating in the construction of viewpoint configurations. The fictive direct discourse is used in this discourse to enable the meme reproduced again and again. Fig.12 shows the configuration of Discourse Viewpoint Space of Habib Meme.

 

 

Fig 12-Viewpoint in "who lost? Her or me?" (Habib) Meme

 

4-4. No Difficulty Meme

 "No Difficulty"[11]is a pseudonym for a Turkish – Persian bilingual old man who was interviewed and his speeches were broadcast via Tehran Province Television Channel in a program named In Province[12]. The interviewer asked him to explain the problems of that district and the houses destruction which was going to be done by Tehran municipality. Though that man was in catastrophically bad conditions, he answered that he had no difficulty. When the interviewer explained that the municipality was going to destroy their houses, he became embarrassed and wanted that reporter to show him municipality. When the interviewer explained that he was not from municipality, and just wants to know his difficulty, the old man asked again which difficulty and he had no difficulty at all! This interview (Interview Frame Space) provides much information we require to understand this type of memes:[13]

 

 

Fig 13- No Difficulty Meme1.

Your Honor! You raised the price of Pride to 20 Millions! Who's Saipa? First show me it!

Fig 14- No Difficulty Meme2.

Iran former Prince: We have no difficulty in collapsing IRGC.

Fig 15- No Difficulty Meme 3. Iran's former Minister of Roads & Urban Development in the Islamic Consultative‌ Assembly: "No difficulty! I've already insured it!"

The video of this interview went trend in Persian social media and created a space which can be called Interview Space. The interviewee (The old man) became popular as a symbol of a seemingly courageous man who was in bad conditions though pretended that everything was good and he had no problem at all. This is an important input which feeds Discourse Space1. Discourse space1 is a meme making space with meme character and meme maker as its participants. In No Difficulty memes, image macro may contain the photo of that old man, or some part of his body, mostly head, or hand, in picture or the phrase "No difficulty", in verbal or textual mode. The construction "no difficulty in" emerges here in this space. In other words, some parts of this frame can metonymically evoke the entire space. The elements of Frame space of interview are mapped onto Discourse space1. These elements are the photo of the old man, his head, or his hand, or the phrases like: No difficulty, and "Show me …". These elements metonymically call up the details of that interview. Meme maker postulates the intersubjectivity of this frame and the new one embedded later in Discourse space1.

The space embedded in the discourse space1, as the result of different sociocultural problems discussed in social communities, and as the result of contextual factors, as another discourse space (Discourse Space2), is presented by picture or text or a combination of both. Again, fictive direct speeches are narrated to convey contents. Discourse Space2 involves the sociocultural problems which may be evoked via picture and text, though mostly through textual materials. The lack of difficulties which is pretended in Discourse Space1 is denied by the evidences presented in discourse space2. For example, in Fig.13, the problems of an Iranian automobile corporation is established through the ridiculous picture of one of its products called Pride, and the faux interview with a man whose head is that of Difficulty man and is criticizing the high price of this car. Frame Space of the Saipa Corporation helps the meme viewer to easily retrieve all problems of this product.

However, there is a contradiction between lack of difficulty claimed in Space1 and the reported difficulties (of Saipa Corporation) emerged in Space2.  This incongruity unfolds in Discourse Viewpoint Space. Fig.16 exhibits interrelationship between these spaces in the emergence of final viewpoint discourse of Fig.13. Though there are problems in Saipa products, people despairingly pretend to have no difficulty.

In figures14 and 15, pictures have more influential roles. Discourse space2 has different characters. In fig.14, the hand of Difficulty Man is copied on the portrait of Iran former Prince, and the background is kept. Text consists of the fictive direct speeches attributing to him: "We have no difficulty in collapsing IRGC". The political information of meme viewers about IRGC, evoked with this frame are embedded in Discourse space2.  In Fig.15, the former Minister of Roads & Urban Development is expressing his remarks in the Islamic Consultative Assembly, while carrying an airplane on his shoulders, saying: "No difficulty! I've already insured it!" In this figure, just the phrase "no difficulty" calls on the interview frame, while the picture evokes the frame of the problems of Iran airline industries. The highest Spaces of Discourse Viewpoint combine all these embedded spaces, so that the overall intended contents of meme makers are conveyed.

 

 

Fig 16- Viewpoint in No Difficulty Meme 13

 

  1. Conclusion

 In contemporary forms of communication, especially in social media and communities, the creative combinations of image and text are prevalent. As it was illustrated above in the selected examples, internet memes are multimodal constructions since linguistic and visual form interacts with each other to make the overall meaning. They are "emerging pairings of form and meaning".

The results of the analysis show that though some of the formal features of the full construction are missing, as predicted in Dancygier and Vandelanotte (2017: 267), the meaning easily conveyed. According to Dancygier and Vandelanotte, "this is a particular constructional version of frame metonymy, by which characteristic parts of a frame are sufficient to call up whole frame."

This research explored how text is embedded in the image. The assemblage of text and image observes the rules of construction grammar and the processes of blending theory at the same time. Texts and pictures commonly evoke spaces. This study showed that these verbo-pictorial constructions in Persian macro images are the result of recurring integration of different evoked spaces which ends in the overall complete rhetorical humorous meaning or viewpoint. In this blending process, novel phrases, sentences or in general textual materials are superimposed on the images, which in most cases, taken from television programs and had gone viral on Internet. Adding novel spaces (mostly by inserting meme maker's comments or by changing pictures or both) to the preexisting ones enable these verbo-pictorial constructions to be reduplicated in new contexts with new meanings. Mostly, new textual materials are embedded in iterative visual or pictorial materials and the main message is expressed. In most cases illustrated above, texts are fictive direct speeches integrated in the image macros, and expressing the meme maker's viewpoints which assumed to be intersubjectively shared by meme makers and meme viewers. 

Furthermore, in these meaning making processes, the roles apportioned to texts and pictures are somehow different. Though both of them play influential roles in forming the final viewpoint, texts seems to present more novel parts of meanings (meme makers' viewpoints, as told above) and pictures carry the preexisting pieces of meaning, though in some cases it is not completely the case. In most cases, images are constructions which metonymically call up abundant details of a preexisting frame. This affirms the idea of Frame Metonymy introduced by Dancygier and Vandelanotte (2017). When memes are established as multimodal constructions, they may move toward monomodal linguistic constructions, (e.g. in Fig.5). These monomodal versions of constructions may appear in some standard contexts. In these contexts, it is assumed that the mind of the addressee retrieves the visual pictorial content, though it is not visually in access, because of "recurring symbolic phrases" which are metonymically linked to the whole frame, and the intersubjectivity between speaker and addressee in the final Discourse Viewpoint Space. Discourse spaces created in the meme making spaces compress in themselves new discourse spaces. Moreover, texts imposed on the images mostly fill in constructional slots like adverbial clauses of reason and condition, and create the creative part of meaning by using fictive direct discourse and deictic reference.

Final humorous meaning is profiled in the Discourse Viewpoint Space and is constructed as the result of image macros' molding in different but regular ways as well as integration of several discourse, textual, and frame spaces. The textual spaces along with images call up the frame spaces of sitcoms, interviews or serials aired on TV or broadcast in cyberspace and assumed to be shared by social community. To these spaces which are visually yielded by text and image and the corresponding metonymically evoked frames, we can add other information from context. These are evoked in Discourse Viewpoint Space. Persian image macros, in this vein, represent and reconstruct viewpoints, ideas, affects, stereotypes and share the viewpoints with viewers.

Incongruity, which takes place between input discourse or textual spaces of these memes (mostly discourse spaces 1 and 2) is resolved in the highest space, Discourse Viewpoint Space, in which the humorous effect emerges. The resolution either takes place or not, depending on the background knowledge of the addressee and other contextual nonlinguistic factors.

In general, this study may at least contribute to the idea of space integration and the roles of mental, here discourse, space in the construction of novel, attractive viewpoints in internet memes and shed a light on their refashioning and recycling capacities which emerge in different modes. Moreover, as Dancygier and Vandelanotte (2016) argued, viewpoints in discourses are organized in networks which can "take on many different forms but are all governed or supervised by a top-level, unifying viewpoint space where the different lower-level viewpoints are reconciled and understood in final interpretation.

 

[1]. Jingle: "[countable] a short song or tune that is easy to remember and is used in advertising" e. g. I wrote a song which they’re thinking of using as a jingle. (Oxford Learner's Dictionary,  Second Definition, URL: https://www.oxfordlearnersdictionaries.com/definition/english/jingle_1)

[2]. /tʃera koʃtiʃ/?

[3]. The selected memes can all easily be found via Internet search engines. 

[4]. The Persian Book of Kings by Abo-Al Qasem Ferdowsi, between c. 977 and 1010 CE

[5]. /Aleh/ in Persian is a childish pronunciation of /Areh/, meaning Yes.

[6] changeover to summer time, in Iran it is called Sa'at-e Jadid/ NewTime

[7] Changeover to winter time/normal time, in Iran it is called /sa'æt-e qædim/ meaning: the old time.

[8] /pænj-e jædid ja ghædim?/

[9] /pænj tooman-e jædid ja ghædim?/

[10]. It is an Iranian comedy-drama television series directed by Soroush Sehhat (aired on IRIB 3 from December 25, 2016, to January 27, 2018, lasting three seasons). Habib here is drinking tea, sitting on the chair, and putting his leg on the table. The dialogue is as follows:

Habib: Was she Narges's friend? Laleh? Whom, for the hell of it, my family tried to make me marry? She lost me. Where did she end up? Mas'oud: Nowhere! She became miserable. Habib: Didn’t she wear the wedding dress? Mas'oud: Yes, she did. Habib: Didn’t we go to the wedding contract ceremony or not? Mas'oud: Yes, you did. Habib: Well, was it my fault that it was broken off? Mas'oud: No! Habib: What did she gain from that? Mas'oud: Nothing! Habib: Who’s more miserable? Her or me? Mas'oud: Her. Habib: What's she doing now? Mas'oud: She is married to a rich guy, studying abroad. Habib: Who has lost? Her or me? Mas'oud: Her, of course! Habib: See! She ruined herself, wandering in a foreign country.

[11]. /dushvari nædarim/

[12]. /dær 'ostan/

[13]. Unfortunately, the formal Persian –English translation given here cannot show stylistic details of the utterance produced by a bilingual speaker whose first language is Turkish. Some of humorous effect comes from this fact that the old man cannot speak Persian fluently and correctly.

The Interviewer: How many years are you living here sir? The old man: 34 years… 24 years. The Interviewer: And how many years are you living in this difficulty? The old man: Difficulty? Very nice place is here! The Interviewer: So it's good? The old man: Excellent! We don’t have any difficulty here! Since the day I've been here, no one has picked up a stake from here! No one had to do with me. I didn’t have to do with others. The Interviewer: Now, aren’t your houses in the plans of municipality? Didn’t they say that they will destroy your houses? The old man: You destroy them? The Interviewer: Didn’t municipality tell you that it is going to destroy your house? The old man: Who said? Show me them. I don’t know. First show me them. Is it you? Your honor! The Interviewer: No Sir! What would you like to tell officials? The old man: Didn’t you say that you come from municipality? The Interviewer: I'm not municipality sir! I came to ask your difficulty. The old man: Difficulty? I have no difficulty! The Interviewer: Thank you very much!            

Amouzadeh, M., & Tavangar, M. (2004). Decoding pictorial metaphors: Ideologies in Persian advertisements. International Journal of Cultural Studies 7 (2), 147-174.
Amouzadeh, M., & Tavangar, M. (2008). Sociolinguistic aspects of Persian advertising in post-revolutionary Iran. In Semati, M. (Ed.) Media, culture and society in Iran: Living with globalization and the islamic state. pp. 130-151. London & New York: Routledge.
Ansarian, S., Davari Ardakani, N., & Bahrami, F. (2022). Beggary behind the closed door of united nations security council; conceptual metaphor, metonymy and blending in political cartoons. Persian Language and Iranian Dialects 6(2), 109-129.
Barthes, R. (1977). Image, music, and text. Selected and Trans. by Stephen Heath, London: Fontana/Collins.
Blakesley, D. (2004). Defining film rhetoric: The case of hitchcock’s vertigo. In C. A. Hill & M. Helmers (Eds.), Defining Visual Rhetorics, pp. 111–133. Mahwah, NJ: Lawrence Erlbaum.
Brideau, K., & Berret, C. (2014). A brief introduction to impact: “The meme font.” Journal of Visual Culture 13, 307–313.
Csordás, T., Horváth, D., Mitev, A., & Markos-Kujbus, É. (2017). 4.3 User-generated internet memes as advertising vehicles: Visual narratives as special consumer information sources and consumer tribe integrators. Commercial Communication in the Digital Age: Information or Disinformation, pp. 247-266.
Dancygier, B. (2021). Fictive deixis, direct discourse, and viewpoint networks. Frontiers in Communication 6, 1-16. https://doi.org/10.3389/fcomm.2021.624334
Dancygier, B., & Vandelanotte, L. (2016). Discourse Viewpoint as Network. In Barbara Dancygier, Lu Wei-Lun & Arie Verhagen (eds.), Viewpoint and the fabric of meaning: Form and use of viewpoint tools across languages and modalities. Cognitive Linguistics Research 55, 13–40.
Dancygier, B., & Vandelanotte, L. (2017). Internet Memes as Multimodal Constructions. Cognitive Linguistics 28 (3), 565-598.
Dawkins, R. (1976). The selfish gene. Oxford: Oxford University Press.
Fauconnier, G., & Turner, M. (2002). The way we think: Conceptual blending and the mind's hidden complexities. New York: Basic Books.
Forceville, C. (2006). Non-verbal and multimodal metaphor in a cognitivist framework: Agendas for research. In: Gitte Kristiansen, Michel Achard, René Dirven and Francisco Ruiz de Mendoza Ibàñez (eds.), cognitive linguistics: current applications and future perspectives. Berlin/New York: Mouton de Gruyter, pp. 379-402.
Ghaderi nezhad, B., Karbalaei Sadegh, M., & Ameri, H. (2022). Pictorial metaphor and metonymy in citizen cartoons within the cognitive approach: A case study on book and book-reading. Journal of Sociolinguistics 5(3), 95-81.
Huntington, H. E. (2015). Pepper spray cop and the American dream: Using synecdoche and metaphor to unlock internet memes’ visual political rhetoric, Communication Studies 67(1), 77-93. https://doi.org/10.1080/10510974.2015.1087414
Jabłońska-Hood, J. (2015). A conceptual blending theory of humour. Selected British Comedy Productions in Focus. Łódź Studies in Language, 36, edited by Barbara Lewandowska-Tomaszczyk & Łukasz Bogucki, Frankfurt: Peter Lang GmbH.
Knobel, M., & Lankshear, C. (2007). Online memes, affinities, and cultural production. A New Literacies Sampler 29, 199-227.
Kövecses, Z. (2020). Extended conceptual metaphor theory. Cambridge: Cambridge University Press.
Kress, G. (2009). Multimodality: A social semiotic approach to contemporary communication. London: Routledge.  
Kress, G., & van Leeuwen, T. (1996). Reading images. The grammar of visual design. London and New York: Routledge.
Miltner, K. (2011). Srsly Phenomenal: An Investigation into the Appeal of Lolcats. (Unpublished Master’s Dissertation). London School of Economics, London, UK, Retrieved from http://katemiltner.com/
Pourebrahim, S. (2015). A study of verbo-pictorial metaphors in some Persian informative posters. Journal of Researches in Linguistics 6(2), 19 -35. https://dor.isc.ac/dor/20.1001.1.20086261.1393.6.11.2.2
Pourebrahim, S. (2016). Metaphorical realization of ideology in political caricatures: The role of pictorial metaphors in critical discourse analysis. Journal of Researches in Linguistics 8(1), 37 -52. doi: 10.22108/jrl.2017.21260
Saeed, J. (1997). Semantics. Oxford: Blackwell.
Shifman, L. (2014). Memes in digital culture. Cambridge, MA: The MIT Press.