Home About Program Register Now
Conference Program

Three Days of
Arabic Linguistics

March 27–29, 2026 · Indiana University Bloomington · Radio-TV Building, Room 251

8:00–8:40 AM ☕ Registration and Coffee
8:40–8:45 AM 🎙️ Opening Remarks
Syntax Syntax Session 8:45–10:45 AM
8:45–9:15
The syntax of ditransitives in Najdi Arabic.
Reem Alshammeri
introduction: This paper investigates the syntactic structure of ditransitive constructions in Najdi Arabic (NA),focusing on how recipients and themes are expressed across different verb classes. The central research question iswhether the two surface patterns of ditransitives: Double Object Constructions (DOCs) and Prepositional DativeConstructions (PDCs) in NA are derivationally related, as claimed by Larson (1988) and Baker (1996) or reflectdistinct underlying structures as advocated by Marantz (1993), Pesetsky (1995), Harley (2002), Hallman (2015,2018), Al-Janabi (2019), Hallman and Al-Balushi (2022), among others. I propose that the DOC and PDC in NA arebase-generated independently, with the DOC represented by (1) and (3) and the PDC by (2) and (4), and that verbsin NA fall into two lexical classes depending on which ditransitive strategy they permit. This distinction is not onlystructurally grounded but also semantically and morphosyntactically significant, with broader implications fortypological debates on argument structure. NA exhibits two core strategies for ditransitive constructions. Verbs like‘give’ license DOCs, where the recipient precedes the theme, as in (3) with applicative structure, while verbs like‘put’ require PDCs involving a prepositional goal, as in (4).DOC: (1) ʔaʕaṭ-it Faris ʔəl-ʒawal PDC: (2) ḥaṭṭēt ʔəl-ʒawal ʕal-tʔawlahgave.1SG Faris the-phone put.1SG the-phone on-table‘I gave Faris the phone.’ ‘I put the phone on the table.’DOC structure (3) PDC structure (4)Analysis: I argue that DOCs and PDCs in NA are derivationally unrelated. For DOC verbs (Class 1: give, award),the underlying order is DPIO–DPDO (DP-Indirect Object—DP Direct Object), and the attested surface forms: DPIO–DPDO, DPDO–DPIO, li-DPIO–DPDO, and DPDO–li-DPIO—all derive from DPIO–DPDO base, with li- analyzed as adative case marker. Within Class 1, a subclass of send-type verbs (send, return, buy) expresses transfer of possessionbut obligatorily requires li- before the recipient, while remaining DOC-like in structure. By contrast, PDC verbs(Class 2: put, take) select a true DP–PP base, where the surface PP–DP order derives from PDC syntax and denoteslocation. Animacy constraints of the Indirect Object also play an essential role in DOCs but not in PDCs.Both pronominalizations in (7–8) are grammatical; thus, li- in (8) is semantically vacuous (Hallman, 2024). Usinganaphor-, pronoun-, and quantifier-binding diagnostics, I show that li-DPIO–DPDO is a morphological variant of theDOC, consistent with Oehrle’s (1976) Generalization and its strong transfer-of-possession interpretation.(7) Sarah ʕaṭ-it-uhj ʒawal-uhj (DPIO-DPDO) (8) Sarah ʕaṭ-it-l-uhj ʒawal-uhj (li-DPIO-DPDO)Sarah gave-3FS-3MS phone-his Sarah gave-3FS-DAT-3MS phone-his‘Sarah gave him the phone.’ ‘Sarah gave him the phone.’The send-type verb subclass semantically denotes the transfer of possession and syntactically represents a derivedDOC form. Still, it requires li- before the recipient, as shown in (9), but it is ungrammatical without li- as in (10).(9) Sarah ʔrsal-it-l-uh ʔəl-ʒawal (10) *Sarah ʔrsal-it-uh ʔəl-ʒawalSarah send-3FS-to-him the-phone Sarah sent-3FS-him the phone.‘Sarah sent him the phone.’ ‘Sarah sent him the phone.’Although ʔarsal ‘send’ allows only both DPDO–li-DPIO and li-DPIO–DPDO orders, the purpose-clause diagnosticshows that the recipient (Fāris) controls PRO in both cases, as in (11–12). This matches the behavior of give-typeDOCs in NA, where the recipient also controls PRO. Hence, send aligns with a DOC-like, recipient-orientedconfiguration and supports analyzing li- in DPIO as an argumental dative, not a PP adjunct.(11) ʔarsalt risālah li-Fārisi [PROi yigrā-hā] (12) ʔarsalt li-Fārisi risālah [PROi yigrā-hā]sent.1SG letter DAT-Faris [PRO read.3SG.M-it] sent.1SG DAT-Faris letter [PRO read.3SG.M-it]‘I sent a letter to Faris to read.’ ‘I sent a letter to Faris to read.’With class 1 verbs, all derived forms remain DOCs ((li)DPIo–DPDO ⇄ DPDO–(li)DPIo): they express transfer ofpossession, exhibit IO>Theme in binding and scope, and show recipient control in purpose clauses. Li- herefunctions as dative morphology. By contrast, Class2 verbs are genuine PDCs (DP–PP → PP–DP) with PP-internalgoals, lack IO-type binding and PRO control, and only allow theme-passivization. These encode location or path.Conclusion: These findings have broader implications for cross-linguistic theories of argument structure. First, theysupport the existence of distinct, base-generated DOC and PDC constructions, challenging transformationalaccounts that derive one from the other. Second, they demonstrate that verb-specific lexical properties govern thesyntactic realization of arguments, reinforcing a lexicalist perspective on argument licensing. Ultimately, this studyenriches the empirical understanding of Arabic syntax and contributes to ongoing typological and theoretical debateson ditransitives.
9:15–9:45
The Syntax of Exceptional Exceptive Constructions in Tunisian Arabic (TA)
Mohamed Jlassi
n TA, exceptives are expressed both through the standard marker ʔilla—as in Modern Standard Arabic(MSA)—and, distinctively, via the copula verb ka:n, as shown in (1) and (2).(1) a. ʔilla ʕli: ħḍar b. ħḍar-u: kull-hum ʔilla ʕli:only Ali attend.PRF.3SM attend-PRF.3P all-them except Ali‘Only Ali attended.’ ‘Everybody attended except / but Ali.’(2) a. ka:n ʕli: ħḍar b. kull-hum ħḍar-u: ka:n ʕlionly Ali attend.PRF.3SM all-them attend-PRF.3P except Ali‘Only Ali attended.’ ‘Everybody attended except / but Ali.’The ka:n construction in TA complicates von Fintel and Iatridou’s (2007) typology, which divides languagesinto two types: those using exclusive focus (e.g., Finnish, Spanish), and those using exceptive marker +quantifier combinations (e.g., MSA, Greek). TA exhibits traits of both. For instance, type one languagesallow exceptives only in affirmative contexts as in (2), while type two permit them in negative ones—TApermits both, as shown in (3).(3) a. ka:n ʕli: ma:-ħḍar-ʃ b.ma: ħḍar ka:n ʕli:only Ali not-attend.PRF.3SM-not NEG attend.PRF.3SM except Ali‘Only Ali did not attend.’ ‘Only Ali did attend.’c. ma: ħḍar ħatta ħadd ka:n ʕli: d. ma:-ħaḍar-u- ʃ kull-hum ka:n ʕli:NEG attend.PRF.3SM no one except Ali NEG-attend.PRF-3PM-NEG all-them except Ali‘No one attended except / but Ali.’ ‘Not everybody attended except / but Ali.’Crosslinguistically, exception constructions remain debated—especially since semantic analyses dominate,often reducing exceptions to restriction readings (inclusion/exclusion) (von Fintel 1993; Hoeksema 1995;Moltmann 1995; Vostrikova 2019). The distinction between free and connected exceptives (Hoeksema1987) and their categorial status (e.g., conjunctions, prepositions, adverbs, postpositions) remainsunresolved. AlBataineh (2021), building on this, proposes a functional head analysis, treating ʔilla as thehead of an Exceptive Phrase (ExP). While Arabic studies focus on ʔilla in MSA (AlBataineh 2021; Soltan2016; Saeed 2023), dialectal data—especially TA—remains understudied. This paper addresses that gap byanalyzing ka:n-marked exceptives in TA. I show that ka:n, despite its verbal origin and morphologicalinvariance, is a grammaticalized form that yields two distinct readings: (i) restrictive exclusive focus (2a,3a–b), and (ii) restrictive subtraction (2b, 3c–d). These challenge existing categorizations and align ka:nmore closely with a distinct functional head. Adopting and refining AlBataineh’s ExP analysis, I argue thatTA’s ka:n exceptive constructions derive both readings from a base-generated big DP structure, where ka:nestablishes an internal exception relation R with its NP complement. This follows the relational architectureproposed by Hornstein et al. (1994), as adapted in Uriagereka (1995), and Belletti (2005) for doublingconstructions. In restrictive focus readings, ka:n values its restriction features with the excepted NP, thensequentially values exception in Ex and exclusive focus in C, as in (4a). In the subtractive reading, domainsubtraction occurs entirely within the big DP: a universal quantifier occupies [Spec, DP], ka:n raises to Exafter valuing its features against the excepted NP, which remains in situ (4b).(4) a. [CP C ka:n [TP Ali T ħḍar [ExP Ex ka:n [VP V [DP D [exP ex ka:n NP Ali ]]]]]]b. [CP C [TP kullhum T ħḍaru: [ExP Ex ka:n [VP V [DP kullhum D [exP ex ka:n NP Ali ]]]]]]This study contributes to the syntax of exception by offering a novel analysis of TA’s ka:n, situating it as agrammaticalized morphosyntactic hybrid that resists traditional categorial labels and invites broaderreconsideration of exception typologies in Semitic syntax.
9:45–10:15
Anybody Can, But Him & You Cannot: A New PCC Type in Malki Arabic and Its Implications
Fahad Almalki
Crosslinguistically, there has been a significant number of analytical studies concerned with the restrictions on the combinations of weak pronominal object clitics in ditransitive constructions based on their person features, captured in the Person Case Constraint (PCC, e.g., Bonet 1991, 1994; Nevins 2007; Deal 2023). One PCC type that has been influential in these studies comes from Classical Arabic where the I(ndirect) and D(irect) O(bject) pronominal clitics must obey the following person hierarchy: 1>2>3, so-called Ultrstrong PCC (Nevins 2007; Fassi-Fehri 1988). Such PCC variety bans pronominal object clitics of the combinations 2>1, 3>1 and 3>2, as shown in (1). This paper provides a new PCC pattern observed in Malki Arabic (a Tihami Arabic dialect spoken in southwestern Saudi Arabia) and examines its implications for the PCC theories. This PCC pattern is unique, compared to the attested PCC patterns, in that it allows almost all combinations of weak object clitics, even the ubiquitously banned 3IO>1DO pronominal clitic cluster. But it specifically bans the combination of a third person IO and a second person DO (*3>2). I call this PCC pattern No Him&You PCC, which is illustrated with the ditransitive verb in (2) from Malki Arabic. In contrast to the Ultrastrong PCC in Classical Arabic (beside other attested PCC types across languages), the object clitic clusters 3>1 and 2>1 are allowed in this PCC pattern. However, the person combination of 3>2 is banned in both PCC varieties. (1) a. Pact ̇a-ni:-ka gave.3SBJ-1DAT-2ACC ‘He gave you to me.’ 1>2 b. *Pact ̇a:-ka-ni: gave.3SBJ.2DAT-1ACC ‘He gave me to you’ *2>1 c. Pact ̇ay-ta-ni:-hi: gave-2SBJ-1DAT-3ACC ‘You gave him to me.’ 1>3 d. *Pact ̇ay-ta-hu:-ni: gave-2SBJ-3DAT-1ACC ‘You gave me to him.’ *3>1 e. Pact ̇ay-tu-ka-hi: gave-1SBJ-2DAT-3ACC ‘I gave him to you’ 2>3 f. *Pact ̇ay-tu-hu:-ka: gave-1SBJ-3DAT-2ACC ‘I gave you to him.’ *3>2 (Nevins 2007; Fassi-Fehri 1988) (2) a. Pact ̇a:-ni:-k gave.3SG.M.SBJ-me-you ‘He gave you to me.’ 1>2 b. Pact ̇a:-ka-ni:(h) gave.3SG.M.SBJ-you.SG.M-me ‘He gave me to you.’ 2>1 c. Pact ̇ay-ta-ni:-h gave-2SG.M.SBJ-you.SG.M-him ‘You gave him to me.’ 1>3 d. Pact ̇ay-ta-h-ni:(h) gave-2SG.M.SBJ-him-me ‘You gave me to him.’ 3>1 e. Pact ̇ay-tu-ka:-h gave-1SG.SBJ-you.SG.M-him ‘I gave him to you.’ 2>3 f. *Pact ̇ay-tu-h-a:k gave-1SG.SBJ-him-you.SG.M ‘I gave you to him.’ *3>2 In the PCC literature, the Ultrastrong PCC in Classical Arabic has been explained in the syntax via Agree (e.g., Nevins 2007; Deal 2023). Nevins offers a PCC analysis based on Multiple Agree (e.g., Hiraiwa 2001, 2005) and a binary feature system in which the third person pronoun has a person feature (contra. Harley and Ritter 2002; Anagnostopoulou 2005; Béjar and Rezac 2003). In this analysis, the PCC is a result of Multiple Agree carrying out an Agree relation between the object clitics in the DO and IO positions and a relativized probe (v) in one domain. The PCC effect in Classical Arabic arises as a consequence of Multiple Agree failure due to an intervening feature, defiant from the feature on the probe and DO goal. In particular, the illicit object clitic combinations are due to intervention effects triggered by the IO goal containing a person feature whose value is different from the probe [+A(uthor), +P(articipant)] and the DO goal: probe[+A,+P] ...(i) *IO[-A,+P] ...DO[+A,+P]. *2>1 (cf. 1b) (ii) *IO[-A,-P] ...DO[+A,+P]. *3>1 (cf. 1c) (iii) *IO[-A,-P] ...DO[-A,+P]. *3>2 (cf. 1f) (iv) IO[+A,+P] ...DO[-A,+P]. 1>2. (cf. 1a). I focus on Nevins’ and not Deal’s analysis because the former utilizes Multiple Agree, a well-established and independently motivated Agree mechanism in the literature. The No Him & You PCC necessitates a modified analysis of Nevins (2007) which cannot account for the No Him&You PCC type in Malki Arabic at face value. The challenge arising from this PCC type is due to the ban of the combination of 3>1 across the other PCC types, but not the case in the No Him&You PCC. Therefore, this paper offers an analysis of the No Him & You PCC type in which (partially following Nevins 2007 but different from his analysis) I assume that the person features [Hearer] and [Participant] comprise the feature system of pronouns in Malki Arabic, an expected variation across languages (Nevins 2007:288). Such feature system furnishes a generalization that the No Him&You PCC effect only arises if none of the person features is licensed on the DO goal, and it does not arise if at least one person feature of the DO goal is licensed by Agree with the relativized probe v specified for [+Hearer, +Participant]. I show that this analysis captures not only the ban on the *3>2 combination but also the other licit pronominal clitic combinations of the No Him&You PCC type. The data from Malki Arabic first contributes to the empirical domain of the PCC literature with a novel, attested PCC pattern. Evidently, the importance of Arabic dialects to theoretical linguistics is demonstrated by Malki Arabic as a case study which brings challenges to the current analyses of the PCC effects. It also contributes to our understanding of the PCC in general and its comparative syntax, underpinning the universal and parametric aspects of the PCC. Moreover, to the best my knowledge, the PCC effects remain underexplored in the spoken Arabic varieties, and the current paper is one of its kind to investigate the PCC patterns in an Arabic dialect, furnishing a path for future research to explore the PCC effects in other spoken Arabic varieties.
10:15–10:45
A Narrative Function for the Historical Present in Egyptian Arabic
Michael White
What motivates the use of the historical present (HP) in narratives? The traditional view holds that the HP—a present tense verb which functions syntactically and semantically as a past tense verb (P)–is used in narratives to make events more vivid to the listener. Wolfson (1979) argues against this view claiming that many vivid narratives are told without the inclusion of the HP. Instead, she introduces data suggesting that the HP in English is used to mark the beginning of new events within a story. Schiffrin (1981) supports this view; however, she finds that only a switch from HP to P marks the occurrence of a new event. In all other cases, use of the HP appears consistent with the traditional view.In Arabic, the HP is similarly described. Studies of Standard (Ḥasan, 1974), Kuwaiti (Tsukanova, 2008), and Negev Bedouin Arabic (Henkin, 1996, 1998) ascribe a dramatizing function to the HP. In Negev Bedouin Arabic, the HP also fulfils a second narrative function of marking the initial boundary of events (Henkin, 1996, 1998). Event structure was also studied in Classical Arabic narratives; however, Marmorstein (2016) found that the HP in Classical Arabic does not introduce a new event, but rather describes the consequence of an immediately preceding P. In contrast to these findings, Holes (2004) emphasizes that the function of HP may simply be to break the redundancy of verb forms in stories. Narratives have also been analyzed in sedentary Palestinian Arabic (Henkin, 1996) and Egyptian Arabic (Brustad, 2000); however, these studies did not investigate the narrative function of HP as it was only observed in background clauses.For Egyptian Arabic, understanding of the HP has recently expanded due to the availability of larger Egyptian-Arabic narrative corpora whose data show use of the HP in narrative foreground clauses. As these foreground HP have yet to be analyzed, this study examines their use to determine whether they fulfill a narrative function similar to those found in other Arabic dialects or whether they simply break up the redundancy of P. To do so, the transcripts of 76 oral narratives, given by speakers residing in eight Egyptian provinces, were studied. Relying on the definitions of foregrounding established by Dry (1983) and Hopper (1979), foreground HP were identified and categorized according to whether use is consistent with previous research on event structure (van Ess-Dykema, 1984), dramatization (Ritz & Engel, 2008), consequential relation (Marmorstein, 2016), or textual segmentation (Carruthers, 2005).Results of the study suggest that inclusion of the HP is motivated by event structure, as all 20 of the foregrounded HP in the corpus introduce a new event regardless of preceding tense. However, an analysis of the lexical aspect of these verbs reveals that dramatization may also be a factor. Of the HP, 16 are either accomplishment or activity verbs, two classes which have the feature [+durative] which allows the audience to feel as if they are witnessing the unfolding of events. This data suggests that although the HP introduces new events into the narrative, it also functions to increase the drama and suspense of the climax. This study provides evidence of the systematic use of the HP in the foreground of Egyptian narratives, refuting claims that its inclusion is due to a lapse in speaker concentration. Furthermore, these findings suggest that use of the HP to mark event-initial actions is not confined to Negev Bedouin Arabic, motivating future analyses of the HP in the many dialects whose narratives have yet to be studied.
10:45–11:00 AM ☕ Coffee Break
Keynote I
Usama Soltan
Middlebury College, United States
On the Morphosyntax of Adnominal Demonstratives in Egyptian Arabic
11:00 AM – 12:00 PM 📍 Radio-TV 251
12:00–1:00 PM 🍽️ Lunch Break
Poster Session Friday Poster Session 1:00–2:15 PM
📍 Global Studies Hallway
Online Misogyny in Saudi Arabia: A Linguistic Analysis of X Discourse
Arwa Alquayb
This study examines how misogynistic discourse is linguistically and culturally constructedon X (formerly Twitter) in the Saudi context. Although progress has been made in detectingmisogyny on various online platforms; there is still a significant gap in addressing these issues innon-Western contexts, particularly in the Arabic-speaking world. As far as existing literatureindicates, no research has yet examined how misogynistic language appears implicitly in SaudiArabia, where cultural norms and religious beliefs can shape this discourse in distinctive ways.Therefore, this study aims to bridge that gap by focusing on how misogynistic language againstSaudi women operates on social media platforms like X.Drawing on critical discourse analysis (CDA), the research investigates how language;through syntax, figurative devices, and cultural references; reinforces dominant gender ideologies.It addresses two primary questions: (1) What are the linguistic features that characterizemisogynistic discourse against Saudi women on X, including the use of figurative language andsyntactic structures that contribute to dehumanization? and (2) How do cultural norms andreligious beliefs shape this discourse? A dataset of 11,944 Saudi Arabic posts was collected usinga mixed-methods approach combining automated scraping and geolinguistic filtering. Posts wereinitially selected based on a culturally grounded lexicon of misogynistic terms and expressions.Two trained native Arabic-speaking raters assessed all tweets on a 5-point Likert scale for degreesof misogyny, and a third rater reviewed 25% of the dataset to ensure inter-rater reliability.The analysis reveals that misogyny is expressed through a spectrum of linguistic strategies.Figurative language, such as metaphors portraying women as instinct-driven creatures,commodities, or threats, was frequently used to naturalize gendered hierarchies. Euphemismssoftened the expression of oppressive ideas, framing control and obedience as moral virtues.Syntactic patterns, such as the frequent use of copular/adjectival predicates, imperatives, andgeneralized plural noun phrases, reinforced discourses of control, hierarchy, and uniformity. Postsalso weaponized religious discourse, such as references to women’s “deficiency in mind and faith,”to legitimize patriarchal norms. Cultural and religious ideologies were deeply embedded in thediscourse, with tweets frequently referencing Islamic texts, traditional morality, or social norms tolegitimize gender inequality. Unlike other languages where dehumanization is often enactedthrough misgendering via neutral pronouns, Arabic’s gender-marked system leads to what thisstudy conceptualizes as hyper-gendering—an intensified marking of gender through overtmorphology and plural forms that exaggerate women’s difference and reinforce binary hierarchies.This hyper-gendering magnifies women’s perceived moral or intellectual deficiencies rather thanerasing their gender identity and position them as hyper-visible yet devalued subjects.By employing a mixed-methods approach that incorporates both computational tools andqualitative analysis, this research provides deeper insight into the implicit forms of misogyny thatautomated detection methods alone have struggled to uncover. The study contributes to a growingbody of work on digital misogyny by contextualizing it within Arabic linguistic and culturalframeworks and offer a more nuanced understanding of how gendered power operates in onlinespaces.
Between Cultures, Between Words: Politeness and Request Strategies Among Heritage Speakers of Arabic in the US
Duaa Makhoul
This study presents an examination of the request speech act strategies and levels of directness producedby Heritage Speakers (HSs) of Levantine Arabic in a Heritage language context in the United States.Hypothesizing that English interactional experience will affect the speaker’s use of directness in framingan impositional move. The study focuses on directness, politeness markers, and mitigation employedacross different levels of imposition. Prior research on HSs of Arabic has mainly addressed morphosyntaxand phonology (Benmamoun et al., 2013; Montrul, 2021). It has also investigated the speech act ofrequest in Arabic L1 contexts (Al-Ageel 2016; Al-Masaeed 2023). However, there has been limitedinvestigation of pragmatic competence, particularly in face-threatening acts like requests (Brown&Levinson, 1987; Blum-Kulka & Olshtain, 1984). Moreover, little is known about how Arabic pragmaticnorms are maintained or adapted among HSs in bilingual settings. This study addresses this gap byinvestigating how HSs formulate requests in Arabic, how direct their strategies are, and how politenessand mitigation devices are employed in different social conditions.The study uses online oral role-play scenarios for 6 pairs of female participants born in the US or whoarrived before the age of 5. The role-play scenarios vary in both imposition and social distance. Theparticipants’ conversation statements are analyzed for both directness and politeness. The directness isevaluated using the Cross-Cultural Act Realization Project (CCSARP), which categorizes speech acts intothree main groups: (1) direct, (2) conventionally indirect, and (3) nonconventional indirect strategies. Thepoliteness is analyzed using the evaluative framework developed by Brown and Levinson’s politenesstheory (1987) and Culpeper’s (2011). In the analysis, the evaluation also includes measuring the use ofmitigation devices such as hedges, apologies, politeness markers, and appeals to solidarity. Results reveala strong preference for Conventional Indirect strategies, primarily Query Preparatory forms, i.e.,referencing willingness, ability, or possibility (e.g., momken taʿṭīnī al-kitāb? “Can you hand me thebook?”), affirming Arabic norms of politeness even in L2 contexts. Also, the HSs use indirect requeststrategies more in contexts of high imposition and social distance, reflecting both Arabic cultural normsof deference and bilingual pragmatic adjustment. The results also show that politeness and mitigationmarkers (law samaḥt, min faḍlak) and softeners (apology, justifications, and excuses) appear twice asoften in high-imposition scenarios compared to those of lower imposition, serving to mitigate face-threatening acts.These findings highlight how Arabic politeness conventions persist yet evolve in heritage contexts,supporting the view that pragmatic competence is a key site of cultural continuity. By emphasizing thepragmatic aspects of heritage language use and Arabic politeness within bilingual pragmatics, this studybroadens the scope of Arabic linguistics to include heritage speaker pragmatics, offering new insights intohow Arabic communicative norms are negotiated, preserved, and transformed across generations. It alsohelps in expanding the understanding of Arabic heritage pragmatics and highlights its implications forpedagogy, particularly in fostering sociocultural awareness and pragmatic competence.
Acoustic Correlates to Laryngeal Articulation in Pharyngeal Consonants in Modern Standard Arabic
Benjamin Lang
Pharyngeal consonants /ħ ʕ/ are traditionally described in the Handbook of the IPA (International Phonetic Association, 1999) as having a constriction in the back of the throat between the tongue body and the pharyngeal wall. Previous studies identify vowel formant changes, such as raising of F1 and lowering of F2 on adjacent vowels and through pharyngeal segments, as primary acoustic correlates to this pharyngeal articulation (Khattab et al., 2018; Heselwood, 2007). In contrast, the Laryngeal Articulator Model (LAM) (Esling et al., 2019; Esling, 2005) proposes that pharyngeal consonants are realized as a raising and narrowing of the epilaryngeal tube, and are thus articulated within the larynx, with co-occurring glottal states. In the present study, we examine recordings of L1 Arabic speakers producing /h ʔ ħ ʕ w j/ in Modern Standard Arabic to assess the contribution of laryngeal articulation during the production of pharyngeal consonants.If pharyngeal consonants arise from laryngeal articulations rather than tongue root retraction, then acoustic correlates such as residual H1 (correlated with glottal spreading/constriction) and SoE (correlated with voicing intensity), which are typically used to assess varying states of the glottis (Garellek, 2019), are predicted to additionally characterize pharyngeal articulation. According to the LAM, /ħ/ is expected to be voiceless and spread glottis, predicting lower SoE (weak to no voicing) and higher Residual H1* (spread glottis), while /ʕ/ is expected to be voiced and modal, predicting higher SoE (stronger voicing) and lower Residual H1* (constricted glottis). For both /ħ ʕ/, F1 is expected to be higher relative to /h ʔ/ due to laryngeal raising.31 speakers of L1 Arabic (Mean Age: 24, Range: 19–46; Gender: 21 women, 9 men, 1 non-binary person) were recorded in a sound booth producing 128 target words with target consonants /h ʔ ħ ʕ w j/ in word-initial, word-medial, and word-final positions in Modern Standard Arabic. Target words were embedded in one of two nearly identical carrier phrases, e.g., Phrase A لاقلمأدحواونیسمخرةم [qɑl ʔamal wæħɪd wa xamsin mara] “He said ‘hope’ fifty-one times” (16 speakers) or Phrase B ُلقلمأدحواونیسمخرةم [qʊl ʔamal wæħɪd wa xamsin mara] “Say(m.) ‘hope’ fifty-one times” (15 speakers). The carrier phrase changed midway through the experiment due to the tendency of speakers to insert a vowel following [qɑl]. Target consonants were segmented by hand using Praat (Boersma and Weenink, 2022). Acoustic measurements for F1, H1̈*, SoE, and Energy were extracted from each millisecond of target consonants using VoiceSauce (Shue et al., 2011). Residual H1*, the remaining amplitude of f0 when factoring out overall energy, was additionally calculated to differentiate phonation types (Chai and Garellek, 2024). Linear mixed-effects regression models were fit predicting average, speaker-normalized Residual H1*, SoE, and F1 with segment as a fixed effect and speaker and target word as random effects using lme4 in R (Bates et al., 2015).For /ħ/, higher F1 and lower SoE vary as predicted by the LAM, indicating laryngeal raising and voicelessness, but Residual H1* is significantly lower relative to other consonants, indicating a degree of constriction opposite the predicted glottal spreading. For /ʕ/, higher SoE, lower Residual H1*, and higher F1 all vary as predicted by the LAM, indicating laryngeal raising, voicing, and slightly constricted glottis relative to modal /j w/.Overall, more than tongue root retraction characterizes production of these sounds, providing evidence for the connected states of articulation between pharyngeal and laryngeal sounds proposed in the LAM. Following these results, it is worth considering how correlates to laryngeal articulation may contribute to any pharyngeal articulation in Arabic, given extensive work highlighting traditional tongue root retraction for pharyngealized consonants (Kulikov et al., 2021; Hermes et al., 2015; Israel et al., 2012; Al-Tamimi et al., 2009; Khattab et al., 2006; Asher and Laufer, 1988), and the phonetic and phonological descriptions relating pharyngeal articulation to the broader guttural or “emphatic” consonant class (McCarthy, 1994). Further analyses will investigate the temporal extent and changes in acoustic correlates to laryngeal articulation across pharyngeal consonants and adjacent segments.
Spatial Prepositions in Najdi Arabic
Hammad Alshammari
Recent treatment of prepositional phrases (PPs) in the literature has shifted from a single-projection modelto a dual-projection model that reflects their syntactic and semantic properties. However, the dual projection modelstill requires refinement regarding the issue of overlaps among prepositions, the ambiguity between possessive andlocative constructions, and the interaction between locative prepositions and adverbial nouns "dˤuruf". Therefore,this study aims to investigate and analyze the internal structure of spatial PPs in Najdi Arabic to refine currentcartographic analyses of the PP domain and fill gaps in the syntax of PPs. Building on proposals that decompose thePP into multiple functional layers (Koopman 2000; Svenonius 2006; Den Dikken 2010; Cinque 2010), the studyexamines how Najdi spatial prepositions encode distinctions between Place and Path, how certain prepositionsalternate between locative and possessive interpretations, and how adverbial spatial nouns behave in relation toprepositions. These empirical findings raise important theoretical questions about: a. whether prepositions are lexicalor functional in nature, b. where arguments are introduced in the structure, and c. how the architecture of extendedPPs should be represented.Najdi Arabic provides a rich testing ground for these theoretical issues because many spatial prepositions inthe language are homophonous with possessive markers, undergo systematic polysemy, and exhibit alternations tieddirectly to the syntactic environment. For example, the preposition /ʕala/ ‘on’ can function as a stative locative P, ascalar adjective, or a verbal preposition expressing a change of state, as illustrated in (1), as proposed by Fassi Fehriand Alrawi (2023-2024). Similarly, /fi:/ ‘in’ alternates between spatial and abstract-state interpretations, as in (2).(1) a. ʔal-miftaħ ʕala ʔatˤawlah (2) a. ʔana: fii ʔal-ʒa:miʕahDEF-key on DEF-table I in DEF-university“The key is on the table” “I’m at the university” (spatial)b. ʔifari:st ʔaʕla min tˤwaq b. xa:lid fii muʃkilahEverest higher from tuwaiq Khalid in trouble“Everest is higher than tuwaiq” “Khalid is in trouble” (a change of state)c. ʔaħmad ʕala-ah ʔal-ħizanAhmed overwhelmed-3rd.S.M DEF-sadness“Ahmed was overwhelmed with sadness”Regrading /ʕala/, which has been argued to show categorial flexibility and possibly derived from a locative root, Iargue that this analysis is problematic. There is no morphological or semantic evidence that /ʕala/ originates from aroot with [+loc] feature. Instead, /ʕala/, I assume, functions as a fixed functional head whose interpretation shiftsaccording to its syntactic position rather than through a root-based flexibility. On the other side, I argue that thepreposition /fi:/ ‘in’ in Najdi Arabic alternates between spatial and abstract-state meanings, captured through alayered PP structure. In (2a), /fiː/ merges as the head of Place0, which expresses physical containment, as in (3a). onthe other hand,in (2b), /fi:/ moves to or lexicalizes a higher functional head, State0, which denotes non-spatialcontainment, as in (3b). The syntactic layering could account for the alternation, suggesting that the differentinterpretations are determined by the structural position at which /fi:/ is merged.(3) a. [PlaceP DP [Place0 fi: [KP DP]]] (4) *ʔal-wa:lid wagaf bi-gada:m ʔal-baytDEF-boy stand at-front DEF-houseb. [StateP DP [State0 fi: [PalceP [KP DP]]]] “The boy is standing in front of the house”(5) maʃi:t min gadam ʔal-byatwalked.1st.S from front DEF-house“I walked from in front of the house”A further empirical puzzle concerns the co-occurrence of locative prepositions and adverbial nouns such as/gadam/ "front" and /wara/ "back". Cross-linguistically, the combinations of these two elements is permitted (e.g., infront of), forming a PlaceP + AxPartP structure. In Najdi, however, this sequence is rendered ungrammatical, asshown in (4). At the same time, the co-occurrence of Path prepositions and adverbial nouns remains acceptable, as in(5). The unexpected behavior of adverbial nouns with Place prepositions, given cross-linguistic evidence, suggeststhat these nouns may have undergone partial grammaticalization as independent locative heads, carrying a [locative]feature that eliminates the need for an overt Place head, contrary to assumptions in existing cartographic models.Evidence from argument structure, P-compatibility with stative vs. motion predicates, and structuralalternations supports a hierarchical decomposition of the PP into PathP, PlaceP, AxPartP, KP, and PossP. Eachprojection contributes a distinct semantic function and is rigidly ordered.The outcomes of this research provide a new fine-grained account of spatial PPs in Najdi, clarify theinterplay between location and possession, and motivate a revision of how axial nouns and location heads interactwithin PP structure.
Emphatic Imperatives in Iraqi Arabic
Mahmood Al Fkaiki
The imperative has syntactically and semantically received a great deal of attention in the linguistic literature (Zanuttini 1997, Benmamoun 2000, Aloni and Ciardelli 2011, Potsdam 1998 and 2007, Haddad 2020, Rupp 2003, Portner 2016, Zanuttini, Pak and Portner 2012, Portner 2007, Kaufmann 2012). Zanittini (1997) syntactically suggests that imperatives are derived via V-to-C movement. In Arabic, Benmamoun (2000) adopts Lasnik’s analysis (1981) to represent the syntax of imperatives, and he assumes that a functional projection in imperative sentences is headed by an imperative feature. Crosslinguistically, languages tend to use the imperative either in a regular syntactic structure or an emphatic syntactic structure, and the emphatic imperatives are often achieved through specific linguistic features such as intonation, word choice, or repetition. For example, imperatives in English appeareither with normal imperatives (sit down!) or with emphatic imperatives through adding an auxiliary "do" (Do sit down!) where they denote a strong sense of urgency or insistence (a speaker has a strong desire for the addressee to perform the action or she/he shows some feeling of anger or emotion or urgency).In Arabic varieties such as Iraqi Arabic (IA), affirmative imperatives appear in two different structures. The first structure occurs as a normal imperative construction as in (1a). However, the second imperative structure in (1b) appears with a prefix /də/ which is previously unnoticed. In this paper, I argue that the prefix /də/ attached to imperatives is realized as an emphatic marker requesting to do something immediately where this marker bears the lexical realization of an “imperative” feature, as shown in (1b).(1) a. iktb wa:dʒb-k normal imperativewrite homework-your“Do your homework!”b. d-iktb wa:dʒb-k1 emphatic imperativeEm-write homework-your“Do your homework!”I propose that the emphatic prefix /de/ is located in the functional projection I call Emphatic Phrase (EmP) which is headed by the prefix /də/ . I propose that an imperative form is generated in the head of the VP, and it moves multipely, according to Head Movement Constraint (Travis1984). Lastly, the imperative form merges with the head of EmP, as shown in (2).This marker occurs with a conjunction of imperatives in different syntactic structures. The existence of a connector /w/ allows the emphatic marker to be with the first imperative only, as illustrated in (3a), but the emphatic prefix is illicit when appearing with each single imperative, as in (3b).(3) a. də-kul w ʃrab!Em-eat and drink“Do eat and drink!”b. ??də-kul w də-ʃrab!Em-eat and Em-drinkThis description suggests the following: (i) the single emphatic marker allows the imperative to connect with another imperative as in (3a) in which IA uses a single emphatic operator to connect two imperatives through a connector “w”. However, IA doesn’t permit two emphatic operators with imperatives when linking via a connector, as in (3b). (ii) when we have a single emphatic operator, the connected imperatives refer to a single compound clause. (iii) the forced single operator in (3a) applies to both actions together. (iv) the emphatic marker has a wide scope, including both actions in (3a). More details come with the whole paper.
Phonological Transfer and Markedness Effects in Saudi Arabic Speakers' English Obstruent Production
Amirah Alruwais
It is well-documented that L2 learners often transfer phonological patterns from their firstlanguage (L1) to their L2. For instance, German speakers, whose native language lacks anobstruent voicing contrast in final positions, typically transfer this rule to their L2 (English),leading to the devoicing of all voiced codas in English (Dmitrieva, 2014). By contrast, Arabicspeakers, who have a voicing contrast in final positions, should theoretically maintain this inEnglish. However, there is also evidence suggesting that markedness plays a significant role in L2acquisition. The markedness hypothesis proposes that less common structures (such as voicedobstruents in coda position) are harder to acquire than more common ones (such as voicelessobstruents in coda position) (Eckman, 1977). Broselow et al. (1998), for example, observed thatMandarin speakers learning English often devoice final voiced obstruents, even though their L1lacks this rule. This finding suggests that markedness complicates L2 acquisition beyond what ispredicted by simple L1 transfer.In the current study, we test this hypothesis in Arabic, which provides an interesting casefor two reasons. First, although Arabic has a voicing contrast in coda position, sporadic informationfrom previous studies suggests that Arabic speakers do occasionally devoice obstruents in codaposition (Olsen, 2007, p. 21). Second, Arabic lacks the phoneme /p/, and speakers typically realizethis sound as [b], regardless of syllable position (Flege & Port, 1981).We analyzed a corpus of pre-existing data from the Speech Accent Archive, focusing onsix Saudi Arabic speakers learning English. The target English sentences provided by the Archiveincludes 10 obstruents in initial positions and 11 obstruents in final positions. Of these, 7 arevoiceless sounds, and 6 are voiced sounds, resulting in a total of 21 measurements for eachparticipant. Participants were selected based on their age range (20-38 years) and their beginnerlevel of English proficiency, ensuring a representative sample of Saudi Arabic dialect. Byinspecting waveforms and spectrograms for evidence of voicing, we coded each target consonantas correct or incorrect. Target /p/ was excluded from the main analysis to prevent confoundingeffects, as Arabic lacks this phoneme.For initial position, results indicate accuracy rates of 94.3% for voiced and 98.7% forvoiceless obstruents. Meanwhile, for final position, results indicate accuracy rates of 31.4% forvoiced and 95.8% for voiceless obstruents. This strongly suggests the influence of markedness ontheir L2 phonological adaptation. Additionally, /p/ was realized as /b/ 72.2% in initial position and83.3% in final position, reinforcing the role of L1 transfer. Thus, while L1 transfer clearlymanifests itself, it crucially interacts with the effects of universal markedness.
A Semantic-Syntactic Typology of Medieval Arabic Proverbial Structures
Hedi Majdoub
Despite a rich philological tradition focused on medieval Arabic amṯāl, a comprehensivelinguistic typology of their structures is notably absent. While Western paremiology hasestablished structural classifications (e.g., Gómez-Jordana Ferary, 2012), the study of Arabicproverbs has remained largely confined to thematic or moral categorizations. Consequently, nosystematic, linguistically-grounded methodology has distinguished the proverb proper—agenuine sententious statement—from the many other phraseological units grouped under thetraditional label of maṯal. This paper directly addresses this theoretical and methodological gap.To bridge this gap, this study implements a robust analytical framework grounded in MauriceGross’s (1986) Lexicon-Grammar. This approach provides the necessary tools to test andconfirm the crucial property of fixedness (figement), a defining characteristic of proverbs. Byanalyzing distributional properties and applying transformational tests (e.g., passivization,nominalization), we can empirically demonstrate that proverbial structures systematically blockmanipulations that comparable free sequences would otherwise permit. This method allows usto operationalize the three core definitional criteria used to isolate our corpus: (1) genericity,the expression of a timeless, universal truth (Anscombre, 2005); (2) syntactic autonomy, thestatus of the proverb as a self-contained micro-text (Schapira, 1999); and (3) fixedness,confirmed via the Lexicon-Grammar diagnostics.Applying this methodology to eight major medieval collections (from al-Mufaḍḍal al-Ḍabbī toal-Maydānī), this research validates the central hypothesis: authentic Arabic proverbs are notinfinitely variable but are organized around a finite set of twelve recurrent structural matrices.This typology, ranked by frequency, empirically confirms for Arabic the universalist claim thata "relatively stable number of proverbial structures" exists across languages (Gómez-JordanaFerary, 2012). The most prominent matrices include:1. The Averbal Predicative:o al-ḫayru ʿādatun wa-š-šarru laǧāǧatuno ‘Good is a habit and evil is obstinacy.’2. The Antecedentless Relative (with man):o man yaʾkul bi-yadayni yanfad.o ‘He who eats with both hands gets exhausted.’3. The Conditional:o iḏā lam yanfaʿ-ka l-bāzī fa-ntif rīša-huo ‘If the falcon is of no use to you, then pluck its feathers.’4. Quantifier kulluo kullu ḏāti ḏaylin taḫtāluo ‘Every tail-bearer struts’This study makes a threefold contribution. It (a) delivers the first semantic-syntactic typologyof medieval Arabic "proverbs proper," grounded in explicit, replicable selection diagnostics;(b) provides a formal explanation for how these twelve structural molds produce stable,sententious interpretive effects; and (c) offers robust templates for future corpus annotation andcomputational detection, laying the groundwork for large-scale comparative paremiology.
Bilingual and Diglossic Code-Switching in Egyptian, SA, and English: A Comparative MEG Study
Zainab Hermes
Recent work in bilingualism and executive control (Blanco-Elorrieta and Caramazza, 2020)suggests that a single language system is in use for bilinguals, governed by mechanisms such asword frequency and communicative context between languages. Within these mechanisms is anassumption of a shared language space between each of a speaker’s languages, in which all wordscompete with each other depending on linguistic context and communicative need. In order toevaluate the interaction between languages in a shared space, previous studies have focused oncode-switching between languages such as Emirati Arabic and English (Blanco-Elorrieta andPylkkänen, 2017) or Korean and English (Phillips and Pylkkänen, 2021). However, Arabic existson a continuum between standard and dialectal varieties leading to the phenomenon known asdiglossia (Ferguson, 1951), in which two registers of the same language exist side by side, eachassociated with specific functions, and exhibiting substantial structural and lexical differences.Arabic maintains a highly codified formal register Modern Standard Arabic (MSA) and a morefamiliar spoken register Colloquial Arabic (CA). Speakers of Arabic are exposed to both varietiesdaily in different contexts and to achieve different functions. This provides an opportunity tofurther characterize the mechanism of bilingual language control as bilingual speakers of Arabicand English not only need to manage switching between languages (bilingual code-switching), butalso between varieties of one language (diglossic code-switching) which share significantrepresentational space.Using magnetoencephalography (MEG), we examine brain activity elicited by switching bycomparing sentence-internal switches between MSA, Egyptian Arabic–a variety of CA, andAmerican English. This study describes work in progress (n=4). Participants listened to 164 pairsof sentences which contained a single-word code-switch and 164 sentences with no code-switch.Switch materials included utterances in which the matrix language was either MSA or EgyptianArabic, while frequency-matched code switched words were either MSA, Egyptian Arabic, orEnglish. Computational models were constructed to analyze the data (Brodbeck et al. 2023), withmodels that set as potential predictors of brain activity (a) the presence of a switch of any kind(diglossic or bilingual) and (b) that distinguished between a diglossic and a bilingual switch. Atime window of up to one second after the switch was analyzed for its relationship to the predictors,and the predictive power of each model to explain the recorded brain data was assessed.Preliminary results show that model (b), which modeled diglossic code-switching, did notoutperform model (a), which modeled all types of code-switching, bilingual and diglossic. This isconsistent with contemporary models of bilingualism that posit a shared language space withactivity at the switch modulated by mechanisms such as frequency and communicative context.Further analyses will explicitly examine the contribution of these mechanisms at the moment ofswitching. Our results have important implications for second language pedagogy, suggesting thecurrent practice of integrating MSA and CA in the same classroom is comparable cognitively tointegrating two distinct languages (Arabic and English, for example)
Computational Computational Linguistics Session 2:30–4:30 PM
2:30–3:00
Computational Modeling of Semantic Drift in Qur'anic Arabic and its English Translations: A Hybrid Embedding Approach
Haq Nawaz
This paper explores the linguistic and computational challenges of modeling semantic driftbetween Qur’anic Arabic and its English translations through hybrid retrieval architectures.While the Qur’an has been translated into English by numerous scholars, linguistic variationamong translations often reflects subtle theological, contextual, and cultural nuances.Computational systems trained on modern Arabic corpora typically fail to preserve thesenuances, resulting in misalignment between the Arabic source and its translated interpretations.The present research aims to bridge this gap by integrating information retrieval and neuralembedding models to measure and visualize semantic divergence across translations.The dataset comprises over 6,000 verse–translation pairs drawn from multiple Englishtranslators, including Pickthall, Sahih International, Wahiduddin, and Daryabadi. Each verse waspreprocessed and embedded using the E5-base-v2 model, which encodes semantic intent ratherthan surface similarity. BM25 was concurrently applied to index the corpus lexically, producinga comparative retrieval layer. Queries such as “rules about breastfeeding,” “adulterypunishment,” and “interest/usury” were used to evaluate the interaction between lexical recalland embedding-based contextual relevance.Empirical analysis revealed that lexical models like BM25 outperform embeddings in querieswith explicit keywords (e.g., “interest” → “riba”), whereas embedding models excel in capturingsemantically broad or paraphrased queries (e.g., “punishment for adultery”). However,embeddings sometimes generate conceptually related but doctrinally irrelevant verses—demonstrating the risk of overgeneralization. This linguistic drift highlights how neural models,though contextually rich, may dilute theological precision essential for Qur’anic interpretation.To mitigate this, a hybrid reranking pipeline using CrossEncoder (MiniLM-L-6-v2) wasintroduced, yielding improved semantic grounding.From a computational linguistics perspective, this study demonstrates that embedding-basedQur’anic retrieval benefits from linguistic constraints inspired by tafsīr and Arabic semanticfields. The hybrid approach balances contextual abstraction with doctrinal specificity—offering amodel that is both linguistically interpretable and computationally scalable. This workcontributes to Arabic computational linguistics by establishing a reproducible methodology forcross-lingual semantic stability analysis in sacred texts, thus aligning quantitative modeling withinterpretive sensitivity.
3:00–3:30
Maskuk: Leveraging LLMs and the Web in Building an Arabic Collocation Dictionary
Khaled Elghamry, Attia Youseif, Muhammad Abdo & Saad Yousef
Collocations are essential for understanding natural, idiomatic language use and for improving language learning and NLP performance (Semiyeva, 2025; Yucedal and Kara, 2023; Sabanashvili and Garibashvili, 2022; Shin 2007; Bui, 2021; Hua et al., 2021; Nesselhaugh, 2005; Nation et al, 2001; Schmitt, 2000). Existing Arabic resources remain limited in coverage, domain diversity, and methodological transparency (e.g., Albalwi, 2023; Abu Ghazalah, 2007, 2014; Abd Al-Salam, 2004). This study presents a semi-automated, data-driven approach that combines the power of Large Language Models (LLMs) and the scale of the Web as a corpus with the linguistic intuition of native Arabic speakers to identify high-quality Modern Standard Arabic (MSA) collocations.First, Gemini and ChatGPT are used to generate an initial list of semantic domains and subdomains, which are then refined by native speakers, resulting in seven thematic domains (Politics, Environment, Society and Culture, Economy, Arts, Science and Technology, and Sports) and 21 subdomains. For each domain, LLMs produce seed nouns, verbs, adjectives, and phrases, which are reviewed for naturalness and semantic coherence to ensure structural diversity (subject–verb, verb–object, idafa, noun–adjective).For each domain in the finalized list, LLMs are utilized again to produce a preliminary set of seed nouns, verbs, and adjectives, as well as phrases that are likely to be representative within that domain. Items in this list undergo a second round of evaluation by native speakers, who assess them for naturalness, semantic coherence, and relevance. To guarantee that collocations of different structural types are represented, human evaluators make sure that the initial seeds include subject-verb, verb-object, idafa (noun-noun), and noun-adjective collocation candidates.The vetted terms are then used as search queries across 206 Arabic websites and international Arabic-language outlets (e.g., BBC, CNN, Euronews), building a large, regionally representative corpus. New collocation candidates are extracted using Pointwise Mutual Information (PMI) and related measures (normalized PMI, LLR, Dice, Jaccard).The output of applying these measures is a list of sequences of two or more Arabic words in the search results, sorted by their frequency and collocatability scores. Finally, the top-most frequent 1000 candidate collocations in the resulting list are validated by native speakers to remove noise and non-collocational items.This hybrid methodology—combining LLM-assisted domain structuring, large-scale web corpus collection, rigorous statistical extraction, and iterative human validation—produces a high-quality, domain-diverse Arabic collocation dictionary suitable for language learning, lexicography, and natural language processing applications. This dictionary contains 2,261 unique collocations, each with a real-world example showing how it is used, in addition to its English translation.
3:30–4:00
Natural Language Inference: Lost in Translation? Label Stability in Arabic Machine Translation
Muhammad S. Abdo
Natural Language Inference (NLI), formerly known as recognizing textual entailment, refers to the problemof determining the logical relationship between a premise and a hypothesis. These relationships are typicallycategorized into three types: Entailment (the hypothesis logically follows from the premise), Contradiction(the hypothesis directly conflicts with the premise), and Neutral (the premise provides information thatneither supports nor rules out the hypothesis) as shown in Table 1 below (MacCartney, 2009; Camburu etal., 2018). Although NLI has been widely used to benchmark large language models’ reasoning abilities(Madaan et al., 2024), little is known about how inference relations behave when NLI datasets are machine-translated into syntactically and morphologically rich languages such as Arabic. While prior work hasexamined the roles of named entity recognition (Al Deen et al., 2023), monotonicity (Hu et al., 2020), andlabel variation in monolingual NLI datasets (Jiang et al., 2023), the cross-lingual stability of inference labelsremains largely untested. In this paper, we investigate whether inference relations are preserved onceEnglish NLI pairs are translated into Arabic using the OpenAI’s gpt-4o-mini API.Table 1. Example Sentence Pairs with their NLI relationsPremise Hypothesis RelationThe weather condition increases the sufferingof displaced Palestinians inside the tents.The weather makes the lives of displacedPalestinians more difficult.EntailmentAl-Ahram newspaper is the most widespreadnewspaper in Egypt.Akhbar El Yom is the most popular Egyptiannewspaper.ContradictionThe Sudanese army announces the regainingof control over one of the areas.What is happening in Sudan is an absolutedisaster.NeutralTo this end, a native Arabic linguist manually annotated an evaluation dataset of 1,000 English–Arabicsentence pairs, which were drawn from widely used NLI resources, including ANLI (Nie et al., 2020)and MultiNLI (Williams et al., 2018), among others. Preliminary experiments conducted on a subset of thedata revealed substantial translation-induced variation. In comparison to the manually annotated Englishgold labels, Arabic annotations achieved an accuracy of 63.3%. Notably, the relationship betweenContradiction and Arabic annotations was the most unstable, with 42% of Contradiction cases being labeledas Neutral. On the other hand, Entailment exhibited greater resilience, with a recall of 61%. Neutral, on theother hand, achieved the highest recall (77%). Further qualitative investigation denotes that manymismatches arise not from random noise or translation artifacts, but rather from systematic linguistictransformations introduced by Arabic morphosyntax and lexical choices. To better understand thesemechanisms, we conduct a targeted analysis on subsets of pairs involving conditionality, intensionality,modality, and comparative structures, all of which are known to introduce cross-lingual semantic ambiguityand affect entailment directionality.
4:00–4:30
When Syntax and Semantics Diverge: Semantic Agreement in Arabic and its NLP Implications [Cancelled Talk]
Khaled Elghamry
Semantic agreement arises when an agreement target aligns with the conceptual or referential features of its controller—such as meaning or referent—rather than its morphosyntactic form (Corbett, 2023). This interplay between syntax and semantics, where agreement is driven by meaning rather than grammatical structure, is illustrated by the examples below. In examples (A), target verbs show both types of agreement with their named entity controllers, whereas only semantic agreement with their official-title controllers in examples (B). Despite its significance for NLP tasks, semantic agreement in Arabic remains largely underexplored from a computational perspective. This presentation pursues five main objectives:(1) to investigate instances of semantic agreement in Arabic, with particular attention to named entities (NEs), and official or occupational titles, arguing for an extended version of logical metonymy to explain these agreement patterns (Zarcone, 2014); (2) to demonstrate how understanding these agreement patterns is important for Arabic syntactic parsing, anaphora and coreference resolution, and machine translation, among other downstream applications; (3) to review existing computational approaches for modeling and handling semantic agreement (Kuryanov et al. 2024, García et al. 2025); (4) to evaluate the suitability and effectiveness of these approaches for Arabic; and (5) to introduce a curated corpus of Arabic texts annotated for semantic agreement phenomena, supporting empirical analysis and the development of NLP models for this challenging aspect of Arabic grammar.A. NEs examples:ةناسرتلازافعكلامزلا11ةرملوطھخیرات)23-30-2024Twitter(ʔal-tarˈsaːna faːz ʕa ʔaz-zaˈmaːlek ʔiħdaː ʕaʃara marra tuːl taːˈriːxihAl-Tersana(F.SG) won(3SG.M) against Al-Zamalek eleven times throughout his-historyAl-Tersana won against Al-Zamalek eleven times throughout its history.ةنس27ىفوىئاھنسأكرصمھناسرتلازتافع...ىلھلأا)03-09-2024Twitter(sanah ʕaʃraːn wa sabʕa wa fiː nihaːji kaʔs maṣr ʔal-tarˈsaːna faːzat ʕa ʔal-ʔahliːyear(F.SG) 1927 and-in final cup Egypt Al-Tersana(F.SG) won(3SG.F) against Al-AhlyIn 1927, in the Egypt Cup final, Al-Tersana won against Al-Ahly.ةدعاقلانلعتاھتیلوؤسمنعموجھىلعةدعاق.ةیركسع)DW 2010(al-qɑːʕida tuʕlin masʔuːliyyatahɑ ʕan hujum ʕala qɑːʕida ʕaskariyya muriːtaːniyyaAl-Qaeda(F.SG) announces(3SG.F) its-responsibility(F.SG) for attack on base military‘Al-Qaeda announces its responsibility for an attack on a military base.’ةدعاقلانلعیھتیلوؤسمنعموجھلاىلعركسعم).شیجللMandabpress 2017(al-qɑːʕida yuʕlin masʔuːliyyatuh ʕan al-hujum ʕala muʕaskar lil-jayšAl-Qaeda(F.SG) announces(3SG.M) its-responsibility(M.SG) for the-attack on camp for-the-army‘Al-Qaeda announces its responsibility for the attack on an army camp.’"لیفلازرقلأا"ردصتیكابشامنیسلا.ةیرصملا)Alkhaleejonline 2015(al-fiːl al-ʔazraq yattaṣaddar shubbak al-sinema al-miṣriyyaThe-elephant(M.SG) the-blue tops(3SG.M) box-office the-cinema the-Egyptian‘“The Blue Elephant” tops the Egyptian cinema box office.’"لیفلازرقلأا"ردصتتةمئاقبتكلارثكلأااعیبميف2013.)Akhbarelyom 2013(al-fiːl al-ʔazraq tattaṣaddar qāʔimat al-kutub al-ʔakthar mabīʕan fi 2013The-elephant(M.SG) the-blue tops(3SG.F) list the-books the-most sold in 2013‘“The Blue Elephant” tops the list of best-selling books in 2013.’B. Official Titles examples with female referentsبئانظفاحمطایمدثحبتعمریفستوك...اروفید)Youm7 2025(nāʔib muḥāfiẓ dumyāṭ tabḥaθ maʕ safīr kuːt dīvwār al-taʕāwun..Deputy(M.SG) Governor Dumyat discusses(3SG.F) with Ambassador Côted’Ivoire‘The Deputy Governor of Dumyat discusses with the Ambassador of Côte d’Ivoire...’ریوزةفاقثلادھشتةیلافتحا.اربولأا)2018Almasryalyoum(wazīr al-thaqāfa tashhad iḥtifāliyya al-ūbrāMinister(M.SG) of-culture witnesses(3SG.F) celebration the-opera‘The Minister of Culture attends the opera celebration.’
4:30–4:45 PM ☕ Coffee Break
Keynote II
Muhammad Abdul-Mageed
The University of British Columbia, Canada
Measuring Arabic Competence in Modern Models: Variation, Cultural Meaning, and Contextual Appropriateness Across Varieties
4:45–5:45 PM
8:00–9:00 AM ☕ Registration and Coffee
Phonetics Phonetics / Phonology Session 9:00–10:30 AM
9:00–9:30
Prevoicing is perceptually redundant: evidence from one Arabic and two Iranian languages
Nawal Bahrani
Introduction. This is a study of how prevoicing and aspiration affect the perception of /d-t/ in Khuzestani Arabic(KhA), Hawrami (Haw), and Balochi (Bal). Depending on the used VOT categories, two-way laryngeal systems aredivided into three types: aspiration (short lag vs long lag), true voicing (prevoicing vs short lag), and over-specified(prevoicing vs long lag). We examined two over-specified languages, Khuzestani Arabic (Bahrani & Kulikov, 2023)and Hawrami (Kulikov & Bahrani, in press), and one true voicing language, Balochi (Kulikov & Bahrani, in press).We show that prevoicing plays no role in this perceptual distinction. Although speakers of these languages consistentlyproduce phonologically voiced stops with prevoicing, the results reported here show that this feature is redundantperceptually as listeners instead rely only on the duration of aspiration. To learn more about the perception ofprevoicing in these languages, especially in over-specified languages, the following questions were answered:1) Do listeners rely on prevoicing to classify a stimulus as /d/?2) Where is the perceptual /d/-/t/ VOT boundary?3) What is the effect of increasing the intensity of VOTs on listeners’ perception?4) What are the possible theoretical implications of our findings?Methods. Fifteen native speakers per language, between 20-45 years old, participated in this experiment in-person inIran. Stimuli were made by manipulating natural speech in Praat. VOT intensity was left unchanged or multiplied by4. Value 4 was selected after observing no difference between the original intensity and doubling it in a pilot perceptionstudy. An identification task was run on PCIbex by using the Visual Analogue Scale task instead of the two-alternativeforced choice as it can better capture the gradient effects of prevoicing in responses. Participants were presented witha scale on the screen with ‘d’ word on the right end and ‘t’ word on the left end and heard tokens along the /d/-/t/ VOTcontinuum (Figure 1). They were told to move the circle in the middle closer to the left end or the right end dependingon how similar each stimulus sounds to either word. Data was analyzed by linear regression mixed effects modelswith main effects of VOT, intensity, and their interaction. The response data was empirical-logit transformed.Conclusion. In our test languages the perceptual boundary was bigger than zero in both intensity conditions, althoughthe increase in VOT intensity moved the boundary closer to zero (Figure 2 for KhA). Furthermore, the percentage of/d/ responses to negative VOTs was almost 100% with no gradiency in listeners’ responses. The redundancy ofprevoicing in perception supports the abstract underlying representation of voicing contrast. In addition, it poses anissue for Laryngeal Realism which uses the word-initial VOT categories as the underlying representation of voicingcontrast: ‘does the failure of prevoicing to influence perception in all three languages undermine representing /d/ as[voice] phonologically?’. Using prevoicing in production can be a way of marking sociophonetic differences amongdifferent dialects/languages (e.g. Herd, 2020).
9:30–10:00
Toward Customizable Forced-Alignment for Dialectal Arabic Speech
Rachel Meyer
Forced alignment is a staple tool for research in phonetics and phonology, offering preciseword- and phone-level alignment of audio signals with pre-written transcriptions. Transcriptions,although time-consuming to produce, ensure high levels of accuracy. Even when requiring hand-correction, mediocre alignments can greatly speed up data processing for phonetic analysis. Onlyone of the most used forced aligners, MAUS (Kisler et al., 2017), has any Arabic support at all,but only provides word-level alignment and transcription for Modern Standard Arabic (MSA).This study presents first steps toward word- and phone-level forced alignment for dialectal Arabic.The forced aligner is trained through the Montreal Forced Aligner (MFA) framework,which allows users to create their own models (McAuliffe et al., 2017). MFA requires four inputsto align data: audios, transcripts, pronunciation dictionary, and acoustic model. The audio andtranscripts originate from the Massive Arabic Speech Corpus (MASC), a 1200-hour corpus ofsubtitled Arabic-language YouTube videos (Al-Feytani et al., 2023). Preliminary results come froma subset of twenty-four hours of MSA audio and eight hours of Egyptian (EG) audio, classified as“clean” (versus “noisy”) by the authors of MASC.The pronunciation dictionary was generated by modifying a large MSA pronunciationdictionary (Doherty, 2016). Each of the over 857,000 Arabic headwords in the dictionary (oftenmorphologically complex) is accompanied by one or more phonetic transcriptions. This dictionarywas modified in Python to accommodate Egyptian phones in a manner that allows for futurecustomization for any dialect. The preliminary focus is on consonants, so the modifications allowusers to specify which phone or phones each Arabic letter corresponds to. For example, for anMSA dataset, users can specify that ج corresponds to /dʒ/, for an Egyptian dataset it maycorrespond to /g/, and for a Levantine dataset it may correspond to /ʒ/ (Youssef, 2021). For adataset with multiple dialects, all three phones may be included. The script then generates newdictionary entries. For example, the word جميل “beautiful,” could have any combination of thepronunciations [dʒamiːl, ʒamiːl, gamiːl] etc., depending on what phones ج was specified for.The pronunciation dictionary was generated with multiple phones for five targetconsonants that differ between MSA and EG: ث (MSA [θ], EG [t]), ج (MSA [dʒ], EG [g]), ذ (MSA[ð], EG [d]), ظ (MSA [ðˁ], EG [dˁ], and ق (MSA [q], EG [ʔ])(Watson, 2002; Haddad, 2023). Otherpronunciations in the dataset were ignored for simplicity in the early stages of developing thealigner. Once the dictionary had been generated, an acoustic model was trained using MFA’s trainfunction and then applied to the data. To analyze training performance, counts of each transcriptionwere tabulated (Table 1).Table 1. Counts (and percentage) of MSA and Egyptian Arabic phones for the five target letters.Contrary to expectations, only for ق did the Egyptian data have a larger number of theexpected phone than the MSA data. These results suggest that the alignment training was not verysuccessful. However, performance can be improved in future models through retraining the modelwith some corrected reference alignments, increasing the dataset size, and performing additionalcleaning/normalization prior to training. In spite of the unsuccessful model demonstrated here, theability to create custom, dialect-specific pronunciation dictionaries overcomes one major hurdlein the creation of accurate forced alignment.
10:00–10:30
Sonority as a design principle in Arabic quadriliteral root architecture
Mashaell Almuhawes and Ali Idrissi
Despite extensive work on Arabic triliteral roots, the internal organization of quadriliteral roots (QRs)remains underanalyzed. Greenberg (1950) mentions them only briefly, essentially noting that they obey thesame cooccurrence regularities as triconsonantal roots. Also, in both early Arab grammars and modernscholarship, QRs are viewed as extensions of bi- or triconsonantal roots formed through reduplication oraugmentation and are often consigned to onomatopoeic periphery (Bohas, 1997; El Zarka, 2005;Kuryłowicz, 1973). For example, reduplicated /zlzl/ ‘shake repeatedly’ is traced back to the biradical verb/zl/ ‘slip’, while an augmented root such as /šmxr/ ‘be proud’ may result from affixation, radical insertion,or the fusion of two triliteral roots. Many studies also note that QRs appear in old and recent loanwords(e.g., /sndn/ ‘anvil’ (Persian); /tlfn/ ‘telephone’ (French)) (Ferguson, 1959; Heath, 1989; Mahadin, 1996).Beyond these descriptive claims, what the literature lacks is a systematic typology of QRs and, crucially, aquantitative account of their structure and the principles (if any) that shape it.We address this gap by examining relative sonority as a potential organizing principle in the structure ofArabic QRs. We propose three types of QRs: augmented, reduplicated, and primitive. Augmented QRsinvolve radical insertion as in /šmx-r/ ‘be proud’ from /šmx/ ‘be high’ (Wright, 1933:49), a process thatcontinues to be productive (as in /rqm-n/ ‘digitize’ from /rqm/ ‘digit’). Reduplicated QRs typically featurerepetition of primitive biliteral roots (e.g., Moroccan Arabic /žržr/ ‘drag’ from /žr/ ‘pull’) but may alsoderive from etymological triliteral roots (e.g., /hmhm/ ‘to murmur’ from /hms/ ‘whisper’), often expressingiterative, plural, or mimetic meanings (e.g., /zqzq/ ‘bird tweet’). Primitive QRs, by contrast, aremorphologically non-derived and semantically independent.To examine the phonotactics of Arabic QRs, we compiled a corpus of 1,575 primitive quadriconsonantalroots (C1C2C3C4) from Arabic dictionaries and modern usage sources. We calculated the sonority transitionswithin each bigram, C1-C2, C2-C3, and C3-C4 and found that QRs adhere to a distinct sonority frame. Arepeated-measures ANOVA over our data revealed a significant effect of position on sonority (F(2, 2650)= 6.42, p = .003), suggesting that while sonority tends to increase from C1 to C2 and from C3 to C4, itdecreases from C2 to C3. We argue that this rise-fall-rise pattern reflects an active sonority-based constraintthat extends beyond the surface CaCCaC prosodic template associated with Arabic QRs (see /zalzal/,/šamrax/, /raqman/), and further propose that C1C2 and C3C4 clusters form distinct phonological domains,or sonority ‘shells’, reflecting the likely historical evolution of QRs through reduplication of some minimallexical unit consisting of a sonority wave/contour with a sonority onset and a sonority peak/head.We show that, in addition to its diachronic reality, this sonority frame has always been active in the mentalgrammar of Arabic speakers: it explains why both old and recent loan QRs tend to conform to this sonoritypattern. A comparison of native primitive and loan QRs in our corpus reveals a striking parallel, suggestingthat loan QRs that make it into the lexicon exhibit the same frame (e.g., Moroccan Arabic /srkl/ ‘walkaround’ from French cercle ‘circle’; Gulf Arabic /bnšr/ ‘flat tire’ from English puncture). A non-parametricFriedman test comparing sonority distances across consonant positions indicated a significant effect ofposition on sonority distance in borrowed QRs (χ²(2) = 142.88, p < .001). Interestingly, while both primitiveand borrowed QRs displayed a rise-fall-rise sonority pattern across positions, the effect was markedlystronger in borrowed roots (χ²(2) = 142.88, p <.001, W=.27) than in primitive roots (χ²(2) = 11.57, p =.003,W=.009), indicating a stronger positional sonority-based constraint in synchronic grammar. Additionally,data shows that borrowed QRs undergo phonological adaptation so that they fit in the expected sonorityframe. For instance, in sankar ‘to scan’, /snkr/ is extracted from scanner but underwent a /k/-/n/ metathesis;and in darzan ‘dozen’, /drzn/ was obtained from through insertion of a sonorant /r/ in C2 (not in C3).We discuss the broader implications of our findings for the synchronic nature and genesis of Arabicquadriconsonantal roots and for the broader models of phonological templates, abstract phonologicalconstraints on lexical units, and loanword adaptation in modern Arabic.
10:30–10:45 AM ☕ Coffee Break
Keynote III
Jalal Al-Tamimi
Université Paris Cité, France
On the Role of Fine Phonetic Detail in Arabic Phonology
10:45–11:45 AM
11:45 AM – 1:00 PM 🍽️ Catered Lunch & Business Meeting
L2 / Heritage L2 / Heritage Session 1:00–2:00 PM
1:00–1:30
Arabic Heritage Speakers’ Perception of Emphatic–Plain Contrasts: The Influence of Vowel Context and Consonant Position
Maaly Al Omary
Heritage speakers (HS) are individuals who grow up in households where a minority language is spoken and acquirethe majority community language during early childhood (Montrul, 2016). Research on a variety of languages,including Hindi, Mandarin, Spanish, and Korean, demonstrates that heritage speakers often outperform secondlanguage (L2) learners in perceiving phonemic contrasts, a benefit attributed to early exposure to heritage languageinput (Tees & Werker, 1984; Au et al., 2002; Knightly et al., 2003; Oh et al., 2003; Godson, 2004). In Arabic,emphatic consonants are a defining feature of the phonological system. These sounds are produced with a secondaryconstriction in the pharyngeal or velar region, which distinguishes them from their plain counterparts (Watson,2002). Research on vowel context and consonant position has shown that these factors affect the perception of thesecontrasts, as coarticulatory effects and emphasis spread can modulate their perceptual salience (Jongman et al.,2011; Al-Masri & Jongman, 2004; Hayes-Harb & Durham, 2016). Recognizing these factors, the present studyinvestigates how vowel context and consonant position influence the perception of emphatic–plain contrasts inArabic heritage speakers of Levantine descent compared to English-speaking L2 learners. We hypothesized thatheritage speakers would outperform L2 learners in accuracy due to early exposure to the heritage language.Participants included eighteen Arabic heritage speakers (of Jordanian, Syrian, or Palestinian descent) and eighteenL2 learners (American), who were undergraduates enrolled in intermediate or advanced university Arabic courses.The stimuli consisted of monosyllabic Arabic words sampled from university-level textbooks, containing emphatic(/dˤ/, /tˤ/, /sˤ/, /ðˤ/) and plain (/d/, /t/, /s/, /ð/) consonants in both word-initial and word-final positions, across threeshort vowel contexts (/æ/, /u/, /i/). A female native speaker of Levantine Arabic recorded all the words. During theauditory forced-choice identification task, participants were instructed to respond as quickly and accurately aspossible. In each trial, they heard a single word and responded to the prompt “Which word did you hear?” Forexample, they distinguished between [sˤæb] ‘he poured’ (PST.3MSG) and [sæb] ‘he pulled’ (PST.3MSG).Data were analyzed using logistic regression, with Group (HS vs. L2), Consonant type (emphatic vs. plain), Voweltype (/æ/, /u/, /i/), and Word position (initial vs. final) as predictors, along with their interactions. Heritage speakersdemonstrated significantly higher identification accuracy (M = 74.7%, SD = 9.2) than L2 learners (M = 59.5%, SD= 8.7; Group (L2 – HS): β = -1.2041, SE = 0.299, z = -4.023, p < .001), indicating an advantage conferred by earlyexposure. Consonant type had a marginally significant effect (β = 0.7152, SE = 0.373, z = 1.916, p = 0.055), withbetter performance for plain (70.0%) than emphatic (63.2%) consonants. A significant interaction between group andword position was observed (β = 0.5734, SE = 0.292, z = 1.965, p = 0.049): heritage speakers showed similaraccuracy in initial (75.5%) and final (73.8%) positions, while L2 learners’ accuracy declined from initial (65%) tofinal (53.9%) positions. There was also a significant interaction between consonant type and vowel context (/u/ –/æ/, β = -1.6426, SE = 0.432, z = -3.800, p < .001), with plain consonants identified most accurately in the /æ/context and emphatic consonants more accurately in the /u/ context. An interaction between vowel context and wordposition indicated that accuracy differences between initial and final positions were more pronounced for /æ/ thanfor /u/. Reaction times were examined using linear regression with the same predictors. While the main group effectwas not significant (β = -920.6, SE = 1111, t = -0.83, p = 0.407), a significant interaction emerged between groupand consonant type (β = 5001.8, SE = 1571, t = 3.18, p = 0.001): heritage speakers showed a smaller reaction timedifference between emphatic and plain consonants, whereas L2 learners had a larger disparity. Furthermore,significant three-way interactions among group, consonant type, and vowel context were found ((i – æ): β = -5153.4,SE = 1924, t = -2.68, p = 0.007; (u – æ): β = -5183.0, SE = 1924, t = -2.69, p = 0.007), indicating that heritagespeakers’ reaction times were stable across vowel contexts, while L2 learners’ varied according to both consonantand vowel.These findings highlight the perceptual advantages of early exposure for heritage speakers, while also suggestingthat continuous input is crucial for undergoing native-like processing. Neither group achieved the accuracy levelreported by native speakers in Jongman et al. (2011). The absence of a word position effect among heritage speakerssuggests the use of more generalized perceptual strategies than those seen in other heritage language contexts (cf.Oh et al., 2003, for Korean). This study presents new evidence of how linguistic experience influences Arabicspeech perception and offers implications for second language acquisition, heritage phonology, and curriculumdevelopment.
1:30–2:00
Crosslinguistic Influence and Protracted Instability in L2: Generic Plurals in Arabic
Mahmoud Azaz
Background. The Interface Hypothesis explains why certain constructions are more susceptible to crosslinguistic influence andprotracted instability in advanced L2 learning (Sorace, 2005; Sorace & Filiaci, 2006; Sorace, 2011; White, 2011). Constructions inwhich syntax interfaces with other internal modules (syntax-semantics interface), are differentiated from external interfaces, in whichsyntax interfaces with other “higher” modules (syntax-pragmatics/discourse interface). The focus on the L2 acquisition of interfaceproperties has recently extended to explore crosslinguistic influence and the learnability conditions that modulate the protractedinstability of these properties (Sorace, 2005; Sorace & Serratrice, 2009; Valenzuela, 2006).Interface Case in Question. The cases in question are the definite and bare plurals in MSA and English. Work on the syntax-semantics interface of definiteness in MSA and English has shown clear differences in how definite and bare nominals are interpretedin preverbal and postverbal positions (Fehri, 2004). Three interpretative possibilities are discussed: generic, specific, and existential.For genericity or kind reference, English uses three noun phrases in the preverbal position, one of which is bare plurals. Moreimportantly, a generic reading is not available for definite plurals as illustrated in (1). In MSA for generic reference to types, definitesingular and plural nouns are interchangeably used in the preverbal position (Fehri 2004) as illustrated in (2) and (3). Interestingly, inMSA bare plurals are grammatical in the preverbal position, but they are assigned an existential reading as illustrated in (4). This caseis commonly construed to be an instance of internal interface.Research Questions. (1) To what extent do three English-speaking learner groups of Arabic exhibits crosslinguistic transfer in theproduction of generic definite plurals? This question will be addressed by comparing production data of three groups of L2 Arabic;And (2) does input exposure as a learnability condition modulate protracted instability of generic definite plurals? This question willbe addressed by comparing production data for advanced learners who studied Arabic in an at-home and abroad. A native speakergroup was used as a control group in both questions.Data and Results. Data were collected using a prompted sentence completion task that asked the participants to read each sentence tounderstand the meaning established and provide the single missing word (spoken and written) with the help of a picture. This was inaddition to a prompted oral narrative task that elicited generic reading and the specific reading at the sentence level. Both tasks arecommonly held to provide spontaneous data in L2 studies. Accuracy rates for the suppliance of the definite article for the first questionfor the two tasks are provided in the tables below. A mixed design ANOVA was conducted on the group results. It returned asignificant main effect for group: F(3,47) = 46.039, p = .000, ηp2 = .746. Also, there was a significant main effect for the readingcondition: F(1,47) = 121.083, p = .000, ηp2 = .720, as well as a significant interaction between group and the reading condition:F(3,47) = 14.189, p = .0000, ηp2 = .475. They overall suggest a considerable crosslinguistic influence from English in the genericreading through providing bare plurals that persisted until advanced learning. This effect resulted in protracted instability of definiteplurals. Only the advanced-high group participants were able to overcome this instability and attain a high rate.This pattern was also found in the results of the oral narrative task. The low-advanced group clearly fluctuated between definite andbare plurals in the generic condition. The high-advanced participants steadily opted for definite plurals and did not show fluctuationbetween definite and bare plurals. Accuracy rates for the suppliance of the definite article for the second research question areprovided in the tables below.For the advanced-AH group, their performance overall was far from being target-like. The average score of target definite plurals wassurprisingly at chance (49.66%; SD: 0.31), and the average score of non-target bare plurals was slightly higher (50.33%; SD: 0.31)with no difference between both averages in a paired-samples t-test: t(9) = 0.9692, p = 0.3608. This pattern of fluctuation betweenbare and definite plurals showed that the advanced-at home participants were exhibiting protracted instability as a manifestation ofpersistent L1 effects. In comparison, the advanced-study abroad participants demonstrated an entirely different pattern. Theydemonstrated a considerable degree of stability, as the average score of their definite plurals was high (93.28%; SD: 0.14), and theaverage score of their bare plurals strikingly dropped: (6.72%; SD: 0.14), with a significant difference between both averages: t(9) = -7.4703, p = 0.0002. The same pattern was found in the oral narrative task.Discussion and Conclusion. The fluctuation patterns that the low-advanced group exhibited in the generic reading condition is takenas a manifestation of the persistent effects of L1 English. It is also consistent with the predictions of the original proposal of theInterface Hypothesis. The integration of semantic knowledge (generic vs. specific) with syntactic knowledge (bare vs. definite plurals)remained far from optimal and gave rise to protracted instability. This instability was caused by competition with the bare plurals intheir L1 English. Ionin and Montrul (2010) and Montrul and Ionin (2012) concluded that the complexities of mapping definite andbare plural nouns to their meanings (whether specific or generic) is one of the reasons this grammatical property is more vulnerable topersistent L1 effects in sentence interpretation even at higher levels of proficiency. The results of the beginning and low-advancedgroups are consistent with the tenets of the Bottleneck Hypothesis by Slabakova (2019), which seeks to explain what is hard and whatis easy to acquire in a second language. It proposes that it is functional morphology (such as definiteness) that is the bottleneck of L2acquisition. The results of the study abroad group supports the role of input exposure in stabilizing this particular case of interface. Itis possible that this group utilized the non-occurrence of generic bare plurals in the preverbal position as a cue in the input to map ageneric reading only to definite plurals. The question of which properties are harder to acquire than others will be further discussed.
2:00–2:15 PM ☕ Coffee Break
Pragmatics Pragmatics / Discourse 2:15–4:30 PM
2:15–2:45
Stylistic Terms of Address among Couples in Egyptian TV Series
Hasnaa Essam Farag
“Oh, my beloved,” “chum,” “oh woman,” “my girls’ father,” “oh sheikha,” “my soul,” “oh my unfortunateoutcome”! The art of address terms among lovers and spouses is a profoundly social practice shaped bysocietal perceptions. Performative interactions in Egyptian TV social dramas provide a rare public windowinto these intimate linguistic practices during private moments, reflecting societal perceptions of theircreators, who both mirror and reshape their communities’ linguistic practices in everyday interactions. Thissociopragmatic study provides an original analysis of the nuances of Egyptian address terms from a stylisticperspective, focusing on how TV social dramas depict these terms among couples from intersecting socialbackgrounds (sociolinguistics) and their discursive social goals (pragmatics), comparing them with existingstereotypes about gender. The study demonstrates how 588 tokens of nominal address terms by 24 couplesin six Egyptian TV social dramas (2016-2024) construct significant styles associated with specific socialgroups. The data captures 144 interactions (scenes), totalling 192 minutes, with six interactions lasting eightminutes per couple, where each spouse has a balanced amount of speaking time, across 95 episodes, writtenby 18 authors (9 men and 9 women). The data maintains a balanced representation of couples who areprotagonists with leading roles, belonging to two gender groups (husbands and wives), two age groups(young, 20-39, and middle-aged, 40-59), and three social classes (upper, middle, and working).The concept of ‘style’ was first introduced in sociolinguistics by Labov (1973), then expanded byCoupland (1980, 2007), along with many linguists (e.g., Mejdell, 2006), to identify the distinctive ways ofspeaking associated with social groups by the frequent co-occurrence of a linguistic form with a specificgroup more than others, serving social functions. I argue that address terms are often a stylistic choiceconnecting lexical meanings to particular social groups and discursive goals, and their social use on screenis not arbitrary but governed by social rules. Terms of address have not been explicitly or deeply exploredfrom a stylistic perspective in the Arab world and beyond, primarily due to the emphasis on their lexicalmeanings, either with reference to power and solidarity (e.g., Brown and Gilman, 1960), power andpoliteness (e.g., Brown and Levinson, 1978), sociolinguistics including social variables (e.g., Qin, 2008),or pragmatics including their general or specific functions (e.g., Mühleisen, 2011). Even sociopragmaticstudies on address terms integrating social variables and overall or specific functions demonstrate theseaspects separately without explicitly or thoroughly examining terms of address as a stylistic choice (e.g.,Al-Balqa’, 2021; Parkinson, 1985). This work takes a step further by going beyond binary approaches andcomprehensively combining previously isolated frameworks under the lens of style in the performativelandscape, adding to our socio-pragmatic understanding of the art of address terms in intimate relationshipsin Egypt.The study reveals that love and kinship terms are perceived as the most significant stylistic forms ofaddress among couples in Egyptian Arabic. Social identities, such as gender, age, and social class, interactto influence love and kinship terms, with no single identity being the most significant. The Egyptian TVseries construct these intersecting social identities among couples through these terms. Love terms ofaddress, like ‘/ya ḥabebti/’ (my beloved), distinguish young upper-class men when addressing their wives.In contrast, kinship terms of address, such as ‘/ya xōya/’ (chum, literally brother), characterize middle-agedworking-class women when addressing their husbands. Both young upper-class men and middle-agedworking-class women instrumentalize these styles to achieve discursive social goals and manage theirmarriage relationships rather than simply expressing love or referring to actual familial relationships. Suchportrayals both negotiate and reinforce broader claims about gender. For example, young upper-class men’sstyle of love terms of address negotiates the common perception about men as emotionally unlettered (e.g.,Cameron, 2008). Meanwhile, the style of middle-aged working-class women in using kinship terms, alongwith their primary social goal of blaming, criticizing, and complaining, reinforces the common perceptionthat women often express negative emotions and complaints (e.g., Talbot, 2010). The study encouragesfurther research on how styles of address terms are constructed and reconstructed, both in spontaneous andperformed speech, across various relationships in the Arab world and beyond.
2:45–3:15
From Khilāfah to Dawlah Madaniyyah: A CDA of Ideological Rearticulation in the Muslim Brotherhood's Political Language
Mohamed Sayed
This paper employs Critical Discourse Analysis to examine the linguistic evolutionwithin the Egyptian Muslim Brotherhood's political discourse, specifically its shift fromadvocating for the Khilāfah to promoting a Dawlah Madaniyyah (civil state). By analyzing keyArabic texts and statements from figures like Ḥasan al-Bannā, Sayyid Quṭb, ʿAbd al-Qādir ʿUda,and Muḥammad Mahdī ʿĀkif, the study investigates how changes in word choice, modality, andintertextual references reflect broader ideological transformations.Using Fairclough’s three-dimensional CDA model, the analysis highlights the discursivestrategies employed by Brotherhood leaders to construct notions of legitimacy, governance, andreligious authority. The findings indicate that while the term Khilāfah became less frequent, itscore concepts—unity, shūrā (consultation), and Sharīʿah—were recontextualized within themore adaptable framework of the Dawlah Madaniyyah. This linguistic change, I argue,represents not a break from previous ideology but rather a rephrasing of Islamic political idealsto suit the demands of modern political discourse.This study contributes to Arabic discourse studies by illustrating how political-religiousmovements navigate modernity through linguistic adaptation, offering insights into theideological functions of Arabic political terminology in post-Islamist settings.
3:15–3:45
Vocatives as Attitudinal Markers: The Tunisian Arabic Particle ha:
Amel Khalfaoui
Vocatives are expressions used to attract the hearer’s attention to the proposition of an utterance(Lambrecht 1996). This study examines the Tunisian Arabic particle ha:, a lesser-knownvocative counterpart to ya:, as in ya: Sonia / ha: Sonia. While ya: has been analyzed in previouswork (e.g., Al-Bataineh 2020; Haddad 2020; Shormani & Qarabesh 2018), ha: has not beeninvestigated in any capacity. I argue that although both particles introduce vocative phrases, onlyha: serves an additional pragmatic function: it functions as an attitudinal marker, guiding thehearer to recognize the speaker’s emotional stance toward the addressee.Corpus evidence shows that ha: is more constrained than ya: in frequency and distribution. Its use islimited to utterances in which the speaker expresses a negative or critical attitude toward the addressee.For example, in (1), a YouTube commentator uses ha: to mock an actress who changed her Arabic name,Jalila, to the Latinate name, Julia, conveying disapproval and sarcasm.(1) hhhh ha: ʒali :la rak men tu:nis. esm-ik ʒalilaaaaahhhh. VOC Jalila EMPH From Tunisia name-your Jalilaaaaa‘hhhh Jalila, you are from Tunisia, [and] your [real] name is Jalilaaaaa ’In contrast, when the speaker expresses a positive or neutral (balanced) attitude toward the addressee,only ya: is felicitous, whereas ha: results in infelicity, as shown in (2) and (3). In (2), the speakerexpresses support for the president by urging him not to reverse his decision to suspend the parliament.Here, ya: appropriately conveys respect and solidarity, while ha: would convey an unintendedinterpretation of criticism or mockery. Similarly, in (3), a blogger asks other participants for advice onhow to add friends on a social media platform. The context provides no evidence of a positive or negativestance toward the addressees. Using ha: in this context would therefore introduce an unnecessaryevaluative meaning such as blaming others for being unhelpful and could lead to miscommunication.(2) la: ruʒu:ʕ ya:/#ha: siya:dit r-raʔi:sNEG going back VOC Mr. the-president‘There is no going back, Mr. President.’(3)Ya:/#ha: ʒma:ʕa ʃniyya l-ħall mtƐ:ʕha l-ħka:ya haðiVOC folks what the-solution POS the-matter this‘Folks, what is the solution to this matter?’Building on research on attitudinal particles, this study adopts Relevance Theory (Sperber & Wilson1986/1995) to argue that the Tunisian Arabic particle ha: semantically encodes a procedural instructiondirecting the hearer to recognize the speaker’s emotional or attitudinal stance toward the addressee. In doingso, ha: serves as a guarantee of relevance, explicitly prompting the hearer to seek additional contextualassumptions to arrive at the intended interpretation. Thus, while ya: can occur in contexts similar to (1),only ha: explicitly instructs the hearer to interpret the utterance as conveying an emotional attitude.
3:45–4:15
Discourse of Manipulation in Egyptian Presidential Interviews (2012): A Pragma-Critical Analysis
Maged Nofal
Wodak & Meyer (2001) state that “language is not powerful on its own – it gains power by the usepowerful people make of it.” In line with this view, the present study investigates the language ofmanipulation employed by the most influential presidential candidates in their TV interviews during the2012 Egyptian presidential election. At that time, Egypt had never experienced a democratic regime, andthe 2012 election was deemed the first free, competitive presidential election in its history. This led manycandidates to employ manipulative strategies in an attempt to influence public opinion and gain electoralsupport.This study examines the intersection of pragmatics and critical discourse analysis (CDA) to uncovermanipulative strategies used by Egyptian presidential candidates following the 2011 revolution. ThroughTV interviews from the 2012 election, five leading candidates—Mohamed Morsi, Ahmed Shafiq, AmrMoussa, Hamdeen Sabahi, and Abdelmonem Aboulfotoh—were analyzed using a pragma-criticalframework that integrates conversational implicature, presupposition, and speech acts with Van Dijk’ssocio-cognitive model and Aristotle’s rhetorical appeals (ethos, pathos, logos).By merging micro-level pragmatic insights with macro-level ideological critique, the analysis reveals howcandidates strategically used language to construct credible personas, trigger emotions, and achievemanipulative discourse aimed at getting elected. The study also explores how positive self-representationand negative othering were used to manipulate public perception.This interdisciplinary approach demonstrats the value of integrating pragmatics and CDA in politicaldiscourse research. It highlights how applied linguistics can engage with real-world challenges. Finally, itpositions political discourse as a space where language evolves, ideologies are shaped, and people’svalues and beliefs can be influenced.
4:15–4:30 PM ☕ Coffee Break
Keynote IV
Nihal Nagi Sarhan
Ain Shams University, Egypt
Scripts of Power: Decoding Cairo's Linguistic Landscape
4:30–5:30 PM
8:15–8:45 AM ☕ Coffee
Morphology Morphology and Lexicon Session 8:45–10:45 AM
8:45–9:15
Arabic Binominal Compound Constructions & Word Formation
Anthony Bucco
This paper investigates how two binominal constructions in standard Arabic—bare noun-noun compounds andconstruct state nominals—pattern with and diverge from their well-studied Hebrew counterparts, and what thesesimilarities and differences imply for theories of word formation, and whether or not word formation is a distinctive,sequential module within the generative process. Building on Borer’s (1988) proposal that word formation is amodule that operates at multiple levels of generative grammar, the presented Arabic data, especially in the domain ofidiomatic and calqued expressions, calls for an enriched account that incorporates insights from ConstructionMorphology (Booij 2010).Borer’s analysis of Hebrew distinguishes between bare compounds (e.g. beyt xolim [house sicks] ‘hospital’) andconstruct state nominals (e.g. ca’if ha-yalda [scarf the-girl] ‘the girl’s scarf’) on several dimensions: definitenessmarking, the locus of primary stress, the availability of internal modification, and the behavior of number features onthe head and complement nouns. In her account, bare compounds are formed lexically and exhibit semantic opacityconsistent with the Lexical Integrity Hypothesis, while construct state nominals are syntactically formed andphonologically unified, with features such as plurality and definiteness percolating via a process she termsSecondary Percolation. These patterns motivate the view that word formation is not confined to a single pre- or post-syntactic component, but can apply at multiple points in the derivation.Arabic features an analogous pair of binominal constructions. As in Hebrew, both can mark plurality anddefiniteness on the head, and both show the genitive marking of the complement noun. However, a closer look at thesemantic behavior of Arabic constructions complicates a straightforward adoption of Borer’s architecture. Arabicexhibits (i) compounds with relatively transparent, often modern calque-like meanings (e.g. ālat zamān [machinetime] ‘time machine’), and (ii) construct state nominals with non-compositional, often lexicalized or metaphoricalmeanings (e.g. ’eid al-mīlad [festival the-birth] ‘Christmas’; rijl al-qanṭur [centaur the-foot] ‘(the constellation)Alpha Centauri’). This distribution contrasts with the strongly lexical-idiomatic character of Hebrew compoundsthat underpins Borer’s division of labor between lexical and syntactic word formation.A survey of Arabic binominal expressions shows that idiomatic or non-compositional meanings in Arabic can besystematically associated with particular constructional schema. Non-compositional construct state nominals tend todenote named categories and specific entities (epithets, species, toponyms, astronomical objects), while manycompounds function as calques for technical tools, artifacts, or abstract concepts modeled on European sourcelanguages. These patterns suggest that, in Arabic, idiomaticity is not tightly tied to a single structural configuration(compound vs. construct) but to higher-level constructional schemas that cut across the traditional lexical/syntacticdivide.I therefore propose that Borer’s insights about feature percolation, opacity, and word-formation strata can bemaintained, but that the interpretive component must be enriched by constructional schemas of the sort posited inConstruction Morphology. For Arabic, we can posit at least two such schemas: (i) a “calque-compound” schema thatlicenses N–N sequences denoting instantiations of tools, implements, or conceptual types, typically with anindefinite genitive complement; and (ii) a “non-compositional construct” schema that licenses definite construct statenominals interpreted as metaphorical exemplars of a category possessed or modified by an attribute. Genitive caseand (in)definiteness then conspire with these schemas to yield the observed contrast between specific, lexicallyanchored referents and more general, instantiated concepts.The broader theoretical implication is that a purely generative, structure-only account of word formation isinsufficient once we consider cross-linguistic variation in idiomaticity and calquing strategies. Arabic shows that thesame structural template (e.g. construct state nominal) can host both compositional and non-compositionalmeanings, depending on its constructional schema. Semitic word formation can be better understood through amulti-level word-formation module with a construction-based semantics. Future directions for this proposal includea more rigorous corpus-based investigation of colloquial varieties, as well as further research into the full range ofArabic calquing strategies and how a more complex calquing system might inform the outlined schemata.
9:15–9:45
Evaluating Arabic Morphological Analyzers on Moroccan Arabic Diminutives
Fedoua Rahmaouy
In Arabic NLP, most work on morphological processing has focused on Modern Standard Arabic (MSA) and a few major dialects such as Egyptian, Levantine, and Gulf Arabic. Researchers initially developed MSA-based morphological analyzers, including early systems such as the Buckwalter Arabic Morphological Analyzer (BAMA; Buckwalter, 2002), the Standard Arabic Morphological Analyzer (SAMA; Maamouri et al. 2010), and later Farasa (Abdelali et al. 2016).Subsequent tools expanded coverage to include dialectal morphology, especially for Egyptian, Levantine, and Gulf Arabic. MADAMIRA (Pasha et al,. 2014) was among the first analyzers to incorporate dialectal variants, while the more recent CAMeL Tools (Obeid et al., 2020) integrates advanced analyzers such as CALIMA-Star. Despite these developments, coverage of many dialects remains limited.Moroccan Arabic (MA) is one such dialect that remains underrepresented in tool development and evaluation. MA exhibits distinctive and complex morphological phenomena, one of which is diminutive formation. Diminutives are frequent in everyday speech and play an important role in highlighting the morphological richness of MA. Their complexity arises from irregular phonological processes—such as reduplication, syllable insertion, gemination, and degemination—that the base form undergoes(Elmdari,1999;Boudlal,2001;Lahrouchi & Ridouane, 2016). These irregularities pose serious challenges for existing Arabic NLP tools.Despite their frequency and linguistic significance, diminutive forms have received little to no computational attention. This lack of coverage reveals a significant gap in how current morphological analyzers process Moroccan Arabic. Hence, this study provides the first systematic evaluation of Arabic morphological analyzers on Moroccan Arabic diminutives. The dataset consists of approximately 200–300 Moroccan Arabic sentences, each containing one or more diminutive forms, manually annotated for lemma, part of speech (POS), and gloss. The analyzers under evaluation include MADAMIRA and CAMeL Tools. Analyzer performance will be evaluated using coverage (the proportion of diminutive tokens recognized), accuracy (for lemma, pos, and gloss) and precision, recall, and F1 (to measure the accuracy of morphological analyses against a gold-standard annotation). The evaluation is currently in progress, and preliminary observations suggest limited recognition of diminutive forms by existing analyzers. Full results and analysis will be presented at the conference. An error analysis will also be conducted to identify patterns of under-recognition and misanalysis. The results are expected to highlight systematic gaps in current analyzers, underscoring the need for developing morphological tools with broader dialectal coverage.
9:45–10:15
Permeability and stratification in the Moroccan Arabic diglossic lexicon
Ali Idrissi and Ali Nirheche
The total assimilation of the Arabic definite marker l- is triggered by stem-initial coronal consonants, exceptfor the fricative [ž] (an affricate in some dialects) which never assimilates in Standard Arabic (StA) butmay or may not assimilate in spoken varieties. This pattern is illustrated by the StA and Moroccan Arabic(MA) cognates in (1). In MA, assimilation applies in (1b, d) but fails in (1f). Freeman (2016), Harrell(1962), and Heath (1987) propose that forms such as (1f) are borrowings from StA and therefore lexicallyspecified as exceptions to the assimilation process. In a recent corpus study, however, Nirheche (2025)shows that the behavior of [ž] in MA is gradient rather than categorical: assimilation occurs in 96% of caseswhen [ž] precedes a consonant (1b), 84% before a schwa (1d), and 37% before a full vowel (1f). Heproposes that in MA, l-ž assimilation is phonologically rather than lexically conditioned.(1) Standard Arabic Moroccan Arabica. /l-žamal/ [lžamal] *[žžamal] b. /l-žməl/ [žžməl] *[lžməl] ‘camel (def.)’c. /l-žarab/ [lžarab] *[žžarab] d. /l-žərba/ [žžərba] *[lžərba] ‘scabies (def.)’e. /l-žanuub/ [lžanuub] *[žžanuub] f. /l-žanub/ *[žžanub] [lžanub] ‘south (def.)’In this paper, we depart from both accounts and propose that MA l-ž assimilation is best captured within astratified yet permeable lexicon, organized into a nativized MA ‘core’ and a StA ‘periphery’ stratum. Thisview situates MA within the broader theory of lexical stratification (Becker & Gouskova, 2016; Hsu &Jesney, 2018; Itô & Mester, 1995, 1999; Jurgec, 2010; Smith, 2018) while adapting it to a diglossicarchitecture in which strata, part of the overall grammatical system, are non-nested but crucially interactthrough porous boundaries (see Idrissi et al., 2021). This approach finds strong and hitherto undiscussedsupport in near-minimal cognates such as listed in (2): core stems systematically undergo assimilation(2a,c), while peripheral ones resist it (2b,d). Undergoing assimilation is thus determined neither by simplevariety affiliation (borrowed vs. non-borrowed) nor about its phonological environment (i.e., the makeupof the stem-initial syllable), but by lexical organization within a diglossic grammar.(2) Core items (MA) Periphery items (StA)a. /l-žəbha/ [žžəbha] (pl. žbuh) ‘forehead’ b. /l-žabha/ [lžabha] (pl. žabah-aat) ‘frontline’c. /l-žayħa/ [žžayħa] (pl. žayħ-at) ‘misfortune’ d. /l-žaaʔiħa/ [lžaaʔiħa] (pl. žawaaʔiħ) ‘pandemic’Further evidence for our permeable-stratified lexicon is found in the behavior of cognates such as in (2)across different grammatical domains. This includes: (i) semantic shift: cognates often diverge in meaning(see (2a,c) vs. (2b,d)); (ii) morphological classes: cognates may select different morphological processes(e.g., sound plural (2b,c) vs. broken plural (2a,d)); (iii) phonemic inventory: MA core vocabulary lacks theglottal stop (StA /muʔmin/ vs. MA /mumən/ ‘believer’); and (iv) syntactic distribution: periphery participlesavoid the MA circumfixal negative marker ma—š and selecting the particle maši (maši ħaaʔir ‘notconfused’ but *ma-ħaʔir-š). Notably, borrowings tend to enter the core and behave accordingly (e.g.,French jaquette > /l-žakita/ = [žžakita] ‘jacket’ and journal > /l-žurnal/ >žžurnal ‘newspaper’). Thesepatterns show that stratification in MA extends beyond phonology to morphology, semantics, and syntax.Crucially, we argue that the stratification defended in our model is not static: the core and peripheral strataremain parts of a single ‘converging’ synchronic grammar, where the MA and StA grammatical subsystemsmay be distinct yet, because their borders are fluid, interact dynamically at all levels of structure andprocessing (Idrissi et al., 2021). Categorical patterns of the type in (1) are characteristic of lexical itemsunambiguously housed either in the core or the periphery. The appeal of the proposed model, however, liesin the fact that it also predicts cases where lexical items, be they native or borrowed, can showdual/ambiguous behavior whereby they can easily shift between core and periphery depending onextragrammatical factors such as speaker literacy level, register or context, language attitudes, age, gender,or even subdialect (e.g., Arabic /l-žil/ ‘generation’ can be lžil or žžil; and French /l-žinirik/ ‘credits’ can belžinirik or žžinirik), reflecting partial interaction between the two systems of the speaker’s mental grammar.
10:15–10:45
Differential Semantic Retention Patterns of Standard Arabic Lexicon in Regional Dialects
Attia Youseif
Arabic dialects exhibit remarkable variation in how they retain lexical items from Standard Arabic (SA). While some regional dialects preserve SA lexemes with their original meanings, others demonstrate significant semantic shifts, complete replacement, or specialized usage (Versteegh, 2014; Holes, 2004). This study presents a systematic investigation of differential semantic retention patterns across major Arabic dialects, examining how and why identical SA lexemes evolve differently across linguistic communities.We introduce a comprehensive, multi-dialectal dataset that documents semantic retention patterns for 2,283 SA lexical items across 19 regional dialects, including Egyptian, Tunisian, Gulf, Levantine, Moroccan, and Yemeni varieties (Brustad, 2000). For each lexeme, we document: (1) the SA meaning, (2) dialectal meanings, and (3) frequency of usage. We leverage the data provided by theonline dialectal dictionary mo3jam.com, which contains words and expressions from 19 regionaldialects. Each linguistic item has received multiple votes from users, indicating whether they agree or disagree with its inclusion in the respective dialect. This dialect verification process is further vetted in two additional ways: [1] against multiple standard datasets for Arabic dialects, and [2] eliciting judgements from native speakers of each dialect.Our comparative analysis reveals three primary retention patterns: (1) Shared Retention – lexemes maintaining identical semantic content across dialects (e.g., akala "to eat"); (2) Differential Shift – lexemes exhibiting distinct semantic evolution in different regions (e.g., bakhīl: "stingy" in Egyptian vs. "lazy" in Tunisian) (Traugott & Dasher, 2001); (3) Synonymous Retention - different dialects selecting a lexeme from SA near-synonyms to express a given concept (e.g., ‘sleep’ for example is ‘yana:m’, ‘yarqudh’ and ‘yan3as’ in Egyptian, Tunisian and Moroccan dialets, respectivly; and (4) Selective Retention – SA lexemes with multiple meanings where different dialects retain only one of those meanings (e.g., shaḥīḥ: "scarce" in Egyptian vs. "stingy" in Tunisian) (Nerlich & Clarke, 2003; Geeraerts, 2010).This research contributes to Arabic dialectology by providing: (1) the first systematic cross-dialectal semantic retention dataset; (2) evidence-based patterns of semantic divergence; and (3) a framework for understanding why certain lexical items remain stable while others undergo rapid semantic change (Owens, 2013). The dataset is designed to support future research in comparative dialectal studies, sociolinguistic variation, and Arabic natural language processing (Habash, 2010).
10:45–11:00 AM ☕ Coffee Break
Grammar Grammar Issues Session 11:00 AM – 1:00 PM
11:00–11:30
The discourse function of second-person clitics in Khuzestani Arabic
Seyyed Hatam Tamimi Sa'd
Khuzestani Arabic (KhA) is a gΩlΩt dialect of South Mesopotamian Arabic spoken by Iranian Arabs in thesouthwestern province of Khuzestan. Despite its typological importance, KhA has remained largelyunderstudied due to restricted academic access and sociopolitical marginalization (Leitner, 2022; Matras &Sakel, 2007). This dialect employs several light verbs (e.g., gɑʕΩd ‘standing’, gɑm ‘stood’, rɑħ ‘went’, dˤæɫ‘stayed’, and ɣΩdæh ‘outpaced’) that denote various aspectual meanings (e.g., inceptive, durative) and thattake lexical verbs as complements which denote the event itself (1).(1) gɑm yԥ-rkԥdˤstood.3sg.masc.pfv impf-run.3sg‘He started to run.’Light verbs may optionally host second-person dative clitics læ-k / lΩ-tʃ (singular masculine and feminine,respectively) or l-kæm / l-tʃæn (plural masculine and feminine, respectively) (2), which may also be hostedby the main verb (3). Attaching the clitic to the light verb is the unmarked pattern.(2) gɑm-læ-k yԥ-rkԥdˤ (3) gɑm yԥ-rkԥdˤ-læ-kstood.pfv-for-2sg.masc impf-run.3sg‘He started to run.’I argue that these clitics (i) are not genuine benefactive or dative markers, but (ii) serve a pragmatic-discourse function, namely, addressee involvement and event relevance. Four lines of evidence support thisclaim. First, the optionality of these clitics and their free occurrence with transitive, intransitive, andditransitive verbs indicate that they are not selected arguments. Second, whether they attach to the lightverb or to the main verb does not affect the truth conditions of the clause, leaving its core meaning largelyunaffected. If they were genuine benefactives, their attachment site would necessarily correspond to theverb whose semantics they modify. Instead, their positional flexibility indicates a pragmatic or discourse-driven function. Nevertheless, a slight semantic shift is observed: attachment to the light verb emphasizesaspectual information such as event initiation or actualization, while attachment to the main verbemphasizes the event itself. Third, unlike in other Arabic dialects, KhA clitics are restricted to second-person forms, to the exclusion of first and third-person forms, reflecting an addressee-oriented pragmaticfunction centered on addressee involvement. The infelicity of (3), which bears a non-second-person clitic,thus follows naturally from this restriction:(3) * gɑm-ԥl-hæh yԥ-rkԥdˤstood.3sg.pfv-for-3sg.fem impf.run.3sgFinal supporting evidence comes from negation: when these clitics co-occur with negated predicates, theresult is semantically infelicitous (4):(4) # mɑ-gɑm-læ-k yԥ-rkԥdˤneg-stood.pfv-for-2sg.masc impf.run.3sg‘He didn’t start to run.’I argue that this infelicity arises because these clitics presuppose the initiation or actualization of the event,which conflicts with sentential negation asserting non-realization. This clash, thus, results in pragmaticpresupposition failure. Negation becomes felicitous only when it contrasts with a presupposed affirmative,as in Givón’s (2001, p. 370) example, where negation conveys a contrary-to-expectation inference:(5) A: What’s new?B: My wife isn’t pregnant.Negation thus presupposes that the corresponding affirmative proposition (“My wife is pregnant”) wascontextually expected. Similarly, in KhA, second-person form clitic construction presupposes an initiatedor real event; when negation denies that event, the presupposition fails, yielding pragmatic anomaly.KhA contributes new empirical evidence to the growing cross-dialectal literature on clitics (e.g., Bar Moshe,2021; Haddad 2020a; Haddad 2020b), showing that KhA dative clitics serve as addressee involvement andevent salience markers rather than argument structure. These findings broaden our understanding of thesyntax-pragmatics interface in Arabic through discourse-oriented morphology.
11:30–12:00
Evidence for Indexical Shift in Moroccan Arabic
Hafida Iguir
Indexicals in the complements of attitude verbs can shift their reference from the utterance context to the reported speech context in many languages (e.g. Amharic, Schlenker 1999; Zazaki, Anand and Nevins 2004; Anand 2006; Nez Perce, Deal 2014, 2019; Korean, Park 2016). The goal of this paper is to establish that d:a:ri:dZa (Moroccan Arabic, MA) also exhibits indexical shift in complement clauses, as shown in (1) with time and locative indexical adverbs —a phenomenon to our knowledge not reported before. (1) a. On Monday, June 3rd, Sana says: “I will leave tomorrow.” On June 10th, I tell Ayoub:ttninmondaylli:Cfa:tpassedsanasanaga:l-tsay.PAST-F3Sli-jja:to-meb@lli:C°Ga-tmSi:FUT-F3S.walkGdda:tomorrow‘Last Monday, Sana told me that she would leave the next day.’ b. In Mirleft, Ayoub says: “people are nice here”. In Nantes I utter:m@lli:whenka:nbe.PAST.M3Sf-mi:rl@ftin-mirleftga:lsay.PAST.M3Sb@lli:C°nnaspeoplemzja:n-innice-PLHna:ja:here‘When he was in Mirleft, Ayoub said that people are nice there.’ In (1a), the indexical time adverb Gdda: ‘tomorrow’ refers to the day after the day of the reported speech event (June 4th), not the day following the day of speech (June 10th). In (1b), Hnaja: ‘here’ designates the location of the attitude holder (Mirleft), not that of the speech act (Nantes). We offer 6 arguments to establish that this perspective shift has the hallmark properties of IS, and is not reducible to alternative analyses –in particular, (partial) quotational analyses (i.e. (iii-vi)):i. It is restricted to complements of speech report verbs (’say’, ’tell’) and thus unavailable in other subordinate contexts.ii. Shifted adverbs are true indexicals: they cannot be analyzed as containing a bindable anaphor responsible for their shifted/anaphoric reading, since they cannot be bound by/covary with a quantifier, unlike t@mma ‘there’ or lG@dliH ‘the next day’ (cf. Kaplan 1989).iii. Material in embedded clauses appears in nonverbatim form, even in clauses containing a shifted indexical (Shklovsky and Sudo 2014).iv. Shifted indexicals are possible in embedded environments where an argument is interpretedde re (Deal 2019).v. Availability of long-distance extraction from an embedded clause containing a shifted indexical (cf. Sudo 2012), as shown in (2).vi. Shifted indexicals obey the Shift-Together Constraint (Anand and Nevins 2004) which requires all indexicals in the same embedded domain to shift together: both of the embedded adverbs in (2) can shift together ((a)), remain unshifted ((b)), but mixed readings ((c-d)) are ungrammatical. Crucially, the shift cannot be due to direct quotation, as the wh-phrase has been moved long-distance out of the embedded clause and quotations are islands for movement (Sudo 2012). (2) Shifted context: Speaker asks this question on a Saturday in Nantes.Sku:nWhoga:l-tsay.PAST-F3Ssanasanattninmondaylli:Cfa:tpassedf-mi:rl@ftin-mirleftb@lli:C°Ga-jSu:f-u:FUT-see.3SM-himajju:bayoubHna:ja:hereGdda:?tomorrowLit: ‘Who did she say, last Monday, that Ayoub would see here, tomorrow?’ a.‘here’ = Nantes, ‘tomorrow’ = Sunday *c.‘here’ = Mirleft, ‘tomorrow’ = Sunday b.‘here’ = Mirleft, ‘tomorrow’ = Tuesday *d.‘here’ = Nantes, ‘tomorrow’ = Tuesday
12:00–12:30
Answering systems and proposition salience: Evidence from Hijazi Arabic
Aisha Fuddah
There is a well-known distinction between two answering systems, often referred to as the truth-based system and thepolarity-based system (Jones, 1999; Holmberg, 2016). These systems differ in how they answer yes-no questions like (1).(1) Q: Does John not drink coffee?In truth-based languages like Japanese, the negative proposition (John doesn’t drink coffee) is confirmed by ‘yes’ and thepositive one (John drinks coffee) by ‘no’. In polarity-based languages like Swedish, ‘no’ confirms the negative proposition,and a special affirmative particle ‘yes.REV’ confirms the positive proposition. Holmberg (2016) links this contrast to theposition of negation (i.e., NEG is low inside vP in truth-based languages and high outside vP in polarity-based ones). Basedon novel data from Hijazi Arabic, we propose that answer particles target the semantically salient proposition, not negation.In truth-based languages, this is the entire proposition (including negation), whereas in polarity-based languages, it is thepositive proposition even when the question is negative, as illustrated in (1'). Following Krifka (2013), we treat answerparticles as operating on propositions introduced by the question, but show that in Hijazi Arabic, salience is fixed as thepositive alternative, independent of discourse.(1') a. Truth-based: Salient proposition is John doesn’t drink coffeeA1: ‘yes’ (= he doesn’t drink coffee) A2: ‘no’ (= he drinks coffee)b. Polarity-based: Salient proposition is John drinks coffeeA1: ‘yes’ / ‘yes.REV’ (= he drinks coffee) A2: ‘no’ (= he doesn’t drink coffee)Hijazi Arabic, like many polarity-based languages (Holmberg 2016), has a three-particle answering system: ʔiwa ‘yes’confirms the positive proposition, laʔ ‘no’ the negative, and ʔilla ‘yes.REV’ reverses the negative polarity to confirm thepositive proposition. While ʔiwa and laʔ are felicitous with neutral questions (2), ʔilla is licensed only under negation (3).(2) Q: saːg xaːlid ʔas-sajjaːra?drove.3SM Khalid the-car‘Did Khalid drive the car?’A1: ʔiwa. A2: laʔ. A3: #ʔilla.yes no yes.REV‘Yes.’ (= he drove the car) ‘No.’ (= he didn’t drive the car) ‘Yes, he did.’(3) Q1: xaːlid kaːn maː ji-ʒri fi-l-leːl? Q2: xaːlid maː kaːn ji-ʒri fi-l-leːl?Khalid was.3SM NEG 3SM-run at-the-night Khalid NEG was.3SM 3SM-run at-the-night‘Was Khalid not running at night?’ ‘Was Khalid not running at night?’A1: #ʔiwa. A2: laʔ. A3: ʔilla.yes no yes.REV‘Yes.’ ‘No.’ (= he wasn’t running at night) ‘Yes, he was.’The questions in (3) also show that negation in Hijazi Arabic may occur inside vP (3Q1) or outside it (3Q2). In both cases,only laʔ and ʔilla are felicitous, while ʔiwa is not, indicating that the syntax-based account cannot explain the distributionof answer particles. Following Krifka (2013), we analyse answer particles as propositional anaphors referring to discoursereferents introduced by the question. In neutral questions, LF introduces a positive proposition d, which ʔiwa assertsand laʔ rejects (¬d). In negative questions, LF introduces d and its negative counterpart d′; ʔilla asserts d and laʔ asserts ¬d.The observation that ʔilla is licensed only under negation indicates that d′ is introduced in such contexts, while the infelicityof ʔiwa suggests it is blocked when d′ is present. Crucially, this does not make d′ the salient alternative. Since ʔilla stillasserts d, the positive proposition remains semantically salient. If d′ were salient, we would expect the particle system toshift accordingly: ʔiwa should then be able to assert d′ (= ‘Yes, Khaled wasn’t running at night’), and laʔ should be able toreject it (= ‘No, he was’). Discourse-biased contexts like (4) further support this analysis: regardless of bias, ʔilla asserts d,laʔ asserts ¬d, and ʔiwa remains infelicitous. Thus, in Hijazi Arabic (and perhaps in other polarity-based languages), theanswering system is not sensitive to discourse-level salience, contrary to what we would expect in truth-based languages.(4) Context: The speaker knows the hearer drinks coffee every morning, yet the cup remains untouched at noon.Q: maː ʃəribt gahwa lissa?NEG drank.3SM coffee yet‘Did you not drink coffee yet?’A1: laʔ. ‘No.’ (= I didn’t drink coffee) A2: ʔilla. ‘Yes, I did’ A3: #ʔiwa. ‘Yes.’These observations suggest that answer particles are determined not by structure (Holmberg, 2016) or discourse (Krifka,2013), but by how languages encode propositional salience. The Hijazi Arabic pattern thus shows that cross-linguisticvariation in answer particles reflects differences in which proposition the grammar selects as semantically salient.
12:30 PM 🎓 Closing Remarks — End of Symposium