Arabic Heritage Speakers’ Perception of Emphatic–Plain Contrasts: The Influence of Vowel Context and Consonant Position
Heritage speakers (HS) are individuals who grow up in households where a minority language is spoken and acquirethe majority community language during early childhood (Montrul, 2016). Research on a variety of languages,including Hindi, Mandarin, Spanish, and Korean, demonstrates that heritage speakers often outperform secondlanguage (L2) learners in perceiving phonemic contrasts, a benefit attributed to early exposure to heritage languageinput (Tees & Werker, 1984; Au et al., 2002; Knightly et al., 2003; Oh et al., 2003; Godson, 2004). In Arabic,emphatic consonants are a defining feature of the phonological system. These sounds are produced with a secondaryconstriction in the pharyngeal or velar region, which distinguishes them from their plain counterparts (Watson,2002). Research on vowel context and consonant position has shown that these factors affect the perception of thesecontrasts, as coarticulatory effects and emphasis spread can modulate their perceptual salience (Jongman et al.,2011; Al-Masri & Jongman, 2004; Hayes-Harb & Durham, 2016). Recognizing these factors, the present studyinvestigates how vowel context and consonant position influence the perception of emphatic–plain contrasts inArabic heritage speakers of Levantine descent compared to English-speaking L2 learners. We hypothesized thatheritage speakers would outperform L2 learners in accuracy due to early exposure to the heritage language.Participants included eighteen Arabic heritage speakers (of Jordanian, Syrian, or Palestinian descent) and eighteenL2 learners (American), who were undergraduates enrolled in intermediate or advanced university Arabic courses.The stimuli consisted of monosyllabic Arabic words sampled from university-level textbooks, containing emphatic(/dˤ/, /tˤ/, /sˤ/, /ðˤ/) and plain (/d/, /t/, /s/, /ð/) consonants in both word-initial and word-final positions, across threeshort vowel contexts (/æ/, /u/, /i/). A female native speaker of Levantine Arabic recorded all the words. During theauditory forced-choice identification task, participants were instructed to respond as quickly and accurately aspossible. In each trial, they heard a single word and responded to the prompt “Which word did you hear?” Forexample, they distinguished between [sˤæb] ‘he poured’ (PST.3MSG) and [sæb] ‘he pulled’ (PST.3MSG).Data were analyzed using logistic regression, with Group (HS vs. L2), Consonant type (emphatic vs. plain), Voweltype (/æ/, /u/, /i/), and Word position (initial vs. final) as predictors, along with their interactions. Heritage speakersdemonstrated significantly higher identification accuracy (M = 74.7%, SD = 9.2) than L2 learners (M = 59.5%, SD= 8.7; Group (L2 – HS): β = -1.2041, SE = 0.299, z = -4.023, p < .001), indicating an advantage conferred by earlyexposure. Consonant type had a marginally significant effect (β = 0.7152, SE = 0.373, z = 1.916, p = 0.055), withbetter performance for plain (70.0%) than emphatic (63.2%) consonants. A significant interaction between group andword position was observed (β = 0.5734, SE = 0.292, z = 1.965, p = 0.049): heritage speakers showed similaraccuracy in initial (75.5%) and final (73.8%) positions, while L2 learners’ accuracy declined from initial (65%) tofinal (53.9%) positions. There was also a significant interaction between consonant type and vowel context (/u/ –/æ/, β = -1.6426, SE = 0.432, z = -3.800, p < .001), with plain consonants identified most accurately in the /æ/context and emphatic consonants more accurately in the /u/ context. An interaction between vowel context and wordposition indicated that accuracy differences between initial and final positions were more pronounced for /æ/ thanfor /u/. Reaction times were examined using linear regression with the same predictors. While the main group effectwas not significant (β = -920.6, SE = 1111, t = -0.83, p = 0.407), a significant interaction emerged between groupand consonant type (β = 5001.8, SE = 1571, t = 3.18, p = 0.001): heritage speakers showed a smaller reaction timedifference between emphatic and plain consonants, whereas L2 learners had a larger disparity. Furthermore,significant three-way interactions among group, consonant type, and vowel context were found ((i – æ): β = -5153.4,SE = 1924, t = -2.68, p = 0.007; (u – æ): β = -5183.0, SE = 1924, t = -2.69, p = 0.007), indicating that heritagespeakers’ reaction times were stable across vowel contexts, while L2 learners’ varied according to both consonantand vowel.These findings highlight the perceptual advantages of early exposure for heritage speakers, while also suggestingthat continuous input is crucial for undergoing native-like processing. Neither group achieved the accuracy levelreported by native speakers in Jongman et al. (2011). The absence of a word position effect among heritage speakerssuggests the use of more generalized perceptual strategies than those seen in other heritage language contexts (cf.Oh et al., 2003, for Korean). This study presents new evidence of how linguistic experience influences Arabicspeech perception and offers implications for second language acquisition, heritage phonology, and curriculumdevelopment.