Keynote
Muhammad Abdul-Mageed
The University of British Columbia, Canada
Talk Title
Measuring Arabic Competence in Modern Models: Variation, Cultural Meaning, and Contextual Appropriateness Across Varieties
Arabic is best understood not as a single language target but as a structured space of variation across geolects, registers, genres, and culturally anchored meaning. This keynote argues that Arabic linguistics and Arabic NLP share a methodological bottleneck: we discuss “competence” while relying on resources and evaluations that often under-specify what they measure. This has become more visible with the rise of large language models, whose fluent outputs can mask systematic gaps in dialectal control, cultural inference, and context-appropriate choice. I propose treating corpora, tasks, and benchmarks as measurement instruments designed to make variation variables explicit, controllable, and comparable, so computational results can speak to linguistic hypotheses and linguistic theory can constrain computational claims. I present a unified program around three increasingly stringent notions of Arabic competence. (i) Variation competence: recognizing and preserving dialectal signals under controls that reduce spurious correlates such as topic, named entities, and geography-as-content. (ii) Cultural–pragmatic competence: interpreting conventionalized, non-compositional meaning and felicity conditions, illustrated through multidialectal proverb understanding and other culturally saturated constructions. (iii) Situated competence: choosing appropriate forms when meaning and appropriateness depend on grounded context, including visually and socially specified settings, interactional roles, and culturally salient cues. Across these layers, I show how community benchmarks and diagnostic protocols expose failure modes that matter to both fields. I conclude with a joint agenda for cumulative science: phenomenon-driven evaluations that encode linguistic variables by design, analyses that connect model behavior to Arabic generalizations, and inclusive data practices that treat Arabic diversity as the scientific object.
+ View Speaker Bio
Muhammad Abdul-Mageed is the Canada Research Chair in Natural Language Processing and Machine Learning and an Associate Professor at the University of British Columbia. As Director of the UBC Deep Learning & NLP Group, Co-Director of the SSHRC I Trust Artificial Intelligence partnership, and Co-Lead of the SSHRC Ensuring Full Literacy initiative, he develops multilingual, multimodal, and cross-cultural large language models that are culturally sensitive, equitable, efficient, and socially aware. These models advance applications across speech, language, and vision—supporting improved human health, more engaging learning, safer social networking, and reduced information overload. Securing extensive research funding, his work has been supported by the Gates Foundation (through Clear Global), NSERC, and the Canada Foundation for Innovation, with additional contributions from Google, AMD, and Amazon. A recipient of the 2025 Abdul-Hameed Shoman Award for AI and Arabic and more than 10 best paper awards, Dr. Abdul-Mageed has authored more than 200 peer-reviewed publications. He has advised the Government of Canada on generative AI policy and delivered invited lectures, keynotes, and panel presentations in more than 25 countries. His work has been featured in outlets such as MIT Technology Review, The Globe and Mail, Euronews, and Libération.
📍 Radio-TV Building, Room 251 · Friday, March 27 · 4:50 PM – 5:50 PM