Section 01
[Main Floor] MultiPun: Can Large Vision-Language Models Understand Multimodal Puns?
This article introduces MultiPun, a paper accepted at the ACL 2026 main conference, which explores the ability of large vision-language models (LVLMs) to understand image-text combined puns, revealing the limitations and challenges of current models in capturing cross-modal humor and ambiguity.