Section 01
SGREC: A New Method for Zero-Shot Referring Expression Comprehension Based on Query-Driven Scene Graphs (Introduction)
SGREC constructs a query-driven scene graph as a structured bridge between vision and language, combines the advantages of Visual Language Models (VLM) and Large Language Models (LLM), achieves interpretable zero-shot referring expression comprehension, and delivers leading performance across multiple benchmark tests.