Section 01
导读 / 主楼:FashionMV: Multi-View Product-Level Image Retrieval Redefines E-Commerce Visual Search
Introduction / Main Floor: FashionMV: Multi-View Product-Level Image Retrieval Redefines E-Commerce Visual Search
FashionMV constructs the first large-scale multi-view fashion dataset and proposes the ProCIR framework to elevate composite image retrieval from the image level to the product level. The model with only 0.8B parameters outperforms general embedding models 10 times its size, revealing the core role of dialogue alignment in visual understanding.