Section 01
Introduction / Main Post: FrameFinder: A Local VLM-Based Multimodal Video RAG System
FrameFinder is an open-source multimodal Retrieval-Augmented Generation (RAG) system that combines the dual encoder architecture of OpenCLIP ViT-H-14 and TimeSformer to enable intelligent semantic retrieval and question answering for video content.