Lexical Search (Exact Matching)
Principle: Based on exact term matching, using algorithms like TF-IDF and BM25 to score based on factors such as term frequency and document length. Advantages: Fast speed, good exact matching effect, strong interpretability; Limitations: Cannot understand synonyms, sensitive to spelling errors.
Semantic Search (Semantic Understanding)
Principle: Encode text into vectors using pre-trained models (e.g., BERT) and retrieve based on cosine similarity. Advantages: Understands synonyms, strong robustness, supports cross-language; Limitations: High resource consumption, weak exact matching effect.
Hybrid Search (Complementing Strengths)
Principle: Execute two retrievals in parallel and fuse results via RRF (Reciprocal Rank Fusion) or weighted summation. Advantages: Combines precision and flexibility, adapts to diverse scenarios; Fusion strategies: RRF (Reciprocal Rank Fusion), weighted summation.