Zing Forum

Reading

AIR: Integrating Semantic Capabilities of Large Language Models into Industrial Cross-Domain Recommendation Systems

The Kuaishou E-commerce team proposed the AIR framework, which achieves 400x inference acceleration through offline LLM reasoning and online dynamic intent representation construction, leading to a 3.446% increase in GMV in real business scenarios.

跨域推荐大语言模型工业级部署快手电商推荐意图推理离线在线分离
Published 2026-06-09 11:13Recent activity 2026-06-10 09:18Estimated read 6 min
AIR: Integrating Semantic Capabilities of Large Language Models into Industrial Cross-Domain Recommendation Systems
1

Section 01

Introduction: Kuaishou E-commerce's AIR Framework—LLM Semantic Capabilities Applied to Industrial Cross-Domain Recommendation

The Kuaishou E-commerce team proposed the AIR (Atomic Intent Reasoning) framework. Through an innovative architecture combining offline LLM reasoning and online dynamic intent representation, it addresses the semantic gap, data noise, and inference latency issues in cross-domain recommendation. It achieves a 400x inference acceleration and delivers a significant 3.446% increase in GMV in real business scenarios.

2

Section 02

Background and Challenges: Three Core Problems in Cross-Domain Recommendation

Background and Challenges

Cross-domain recommendation is a core problem in content e-commerce scenarios. Its goal is to infer users' e-commerce purchase intentions from their content domain interactions, but it faces three major challenges:

  1. Semantic Gap: Lack of direct semantic correlation between content domain and e-commerce domain behaviors;
  2. Data Scale and Noise: Cross-domain behavior sequences are large and noisy, making it difficult for traditional models to capture key signals;
  3. Inference Latency: Although LLMs have strong semantic capabilities, their millisecond-level latency prevents direct application in online recommendation.
3

Section 03

Core Design of the AIR Framework: Offline-Online Separation and Atomic Intent Construction

Core Design of the AIR Framework

Offline-Online Separation Architecture

Migrate LLM reasoning to the offline phase, and only perform efficient retrieval and combination online:

  • Offline Phase: Use LLM to analyze users' historical behaviors, extract atomic intent representations, and store them in an intent knowledge base;
  • Online Phase: Retrieve relevant atomic intents and dynamically construct user intent through a lightweight combination module, significantly reducing latency.

Dynamic Construction of Intent Representation

Introduce the concept of atomic intent, decompose complex intents into reusable units, and dynamically select and combine them online based on context, balancing richness and efficiency.

4

Section 04

Performance: 400x Acceleration and Significant Improvement in Business Metrics

Performance and Experimental Results

  • Inference Acceleration: Offline LLM reasoning migration achieves approximately 400x acceleration while maintaining semantic consistency;
  • Offline Experiments: Achieve state-of-the-art (SOTA) performance on public datasets;
  • Online A/B Testing: A 3.446% increase in GMV in Kuaishou E-commerce's real scenarios, with core business metrics showing stable and significant improvements.
5

Section 05

Technical Insights and Industry Value

Technical Insights and Industry Value

  1. Offline-Online Separation: Effectively solves the LLM inference latency problem; precomputation + efficient retrieval balances effect and latency;
  2. Atomic Intent Representation: Retains LLM's semantic capabilities and supports flexible online combination;
  3. Industrial Application: Demonstrates how LLM technology can be applied in high-concurrency, low-latency production environments.
6

Section 06

Summary: Significance and Application Value of the AIR Framework

Summary

Through its innovative architecture, the AIR framework successfully integrates LLM semantic capabilities into industrial cross-domain recommendation systems. While achieving a 400x inference acceleration, it brings significant business value, promotes the development of cross-domain recommendation technology, and provides a feasible solution for the application of LLMs in the recommendation field.