Zing Forum

Reading

Otter Streams: Seamlessly Integrating Machine Learning Models into Apache Flink Stream Processing Pipelines

An open-source framework designed to simplify the integration of machine learning models with the Apache Flink stream processing engine, supporting real-time feature engineering, model inference, and online learning scenarios.

Apache Flink流处理机器学习实时推理MLOps特征工程
Published 2026-05-02 21:15Recent activity 2026-05-02 21:19Estimated read 8 min
Otter Streams: Seamlessly Integrating Machine Learning Models into Apache Flink Stream Processing Pipelines
1

Section 01

Introduction / Main Floor: Otter Streams: Seamlessly Integrating Machine Learning Models into Apache Flink Stream Processing Pipelines

An open-source framework designed to simplify the integration of machine learning models with the Apache Flink stream processing engine, supporting real-time feature engineering, model inference, and online learning scenarios.

2

Section 02

Trends in the Convergence of Stream Processing and Machine Learning

In the era of big data, enterprises' demand for real-time data processing capabilities is growing day by day. Traditional batch processing models struggle to meet the low-latency requirements of scenarios such as fraud detection, recommendation systems, and IoT monitoring. As a leading stream processing engine, Apache Flink has become the preferred infrastructure for real-time data processing due to its exactly-once processing semantics, low latency, and high throughput characteristics.

At the same time, machine learning models are evolving from offline training to online services. More and more application scenarios require models to respond to streaming data in real time, performing real-time feature calculation, online inference, and even incremental learning. However, integrating machine learning models into stream processing pipelines is not easy—it involves many technical challenges such as model serialization, feature consistency, and inference latency optimization.

The Otter Streams project was born to address this pain point; it provides an elegant abstraction layer that allows developers to connect existing ML models to Flink stream processing pipelines with minimal changes.

3

Section 03

Framework-Agnostic Model Support

Otter Streams is designed using an adapter pattern and supports multiple mainstream machine learning frameworks:

  • TensorFlow: Integration via TensorFlow Serving or TF-Java API
  • PyTorch: Supports loading of TorchScript serialized models
  • scikit-learn: Conversion via PMML or ONNX formats
  • XGBoost/LightGBM: Native support for gradient boosting tree models
  • Custom models: Provides an SPI interface to allow integration of any Java/Scala-implemented models

This design ensures that users can migrate to the stream processing environment without rewriting their models, protecting existing technical investments.

4

Section 04

Feature Engineering and Data Consistency

Feature engineering in stream processing scenarios faces unique challenges: handling out-of-order events, maintaining time window states, and ensuring feature consistency between training and inference phases. Otter Streams provides:

  • Time window abstraction: Supports multiple semantics such as tumbling windows, sliding windows, and session windows
  • State management: Uses Flink's Checkpoint mechanism to ensure feature state fault tolerance
  • Feature storage integration: Integrates with KV stores like Redis and Cassandra to support feature lookup and backfilling
  • Feature version control: Tracks feature definition changes to ensure compatibility between models and features
5

Section 05

Low-Latency Inference Optimization

Real-time inference is extremely sensitive to latency. Otter Streams has made several optimizations at the architectural level:

  • Model caching: Caches models locally in Flink TaskManagers to avoid remote calls
  • Batch inference: Automatically aggregates micro-batch requests to fully utilize GPU/TPU parallel computing
  • Asynchronous IO: Non-blocking feature acquisition and model calls to maximize throughput
  • Model hot update: Supports non-stop model version updates to ensure service continuity
6

Section 06

Real-Time Fraud Detection

Financial transaction fraud detection is a classic use case for stream processing ML. Otter Streams can:

  • Real-time aggregation of users' recent transaction behavior features (frequency, amount, geographic location)
  • Call risk control models for millisecond-level risk scoring
  • Trigger real-time interception or manual review workflows
  • Feed detection results back to the model for online learning
7

Section 07

Personalized Recommendation Systems

In e-commerce and content platforms, real-time personalized recommendations directly affect user experience and business conversion:

  • Capture users' real-time behaviors (clicks, browsing, favorites)
  • Update user profiles and context features
  • Call recommendation models to generate candidate sets and sort them
  • Support A/B testing and multi-armed bandit online optimization
8

Section 08

Industrial IoT Predictive Maintenance

Manufacturing equipment monitoring requires processing high-frequency sensor data streams:

  • Real-time parsing of equipment sensor data (temperature, vibration, current)
  • Calculation of sliding window statistical features (mean, variance, frequency spectrum)
  • Running anomaly detection models to identify potential failure signs
  • Triggering maintenance work orders and predicting remaining useful life