Zing 论坛

正文

AI Robotics News Bot:自托管的 Telegram 新闻策展机器人,用 LLM 智能筛选高质量内容

一个完全自托管、自动化的 Telegram 新闻机器人,利用 Prefect 3 编排工作流、PostgreSQL + pgvector 存储向量化数据、OpenRouter 调用大语言模型进行内容策展,从 newsdata.io 抓取 AI 和机器人领域新闻并自动发布到 Telegram 频道。

Telegram机器人新闻策展PrefectpgvectorOpenRouterLLM自动化工作流Docker
发布时间 2026/05/26 22:15最近活动 2026/05/26 22:19预计阅读 6 分钟
AI Robotics News Bot:自托管的 Telegram 新闻策展机器人,用 LLM 智能筛选高质量内容
1

章节 01

AI Robotics News Bot: Self-hosted Telegram News Curation Bot Overview

AI Robotics News Bot is a fully self-hosted, automated Telegram news curation bot focused on AI and robotics fields. It uses Prefect 3 for workflow orchestration, PostgreSQL + pgvector for vector data storage, OpenRouter to call LLMs for content curation, and newsdata.io as the news source. Its core mission is to filter high-quality, interesting, and unique news from massive information and auto-publish to specified Telegram channels.

2

章节 02

Background & Project Context

The bot aims to solve the problem of information overload in AI and robotics domains by providing an automated way to curate and deliver high-quality content.

3

章节 03

Technical Architecture Breakdown

Workflow Orchestration

Uses Prefect 3 (lightweight, easy-to-use) for scheduling daily tasks (news crawling, curation, publishing). Deployed in headless mode via Docker Compose to reduce resource usage.

Data Storage

PostgreSQL 16 with pgvector extension: stores both Prefect internal state (workflow definitions, run records) and newsbot data (articles, vector embeddings, metadata). Supports vector similarity search directly in SQL.

Content Curation

OpenRouter provides unified access to LLMs (e.g., Mistral Large) for:

  • Relevance check (AI/robotics topic)
  • Quality scoring (depth, originality, value)
  • Deduplication (vector similarity)

Configurable prompts in config/prompts/ allow adjusting curation standards.

News Source

Uses newsdata.io API with a whitelist (config/sources_whitelist.yml) to control allowed sources and set priority weights.

4

章节 04

Deployment & Configuration Details

Docker Compose Setup

  • Server Profile: Starts PostgreSQL, Redis, Prefect API services.
  • Worker Profile: Starts Prefect worker to execute tasks.

Configuration Management

All settings are in YAML files under config/:

  • sources_whitelist.yml: Allowed news sources and priorities
  • settings.yml: Crawling interval, thresholds, model names
  • prompts/: LLM prompts for curation and categorization

Security

Sensitive info (API keys, Telegram token, DB password) is stored in .env (Git-ignored) and injected into containers via Docker Compose.

5

章节 05

Application Scenarios & Value

  1. Personal Knowledge Management: Researchers/developers can deploy private instances to build custom AI/robotics information streams.

  2. Community Content Operation: Tech community operators can use it to automate content pushing, reducing manual filtering costs and ensuring consistent quality.

  3. Learning Resource: Demonstrates modern data engineering best practices (workflow orchestration, vector DB, LLM integration, containerization) for learners.

6

章节 06

Key Highlights & Conclusion

Project Highlights

  1. Clear architecture: Prefect + PostgreSQL + LLM layered design
  2. Config-driven: Adjust behavior via YAML without code changes
  3. Vector deduplication: Efficient similar content detection using pgvector
  4. Open-source friendly: MIT license, clean code for secondary development
  5. Operation-friendly: One-click deployment via Docker Compose, ready-to-use backup scripts

Conclusion

The bot uses a simple, stable tech stack to build a practical automated system. Its "good enough" engineering approach (avoiding overcomplicated microservices) is worth referencing for independent developers and small teams.