Zing Forum

Reading

PPT-Agent: A Multi-Agent Collaborative Automated Presentation Generation System

A cross-platform PPT generation tool based on multi-agent LLM workflow, supporting Gemini review, SVG output, and 17 preset styles, which can run on multiple host platforms such as Claude Code and OpenCode.

multi-agentLLM workflowpresentation generationSVGBento GridGeminiClaude CodeOpenCodeMCPautomation
Published 2026-05-03 23:45Recent activity 2026-05-03 23:47Estimated read 6 min
PPT-Agent: A Multi-Agent Collaborative Automated Presentation Generation System
1

Section 01

PPT-Agent: Guide to the Multi-Agent Collaborative Automated Presentation Generation System

PPT-Agent is a cross-platform PPT generation tool based on multi-agent LLM workflow, supporting Gemini review, SVG output, and 17 preset styles, and can run on multiple host platforms such as Claude Code and OpenCode. Through the collaboration of agents with clear division of labor, it realizes end-to-end automation from requirement research to final delivery, and combines professional design workflows to improve the efficiency and professionalism of PPT production.

2

Section 02

Project Background and Positioning

Traditional PPT production is time-consuming and relies on manual design, while simple AI tools produce rough outputs lacking professional design sense. PPT-Agent emerged as a solution, adopting a multi-agent architecture that combines LLM capabilities with professional design workflows, supporting multiple AI programming host environments such as Claude Code and OpenCode to achieve end-to-end automation.

3

Section 03

Core Architecture: 7-Stage Workflow

  1. Initialization and Parameter Parsing: Receive requirements, parse parameters such as style (17 presets) and brand colors;
  2. Requirement Research: The research-core agent collects information and supports user confirmation;
  3. Material Collection: Parallel search and aggregation of materials such as images and data;
  4. Outline Planning: The content-core agent builds the outline using the pyramid principle and supports user approval;
  5. Draft Planning: Generate a simplified SVG layout framework;
  6. Design Draft Generation and Review: The slide-core agent generates Bento Grid layout SVG; the review-core agent calls Gemini for multi-dimensional review (layout, readability, etc.), with up to 2 rounds of fixes;
  7. Delivery: Output SVG, HTML preview, and speaker notes.
4

Section 04

Technical Highlights and Innovations

  1. Bento Grid Layout Engine: 1280×720 SVG format, visually balanced information block organization;
  2. Gemini-Driven Review: Multi-dimensional scoring (minimum 7.0 points), downgrades to technical verification when unavailable;
  3. Brand Customization: YAML configuration to inject brand color systems;
  4. Cross-Platform Compatibility: Supports multiple models on platforms like OpenCode and Claude Code;
  5. Resume from Breakpoint: State persistence, allowing resumption of execution from breakpoints.
5

Section 05

Practical Effects and Application Scenarios

Practical Effects: In the Xiaomi SU7 case, comparison of outputs from different models/platforms (e.g., GPT-5.4 quality score of 8.53/10, MiMo V2 Pro using Xiaomi brand orange), the workflow maintains stable output quality. Application Scenarios: Enterprise marketing teams generating brand PPTs, consultants creating analysis reports, educators converting courseware, entrepreneurs preparing roadshow BPs, researchers making conference presentation materials.

6

Section 06

Limitations and Future Directions

Limitations: Only supports SVG/HTML output, relies on Gemini availability, and has limited complex animation capabilities. Future Directions: Improve MCP Server encapsulation, enhance Headless mode to support CI/CD integration, expand more output formats, and introduce rich animation generation capabilities.

7

Section 07

Summary

PPT-Agent simulates the workflow of a professional team through multi-agent collaboration, balancing output professionalism and customizability. It is an open-source project worth attention for users who frequently create presentations, representing a new direction in AI-assisted content creation.