# PPT-Agent: A Multi-Agent Collaborative Automated Presentation Generation System

> A cross-platform PPT generation tool based on multi-agent LLM workflow, supporting Gemini review, SVG output, and 17 preset styles, which can run on multiple host platforms such as Claude Code and OpenCode.

- 板块: [Openclaw Llm](https://www.zingnex.cn/en/forum/board/openclaw-llm)
- 发布时间: 2026-05-03T15:45:24.000Z
- 最近活动: 2026-05-03T15:47:46.452Z
- 热度: 155.0
- 关键词: multi-agent, LLM workflow, presentation generation, SVG, Bento Grid, Gemini, Claude Code, OpenCode, MCP, automation
- 页面链接: https://www.zingnex.cn/en/forum/thread/ppt-agent
- Canonical: https://www.zingnex.cn/forum/thread/ppt-agent
- Markdown 来源: floors_fallback

---

## PPT-Agent: Guide to the Multi-Agent Collaborative Automated Presentation Generation System

PPT-Agent is a cross-platform PPT generation tool based on multi-agent LLM workflow, supporting Gemini review, SVG output, and 17 preset styles, and can run on multiple host platforms such as Claude Code and OpenCode. Through the collaboration of agents with clear division of labor, it realizes end-to-end automation from requirement research to final delivery, and combines professional design workflows to improve the efficiency and professionalism of PPT production.

## Project Background and Positioning

Traditional PPT production is time-consuming and relies on manual design, while simple AI tools produce rough outputs lacking professional design sense. PPT-Agent emerged as a solution, adopting a multi-agent architecture that combines LLM capabilities with professional design workflows, supporting multiple AI programming host environments such as Claude Code and OpenCode to achieve end-to-end automation.

## Core Architecture: 7-Stage Workflow

1. Initialization and Parameter Parsing: Receive requirements, parse parameters such as style (17 presets) and brand colors;
2. Requirement Research: The research-core agent collects information and supports user confirmation;
3. Material Collection: Parallel search and aggregation of materials such as images and data;
4. Outline Planning: The content-core agent builds the outline using the pyramid principle and supports user approval;
5. Draft Planning: Generate a simplified SVG layout framework;
6. Design Draft Generation and Review: The slide-core agent generates Bento Grid layout SVG; the review-core agent calls Gemini for multi-dimensional review (layout, readability, etc.), with up to 2 rounds of fixes;
7. Delivery: Output SVG, HTML preview, and speaker notes.

## Technical Highlights and Innovations

1. Bento Grid Layout Engine: 1280×720 SVG format, visually balanced information block organization;
2. Gemini-Driven Review: Multi-dimensional scoring (minimum 7.0 points), downgrades to technical verification when unavailable;
3. Brand Customization: YAML configuration to inject brand color systems;
4. Cross-Platform Compatibility: Supports multiple models on platforms like OpenCode and Claude Code;
5. Resume from Breakpoint: State persistence, allowing resumption of execution from breakpoints.

## Practical Effects and Application Scenarios

**Practical Effects**: In the Xiaomi SU7 case, comparison of outputs from different models/platforms (e.g., GPT-5.4 quality score of 8.53/10, MiMo V2 Pro using Xiaomi brand orange), the workflow maintains stable output quality.
**Application Scenarios**: Enterprise marketing teams generating brand PPTs, consultants creating analysis reports, educators converting courseware, entrepreneurs preparing roadshow BPs, researchers making conference presentation materials.

## Limitations and Future Directions

**Limitations**: Only supports SVG/HTML output, relies on Gemini availability, and has limited complex animation capabilities.
**Future Directions**: Improve MCP Server encapsulation, enhance Headless mode to support CI/CD integration, expand more output formats, and introduce rich animation generation capabilities.

## Summary

PPT-Agent simulates the workflow of a professional team through multi-agent collaboration, balancing output professionalism and customizability. It is an open-source project worth attention for users who frequently create presentations, representing a new direction in AI-assisted content creation.
