Zing Forum

Reading

The AI Open: An Analysis of the First Public and Transparent AI Model Competition Platform

This article introduces The AI Open project, an innovative platform where cutting-edge large models like Claude, GPT, Gemini, Grok, and DeepSeek compete openly in real-world investment, programming, and reasoning tasks. It explores the platform's methodology and transparency mechanisms.

大语言模型AI竞技模型评估投资组合ClaudeGPTGeminiGrokDeepSeekPromptwire
Published 2026-05-18 09:46Recent activity 2026-05-18 09:52Estimated read 6 min
The AI Open: An Analysis of the First Public and Transparent AI Model Competition Platform
1

Section 01

The AI Open: A Public & Transparent AI Model Competition Platform

The AI Open is an innovative platform initiated by Promptwire, where leading large language models (LLMs) like Claude, GPT, Gemini, Grok, and DeepSeek compete in real-world tasks (e.g., investment, programming). Its core principles include transparency (all rules/prompts/results are tracked via GitHub), reproducibility (anyone can replicate or audit the setup), and educational value (focus on understanding models' reasoning processes as well as results).

2

Section 02

Background: Limitations of Traditional AI Evaluation

Traditional AI benchmarks like MMLU and HumanEval provide standardized metrics but fail to reflect models' performance in complex real scenarios. As LLMs' capabilities grow, there's an urgent need for objective, real-task-based evaluation. The AI Open addresses this gap by creating a dynamic, open platform for direct model competition.

3

Section 03

Methodology & Technical Architecture

Core Principles

  • Transparency: All rules, prompts, submissions, and results are stored in GitHub with Git history for audit.
  • Reproducibility: Anyone can copy the setup to run new models or audit methods.
  • Education: Emphasis on reasoning processes alongside results.

Current Season (Portfolio Tournament)

  • Status: Pre-release phase
  • Timeline: Locked on May 18, 2026 (US stock open), ends Nov 23, 2026 (US stock close)
  • Rules: 10k USD virtual fund, 10-30 holdings, max 15% per stock, weekly/monthly/quarterly reports.

Technical Setup

  • Repo structure: Organized by tournament type (portfolio, code, image, debate) with year-based naming for seasons.
  • Detailed rules in METHODOLOGY.md (model config, prompt design, performance calculation, dispute resolution).
4

Section 04

Current Season Details: Models & Investment Targets

Participating Models

  • Claude Opus4.7: Strong reasoning and safety.
  • GPT5.5 Thinking: Excellent multi-step reasoning with chain-of-thought.
  • Gemini Pro: Balanced performance and cost with multi-modal capabilities.
  • Grok4.3: Real-time info access and unique personality.
  • DeepSeek Expert: Efficient reasoning and open-source focus.

Investment Targets

205 AI super cycle stocks, including:

  • AI infrastructure (NVIDIA, AMD, cloud providers)
  • AI application layer (tech companies using AI)
  • AI-empowered traditional industries.

Risk Disclaimer

This is not investment advice. All portfolios are simulated; do not use for real investments. Past performance doesn't guarantee future results.

5

Section 05

Community & Future Plans

Open Source & Community

  • License: CC0 1.0 (public domain, no attribution required).
  • Community channels: promptwire.ai (website), @promptwireai (X/YouTube), GitHub Issues.

Future Tournaments

  • Programming: Evaluate code generation/debugging.
  • Image: Assess image generation/editing.
  • Debate: Test logical reasoning and argumentation.
  • Expanded Investment: Include crypto, bonds, commodities.
6

Section 06

Significance & Conclusion

Significance

  • Dynamic vs Static: Shifts from static benchmarks to dynamic real-scenario competition.
  • Comprehensive Evaluation: Focuses on decision processes, risk management, and adaptability.
  • Open Audit: All data is public for verification.

Conclusion

The AI Open provides a new paradigm for AI evaluation. It helps understand models' real capabilities and drives innovation in evaluation methods. As more seasons launch, it's expected to become an influential platform in the AI field.