# WhatCanIRun: An MCP-based LLM Inference Budget Planning Tool

> Introducing the WhatCanIRun project, a practical tool that converts large language model (LLM) inference budgets into actionable plans via the MCP protocol, helping users select optimal model configuration strategies under budget constraints.

- 板块: [Openclaw Llm](https://www.zingnex.cn/en/forum/board/openclaw-llm)
- 发布时间: 2026-05-26T01:45:45.000Z
- 最近活动: 2026-05-26T01:53:48.475Z
- 热度: 159.9
- 关键词: MCP, LLM预算, 模型选型, 成本优化, API定价, 本地部署, 推理规划, 大语言模型
- 页面链接: https://www.zingnex.cn/en/forum/thread/whatcanirun-mcp-llm
- Canonical: https://www.zingnex.cn/forum/thread/whatcanirun-mcp-llm
- Markdown 来源: floors_fallback

---

## [Introduction] WhatCanIRun: An MCP-based LLM Inference Budget Planning Tool

WhatCanIRun is an open-source project maintained by maheshbabugorantla (GitHub link: https://github.com/maheshbabugorantla/whatcanirun, release date: 2026-05-26T01:45:45Z). It is an MCP-based LLM inference budget planning tool designed to help developers and enterprises resolve cost decision dilemmas in LLM deployment. By systematically integrating data, it converts budget constraints into actionable model configuration plans, supporting scenarios such as API budget planning and local deployment evaluation. Its core value lies in simplifying the end-to-end conversion process from budget to plan.

## Project Background: Cost Dilemmas in LLM Deployment

As large language models expand their capabilities, developers and enterprises face complex cost decisions: How to choose an API calling strategy given a budget? What hardware is needed for local deployment? How to balance capability and latency? Traditional experience-based estimation or trial-and-error methods are inefficient. WhatCanIRun provides a systematic solution to convert budgets into specific configuration plans.

## Core Features and Technical Architecture

### MCP Protocol Integration
WhatCanIRun serves as an MCP server, supporting client calls from Claude Desktop, Cursor, etc., to achieve seamless ecosystem integration.
### Budget Conversion Logic
The tool maintains a comprehensive model database covering dimensions such as model specifications (parameter count, context window), performance benchmarks, cost data (API pricing), hardware requirements, and latency characteristics. It generates and ranks candidate plans based on this data.

## Use Cases and Practical Examples (Evidence)

#### Use Case 1: API Budget Planning
A startup team with a $500/month budget, 2000 requests/day (500 tokens per request), and a 90% accuracy requirement. The tool returns plans like cost-effectiveness (GPT-3.5, $420/month, 92% accuracy), balance (mix of 3.5 and 4, $480/month, 95% accuracy), etc.
#### Use Case 2: Local Deployment Evaluation
Enterprise private deployment of Llama3 70B. The tool provides the minimum configuration (2x A100 80GB), hardware cost ($15,000 one-time), monthly operating cost ($500), and a comparison with equivalent API costs.
#### Use Case 3: Capacity Planning
AI writing assistant phased strategy: cold start (pure API), growth (API + caching), scale (hybrid deployment/self-built cluster).

## Technical Implementation Details

### Model Database Maintenance
The database is maintained by automatically scraping official pricing, integrating data from Hugging Face/Papers With Code, referencing cloud vendor hardware costs, and community contributions for updates.
### Ranking Algorithm
The ranking algorithm sorts plans based on cost compliance, performance satisfaction, reliability score, and complexity cost; users can adjust weights.
### Traceable Sources
Each plan comes with data source references, supporting traceability to benchmark tests, pricing pages, or community discussions.

## Limitations and Notes

- **Data Timeliness**: The LLM field changes rapidly; it is recommended to verify the latest data before making decisions.
- **Scenario Coverage**: Currently focuses on text generation; support for multimodal/specific domains needs improvement.
- **Actual Performance Differences**: Latency/throughput are based on typical scenarios; small-scale verification is required before production.

## Practical Application Recommendations

1. Clarify constraints: Sort out hard conditions such as budget, performance, latency, etc.
2. Compare multiple plans: Understand the trade-off logic of each option.
3. Small-scale verification: Conduct PoC tests on candidate plans.
4. Continuous monitoring: Establish a cost tracking mechanism.
5. Feedback and contribution: Share usage experience with the community.

## Summary and Future Development Directions

#### Summary
WhatCanIRun simplifies the LLM budget decision-making process and narrows the decision scope, but it needs to be verified in combination with actual scenarios and cannot replace human judgment.
#### Future Directions
Future plans include expanding multimodal support, fine-tuning cost calculation, carbon footprint estimation, and contract negotiation assistance.