With the rapid development of Large Language Model (LLM) agents, how to efficiently manage and allocate computing resources has become a key challenge. Agentic workflows usually involve multi-step chained calls, parallel execution, and conditional branches, where each step may consume different amounts of time and computing costs.
In practical deployment, agent systems often face two core constraints:
- Time Constraint (Deadline): Tasks must be completed within the specified time
- Budget Constraint: The total cost of task execution cannot exceed the preset upper limit
Traditional resource allocation methods usually adopt static strategies and cannot dynamically adjust based on real-time execution status. The MCPP framework proposes an online resource allocation method based on Active Inference and Bayesian memory evolution, which can maximize task success rate while satisfying constraints.