The function call capability of large language models (LLMs) is a key technology for realizing AI agents and automated workflows. However, traditional prompting methods face severe challenges when generating structured outputs—even large models with billions of parameters often have issues like syntax errors, parameter type mismatches, or non-standard formatting when generating JSON-formatted function calls.
According to the project author's tests, small models without special processing (such as Qwen3-0.6B) have only about 30% validity when directly generating JSON function calls. This means that one out of every three calls fails due to formatting issues, and this unreliability severely restricts the application of LLMs in real-world production environments.
The call-me-maybe project created by 42-course developer rogard-antoine proposes a fundamental solution: instead of relying on the model to "learn" to generate correct JSON through training data, we should "force" the model to generate valid outputs through constraint mechanisms during the decoding phase. This shift in thinking allows a lightweight model with only 0.6B parameters to achieve 100% JSON validity.