Zing Forum

Reading

IDE-integrated AI Development Toolkit: Enabling Non-ML Engineers to Build AI Features

This article introduces a JetBrains IDE plugin that directly integrates AI feature tracking and evaluation into the development workflow, lowering the barrier to AI development.

AI开发工具IDE插件智能体调试JetBrainsLLM工程化追踪评估
Published 2026-05-14 17:28Recent activity 2026-05-15 12:51Estimated read 6 min
IDE-integrated AI Development Toolkit: Enabling Non-ML Engineers to Build AI Features
1

Section 01

IDE-integrated AI Development Toolkit: Enabling Non-ML Engineers to Build AI Features (Introduction)

This article introduces an AI Toolkit plugin for JetBrains IDEs, designed to lower the barrier for non-ML engineers to build AI features. The plugin directly integrates AI feature tracking and evaluation capabilities into the IDE environment familiar to developers, allowing non-ML experts to adopt standardized AI development practices without frequent tool switching or learning an entirely new workflow.

2

Section 02

Hidden Barriers to AI Development: Challenges Faced by Non-ML Engineers

AI development has hidden barriers for non-ML engineers: AI feature outputs are uncertain and unexplainable, agent decision-making processes are hard to track, and evaluation criteria are subjective and vague; traditional testing methods are not systematic or reliable, debugging feels like groping in a black box, and developers are reluctant to switch environments frequently for AI tools. These issues make AI feature development difficult, debugging painful, and reproduction almost impossible.

3

Section 03

AI Toolkit Plugin: IDE-Native AI Development Workflow

Based on developer needs research (standardized evaluation, execution tracking visibility, low context switching), the research team developed the AI Toolkit plugin specifically for JetBrains IDEs. Its core innovation is integrating the full lifecycle of AI development into the Run/Debug loop, including two main components: AI Agents Debugger (tracking and visualizing agent execution) and AI Evaluation (a unit-test-like evaluation framework). The design philosophy respects existing software engineering practices and uses familiar metaphors to reduce learning costs.

4

Section 04

Analysis of AI Toolkit's Core Features

The plugin's core features include: 1. Run-triggered tracking capture: automatically records agent decisions, tool calls, parameters, and intermediate outputs, presented in a hierarchical structure; 2. Real-time hierarchical inspection: interactively view tracking, dive deep into the decision tree layer by layer to quickly locate issues; 3. One-click addition to dataset: save interesting cases (including input, output, and intermediate states) to the evaluation dataset; 4. Unit-test-like evaluation: write use cases to define metrics (supporting from string matching to semantic similarity), perform batch validation, and generate test reports.

5

Section 05

Early Adoption Data: Validating the Plugin's Practical Value

Early adoption data shows positive signals: 1. High conversion rate: developers have a strong willingness to try the tracking feature when prompted actively during runtime; 2. Sustained use: once they start capturing tracking, developers tend to continue using it; 3. Low churn rate: adopters rarely abandon it. These data validate the hypothesis that IDE-native observability reduces the activation energy for AI development.

6

Section 06

Limitations and Future Development Directions

The current plugin has limitations: it mainly supports AI frameworks in the Python ecosystem (such as LangChain, LlamaIndex), and other languages need to be expanded; the performance and experience of large-scale evaluation (hundreds/thousands of use cases) need optimization. Future directions include: expanding framework and language support, enhancing evaluation functions, and exploring team collaboration for sharing datasets and tracking records.

7

Section 07

Conclusion: Implications for Democratization and Engineering of AI Development

AI Toolkit promotes the democratization of AI development: it makes AI feature development manageable, debuggable, and evaluable like traditional software, lowering the barrier to allow more engineers to participate. Implications for AI engineering: tool integration is more effective than independent environments, observability is core, and lowering barriers expands the range of participants. AI should not be exclusive to ML experts but should be a tool for all software engineers.