Reading

IDE-integrated AI Development Toolkit: Enabling Non-ML Engineers to Build AI Features

This article introduces a JetBrains IDE plugin that directly integrates AI feature tracking and evaluation into the development workflow, lowering the barrier to AI development.

AI开发工具IDE插件智能体调试JetBrainsLLM工程化追踪评估

Published 2026-05-14 17:28Recent activity 2026-05-15 12:51Estimated read 6 min

Section 01

IDE-integrated AI Development Toolkit: Enabling Non-ML Engineers to Build AI Features (Introduction)

This article introduces an AI Toolkit plugin for JetBrains IDEs, designed to lower the barrier for non-ML engineers to build AI features. The plugin directly integrates AI feature tracking and evaluation capabilities into the IDE environment familiar to developers, allowing non-ML experts to adopt standardized AI development practices without frequent tool switching or learning an entirely new workflow.

Section 02

Hidden Barriers to AI Development: Challenges Faced by Non-ML Engineers

AI development has hidden barriers for non-ML engineers: AI feature outputs are uncertain and unexplainable, agent decision-making processes are hard to track, and evaluation criteria are subjective and vague; traditional testing methods are not systematic or reliable, debugging feels like groping in a black box, and developers are reluctant to switch environments frequently for AI tools. These issues make AI feature development difficult, debugging painful, and reproduction almost impossible.

Section 03

AI Toolkit Plugin: IDE-Native AI Development Workflow

Based on developer needs research (standardized evaluation, execution tracking visibility, low context switching), the research team developed the AI Toolkit plugin specifically for JetBrains IDEs. Its core innovation is integrating the full lifecycle of AI development into the Run/Debug loop, including two main components: AI Agents Debugger (tracking and visualizing agent execution) and AI Evaluation (a unit-test-like evaluation framework). The design philosophy respects existing software engineering practices and uses familiar metaphors to reduce learning costs.

Section 04

Analysis of AI Toolkit's Core Features

The plugin's core features include: 1. Run-triggered tracking capture: automatically records agent decisions, tool calls, parameters, and intermediate outputs, presented in a hierarchical structure; 2. Real-time hierarchical inspection: interactively view tracking, dive deep into the decision tree layer by layer to quickly locate issues; 3. One-click addition to dataset: save interesting cases (including input, output, and intermediate states) to the evaluation dataset; 4. Unit-test-like evaluation: write use cases to define metrics (supporting from string matching to semantic similarity), perform batch validation, and generate test reports.

Section 05

Early Adoption Data: Validating the Plugin's Practical Value

Early adoption data shows positive signals: 1. High conversion rate: developers have a strong willingness to try the tracking feature when prompted actively during runtime; 2. Sustained use: once they start capturing tracking, developers tend to continue using it; 3. Low churn rate: adopters rarely abandon it. These data validate the hypothesis that IDE-native observability reduces the activation energy for AI development.

Section 06

Limitations and Future Development Directions

The current plugin has limitations: it mainly supports AI frameworks in the Python ecosystem (such as LangChain, LlamaIndex), and other languages need to be expanded; the performance and experience of large-scale evaluation (hundreds/thousands of use cases) need optimization. Future directions include: expanding framework and language support, enhancing evaluation functions, and exploring team collaboration for sharing datasets and tracking records.

Section 07

Conclusion: Implications for Democratization and Engineering of AI Development

AI Toolkit promotes the democratization of AI development: it makes AI feature development manageable, debuggable, and evaluable like traditional software, lowering the barrier to allow more engineers to participate. Implications for AI engineering: tool integration is more effective than independent environments, observability is core, and lowering barriers expands the range of participants. AI should not be exclusive to ML experts but should be a tool for all software engineers.

Continue Reading

Keep going with more reads from the same topic.

Nornir MCP Server: An Enterprise-Grade Bridge for Integrating Large Language Models into Network Automation

Nornir MCP Server is an enterprise-level server based on the Model Context Protocol (MCP). It seamlessly integrates large language models (such as Claude) with the Nornir network automation framework, supporting natural language orchestration for multi-vendor network devices (Cisco, Arista, Juniper, etc.), and providing production-grade features like a dual-engine architecture (NAPALM + Netmiko), intelligent filtering, and a secure sandbox.

Recent activity 2026-05-06 20:51

Bibliothèque Française LLM: A French Public Domain Literature Index System Optimized for Large Language Models

Bibliothèque Française LLM is a structured indexing and annotation project for French public domain literature designed specifically for large language models (LLMs). It integrates multiple authoritative sources such as DraCor, Common Corpus, and Wikisource, providing metadata indexing categorized by genre, author, and era, as well as in-depth annotations for dramatic texts (including characters, lines, stage directions, etc.). Its aim is to enable LLMs to efficiently read and understand classic French literary works.

Recent activity 2026-05-06 20:50

Splinter: A Lock-Free Zero-Copy Shared Memory KV and Vector Storage Library That Eliminates Socket and Memcpy Overhead for LLM Inference

Splinter is a minimalist, high-performance key-value (KV) and vector storage system enabling zero-latency inter-process communication via shared memory and atomic operations. With only 766 lines of core code, it supports millions of operations per second and 768-dimensional vector storage, offering a new architectural approach for local LLM inference and data-intensive applications.

Recent activity 2026-04-03 08:49

Folkering OS: When the Operating System Itself Is AI—A Self-Evolving Bare-Metal Rust System

Folkering OS is the world's first AI-native bare-metal operating system, entirely written in Rust no_std without relying on Linux, POSIX, or libc. It can generate commands from scratch, compile them into WASM, and run them in 10 seconds, achieving true self-evolution.

Recent activity 2026-04-09 16:15