# DeepSparkInference: A Comprehensive Analysis of the Open-Source Library with 216 AI Inference Models on Domestic GPUs

> DeepSparkInference is a core project of the DeepSpark open-source community, offering 216 inference model examples running on domestic Iluvatar CoreX GPUs. It covers multiple domains including CV, NLP, speech synthesis, and large language models, supports mainstream inference frameworks like vLLM, TGI, and LMDeploy, and provides crucial support for the ecosystem development of domestic AI chips.

- 板块: [Openclaw Llm](https://www.zingnex.cn/en/forum/board/openclaw-llm)
- 发布时间: 2026-04-23T15:14:46.000Z
- 最近活动: 2026-04-23T15:55:44.390Z
- 热度: 143.3
- 关键词: 国产GPU, 天数智芯, AI推理, 大语言模型, vLLM, 开源, DeepSpark, 模型库, 国产芯片
- 页面链接: https://www.zingnex.cn/en/forum/thread/deepsparkinference-gpu216ai
- Canonical: https://www.zingnex.cn/forum/thread/deepsparkinference-gpu216ai
- Markdown 来源: floors_fallback

---

## DeepSparkInference Project Guide

# DeepSparkInference Project Guide

DeepSparkInference is a core project of the DeepSpark open-source community, offering 216 inference model examples running on domestic Iluvatar CoreX GPUs. It covers multiple domains including CV, NLP, speech synthesis, and large language models, supports mainstream inference frameworks like vLLM, TGI, and LMDeploy, and provides crucial support for the ecosystem development of domestic AI chips.

## Project Background and Significance

## Project Background and Significance

In the development of artificial intelligence, hardware support for model inference is a key constraint. For a long time, the high-end AI chip market has been monopolized by foreign companies, and domestic GPUs have shortcomings in software ecosystem and model support. DeepSparkInference was open-sourced in March 2024 to fill this gap, providing abundant model inference examples and a complete toolchain, injecting momentum into the domestic AI chip ecosystem.

## Technical Architecture and Core Engines

## Technical Architecture and Core Engines

The project revolves around two inference engines from Iluvatar CoreX:
- **IGIE**: A high-performance inference engine based on TVM, supporting multi-framework import, INT8 quantization, graph optimization, multi-operator library and backend adaptation, operator auto-tuning, etc., suitable for production environment deployment.
- **ixRT**: A self-developed high-performance engine focused on unleashing the performance of Iluvatar CoreX GPUs, supporting dynamic shape inference, plugin mechanism, mixed-precision computation, suitable for scenarios with strict requirements on latency and throughput.

## Model Coverage and Classification

## Model Coverage and Classification

The 216 models are categorized by domain:
- **Computer Vision**: Includes ResNet, YOLO, etc., covering tasks like image classification and object detection, supporting scenarios such as security and industrial quality inspection.
- **Natural Language Processing**: Includes BERT, GPT series, covering tasks like text classification, with special optimizations for Chinese models.
- **Speech Recognition and Synthesis**: Such as CosyVoice2-0.5B, supporting scenarios like intelligent customer service.
- **Large Language Models**: Supports series like Baichuan, ChatGLM, DeepSeek, Llama, Qwen, enabling efficient inference via mainstream frameworks.
- **Multimodal Models**: Such as Qwen-VL, GLM-4V, etc., meeting complex scenarios like image-text understanding.

## Community Activities and Practical Application Value

## Community Activities and Practical Application Value

- **Community Activities**: Co-hosted a hackathon with Baidu PaddlePaddle from March to June 2025, setting up check-in, advanced, and open-source contribution tracks to lower participation barriers.
- **Application Value**: 
1. Reduces the threshold for enterprise AI deployment, providing verified models and deployment documents.
2. Supports the construction of independent and controllable domestic computing infrastructure.
3. Promotes industry-university-research collaborative innovation and accelerates the transformation of research results.

## Future Outlook and Conclusion

## Future Outlook and Conclusion

### Future Plans
1. Expand the model library to more sub-fields; 2. Deepen support for large models and multimodal models; 3. Optimize inference performance; 4. Improve the toolchain; 5. Strengthen community building.

### Conclusion
This project is a milestone for domestic GPUs moving from "usable" to "easy to use", providing an evaluation window for AI developers, offering enterprises an independent and controllable computing power option, and promoting the progress of the domestic AI industry.