Zing Forum

Reading

Research on LLM Usage Efficiency: How to Reduce Resource Consumption Through Prompt Design Optimization

An empirical study on the usage efficiency of large language models (LLMs), which reveals how user behavior and prompt design affect resource consumption through analysis of real datasets and controlled experiments, and provides actionable optimization recommendations.

LLM资源效率提示词工程可持续性token优化机器学习数据分析
Published 2026-06-04 14:44Recent activity 2026-06-04 14:54Estimated read 5 min
Research on LLM Usage Efficiency: How to Reduce Resource Consumption Through Prompt Design Optimization
1

Section 01

Introduction to LLM Usage Efficiency Research

This study was published by Thoericht on GitHub on May 20, 2026, focusing on the issue of LLM usage efficiency. Through the analysis of real conversation datasets and controlled experiments, it explores how prompt design and user interaction patterns affect resource consumption, and provides actionable optimization recommendations. The core goal is to improve the resource usage efficiency of LLMs, reduce costs and environmental burdens.

2

Section 02

Research Background and Core Questions

Background: LLMs have been integrated into daily work processes, but there are significant differences in the efficiency of user usage patterns, leading to unnecessary computational overhead, increased costs, and environmental burdens.

Core Questions:

  1. How does prompt structure affect token consumption and response length?
  2. Are there efficient topic or task types?
  3. Can machine learning be used to model usage efficiency?
3

Section 03

Data Sources and Research Methods

Data Collection: A dual strategy was adopted—real conversation datasets (similar to ShareGPT style) + synthetic prompt experiments (controlled comparison).

Analysis Framework: A four-stage process: Exploratory Data Analysis (statistics + novelty embedding) → Topic Modeling (Sentence Transformer + BERTopic) → Efficiency Modeling (regression model + SHAP analysis) → Controlled Experiments (quantify efficiency-quality trade-off).

Key Metrics: target_success (whether the first response requires no clarification), target_cost (minimum number of tokens for the first response).

Tool Stack: Python pandas/numpy, scikit-learn, matplotlib/seaborn, sentence-transformers, tiktoken.

4

Section 04

Expected Outcomes and Practical Significance

Expected Outcomes:

  1. Identify inefficient usage patterns;
  2. Establish a prompt efficiency prediction framework;
  3. Provide actionable prompt optimization guidelines.

Practical Significance: Help development teams and users reduce operational costs, minimize environmental impact, and turn resource efficiency into an engineering constraint.

5

Section 05

Limitations and Future Directions

Limitations: Using token count and interaction complexity as proxy indicators for resource consumption, without directly measuring energy usage, which may deviate from the actual carbon footprint.

Future Directions:

  1. Integrate real energy consumption monitoring data;
  2. Expand to more LLM providers and model architectures;
  3. Develop real-time prompt optimization tools;
  4. Explore cumulative efficiency optimization for multi-turn conversations.
6

Section 06

Research Conclusion

In today's era of widespread LLM applications, resource efficiency has become an essential engineering constraint to consider. This study provides a systematic analysis framework, using data-driven methods to understand and optimize LLM usage efficiency, which has important reference value for reducing costs and environmental impact.