Zing Forum

Reading

Recipe Nutrient Estimation: A Systematic Comparison Between Traditional Methods and LLMs

This paper systematically compares the performance of TF-IDF, DeBERTa-v3, and LLMs on the recipe nutrient estimation task, finding that LLMs achieve the highest accuracy under the strict EU 1169/2011 standard, but there is a significant efficiency-accuracy trade-off.

食谱营养估计饮食监测LLM应用TF-IDFDeBERTaEU 1169/2011食品知识精度效率权衡
Published 2026-04-28 23:41Recent activity 2026-04-29 10:58Estimated read 5 min
Recipe Nutrient Estimation: A Systematic Comparison Between Traditional Methods and LLMs
1

Section 01

[Introduction] Core Summary of Recipe Nutrient Estimation: A Systematic Comparison Between Traditional Methods and LLMs

This paper systematically compares the performance of TF-IDF, DeBERTa-v3, LLMs, and hybrid pipelines on the recipe nutrient estimation task. Key findings: LLMs and hybrid pipelines achieve the highest accuracy under the strict EU 1169/2011 standard, but there is a significant efficiency-accuracy trade-off; traditional methods are fast but have limited accuracy. The study provides guidance for model selection in different scenarios (real-time/precision-first/hybrid) and points out current limitations and future directions.

2

Section 02

Task Background: Two Core Challenges of Nutrient Estimation in Dietary Monitoring

Accurately estimating nutrient components from unstructured recipe text is a key challenge in dietary monitoring, stemming from two points:

  1. Ambiguous ingredient terms: Phrases like "a handful of spinach" or "moderate salt" lack standardization, and processing states (fresh vs. canned tomatoes) also affect nutrients;
  2. Variable quantity expressions: Volumes (cups/spoons), weights (grams/ounces), counts (pieces/slices), and ambiguous descriptions (a little) are diverse, requiring complex reasoning for standardization.
3

Section 03

Study Design: Systematic Evaluation Scheme for Four Models

Four types of models were evaluated:

  1. TF-IDF + Ridge Regression: Lexical baseline, advantages are fast, simple, and interpretable, but cannot handle semantic similarity and complex quantities;
  2. DeBERTa-v3: Deep semantic model, expected to have strong semantic capabilities, but performs poorly due to data scarcity and lack of food-specific knowledge;
  3. LLM Generative Reasoning: Uses world knowledge, parses ambiguous terms, normalizes units, shows reasoning chains, and has high accuracy;
  4. Hybrid LLM Refinement Pipeline: TF-IDF for fast initial estimation + LLM correction, balancing efficiency and accuracy.
4

Section 04

Evaluation Criteria: Adopting the Strict Tolerance Standard of EU 1169/2011

The study uses the strict tolerance standard defined by EU Regulation 1169/2011 for evaluation. This regulation specifies accuracy requirements for food nutrition labels, providing a realistic and strict benchmark for the task.

5

Section 05

Key Findings: Clear Trade-off Between Accuracy and Efficiency

Accuracy Ranking (under EU standard): 1. Hybrid LLM Pipeline > 2. Few-shot LLM > 3. TF-IDF > 4. DeBERTa-v3; LLMs perform best in all nutrient categories. Efficiency Comparison:

Method Inference Latency Accuracy
TF-IDF Millisecond level Medium
DeBERTa-v3 Hundred-millisecond level Low
LLM Second level Highest
Hybrid Pipeline Second level Highest
Conclusion: Higher accuracy comes at the cost of real-time performance.
6

Section 06

Practical Implications: Model Selection Recommendations for Different Scenarios

  1. Real-time applications: For example, mobile real-time scanning, choose TF-IDF (millisecond-level response);
  2. Precision-first: For example, medical nutrition monitoring, choose LLM (meets regulatory accuracy);
  3. Hybrid deployment: TF-IDF for initial estimation in the fast phase, LLM optimization in the background for a progressive experience.
7

Section 07

Study Limitations and Future Research Directions

Current Limitations: Evaluated only on FoodBench-QA, no analysis of LLM costs, no multilingual support; Future Directions: Domain-adaptive fine-tuning of LLMs, evaluation of efficient small LLMs, integration with structured nutrition databases, design of interactive systems.