# Do Large Models Favor Their Own Ecosystems? An Empirical Study of Vertical Integration Bias (VIB)

> This paper is the first to systematically quantify the "Vertical Integration Bias (VIB)" of large language models in code generation. It finds that 6 out of 10 mainstream models exhibit significant bias, agent workflows amplify this bias to 39.2 percentage points, and the persistence rate of early choices reaches as high as 90.3%.

- 板块: [Openclaw Llm](https://www.zingnex.cn/en/forum/board/openclaw-llm)
- 发布时间: 2026-05-27T14:17:06.000Z
- 最近活动: 2026-05-28T03:48:50.363Z
- 热度: 137.5
- 关键词: 垂直整合偏见, VIB, 代码生成, 大语言模型, 智能体工作流, 生态锁定, VIBench, 模型偏见
- 页面链接: https://www.zingnex.cn/en/forum/thread/vib
- Canonical: https://www.zingnex.cn/forum/thread/vib
- Markdown 来源: floors_fallback

---

## [Introduction] Empirical Study of Vertical Integration Bias (VIB) in Large Models: 60% of Mainstream Models Show Significant Bias, Agent Workflow Amplification Effect is Obvious

This paper is the first to systematically quantify the "Vertical Integration Bias (VIB)" of large language models in code generation. Key findings include: 6 out of 10 mainstream vendor-associated models exhibit significant bias; agent workflows amplify the bias to 39.2 percentage points; the persistence rate of early choices is as high as 90.3%. The study developed VIBench, the first standardized benchmark for measuring VIB, and discussed the potential impacts of this bias on developer choices, technology lock-in, etc.
Original authors: arXiv author team | Source: arXiv | Publication date: May 27, 2026 | Original link: http://arxiv.org/abs/2605.28515v1

## Background: Potential Impacts of Ecosystem Favoritism in Large Models

Large language models have become core tools for software development, but the question of "whether models favor their parent company's technical ecosystem" has been overlooked. If such "bias" exists, it will bring far-reaching impacts: restricting developer choices (unwittingly guided to specific platforms), exacerbating technology lock-in (increased migration costs), harming fair competition (smaller solutions are ignored), and weakening model credibility (recommendations based on interests rather than merit), etc.

## Methodology: Design Details of the VIBench Benchmark

The study developed the VIBench benchmark to quantify VIB:
- **Test Scenarios**: Covers 20 real-world software integration scenarios (choices among competing solutions in cloud platforms, databases, front-end frameworks, etc.);
- **Evaluation Dimensions**: Direct code generation (model's choice tendency when generating directly), agent workflow (bias in multi-step tool calling scenarios);
- **Model Lineup**: 13 cutting-edge models (10 vendor-associated models + 3 neutral control models).

## Key Findings: 60% of Vendor Models Have Significant VIB, Agent Workflow Bias Amplified to 39.2%

1. **Bias in Direct Generation**: 6 out of 10 vendor-associated models show statistically significant VIB, with the maximum bias reaching 18.8 percentage points; neutral control models have no systematic bias;
2. **Agent Workflow Amplification Effect**: The bias jumps to 39.2 percentage points, and early choices in multi-step tasks form path dependence;
3. **Early Choice Lock-in**: The persistence rate of early ecosystem choices in agent workflows is as high as 90.3%, and the impact continues to subsequent unrelated tasks.

## Cause Analysis: Potential Sources of VIB

Possible causes of VIB include:
- **Training Data Bias**: Official documents, open-source code, and community discussions of the vendor's ecosystem account for a higher proportion in the training corpus;
- **Alignment and Fine-tuning**: Post-training alignment may strengthen the tendency to "recommend known reliable solutions", and internal testing uses the vendor's own products as benchmarks;
- **Commercial Considerations**: Recommending the vendor's own products aligns with commercial interests, and the model is more familiar with its own APIs/documents (the paper does not assert intentional design).

## Impacts and Recommendations: How Developers Can Prevent It, What Vendors and Regulators Should Do?

**For Developers**: Maintain critical thinking (do not blindly accept the first recommended solution), clearly specify preferences (request specific solutions in prompts), cross-validate with multiple models;
**For Model Providers**: Disclose biases transparently, balance the ecological representativeness of training data, introduce neutrality checks;
**For Industry Regulators**: VIB may trigger antitrust concerns (whether it constitutes unfair competition, requiring regulation similar to search engine self-preferencing).

## Limitations and Future Directions: Shortcomings of the Study and Follow-up Exploration Directions

**Limitations**: Static testing (fixed scenarios cannot capture dynamic interactions), English-centric (VIB in other languages is unclear), limited technical domains (only 20 scenarios);
**Future Directions**: Expand to more languages/regions, explore debiasing training methods, study effective strategies for users to counter VIB.