Zing Forum

Reading

Open-Source Large Model Supply Chain Risk Assessment: New Challenges for Enterprise AI Security

A supply chain risk assessment report on three open-source large language models on the HuggingFace platform reveals key risk dimensions in AI model deployment, including access control, file format security, and publisher traceability.

AI安全供应链风险开源大模型HuggingFaceCISO风险评估Safetensors模型溯源
Published 2026-04-29 08:44Recent activity 2026-04-29 10:22Estimated read 8 min
Open-Source Large Model Supply Chain Risk Assessment: New Challenges for Enterprise AI Security
1

Section 01

Introduction: Core Insights from Open-Source Large Model Supply Chain Risk Assessment

This article, based on a supply chain risk assessment of three open-source large language models on the HuggingFace platform, reveals key risk dimensions in AI model deployment, including access control, file format security, and publisher traceability. The rapid adoption of open-source large models has introduced new attack surfaces, and their supply chain security differs fundamentally from traditional software—providing critical references for CISOs and security decision-makers.

2

Section 02

Background: Real-World Cases of AI Supply Chain Attacks

AI supply chain risks are not theoretical assumptions—real incidents have occurred:

  1. PyTorch Dependency Confusion Attack (December 2022):Attackers published a malicious package with a higher version on PyPI, causing Linux nightly builds to install it automatically and steal sensitive data.
  2. Malicious Models Found on HuggingFace (2024):Researchers discovered over 100 malicious models using the Pickle format (which executes arbitrary code when loaded). As of April 2025, Protect AI's Guardian scanner flagged over 352,000 unsafe versions among more than 50,000 models. These cases highlight the need for a dedicated assessment framework for AI supply chain security.
3

Section 03

Assessment Objects and Risk Dimension Framework

Assessment Objects: Three representative models on the HuggingFace platform:

Model Publisher Publisher Type Access Control License Overall Risk Rating
google/gemma-2-2b Google US Large Tech Company Authentication Requirement Gemma Custom License Low-Medium
TheBloke/Mistral-7B-Instruct-v0.2-GGUF TheBloke Individual Community Contributor Fully Open Apache 2.0 High
tiiuae/falcon-7b TII UAE Government-Affiliated Research Institution Fully Open Apache 2.0 Medium-High

Risk Dimension Framework: Five core dimensions:

  1. Access Control: Whether authentication is required for download, vulnerability notification mechanism
  2. License Compliance: Whether it's a standard type, commercial use permission, accuracy of repackager licenses
  3. File Format Security: Whether Safetensors (no code execution) or old formats like Pickle are used
  4. Publisher Traceability: Builder, responsible entity, geopolitical risks, single point of failure
  5. Version Timeliness: Whether it's up-to-date, security patch frequency, security issue notification mechanism
4

Section 04

Key Findings: Risk Analysis of the Three Models

Google Gemma 2B (Low-Medium Risk):Strengths include authentication requirement, Safetensors format, and clear responsible entity; issues include custom license requiring legal review and outdated version. TheBloke Mistral 7B GGUF (High Risk):Risk points include zero access control, no organizational accountability for individual maintainers, no security scanning, no patch frequency. With 57,669 monthly downloads, its impact is wide—enterprises are not advised to use it directly. TII Falcon 7B (Medium-High Risk):Technically reliable but has geopolitical risks; requires third-party risk analysis, CISO approval, and legal review. Its version is outdated.

5

Section 05

Cross-Model Risk Themes: Common Issues in the Open-Source AI Ecosystem

Through analysis of the three models, five common risks were identified:

  1. Authentication Threshold Gap: Only one model requires authentication; models without thresholds lack security notification mechanisms
  2. File Format Security Control: Safetensors is a core security measure; Pickle has code execution risks
  3. Single Point of Failure for Community Models: Compromised HuggingFace accounts can replace widely downloaded model files
  4. Geopolitical Traceability Risk: Publishers affiliated with foreign governments require third-party risk analysis
  5. Version Timeliness Obligation: Outdated models have no formal security announcements; updates need to be monitored
6

Section 06

Six Priority Recommendations for CISOs

  1. Establish a formal AI model review process; standardized security reviews are required before deployment
  2. Mandate the use of Safetensors or equivalent secure formats
  3. Prioritize model sources that require authentication
  4. Apply standard TPRM (Third-Party Risk Management) processes to AI publishers (including foreign government-affiliated entities)
  5. Establish a model version monitoring plan; treat outdated models as unpatched software
  6. Train AI teams on supply chain risk awareness
7

Section 07

Conclusion: A New Paradigm for AI Supply Chain Security

Open-source large model supply chain security is a new frontier for enterprise security. Their characteristics—such as large file sizes and opaque structures—make traditional tools difficult to apply. This report provides a reusable assessment framework; enterprises need to build dedicated AI supply chain risk assessment capabilities to ensure a secure AI transformation.