Zing Forum

Reading

Automated Privacy Policy Evaluation Using Large Language Models: An Empirical Study Integrating LegalBERT and LLaMA

This article introduces a bachelor's thesis study that explores how to use LegalBERT and LLaMA 3 8B models for automated classification and scoring of privacy policies, providing a practical technical solution for privacy compliance reviews.

隐私政策大型语言模型LegalBERTLLaMA文本分类隐私合规LoRA微调自然语言处理
Published 2026-04-07 17:44Recent activity 2026-04-07 17:48Estimated read 6 min
Automated Privacy Policy Evaluation Using Large Language Models: An Empirical Study Integrating LegalBERT and LLaMA
1

Section 01

[Introduction] Core of the Empirical Study on Automated Privacy Policy Evaluation with LegalBERT and LLaMA

This article presents a bachelor's thesis study that explores the use of LegalBERT (a BERT variant for the legal domain) and the LLaMA 3 8B model for automated classification and scoring of privacy policies. It aims to address the time-consuming and labor-intensive nature of manual reviews, providing a practical technical solution for privacy compliance checks. The study compares three technical approaches (LegalBERT fine-tuning, LLaMA 3 8B LoRA fine-tuning, and zero-shot reasoning) to verify the model's effectiveness and generalization ability.

2

Section 02

Research Background and Problem Awareness

In the digital age, privacy policies are often obscure and difficult to understand. Manual reviews are inefficient and struggle to handle massive demands. With the introduction of regulations like GDPR and CCPA, privacy compliance reviews have become increasingly important. How to efficiently and accurately evaluate privacy policies has become a key issue in both academia and industry.

3

Section 03

Research Objectives and Technical Approaches

The core objective is to build an intelligent system that automatically analyzes privacy policies, identifies categories of privacy practices, and outputs quantitative scores. Three technical approaches are used for comparative experiments: 1. Fine-tuning LegalBERT on a privacy policy corpus; 2. Efficient fine-tuning of LLaMA 3 8B using LoRA parameters; 3. Zero-shot direct reasoning with LLaMA 3 8B (no training required).

4

Section 04

Dataset and Model Training Details

The OPP-115 corpus is used (115 real website privacy policies labeled into 8 categories of privacy practices: first-party information collection and use, third-party information collection and use, information sharing and disclosure, user choice and access rights, data retention and deletion, security protection measures, policy change notification mechanisms, and child privacy protection). Preprocessing includes frequency analysis, annotation integration (0.75 threshold), and dataset partitioning. LegalBERT serves as the baseline model, using supervised learning to identify privacy practice features; LLaMA 3 8B is fine-tuned with LoRA, achieving optimal performance at 0.5 epochs (checkpoint-100) to avoid overfitting.

5

Section 05

Scoring Mechanism and Aggregation Strategy

A rule-based scoring pipeline is designed: first, identify the optimal attribute values for each privacy practice category, then synthesize them into an overall privacy friendliness score ranging from 0 to 10 (higher scores indicate more comprehensive protection and greater transparency). The aggregation strategy not only focuses on whether a category of practice is mentioned but also considers the specific way it is phrased (e.g., the clarity of data sharing clauses).

6

Section 06

Zero-shot Reasoning and Generalization Ability Verification

Zero-shot reasoning does not require labeled data or training, making it highly practical (suitable for scenarios with limited resources or rapid verification). The generalization test applies the model to modern privacy policy texts to verify its performance in real-world scenarios (the OPP-115 dataset is from 2016 and needs to adapt to current policy changes). It requires texts to be split into segments using "|||" and annotation files to be provided.

7

Section 07

Research Significance and Future Outlook

It provides a systematic solution for automated privacy policy evaluation and reveals the differences and applicable scenarios between domain-specific models and general large models. Application scenarios include enterprise compliance self-checks, regulatory review assistance, user rights protection, and academic research tools. Future directions can explore multilingual evaluation, dynamic policy monitoring, knowledge graph semantic reasoning, etc.