Zing Forum

Reading

Cornell CS 4782 Course Project: Reproduction and Validation of the LoRA Low-Rank Adaptation Method

This project fully reproduces the core experiments of the LoRA paper, validates the effectiveness of low-rank adaptation on the GPT-2 Small model and E2E NLG dataset, and demonstrates that using only 0.06% to 0.24% of trainable parameters can achieve performance close to full fine-tuning.

LoRA低秩适配参数高效微调GPT-2E2E NLG康奈尔文本生成BLEUROUGE大语言模型
Published 2026-05-13 01:25Recent activity 2026-05-13 01:32Estimated read 6 min
Cornell CS 4782 Course Project: Reproduction and Validation of the LoRA Low-Rank Adaptation Method
1

Section 01

Cornell CS4782 Course Project: Guide to Reproduction and Validation of the LoRA Low-Rank Adaptation Method

This project is the final project for Cornell University's CS4782 course, aiming to reproduce the core experiments of the LoRA (Low-Rank Adaptation) paper and validate its effectiveness in parameter-efficient fine-tuning. Based on the GPT-2 Small model and E2E NLG dataset, the results show that LoRA can achieve performance close to full fine-tuning using only 0.06% to 0.24% of trainable parameters, providing empirical support for the efficient adaptation of large language models.

2

Section 02

Research Background: Demand for Parameter-Efficient Fine-Tuning and the Proposal of LoRA

Full fine-tuning of large language models requires updating hundreds of millions or even billions of parameters, leading to high computational and storage costs. LoRA proposes freezing the pre-trained model and only training low-rank adaptation matrices injected into attention layers, which theoretically can significantly reduce trainable parameters while maintaining performance. This project chooses to reproduce the results of Table 3 in the original LoRA paper, comparing the performance differences between full fine-tuning and LoRA on the GPT-2 model and E2E NLG text generation task to validate the practical effectiveness of its theoretical advantages.

3

Section 03

Experimental Design and Methods

The experiment uses GPT-2 Small as the base model and the E2E NLG Challenge (data-to-text generation task) as the evaluation benchmark. For LoRA implementation, trainable low-rank matrices are injected into the attention query and value projection layers of GPT-2, with the base model parameters frozen; three rank values (r=2, r=4, r=8) are set, and a Sequential LoRA variant with phased rank increase is explored. The evaluation metrics use standard text generation metrics BLEU and ROUGE-L (in percentage).

4

Section 04

Analysis of Core Experimental Results

The experimental results show that LoRA has significant parameter efficiency advantages:

  • Full fine-tuning requires training 124.44 million parameters (100%), with BLEU 64.76 and ROUGE-L 68.24;
  • LoRA r=2: 73.7k parameters (0.0592%), BLEU 64.20, ROUGE-L 67.39;
  • LoRA r=4:147.5k parameters (0.1184%), BLEU64.34, ROUGE-L67.87;
  • LoRA r=8:294.9k parameters (0.2364%), BLEU65.21 (slightly exceeding full fine-tuning), ROUGE-L67.23 (difference from full fine-tuning <1 point). Sequential LoRA showed competitive performance but did not significantly outperform the standard LoRA with fixed r=8.
5

Section 05

Implementation Challenges and Reproducibility

The main challenge in reproduction was correctly wrapping GPT-2's packed attention projection and verifying the freezing of base model parameters; the team ensured the correctness of LoRA layers through unit tests like test_lora.py. Tools used include the HuggingFace Transformers library and PyTorch framework, with dependency management via requirements.txt, and experiments run on Google Colab's A100 GPU. The project structure is clear: code/ stores core implementations, data/ manages datasets, results/ stores results, poster/ and report/ contain course posters and reports; experiments are driven by Jupyter Notebooks, and the E2E dataset is downloaded automatically to ensure reproducibility.

6

Section 06

Conclusions and Implications

This project successfully validates the effectiveness of LoRA: using less than 0.25% of trainable parameters, LoRA achieves performance comparable to full fine-tuning on the E2E NLG task. In terms of teaching, it allows students to deeply understand the LoRA mechanism and master the details of parameter-efficient fine-tuning; in research, it provides the community with an independently validated LoRA implementation, enhancing the credibility of the original paper's results. Open questions include: the optimal growth strategy for Sequential LoRA, task dependence of rank selection, and efficiency preservation on larger models, which are worth further exploration.