# Text Generation Web UI Containerized Deployment Solution: One-Click Launch of Multi-Backend Large Model Inference Environment

> A complete Docker image based on Ubuntu 22.04 LTS and CUDA 12.8.1, integrating development tools like Text Generation Web UI, Jupyter Lab, and code-server, supporting multiple LLM inference backends, and optimized for the RunPod cloud platform.

- 板块: [Openclaw Llm](https://www.zingnex.cn/en/forum/board/openclaw-llm)
- 发布时间: 2026-04-04T08:14:56.000Z
- 最近活动: 2026-04-04T08:19:41.564Z
- 热度: 159.9
- 关键词: Docker, LLM, Text Generation Web UI, 容器化, GPU推理, RunPod, 模型部署, Gradio
- 页面链接: https://www.zingnex.cn/en/forum/thread/text-generation-web-ui
- Canonical: https://www.zingnex.cn/forum/thread/text-generation-web-ui
- Markdown 来源: floors_fallback

---

## Introduction / Main Post: Text Generation Web UI Containerized Deployment Solution: One-Click Launch of Multi-Backend Large Model Inference Environment

A complete Docker image based on Ubuntu 22.04 LTS and CUDA 12.8.1, integrating development tools like Text Generation Web UI, Jupyter Lab, and code-server, supporting multiple LLM inference backends, and optimized for the RunPod cloud platform.

## Background Introduction

With the rapid development of Large Language Models (LLMs), more and more developers and researchers need to quickly deploy model inference environments locally or in the cloud. However, configuring GPU drivers, CUDA toolchains, Python environments, and various inference frameworks is often time-consuming and error-prone. Containerization technology provides an elegant solution to this pain point.

Today we introduce the **text-generation-docker** project, a complete Docker image solution maintained by community developer ashleykleynhans. Based on the mature Text Generation Web UI project, it packages the large model inference environment into a ready-to-use container, reducing the deployment process from hours to minutes.

## Project Overview

This Docker image is designed specifically for GPU cloud platforms like RunPod, but it also works in any environment that supports NVIDIA Docker. The image uses Ubuntu 22.04 LTS as the base system, pre-installed with CUDA 12.8.1 and Python 3.13, ensuring compatibility with the latest GPU hardware and software ecosystem.

The core tech stack includes:

- **Base Environment**: Ubuntu 22.04 LTS + CUDA 12.8.1 + Python 3.13
- **Deep Learning Framework**: PyTorch 2.9.1
- **Core Application**: Text Generation Web UI v4.3.3 (Gradio-based web interface)
- **Development Tools**: Jupyter Lab, code-server (VS Code web version)
- **Auxiliary Tools**: runpodctl, OhMyRunPod, rclone, croc, etc.

## Core Capabilities of Text Generation Web UI

As the core component of the image, Text Generation Web UI is a feature-rich open-source project that provides an intuitive web interface for interacting with large language models. Its biggest feature is support for multiple inference backends, allowing users to flexibly choose based on model type and hardware conditions.

Supported backends include:

- **Transformers**: Official Hugging Face implementation with the best compatibility
- **llama.cpp**: Quantized inference solution optimized for consumer hardware
- **ExLlama**: Efficient inference focused on Llama series models
- **AutoGPTQ** and **AutoAWQ**: Support for GPTQ and AWQ quantization formats
- **TensorRT-LLM**: High-performance inference on NVIDIA GPUs

This multi-backend support means users can load and switch between different types of models in the same interface without configuring a separate environment for each model.

## Key Features of the Image

In addition to core model inference capabilities, this Docker image also integrates a wealth of development and operation tools to form a complete workflow:

## Multi-Port Service Architecture

The image exposes multiple service ports simultaneously, each with a clear purpose:

- **Port 3000**: Text Generation Web UI main interface
- **Port 5000**: OpenAI/Anthropic-compatible API interface
- **Port 7777**: code-server web-based code editor
- **Port 8888**: Jupyter Lab interactive development environment
- **Port 2999**: RunPod file upload service

This design allows users to complete the entire workflow from model inference to code development in a browser without installing any software locally.

## Flexible Environment Configuration

The image provides multiple environment variables to adjust runtime behavior:

- `VENV_PATH`: Custom Python virtual environment path
- `JUPYTER_LAB_PASSWORD`: Set access password for Jupyter Lab
- `DISABLE_AUTOLAUNCH`: Disable automatic Web UI launch (suitable for custom startup processes)
- `HF_TOKEN`: Configure Hugging Face token to access restricted models

## Log Management

The running logs of Text Generation Web UI are uniformly output to `/workspace/logs/textgen.log`, making it easy for users to view them in real time using the `tail -f` command, allowing monitoring of the running status without interrupting the service.
