Reading

KWALLM: A Large Language Model-Based Qualitative Text Analysis Tool for Social Science Research

KWALLM is a qualitative text analysis application developed using R and Shiny, enabling non-technical users to perform analysis tasks such as text classification, topic extraction, and sentiment scoring using large language models.

质性研究文本分析大语言模型R语言Shiny社会科学主题建模人机协同PII脱敏计算社会科学

Published 2026-06-07 04:45Recent activity 2026-06-07 04:49Estimated read 5 min

Section 01

Introduction / Main Post: KWALLM: A Large Language Model-Based Qualitative Text Analysis Tool for Social Science Research

Section 02

Original Author and Source

Original Author/Maintainer: Kennispunt Twente
Source Platform: GitHub
Original Project Name: KWALLM
Original Link: https://github.com/KennispuntTwente/KWALLM
Release Date: June 2026

Section 03

Project Overview

KWALLM is a text analysis application specifically designed for qualitative research, developed by Kennispunt Twente (Knowledge Center Twente, Netherlands). Built on the R language and Shiny framework, it encapsulates the powerful capabilities of large language models (LLMs) in a user-friendly web interface, allowing social science researchers to conduct efficient text analysis without a programming background.

Section 04

Classification Analysis

Users can predefine a list of categories, and the model will automatically classify texts. For example, product reviews can be categorized into "positive", "negative", or "neutral". This supervised classification method is suitable for research scenarios with a clear analysis framework already in place.

Section 05

Feature Scoring

Users define specific features (e.g., "level of positive emotion"), and the model scores texts based on their matching degree with the feature. This method provides more fine-grained quantitative indicators than simple classification, making it suitable for research questions that require measuring degrees.

Section 06

Topic Extraction

Without predefined categories, the model automatically identifies topics in texts and assigns labels. This method is based on the research findings of Wanrooij, Manhar & Yang (2024) and Pham et al. (2023), and outperforms traditional methods like BERTopic on small datasets.

Section 07

Text Tagging

For qualitative coding needs, the model can mark text segments related to specific codes. For example, given the code "color", the model will highlight all text segments mentioning colors (such as "yellow" in "The sun is yellow"). Users can customize codes or let the LLM automatically generate codes based on the text. This mode is particularly suitable for analyzing long texts like interview records or focus group discussions.

Section 08

Automatic PII Redaction

Considering research ethics and data protection regulations (e.g., GDPR), KWALLM has built-in multi-layer mechanisms for personal information identification and redaction:

Basic Detection: Uses regular expressions to identify common PII such as email addresses, phone numbers, and Dutch postal codes
Advanced Detection: Integrates the GLiNER model for localized deep PII identification without sending sensitive data to external APIs

This design ensures the privacy of research participants is protected while not compromising the quality of analysis.

KWALLM: A Large Language Model-Based Qualitative Text Analysis Tool for Social Science Research

Introduction / Main Post: KWALLM: A Large Language Model-Based Qualitative Text Analysis Tool for Social Science Research

Original Author and Source

Project Overview

Classification Analysis

Feature Scoring

Topic Extraction

Text Tagging

Automatic PII Redaction

Continue Reading

SignalCut: An Intelligent Tool for Turning AI Search Visibility Gaps into Video Marketing Campaigns

Graph Neural Networks Revolutionize Global Weather Forecasting: From Graph Weather to Open-Source Practice of Multi-Model Fusion

ExoVision: AI-Driven Exoplanet Detection and Habitability Assessment Platform

Vertica Expert Skills: A One-Stop Guide to Enterprise Database Migration and Optimization