Reading

social-inference: An Experimental Platform for Social Reasoning Games Based on Large Language Models

Introducing the social-inference project, a Python project that implements social reasoning games using LLM agents to explore AI performance in deception, reasoning, and social interaction.

社交推理LLM狼人杀AI游戏欺骗检测多智能体OpenAIPython项目

Published 2026-04-20 14:44Recent activity 2026-04-20 14:59Estimated read 8 min

Section 01

Introduction / Main Floor: social-inference: An Experimental Platform for Social Reasoning Games Based on Large Language Models

Introducing the social-inference project, a Python project that implements social reasoning games using LLM agents to explore AI performance in deception, reasoning, and social interaction.

Section 02

Background: The Intersection of AI and Social Reasoning

Social reasoning games such as Werewolf/Mafia and Avalon have long been important experimental scenarios for studying human social intelligence, deception detection, and strategic reasoning. These games require players to make decisions under incomplete information—they must hide their identities while inferring the truth through observing others' words and actions.

With the continuous improvement of large language model capabilities, researchers have begun to explore the performance of AI systems in tasks requiring complex social reasoning. Can LLMs understand the subtleties of deception? Can they infer others' true intentions through dialogue? Can they maintain strategic consistency under pressure? These questions not only concern the boundaries of AI capabilities but also provide a new perspective for us to understand intelligence itself.

The social-inference project was born in this context. It combines classic social reasoning game mechanisms with modern LLM technology to create a unique AI behavior research platform.

Section 03

Project Overview

social-inference is an open-source Python project that implements the core mechanisms of social reasoning games and uses large language models as game participants. The project allows users to run fully automated AI battles and observe the performance of different models in scenarios involving deception, reasoning, and collaboration.

This tool is particularly suitable for AI researchers, game designers, and LLM enthusiasts. It provides a standardized testing environment for evaluating and comparing the capabilities of different language models in social reasoning tasks.

Section 04

1. Game Character System

The project implements the classic role assignment mechanism of social reasoning games:

Impostor: A role that needs to hide its identity and mislead other players.
Detective: A role with special information acquisition capabilities.
Doctor: A role that can protect other players.
Crewmate: An ordinary player who needs to find the Impostor through reasoning.

Each role has specific goals and abilities, which require LLM agents to not only understand the game rules but also adjust their strategies based on their role identity.

Section 05

2. Day-Night Cycle Mechanism

The game adopts the classic day-night alternation mechanism:

Day Phase:

All surviving players participate in public discussion.
Players share information, raise suspicions, and defend themselves.
Decide which player to eliminate through voting.

Night Phase:

Different roles perform their respective special actions.
The Impostor chooses an attack target.
The Detective investigates a player's identity.
The Doctor chooses a protection target.

This mechanism creates an environment of information asymmetry, which is the core source of tension in social reasoning games.

Section 06

3. LLM Agent Architecture

The project uses the OpenAI API to interact with language models, and each player is controlled by an independent LLM instance. Key designs include:

Independent Context: Each agent maintains its own dialogue history and reasoning state.
Role Prompts: Inject role identity and goals into the model through system prompts.
Memory Management: Agents need to remember previous discussion content and voting history.
Strategy Consistency: The model needs to maintain strategic coherence across multiple rounds of dialogue.

Section 07

4. Discussion and Voting System

The core interaction of the game occurs in the discussion and voting sessions:

Discussion Phase:

Agents generate natural language speeches.
Speeches can include accusations, defenses, information sharing, or strategic suggestions.
Other agents observe and update their trust assessments of other players.

Voting Phase:

Agents make voting decisions based on accumulated information.
The voting result determines which player is eliminated.
The true identity of the eliminated player is revealed.

Section 08

System Requirements

Running social-inference requires the following environment:

Python 3.x: The main programming language of the project.
OpenAI API Key: Used to interact with language models.
openai library: The official Python client.
Standard Library Dependencies: argparse, json, os, random, re, time, collections, concurrent.futures, datetime.

Continue Reading

Keep going with more reads from the same topic.

Nornir MCP Server: An Enterprise-Grade Bridge for Integrating Large Language Models into Network Automation

Nornir MCP Server is an enterprise-level server based on the Model Context Protocol (MCP). It seamlessly integrates large language models (such as Claude) with the Nornir network automation framework, supporting natural language orchestration for multi-vendor network devices (Cisco, Arista, Juniper, etc.), and providing production-grade features like a dual-engine architecture (NAPALM + Netmiko), intelligent filtering, and a secure sandbox.

Recent activity 2026-05-06 20:51

Bibliothèque Française LLM: A French Public Domain Literature Index System Optimized for Large Language Models

Bibliothèque Française LLM is a structured indexing and annotation project for French public domain literature designed specifically for large language models (LLMs). It integrates multiple authoritative sources such as DraCor, Common Corpus, and Wikisource, providing metadata indexing categorized by genre, author, and era, as well as in-depth annotations for dramatic texts (including characters, lines, stage directions, etc.). Its aim is to enable LLMs to efficiently read and understand classic French literary works.

Recent activity 2026-05-06 20:50

Splinter: A Lock-Free Zero-Copy Shared Memory KV and Vector Storage Library That Eliminates Socket and Memcpy Overhead for LLM Inference

Splinter is a minimalist, high-performance key-value (KV) and vector storage system enabling zero-latency inter-process communication via shared memory and atomic operations. With only 766 lines of core code, it supports millions of operations per second and 768-dimensional vector storage, offering a new architectural approach for local LLM inference and data-intensive applications.

Recent activity 2026-04-03 08:49

Building an AWS Generative AI Application from Scratch: EC2 + Bedrock Hands-On Tutorial

A complete cloud-native AI application development guide for beginners, building a simple generative AI chatbot using Amazon EC2, Apache, Python CGI, and Amazon Bedrock, covering architecture design, IAM permission configuration, security best practices, and cost optimization suggestions.

Recent activity 2026-06-02 19:49