Zing Forum

Reading

Generative AI as Backup Productivity: Replication of Research on System-Level Benefits Under Labor Constraints

This article introduces the replication code package for a 2026 PNAS Nexus paper, which examines the economic effects of generative AI as a backup solution for human agents in telemarketing scenarios. It finds that despite AI's poor performance in individual tasks, the overall system benefits are improved.

generative AIlabor economicsproductivitycall centerhuman-AI collaborationsystem-level analysisreplication study
Published 2026-06-01 09:41Recent activity 2026-06-01 09:56Estimated read 10 min
Generative AI as Backup Productivity: Replication of Research on System-Level Benefits Under Labor Constraints
1

Section 01

Generative AI as Backup Productivity Research Replication: Core Insights Guide

This article replicates a 2026 PNAS Nexus study on the economic effects of generative AI as a backup option for human agents in telemarketing scenarios. Key findings: Although AI performs poorly in individual tasks, in scenarios with labor constraints (e.g., manpower shortages during holidays), it can improve overall system benefits as a "suboptimal but usable" backup solution. The research data comes from a Chinese online medical insurance brokerage company, and the code package has been open-sourced for replication.

2

Section 02

Research Background and Problem Statement

Original Authors and Sources

Research Background and Problem

Generative AI is developing rapidly, but a key question remains: Can AI still create value for organizations when it performs worse than humans in individual tasks? Traditional views hold that AI should outperform humans to be adopted, but they ignore system complexity—under labor constraints, AI as a backup solution may generate system-level benefits. This study uses telemarketing data from a Chinese online medical insurance brokerage company to quantify the economic value of AI as backup productivity.

3

Section 03

Research Design and Data Description

Data Source

The research data comes from the operational records of a Chinese online medical insurance brokerage company (including millions of calls). The original data cannot be made public due to privacy concerns, but analysis code and desensitized logs are provided. Dataset dimensions: call level (connection status, duration, transaction success, etc.), contract level, time dimension (workdays/holidays), agent characteristics (human/AI identity, performance, etc.)

Core Variable Definitions

  • if_AI: Whether the call was handled by an AI agent
  • if_connected: Whether the call was answered
  • if_succeed: Whether an insurance transaction was generated
  • if_refund: Whether the transaction was refunded
  • bridge_duration_num: Call duration (seconds)
  • payment_amt_num: Transaction amount (RMB)

Research Scenario

Focus on holidays/weekends (periods of manpower shortage), where AI acts as backup productivity to handle some calls, forming a natural experiment scenario.

4

Section 04

Key Research Findings

Finding 1: Poor Performance of AI in Individual Tasks

AI performs worse than experienced human agents in individual task metrics such as call success rate and transaction amount.

Finding 2: System-Level Benefit Improvement

Despite AI's low single conversion rate, it achieves:

  • Response Capacity Expansion: Avoid customer loss during manpower shortages
  • Opportunity Cost Reduction: Free up human agents to focus on high-value consultations
  • Overall Throughput Improvement: 24/7 operation, enhancing system processing capacity These factors together lead to an increase in the system's overall transaction volume and revenue.

Finding 3: Optimal Configuration for Human-AI Collaboration

Scenarios where AI has the greatest value:

  • High-load periods (surge in incoming calls)
  • Standardized consultations (common questions, standard products)
  • Long-tail periods (nights, holidays)
5

Section 05

Methodological Contributions and Replication Code Package

Methodological Contributions

  • Causal Identification Strategies: Fixed effects model, instrumental variable method, matching method, robustness tests
  • Benefit Decomposition Framework: Decompose total benefits into direct effects (benefits from AI calls), indirect effects (benefits from freeing up human time), and network effects (improvement in customer satisfaction)

Replication Code Package Structure

  • Software Dependencies: Stata17+ (main analysis), Python3.10+ (data processing/charts)
  • Code Organization: According to the paper structure, each figure/table corresponds to an independent script
  • One-Click Run: After configuring the data path, execute ./run_all.sh; running time is 10-30 minutes.
6

Section 06

Research Implications and Recommendations

Implications for Managers

Challenge the intuition that "AI must be better than humans to be adopted"; instead, evaluate AI investments from a system perspective rather than focusing solely on individual task metrics.

Implications for Policymakers

AI and humans are complementary rather than substitutive; policies should optimize human-AI collaboration instead of preventing AI applications.

Implications for Researchers

Economic methods (production functions, cost-benefit analysis) have application value in AI evaluation and can generate rich policy insights.

7

Section 07

Research Limitations and Future Directions

Limitations

  • External Validity: Data from a single industry/company; generalization requires caution
  • Data Limitations: Original data cannot be made public, affecting full replicability
  • Technology Iteration: AI technology evolves rapidly; current conclusions may change

Future Directions

  • Cross-industry comparative studies
  • Dynamic effects after AI capability improvements
  • Evolution of long-term human-AI collaboration models
  • Trends in consumer acceptance
8

Section 08

Research Conclusion

This study, with its rigorous empirical design and counterintuitive findings, provides an important reference for the economic evaluation of generative AI. Technology evaluation should not be limited to binary comparisons of "who is better" but should focus on how technology reshapes the efficiency of production systems. In the current context where labor constraints are prominent, "suboptimal but usable" AI solutions may be more valuable than imagined.