Reading

LLM Fraud Detection Outperforms Humans: AI Sticks to Warnings More Than Humans Under Pressure

Pre-registered experiments show that when facing already convinced investors, LLMs do not suppress fraud warnings under pressure, while the probability of human advisors suppressing warnings under pressure is 2-4 times that of AI.

LLM安全欺诈检测投资者保护人机对比AI伦理金融咨询压力抵抗预注册实验

Published 2026-04-22 23:03Recent activity 2026-04-23 09:57Estimated read 7 min

LLM Fraud Detection Outperforms Humans: AI Sticks to Warnings More Than Humans Under Pressure

Section 01

[Introduction] LLM Fraud Detection Outperforms Humans: Sticks to Warnings More Under Pressure

This article compares the core performance of LLMs and humans in fraud detection through pre-registered experiments: when facing already convinced investors, LLMs do not suppress fraud warnings due to pressure, while the probability of human advisors suppressing warnings under pressure is 2-4 times that of AI. AI significantly outperforms humans in dimensions such as zero fraud approval and pressure resistance, providing empirical support for the application of AI in the field of financial investor protection.

Section 02

Research Background: Will AI Compromise Principles to Cater to Users?

Large Language Models (LLMs) are trained via Reinforcement Learning from Human Feedback (RLHF) and optimized to be helpful assistants that follow user intentions. A hidden concern is: when the user's intention itself is problematic, will AI compromise principles to cater to the user? In financial consulting scenarios, if an investor has already been convinced by a fraudulent investment opportunity, will AI suppress warnings? The traditional view holds that RLHF-trained AI may tend to give answers users want rather than objective facts, and this study tests this hypothesis through pre-registered experiments.

Section 03

Experimental Design: AI vs. Human Fraud Detection Showdown

Experimental Scenarios

12 investment scenarios were set up, covering three types of opportunities: legitimate investments, high-risk investments, and fraudulent investments (based on real cases such as Ponzi schemes and fake cryptocurrency projects).

Participants and Models

Human participants: 1201 people playing the role of investment advisors; AI models: 7 mainstream LLMs (including GPT-4, Claude, Gemini, etc.).

Experimental Conditions

Baseline condition: Investors ask for advice neutrally; Pressure condition: Investors express that they have been convinced and expect an affirmative response.

Data Scale

3360 AI consultation dialogues, 1201 human advisor evaluations; pre-registered hypotheses avoid post-hoc selection bias.

Section 04

Key Findings: Four Pieces of Evidence That AI Is More Reliable Than Humans

AI Warnings Not Suppressed by Pressure: The frequency of AI warnings slightly increased under pressure, subverting the concern that 'RLHF leads to over-catering'.
Zero Fraud Approval: In the baseline condition, 13-14% of humans approved fraudulent investments, while all LLMs had 0% approval.
Humans More Prone to Compromise Under Pressure: Under pressure, the probability of humans suppressing warnings was 2-4 times that of the baseline, while AI was almost unaffected.
Extremely Low Approval Reversal Rate: In over 3000 dialogues, approval reversals were less than 3 times (<0.3%), and AI maintained a high level of consistency.

Section 05

In-depth Analysis: Four Factors for AI's Better Performance

Breadth of Training Data: Massive text training covers fraud cases and regulatory documents, identifying fraud signals that humans ignore.
No Emotional Involvement: Not affected by emotional factors such as 'not wanting to disappoint customers' or 'avoiding social conflicts', making purely factual judgments.
Consistent Decision-Making Standards: No interference from fatigue or emotional fluctuations, leading to stable decisions.
Reinforced Safety Training: Specialized safety training (rejecting harmful requests, identifying risks) is effective in financial fraud scenarios.

Section 06

Practical Implications: Insights for Multiple Stakeholders

Financial Regulation

AI can serve as a 'second opinion' for human advisors to reduce fraud risks;
Mandatory AI screening in high-risk scenarios;
Incorporate the positive role of AI into regulatory frameworks.

Financial Institutions

Integrate AI risk assessment into service processes;
Train human advisors to learn AI's 'uncompromising' spirit;
Establish a human-AI collaboration model (AI accuracy + human emotional intelligence).

Ordinary Investors

AI advice is more reliable than friends (does not hide risks);
Need to note AI's limitations (no latest information, lack of personalized planning).

AI Developers

Maintain existing safety training;
Be alert to overconfidence and continuously monitor actual performance.

Section 07

Limitations and Future Research Directions

Limitations

Scenarios do not cover new types of fraud or cross-cultural fraud;
No testing of adversarial attacks;
No evaluation of 'false positive' misjudgments;
The responsibility for AI's wrong advice is unclear.

Future Directions

Cross-cultural validation;
Research on long-term interaction scenarios;
Evaluation of multimodal fraud detection;
Optimization of the balance between personalization and principles.