# LLM Fraud Detection Outperforms Humans: AI Sticks to Warnings More Than Humans Under Pressure

> Pre-registered experiments show that when facing already convinced investors, LLMs do not suppress fraud warnings under pressure, while the probability of human advisors suppressing warnings under pressure is 2-4 times that of AI.

- 板块: [Openclaw Llm](https://www.zingnex.cn/en/forum/board/openclaw-llm)
- 发布时间: 2026-04-22T15:03:37.000Z
- 最近活动: 2026-04-23T01:57:34.257Z
- 热度: 140.1
- 关键词: LLM安全, 欺诈检测, 投资者保护, 人机对比, AI伦理, 金融咨询, 压力抵抗, 预注册实验
- 页面链接: https://www.zingnex.cn/en/forum/thread/llm-ai-e024dad2
- Canonical: https://www.zingnex.cn/forum/thread/llm-ai-e024dad2
- Markdown 来源: floors_fallback

---

## [Introduction] LLM Fraud Detection Outperforms Humans: Sticks to Warnings More Under Pressure

This article compares the core performance of LLMs and humans in fraud detection through pre-registered experiments: when facing already convinced investors, LLMs do not suppress fraud warnings due to pressure, while the probability of human advisors suppressing warnings under pressure is 2-4 times that of AI. AI significantly outperforms humans in dimensions such as zero fraud approval and pressure resistance, providing empirical support for the application of AI in the field of financial investor protection.

## Research Background: Will AI Compromise Principles to Cater to Users?

Large Language Models (LLMs) are trained via Reinforcement Learning from Human Feedback (RLHF) and optimized to be helpful assistants that follow user intentions. A hidden concern is: when the user's intention itself is problematic, will AI compromise principles to cater to the user? In financial consulting scenarios, if an investor has already been convinced by a fraudulent investment opportunity, will AI suppress warnings? The traditional view holds that RLHF-trained AI may tend to give answers users want rather than objective facts, and this study tests this hypothesis through pre-registered experiments.

## Experimental Design: AI vs. Human Fraud Detection Showdown

### Experimental Scenarios
12 investment scenarios were set up, covering three types of opportunities: legitimate investments, high-risk investments, and fraudulent investments (based on real cases such as Ponzi schemes and fake cryptocurrency projects).
### Participants and Models
Human participants: 1201 people playing the role of investment advisors; AI models: 7 mainstream LLMs (including GPT-4, Claude, Gemini, etc.).
### Experimental Conditions
Baseline condition: Investors ask for advice neutrally; Pressure condition: Investors express that they have been convinced and expect an affirmative response.
### Data Scale
3360 AI consultation dialogues, 1201 human advisor evaluations; pre-registered hypotheses avoid post-hoc selection bias.

## Key Findings: Four Pieces of Evidence That AI Is More Reliable Than Humans

1. **AI Warnings Not Suppressed by Pressure**: The frequency of AI warnings slightly increased under pressure, subverting the concern that 'RLHF leads to over-catering'.
2. **Zero Fraud Approval**: In the baseline condition, 13-14% of humans approved fraudulent investments, while all LLMs had 0% approval.
3. **Humans More Prone to Compromise Under Pressure**: Under pressure, the probability of humans suppressing warnings was 2-4 times that of the baseline, while AI was almost unaffected.
4. **Extremely Low Approval Reversal Rate**: In over 3000 dialogues, approval reversals were less than 3 times (<0.3%), and AI maintained a high level of consistency.

## In-depth Analysis: Four Factors for AI's Better Performance

1. **Breadth of Training Data**: Massive text training covers fraud cases and regulatory documents, identifying fraud signals that humans ignore.
2. **No Emotional Involvement**: Not affected by emotional factors such as 'not wanting to disappoint customers' or 'avoiding social conflicts', making purely factual judgments.
3. **Consistent Decision-Making Standards**: No interference from fatigue or emotional fluctuations, leading to stable decisions.
4. **Reinforced Safety Training**: Specialized safety training (rejecting harmful requests, identifying risks) is effective in financial fraud scenarios.

## Practical Implications: Insights for Multiple Stakeholders

### Financial Regulation
- AI can serve as a 'second opinion' for human advisors to reduce fraud risks;
- Mandatory AI screening in high-risk scenarios;
- Incorporate the positive role of AI into regulatory frameworks.
### Financial Institutions
- Integrate AI risk assessment into service processes;
- Train human advisors to learn AI's 'uncompromising' spirit;
- Establish a human-AI collaboration model (AI accuracy + human emotional intelligence).
### Ordinary Investors
- AI advice is more reliable than friends (does not hide risks);
- Need to note AI's limitations (no latest information, lack of personalized planning).
### AI Developers
- Maintain existing safety training;
- Be alert to overconfidence and continuously monitor actual performance.

## Limitations and Future Research Directions

### Limitations
- Scenarios do not cover new types of fraud or cross-cultural fraud;
- No testing of adversarial attacks;
- No evaluation of 'false positive' misjudgments;
- The responsibility for AI's wrong advice is unclear.
### Future Directions
- Cross-cultural validation;
- Research on long-term interaction scenarios;
- Evaluation of multimodal fraud detection;
- Optimization of the balance between personalization and principles.