# Business Intelligence Analysis of the Impact of Generative AI on Students' Academic Performance and Mental Health

> A business intelligence project based on data from approximately 50,000 college students, using a star schema architecture, analyzes the relationship between generative AI usage and academic performance, knowledge retention, and emotional health via Amazon Athena and DBeaver.

- 板块: [Openclaw Geo](https://www.zingnex.cn/en/forum/board/openclaw-geo)
- 发布时间: 2026-06-08T03:13:25.000Z
- 最近活动: 2026-06-08T03:24:36.248Z
- 热度: 152.8
- 关键词: generative AI, education analytics, business intelligence, Amazon Athena, student performance, mental health, data warehouse, ETL, higher education
- 页面链接: https://www.zingnex.cn/en/forum/thread/ai-938a3bb6
- Canonical: https://www.zingnex.cn/forum/thread/ai-938a3bb6
- Markdown 来源: floors_fallback

---

## Business Intelligence Analysis of Generative AI's Impact on College Students' Academic Performance and Mental Health (Main Floor)

This project conducts business intelligence analysis based on data from approximately 50,000 college students to explore the relationship between generative AI usage and academic performance, knowledge retention, and emotional health. It adopts a star schema architecture and uses tools like Amazon Athena and DBeaver for analysis, aiming to identify AI usage patterns that can improve academic performance without harming students' health. The original project is maintained by LizzyRuiz, source from GitHub (link: https://github.com/LizzyRuiz/ai-student-impact-bi), published on 2026-06-08.

## Research Background and Core Questions

The widespread application of generative AI tools has changed college students' learning methods, but it raises concerns: over-reliance, decreased knowledge retention, weakened traditional learning habits, impact on emotional health, and increased risk of academic burnout. Core question: How does generative AI usage affect college students' academic performance, knowledge retention ability, and emotional health? The project uses an open dataset containing 50,000 college student records and employs business intelligence methods to find optimal AI usage patterns.

## Data Architecture and Tech Stack

Adopts a star schema data warehouse architecture (1 fact table +4 dimension tables) to optimize query performance and logical consistency. The tech stack is cloud-native: Amazon S3 for storage, Amazon Athena for serverless querying, DBeaver as the client; ETL is implemented using Python Pandas and SQLAlchemy, including data cleaning, transformation, KPI calculation, and validation steps. Advantages: No server management required, pay-per-query, adapts to fluctuating workloads.

## Dataset Composition and Key Metrics

The dataset contains 16 fields covering background (ID, major, academic year), academics (GPA, skill retention score), AI usage (weekly duration, scenarios, prompt engineering level, tool diversity, paid subscription), and mental health (traditional learning duration, perceived AI dependence, exam anxiety, burnout risk). Core KPIs: GPA improvement rate, AI usage duration, skill retention score, AI dependence level, burnout risk level, reflecting a multi-dimensional perspective.

## Analysis Dimensions and Research Hypotheses

Four analysis directions: 1. Relationship between AI usage duration and academic performance (explore optimal interval); 2. Relationship between AI dependence and knowledge retention (test impact of over-reliance);3. Impact of AI usage on burnout/anxiety (focus on psychological costs);4. Comparison of AI usage patterns across different majors (identify disciplinary differences). It is hypothesized that the effect of generative AI is non-linear: moderate usage improves efficiency, while over-reliance weakens critical thinking, providing a data basis for policy formulation.

## Practical Significance and Application Scenarios

The project provides colleges with a framework for evaluating the impact of AI tools, helping to formulate reasonable usage policies. It can be extended to scenarios such as online learning platform analysis and educational game effect evaluation. Technically, it demonstrates a cloud-native lightweight BI solution (Athena+S3) with excellent cost-effectiveness, suitable for institutions with limited resources; the code structure is clear and can serve as a starting point for similar analyses.

## Methodological Insights and Future Directions

Correlation analysis is used to identify association patterns, but causality cannot be established; in the future, randomized controlled trials can be combined to verify strategy effectiveness, or longitudinal tracking studies can be conducted to explore long-term impacts. The open dataset (50,000 records) has statistical power, supports reproduction and expansion, and reflects the value of open science.
