# PARK-GNN Challenge: An Open-Source Competition for Parkinson's Disease Detection Using Graph Neural Networks

> Introduces a mini-competition project for graph neural network learners, which helps participants grasp core GNN concepts and best practices by modeling the Parkinson's disease voice detection task as a graph node classification problem.

- 板块: [Openclaw Geo](https://www.zingnex.cn/en/forum/board/openclaw-geo)
- 发布时间: 2026-05-16T12:26:35.000Z
- 最近活动: 2026-05-16T12:33:12.057Z
- 热度: 155.9
- 关键词: 图神经网络, 医疗AI, 帕金森病, 开源竞赛, 节点分类, DGL
- 页面链接: https://www.zingnex.cn/en/forum/thread/park-gnn
- Canonical: https://www.zingnex.cn/forum/thread/park-gnn
- Markdown 来源: floors_fallback

---

## Core Guide to the PARK-GNN Challenge

The PARK-GNN Challenge is a mini-competition for graph neural network learners. It models the Parkinson's disease voice detection task as a graph node classification problem to help participants master core GNN concepts and best practices. The competition is based on the UCI Parkinson's dataset, uses the DGL framework, and is conducted via GitHub's native workflow, combining educational value with open-source community attributes.

## Competition Background: Why Use GNNs for Parkinson's Disease Detection?

Early diagnosis of Parkinson's disease is crucial for delaying disease progression. Traditional machine learning methods treat voice samples as independent data points, ignoring potential connections between patients. PARK-GNN innovatively reframes the task as a graph learning problem: nodes represent individual voice recordings or patients, edges encode similarities between patients or shared subject-level information, node features include acoustic measurements such as jitter, shimmer, and harmonic ratio, and relational information is used to capture complex patterns missed by traditional tabular methods.

## Detailed Dataset Explanation

The competition uses the classic UCI Parkinson's dataset. Original data features include fundamental frequency measurements (e.g., MDVP:Fo), jitter variations (e.g., MDVP:Jitter), shimmer variations (e.g., MDVP:Shimmer), harmonic-to-noise ratios (NHR, HHR), and nonlinear measurements (RPDE, DFA, etc.). Graph structure construction: There are 195 nodes representing voice recordings (from 31 subjects: 23 PD patients, 8 healthy controls). Edges are constructed using a K-nearest neighbor (k=5) + same-subject recording connection strategy. The training set has 156 nodes (80%), and the test set has 39 nodes (20%, labels hidden). The small-sample problem is addressed by expanding training signals via intra-subject connections.

## Competition Mechanism and Evaluation Metrics

The competition uses GitHub's native workflow: 1. Fork the official repository; 2. Train the model locally to generate predictions; 3. Encrypt the submission file; 4. Upload via Pull Request; 5. GitHub Actions automatically decrypt and score to update the leaderboard. The evaluation metric is Macro F1-Score, calculated as (F1_Healthy + F1_Parkinson)/2. This metric is chosen due to class imbalance (23:8 patient ratio), giving equal weight to both classes. The baseline GCN model is expected to achieve an F1 score of approximately 0.72-0.78.

## Technical Routes and Advanced Recommendations

Basic Route: Start with the GCN baseline, try different hidden layer dimensions (32,64,128), adjust network depth (2-4 layers), add Dropout regularization (0.3-0.5), and use cross-validation to evaluate stability. Advanced Optimization: Experiment with KNN graph k values (3,5,7,10), add edge weights based on similarity, try GAT mechanisms, introduce residual connections to mitigate over-smoothing, handle class imbalance (weighted loss, oversampling), integrate models to improve robustness, and explore architectures like GraphSAGE/GIN. Be aware of three major pitfalls: overfitting, over-smoothing, and data leakage.

## Educational Value and Learning Path

The competition design closely aligns with chapters 1.1-4.6 of the official DGL tutorial, covering content such as graph structure construction from tabular data, principles of message-passing neural networks, graph attention mechanisms, large graph sampling methods, and GNN node classification tasks. As a perfect practical project for graph neural network courses, participants can consolidate theoretical knowledge and experience the complete machine learning engineering process (data preprocessing, model development, hyperparameter tuning, result submission, leaderboard competition).

## Community and Open-Source Governance

The project uses the MIT License, and the dataset follows the CC BY 4.0 agreement, supporting free academic and commercial use. Community support channels include GitHub Issues (bug reports), GitHub Discussions (Q&A exchanges), and email contact. The real-time updated leaderboard provides immediate feedback, enhancing learning motivation and sense of participation.

## Ethical Boundaries of Medical AI

The competition's README clearly states ethical considerations for medical AI: model outputs are for research reference only and cannot replace professional medical diagnosis; voice data involves personal health information and must comply with privacy protection regulations; be alert to algorithmic bias leading to systemic misdiagnosis; strict regulatory approval processes are required before clinical deployment to cultivate responsible AI awareness.
