Reading

Triagegeist: A Practical Project for Predicting Emergency Triage Severity Using Machine Learning

A machine learning project that predicts the emergency severity (ESI classification) of emergency patients based on structured clinical data, using LightGBM, XGBoost, and neural networks, achieving a Macro F1 score of 0.973 in the Kaggle competition.

machine learninghealthcareemergency departmenttriageLightGBMXGBoostclinical dataESIfeature engineering

Published 2026-06-04 11:46Recent activity 2026-06-04 11:50Estimated read 7 min

Section 01

Introduction / Main Floor: Triagegeist: A Practical Project for Predicting Emergency Triage Severity Using Machine Learning

Section 02

Original Author and Source

Original Author/Maintainer: Pyxis567
Source Platform: GitHub
Original Title: triagegeist
Original Link: https://github.com/Pyxis567/triagegeist
Publication Time: June 2026
Related Competition: Kaggle Triagegeist Competition

Section 03

Project Background and Significance

Emergency triage is one of the most critical links in hospital operations. When patients flood into the emergency department, nurses need to determine who needs immediate treatment and who can wait in a matter of minutes. Traditional triage relies on manual experience, but when facing a large number of patients, it is difficult to ensure the accuracy and consistency of judgments.

The Triagegeist project addresses this pain point by attempting to use machine learning models to assist or even replace the traditional manual triage process. This project participated in the Triagegeist competition on Kaggle, with the goal of predicting the assigned emergency severity level (ESI 1-5) based on structured clinical data collected from patients at the triage point.

Section 04

Introduction to the ESI Classification System

ESI (Emergency Severity Index) is a five-level triage system widely used in the field of emergency medicine in the United States:

Level	Label	Description
1	Resuscitation	Immediate life-threatening, requires immediate rescue
2	Emergency	High risk, should not wait
3	Urgent but stable	Stable but requires multiple resources
4	Semi-urgent	Stable, requires only one resource
5	Non-urgent	Stable, no resources needed

This classification system determines the priority of patient visits and directly affects the treatment effect.

Section 05

Dataset Composition and Feature Engineering

The project uses a dataset containing 80,000 training records, with original data including:

Training Set: 80,000 labeled patient records (40 features + target variable)
Test Set: 20,000 unlabeled records for submission
Chief Complaint Text: Original free-text chief complaint of each patient
Medical History Records: 25 binary comorbidity markers

Section 06

Core Feature Groups

The original features cover various aspects of emergency triage:

Vital Signs: Blood pressure, heart rate, oxygen saturation, body temperature, respiratory rate
Demographics: Age, gender, insurance type
Clinical Scores: NEWS2 score, GCS score, pain score
Arrival Context: Arrival method, time, shift
Past Utilization: Number of emergency visits and hospitalizations in the past 12 months

Section 07

Highlights of Feature Engineering

The project constructed 297 features, demonstrating solid feature engineering capabilities:

Missing Value Indicator: Created missing markers for key indicators such as blood pressure, respiratory rate, and body temperature, as missing values themselves are related to triage levels
Median Imputation: Fitted only on the training set to avoid data leakage
Time Features: Created markers for daytime (8-17), evening (18-22), and night periods
Age Binning: Divided age into 8 groups (infant to elderly) and performed one-hot encoding
Vital Sign Interactions: Derived features such as pulse pressure ratio, product of MAP and heart rate
Comorbidity Burden: Sum of 25 medical history markers
Past Utilization Ratio: Number of admissions/(number of emergency visits +1)
NEWS2 Risk Markers: High NEWS2 score (≥7), medium NEWS2 score (5-6)
qSOFA Score: Simplified score for sepsis screening
Cross-score Interactions: Pain × NEWS2, GCS × NEWS2, comorbidity × NEWS2, etc.
Age-stratified Features: Child/elderly markers, PALS-adjusted heart/respiratory thresholds
TF-IDF Text Features: Extracted 100 unigram + bigram features from chief complaint text

Section 08

Model Architecture and Experimental Results

The project tried multiple models, all of which used 5-fold stratified cross-validation and were retrained on the complete training set to generate test predictions.

Triagegeist: A Practical Project for Predicting Emergency Triage Severity Using Machine Learning

Introduction / Main Floor: Triagegeist: A Practical Project for Predicting Emergency Triage Severity Using Machine Learning

Original Author and Source

Project Background and Significance

Introduction to the ESI Classification System

Dataset Composition and Feature Engineering

Core Feature Groups

Highlights of Feature Engineering

Model Architecture and Experimental Results

Continue Reading

SignalCut: An Intelligent Tool for Turning AI Search Visibility Gaps into Video Marketing Campaigns

Graph Neural Networks Revolutionize Global Weather Forecasting: From Graph Weather to Open-Source Practice of Multi-Model Fusion

ExoVision: AI-Driven Exoplanet Detection and Habitability Assessment Platform

Vertica Expert Skills: A One-Stop Guide to Enterprise Database Migration and Optimization