Zing Forum

Reading

CrayPulse: A Real-Time Watershed Anomaly Monitoring System Based on Temporal Graph Neural Networks

The CrayPulse project applies graph neural networks to real-time health monitoring of urban water systems. By integrating meteorological data from Lawrence Berkeley National Laboratory and telemetry information from sensors, it has built a complete anomaly detection and early warning system.

图神经网络环境监测异常检测水生态实时系统物联网机器学习
Published 2026-05-19 07:45Recent activity 2026-05-19 07:51Estimated read 10 min
CrayPulse: A Real-Time Watershed Anomaly Monitoring System Based on Temporal Graph Neural Networks
1

Section 01

Introduction to the CrayPulse Project

CrayPulse is a real-time watershed anomaly monitoring system based on temporal graph neural networks, focusing on health monitoring of urban water systems. It integrates meteorological data from Lawrence Berkeley National Laboratory and telemetry information from sensors to build an anomaly detection and early warning system. Applied to the Strawberry Creek watershed near the University of California, Berkeley, it enables real-time assessment of stream health and early warning of abnormal events.

2

Section 02

Project Background and Significance

Project Background and Significance

Health monitoring of urban aquatic ecosystems has always been an important topic in the field of environmental science. Traditional water quality monitoring methods often rely on manual sampling and laboratory analysis, which have problems such as delayed response and limited coverage. With the popularization of IoT sensor networks and the development of machine learning technologies, real-time and automated aquatic ecosystem monitoring has become possible.

The CrayPulse project focuses on the Strawberry Creek watershed near the University of California, Berkeley, which is a typical urban water system facing various ecological pressures brought by urbanization. The goal of the project is to achieve real-time assessment of stream health and early warning of abnormal events by deploying an intelligent monitoring system.

3

Section 03

Core Technical Architecture

Core Technical Architecture

CrayPulse uses Temporal Graph Neural Network as its core algorithm framework. Compared with traditional time series analysis methods, graph neural networks can capture the spatial correlation between monitoring points in the sensor network, while their temporal modeling capability allows them to identify dynamic patterns of water quality parameters evolving over time.

The system's data input includes two main sources: first, a physical sensor network deployed along the stream, which collects key indicators such as water temperature, pH value, dissolved oxygen, and conductivity in real time; second, meteorological data from Lawrence Berkeley National Laboratory, including environmental parameters such as rainfall, air temperature, and humidity. This multi-source data fusion strategy enables the system to distinguish between natural weather changes and real pollution events.

4

Section 04

System Operation Modes

System Operation Modes

CrayPulse has designed three main operation modes to adapt to different usage scenarios:

Training Mode (Train) : The system obtains historical data of the past 30 days from the Strawberry Creek API, constructs the graph structure, and trains the GNN model from scratch. After training, the model weights are saved locally for subsequent inference. This mode is suitable for system initialization or situations where the model needs to be recalibrated.

Update Mode (Update) : Based on the existing model weights, fine-tuning is performed using newly collected data. This mode can adapt to data distribution drift caused by seasonal changes or environmental condition evolution while maintaining model stability.

Inference Mode (Inference) : Quickly evaluate the data of the latest 48 hours to detect potential abnormal events. This mode does not require training, starts quickly, and is suitable for regular execution as a scheduled task.

5

Section 05

Real-Time Monitoring and Early Warning Mechanism

Real-Time Monitoring and Early Warning Mechanism

The system's 7×24 real-time monitoring is implemented through an independent run_live.py script. This service performs a complete detection process every 15 minutes: obtaining the latest data from the API, running graph neural network inference, generating visual reports, and evaluating anomaly risks.

When the anomaly score exceeds the dynamic threshold adjusted for rainfall, the system triggers the email alert mechanism. This adaptive threshold design considers the legitimate impact of rainfall on water quality parameters, avoiding false alarms caused by natural rainfall. Alert information is sent to preset recipient emails via the SMTP protocol to ensure that relevant personnel can respond in a timely manner.

After each monitoring cycle, the system automatically generates a visual report, saved as reports/latest_report.png, showing a snapshot of the current stream health status. These reports not only serve real-time monitoring but also accumulate valuable data assets for long-term trend analysis.

6

Section 06

Technical Implementation Details

Technical Implementation Details

The project uses Python as the main development language, and dependency management is standardized through the requirements.txt file. It is worth noting that since OpenSeesPy is used for some physical simulations at the bottom layer, an x86 architecture Python environment needs to be run via Rosetta on Apple Silicon devices.

The system's configuration is managed through an environment variable file (.env), which mainly includes two types of sensitive information: the API access token for the Strawberry Creek monitoring network, and SMTP credentials for sending alert emails (supporting Gmail app passwords). This design not only ensures the flexibility of configuration but also avoids hard-coding sensitive information in the code repository.

7

Section 07

Application Value and Insights

Application Value and Insights

The CrayPulse project demonstrates the practical application potential of graph neural networks in the field of environmental monitoring. By modeling the physical sensor network as a graph structure, the system can fully utilize spatial correlation information to improve the accuracy of anomaly detection. At the same time, multi-source data fusion and adaptive threshold design reflect an in-depth understanding of the complexity of real scenarios in engineering practice.

The open-source release of this project provides a reference technical framework for similar aquatic ecosystem monitoring projects. Whether it is stream monitoring in other cities or broader environmental IoT applications, the architectural design and technology selection of CrayPulse are of reference significance. Especially in the context of climate change, intelligent environmental monitoring systems will play an increasingly important role in ecological protection and public health.