Zing Forum

Reading

AuraDent: A Real-Time Voice-Driven Dental Clinical Documentation Automation Platform

AuraDent is a real-time documentation platform for dental clinics. Using Deepgram speech recognition, AI intelligent extraction, and AWS asynchronous processing, it automatically converts doctors' chairside dictations into structured medical records, treatment charts, and post-treatment guidelines.

医疗AI语音识别临床文档牙科DeepgramAWS LambdaPII脱敏
Published 2026-04-27 06:44Recent activity 2026-04-27 07:23Estimated read 5 min
AuraDent: A Real-Time Voice-Driven Dental Clinical Documentation Automation Platform
1

Section 01

Introduction / Main Floor: AuraDent: A Real-Time Voice-Driven Dental Clinical Documentation Automation Platform

AuraDent is a real-time documentation platform for dental clinics. Using Deepgram speech recognition, AI intelligent extraction, and AWS asynchronous processing, it automatically converts doctors' chairside dictations into structured medical records, treatment charts, and post-treatment guidelines.

2

Section 02

Pain Points and Opportunities in Clinical Documentation

During dental treatment, doctors need to record medical records, update treatment charts, and write post-treatment guidelines while treating patients—these documentation tasks are both time-consuming and error-prone. The traditional approach is to fill in records from memory after treatment, which makes it hard to ensure the accuracy and completeness of information. AuraDent was created to address this industry pain point: letting doctors focus on treatment while AI handles documentation.

3

Section 03

System Architecture Overview

AuraDent uses a TypeScript monorepo architecture, integrating real-time voice processing, AI intelligent extraction, and asynchronous post-processing. The entire system is divided into five core modules:

4

Section 04

Real-Time Gateway

The real-time gateway built with Fastify and WebSocket is the system's entry point. It receives front-end audio streams from browsers and forwards them to Deepgram for speech recognition. The gateway manages session lifecycles, distinguishes between partial and final transcriptions, and performs PII (Personally Identifiable Information) desensitization before sending content to AI.

5

Section 05

Intelligent Agent Core

This is the system's "brain", built on the Vercel AI SDK. The agent receives desensitized transcribed text, extracts structured clinical findings through typed tool calls and Zod validation. For example, when a doctor says "The patient's lower right second molar needs root canal treatment", the agent identifies the tooth position (#31), diagnosis (needs root canal treatment), and updates the corresponding data structure.

6

Section 06

Web Frontend

The clinical terminal interface built with React + Vite provides real-time feedback to doctors. The interface includes:

  • Waveform Visualization: Displays microphone activity status
  • Transcription Area: Shows partial and final transcribed text
  • Treatment Chart: Animates updates to tooth status
  • Tracking View: Displays the agent's thinking process, tool calls, and completion events
7

Section 07

Normalization Layer (Ingestion)

Responsible for converting the raw structured data extracted by the agent into a record format suitable for persistence, including deduplication logic (merging multiple mentions of the same tooth) and source tracing (recording the voice segment corresponding to each finding).

8

Section 08

Asynchronous Worker

A post-processing module based on AWS Lambda. When a session ends, the gateway sends session data (desensitized transcriptions, structured findings, tracking records, performance metrics) to an SQS queue, triggering the worker to generate post-treatment PDF guidelines, simulate insurance pre-authorization, and write the complete record to PostgreSQL.