Zing Forum

Reading

DefectBench: A Unified Benchmark Dataset for Multimodal Large Models in Building Facade Defect Detection

DefectBench is a multi-level dataset and benchmark framework specifically designed for building facade defect detection, aiming to promote the application and evaluation of large multimodal models in the field of architectural engineering.

多模态大模型建筑外立面检测缺陷检测基准数据集计算机视觉建筑工程自动化检测
Published 2026-04-09 23:22Recent activity 2026-04-09 23:56Estimated read 7 min
DefectBench: A Unified Benchmark Dataset for Multimodal Large Models in Building Facade Defect Detection
1

Section 01

[Overview] DefectBench: A Unified Benchmark Dataset for Multimodal Large Models in Building Facade Defect Detection

DefectBench is the first multi-level dataset and benchmark framework specifically designed for building facade defect detection. It aims to address the problems of low efficiency, high cost, and easy misjudgment in traditional manual inspection, and promote the application and fair evaluation of multimodal large models in the field of architectural engineering. This open-source project features a multi-level annotation system and multimodal data fusion, providing a comprehensive evaluation platform for researchers and engineers.

2

Section 02

Background: Pain Points of Traditional Building Detection and the Birth of DefectBench

Building facade detection is an important part of urban safety management. However, traditional manual inspection is inefficient, costly, and prone to missed detections and misjudgments due to subjective factors. With the development of multimodal large models in computer vision, their application to automatic building defect detection has become possible. However, the field has long lacked standardized, multi-level evaluation benchmarks, which restricts research progress and model comparison. DefectBench emerged as the first unified benchmark framework for this scenario.

3

Section 03

Project Overview: Core Objectives and Design Philosophy of DefectBench

DefectBench is developed and maintained by the Whitneyyyyy team and is an open-source GitHub project. Its core objective is to establish a full-process benchmark system covering data collection, annotation standards, and evaluation metrics to promote the implementation of multimodal large models. Unlike traditional single-granularity datasets, it adopts a multi-level design, including multi-dimensional information such as defect type, severity, and spatial location, providing richer supervision signals.

4

Section 04

Technical Details: Multi-level Annotation and Multimodal Data Fusion

Multi-level Data Annotation System

DefectBench contains at least four levels of annotation:

  1. Image level: Overall quality assessment and scene classification
  2. Defect level: Bounding box localization and pixel-level segmentation
  3. Semantic level: Fine-grained defect type classification (cracks, spalling, etc.)
  4. Attribute level: Meta-information such as severity, impact range, and priority

Multimodal Data Fusion

Integrate multiple data sources:

  • Visible light images: High-resolution facade photos
  • Depth information: 3D geometric data
  • Infrared thermal imaging: Detection of internal hollowing and leakage
  • Text descriptions: Engineering reports and maintenance records

The multi-level structure improves detection accuracy and interpretability, while multimodal fusion simulates the comprehensive judgment process of professional engineers.

5

Section 05

Evaluation System: Multi-dimensional Metrics to Support Model Performance Verification

DefectBench establishes a scientific evaluation system with metrics including:

  • Detection precision: Accuracy and recall rate of defect localization
  • Classification accuracy: Correct rate of defect type recognition
  • Severity assessment: Accuracy of urgency judgment
  • Inference efficiency: Response speed in actual deployment
  • Generalization ability: Transfer performance across building types and climate regions

These metrics help researchers fully understand the strengths and weaknesses of models and make targeted improvements.

6

Section 06

Application Value: Promoting Intelligent Building Detection and Industry-University-Research Integration

DefectBench has far-reaching significance for the industry:

  1. Provide a standardized research platform for the academic community to accelerate algorithm iteration
  2. Validated models can be deployed in drone and robot inspection scenarios to improve efficiency and safety
  3. The open design encourages industry-university-research collaboration, data and experience sharing, and accelerates the intelligent transformation of the industry.
7

Section 07

Conclusion: A New Starting Point for the Intersection of AI and Architectural Engineering

DefectBench is an important step in the intersection of AI and traditional architectural engineering. By unifying multi-level datasets and benchmark frameworks, it provides valuable resources for multimodal large model research and opens up a new path for automated and intelligent building facade detection. We look forward to more researchers joining in to promote breakthrough progress in the field.