Reading

Comprehensive Study on Underwater Object Detection Based on Deep Neural Networks: Application of YOLOv8 and YOLOv9 in Marine Debris Recognition

This article introduces an in-depth study on underwater object detection, which uses YOLOv8 and YOLOv9 models trained on the TrashCan 1.0 dataset. Through 32 experimental configurations, it evaluates the impact of different model variants, category distributions, and learning rates on detection performance, providing technical references for marine environmental protection and underwater robot vision.

水下目标检测YOLOv8YOLOv9深度学习海洋垃圾检测计算机视觉自主水下航行器IEEE Access

Published 2026-05-06 10:44Recent activity 2026-05-06 10:49Estimated read 6 min

Comprehensive Study on Underwater Object Detection Based on Deep Neural Networks: Application of YOLOv8 and YOLOv9 in Marine Debris Recognition

Section 01

Introduction: Comprehensive Study of YOLOv8 and YOLOv9 in Underwater Marine Debris Detection

This article conducts an in-depth study on underwater object detection, using YOLOv8 and YOLOv9 models with 32 experimental configurations on the TrashCan 1.0 dataset. It evaluates the impact of model variants, category distributions, and learning rates on detection performance, providing technical references for marine environmental protection and underwater robot vision.

Section 02

Research Background and Significance

Oceans cover more than 70% of the Earth's surface, but marine debris pollution has become a global issue—over 8 million tons of plastic waste enter the oceans each year, threatening ecosystems. Traditional underwater monitoring relies on manual diving or simple camera equipment, which is inefficient and costly. Underwater environments face challenges like light attenuation and water turbidity, making conventional computer vision algorithms less effective. Deep learning technologies, especially YOLO series models, offer new possibilities to solve these problems.

Section 03

Experimental Design and Methodology

The study conducts 32 groups of experiments, examining three variables: model variants, dataset structure, and learning rates:

Model Selection: Multiple variants of YOLOv8 (n/s/m/l) and YOLOv9 (t/s/m/c);
Dataset Configuration: The TrashCan1.0 dataset is divided into three categories (debris, animals, ROV) and four categories (adding plants), including training/validation/test images;
Training Parameters: All models are trained for 100 epochs, with learning rates set to two levels: 0.01 and 0.0001.

Section 04

Key Findings and Result Analysis

Positive correlation between model capacity and performance: YOLOv9c performs best on the three-category dataset, while YOLOv8l leads on the four-category dataset;
Impact of category count on detection difficulty: Models perform better on the three-category dataset than the four-category one—simplifying categories can improve accuracy;
Evaluation metrics: High-capacity models have advantages in precision, recall, and mAP metrics, with YOLOv8l and YOLOv9c showing outstanding performance.

Section 05

Practical Application Value

Marine Environmental Protection: Automated detection systems can be deployed on AUVs/ROVs to achieve large-scale, long-term monitoring with low cost and wide coverage;
Underwater Robot Vision: Provides a technical foundation for autonomous navigation, target tracking, etc.;
Expanded Fields: Can be applied to underwater archaeology, marine biology research, underwater facility inspection, etc.

Section 06

Research Limitations and Future Directions

Limitations: The dataset is not included in the code repository, requiring users to obtain it independently, which increases the threshold for reproduction; model weights may need to be provided separately or trained.

Future Directions: Explore lightweight models to adapt to embedded devices; study multi-modal fusion (sonar + optical images); develop online learning mechanisms to adapt to different underwater environments.

Section 07

Conclusion

This study comprehensively evaluates the performance of YOLOv8 and YOLOv9 in underwater object detection through systematic experiments. It finds that model capacity is positively correlated with detection performance, and controlling category complexity has important implications for practical applications. The research results provide references for the development of underwater computer vision systems and contribute to marine environmental protection.

Continue Reading

Keep going with more reads from the same topic.

SignalCut: An Intelligent Tool for Turning AI Search Visibility Gaps into Video Marketing Campaigns

SignalCut is an innovative web application that analyzes brands' visibility gaps in AI search, automatically generates evidence-based marketing strategies, and creates Hera video materials, helping early-stage brands gain a competitive edge in the AI answer engine era.

Recent activity 2026-04-26 11:27

AWS Open-Sources AI Search Citation Analysis System: Track Brand Exposure in AI Search Engines

An open-source project officially released by AWS, built on Amazon Bedrock, Step Functions, and React to form a complete serverless citation analysis system. It helps enterprises monitor their brand's citation status and competitive landscape in AI searches like ChatGPT, Perplexity, Gemini, and Claude.

Recent activity 2026-03-31 20:49

Next.js Application SEO and GEO Integrated Optimization Solution: Comprehensive Visibility from Search Engines to AI Assistants

This article delves into the stevewerme/seo-geo-nextjs project, an open-source tool designed specifically for Next.js applications to simultaneously optimize traditional search engine rankings (SEO) and generative engine visibility (GEO). It analyzes the project's core architecture, implementation mechanisms, practical application scenarios, and its strategic significance for developers and content creators.

Recent activity 2026-04-03 14:48

Baiyuan GEO Platform Technical White Paper: SaaS Engineering Practice for Generative Engine Optimization (GEO)

This article deeply analyzes the GEO Platform technical white paper developed by Baiyuan Technology, covering the seven-dimensional AI citation rate scoring algorithm, AXP shadow document delivery mechanism, Schema.org three-layer entity knowledge graph, and the hallucination automatic detection and repair closed-loop system, providing an engineering solution for brands to gain visibility in generative AI such as ChatGPT and Claude.

Recent activity 2026-04-18 22:54