A CNN Plastic Detection Model for Embedded Platform of ROV

. Plastic pollution has a negative influence on biodiversity especially in aquatic ecosystems, and it has been labelled as one of the greatest dangers to biota. This paper proposes a Convolutional Neural Networks (CNN) based plastic detection model for the embedded platform to identify different shapes of underwater plastics such as bags, bottles, containers, cups, nets, pipes, ropes, snack wrappers and tarps. The model is optimized for Raspberry Pi using OpenVINO framework, with the intention to produce a cost-effective edge system for a Remote Operating Vehicle (ROV) system. The development of the model utilizes a pre-trained object detection model from YOLOv5 and the TrashCan 1.0 dataset, for training and testing. The final model exhibits a good performance, achieving more than 85% accuracy in the overall prediction, which highlights the model's accuracy and reliability in detecting and classifying underwater plastic shapes. Results from this work highlight the potential of the deep learning (DL) real-time embedded processing at the edge rather by a separate computer on land, using a cost-effective embedded platform.


Introduction
Plastic wastes pose a significant threat to aquatic life as well as human well-being.Most of plastic wastes takes different types and shapes like food wrappers, straws, bottles, and bags, originating from land and end up in the ocean through rivers and streams [1].
Identifying and classifying underwater plastic wastes is a step forward to devise effective strategies for reducing plastic wastes in the underwater environment.For underwater plastic wastes identification and classification some of the challenges are highlighted in [2].The precise quantification of plastic wastes in aquatic ecosystems remains a challenge due to the lack of standardized methodologies and the complex nature of plastic debris [3].Common identification methods including manual sampling and visual inspection are time-consuming and subjective [4].Furthermore, the methods are prone to errors and inconsistencies, limiting our ability to yield an accurate amount based on types and shapes of plastic wastes [5].This may hinder our understanding on the overall quantity of plastic waste entering the ecosystems, consequently impeding the development of effective mitigation strategies [6].
To address these challenges, a system that leverages embedded platform and DL algorithms to detect and classify underwater plastic wastes [7] is needed.One of the aims of this work is to implement a cost-effective plastic detection model on an embedded platform to detect and classify different shapes of underwater plastic wastes.

Related Works
ROVs have become a widely used technology for monitoring underwater marine debris [8].One of the earliest works to use DL for underwater detection based on sonar data, and is able to classify objects namely metal, plastic, rubber, background, glass, and cardboard, with an accuracy of 71% [9].
For underwater image analysis in [10], the study uses DL for identifying marine debris in real-world conditions.Using the actual RGB photographs and recordings from various locations to evaluate CNN with regard to water clarity, years, and different types and sizes of litter, it yields a promising result for real-time detection.Further research validates the result with accuracy rates between 88-95% [11].Several research papers investigate the use of DL for detecting and addressing marine debris.One of the studies [13] achieves 86% success rate utilizing transfer learning with the VGG-16 model to detect airborne debris.They produced a dataset of 12,000 images, categorized into three types of floating trash and able to achieve 99% accuracy.Study in [13] compared object recognition algorithms and found that Faster R-CNN scores the highest accuracy, while YOLOv2 offers a balance between speed and accuracy.
Real-time detection of marine debris is possible using DL algorithms.In the study [14], YOLOv5 outperformed Faster R-CNN in detecting macro debris, although hyperparameter values might have influenced the results.Comparing a CNN with the "Bag of Features" approach [15], the CNN proved faster and more accurate in tracking and identifying floating trash.Furthermore, the project [16] explored the use of LSTM and association rules to identify water contaminants, emphasizing the need to combine it with DL for accurate identification.

System Modelling and Development
The overall process of developing the system involved two stages.The first stage is underwater plastic detection (UwPD) model training, and the second stage is UwPD integration on the embedded platform for inference.

UwPD Model
In the first stage, images from TrashCan 1.0 dataset are selected according to the relevance, and suitability of the model; for training, validation, and testing of the UwPD model.The images underwent a preprocessing stage and segregated into pre-determined classes as shown in Table 1.Then, the objects of interest in the images are carefully annotated with labels and bounding boxes using the Roboflow annotation tool.Afterwards, the annotated images are resized, and their orientations are corrected.This is known as the preprocessing step.The pre-processed dataset covers a diverse range of plastic wastes, allowing the model to learn and differentiate between different shapes and characteristics of underwater plastics.
The dataset also includes observations of animals, plants, and different types of trash, providing a representation of real underwater environments.This way enables the model to adapt to different settings and backgrounds for accurate underwater detection and classification.With total 4095 images, the dataset offers a substantial amount of data, crucial for the UwPD model to learn robust features and patterns related to different shapes of underwater plastic wastes.The class distribution is relatively balanced, with sufficient representation of plastic trash images in different categories amounting to 1683 images.This helps in eliminating bias during model training and ensuring accurate detection and classification across all categories of plastic wastes.Following the preprocessing task, the dataset undergoes training based on the configuration files in the model training framework known as PyTorch.The model training task involves multiple steps including choosing the right object detection model, modifying configuration scripts, and executing the scripts.YOLOv5 algorithm employs CNN to detect objects in real-time.Thus, the YOLOv5 object detection model laid the foundation for the UwPD based on its balance in accuracy, efficiency, and ease of implementation as reported in the previous works [13,14].A minimum 85% accuracy is set as the baseline to achieve an efficient UwPD model.

Embedded Platform
The next stage in the process is the optimization of UwPD model on the embedded platform for inference.One of the goals of this study is to produce a cost-effective prototype with a camera module, that could perform edge processing in the ROV.The embedded platform that meets this criterion is Raspberry Pi.
The mechanical structure of a ROV may include a three-thruster frame arrangement for underwater propulsion.Electronic and propulsion control elements are included, such as a Raspberry Pi companion computer that runs the UwPD model, Pixhawk flight controller, and various power supplies.The control system for the ROV utilizes open-source software and the MAVlink communication protocol.It is worth noting that the focus of this project is to develop the UwPD model and study its performance on the cost-effective embedded platform.It is not the intention of this project to develop a complete ROV.
Running the UwPD model for inference relies on the OpenVINO Runtime.As an embedded platform, the Raspberry Pi runs on a Linux operating system and the runtime provides the execution environment for running the model.First, the model must be converted to the Intermediate Representation (IR) files (.bin and .xml)as input to the OpenVINO Runtime.This task is known as model optimization and the IR is produced by one of the OpenVINO Development Tools known as Model Optimizer.Other tools such as Post-Training Optimization Tool, Open Model Zoo Downloader and Benchmark Tool are provided as well in the development kit.Although pre-trained models are included in the development tools, developers have an option to develop their own models using a machine learning modelling framework.The UwPD model was developed with the latter option.A machine learning application, applying the UwPD as inference could be developed once the OpenVINO runtime is installed on the embedded platform.

Result and Analysis
The UwPD model is evaluated according to three criteria: Confusion Matrix, F1-confidence Curve, and challenging cases.

Confusion Matrix
The confusion matrix serves as a performance evaluation for the UwPD model based on different images from the dataset.As depicted in Fig. 1, the matrix exhibits predicted classes in columns and true classes in rows.The diagonal entries indicate the model's accuracy in predicting true positive rates for each class.Off-diagonal entries represent misclassifications, showing false positive rates of misclassified classes.The 'background' row represents instances not classified in any of the classes.The model performs well in certain classes, such as 'trash_clothing', 'trash_cup', and 'trash_snack_wrapper', exhibiting high precision and recall scores of 100% match.However, in classes like 'trash_net' and 'trash_rope', the results show 71% and 79% respectively, implying lower precision, recall, or higher false positive rates which might be due to the reduced number of images in comparison to other objects.

F1-confidence Curve
The F1-confidence Curve provides insights into the model's performance and confidence estimates.An ideal curve exhibits a steep increase from low to high confidence thresholds, reaching a peak that represents the maximum F1 score.The curve in 0 shows a similar pattern, indicating an optimal balance between precision and recall.The steepness of the curve indicates the ability of model to differentiate between confident and uncertain predictions.From the statement "all classes 0.87 at 0.593" in the figure, for all classes together the model achieved 0.87 F1 score at a confidence threshold of 0.593.This indicates a strong performance between precision and recall for all classes.Overall, the curve reflects the model's accuracy and completeness in its predictions, demonstrating its effectiveness across the range of classes.

Challenging Cases
The UwPD model exhibits good capabilities in detecting objects under several underwater conditions.These conditions often involve objects blending with surroundings as depicted in 0 or being poorly illuminated in 0. Despite these obstacles, the model is able to distinguish objects of interest from the challenging background with 70% and 86% accuracies (Fig. 3), overcoming the camouflage effect caused by blending.The model could also perform under low-light conditions (Fig. 4), able to detect a plastic object at 86% accuracy, that may be poorly illuminated or partially hidden.These results imply its robustness and effectiveness in different underwater scenarios, hence showing its potential in facilitating underwater exploration, and environmental conservation efforts.

Conclusion and Future Work
A CNN based underwater plastic detection model has been implemented and optimized for the embedded platform to identify different shapes of plastic wastes.The results imply the abilities of the model in the detection of various underwater plastic objects, including under challenging conditions.In the future, the expansion of the system by incorporating advanced DL techniques to further improve detection accuracy will facilitate real-time monitoring in real ROV.

Table 1 .
Summarization of dataset according to the classes