Project 3: Real-time object detection with YOLOv8

Introduction

Background

During the machine learning module, we learned how to build a Convolutional Neural Network (CNN) for object detection, as documented here, and this project aims to explore alternatives for object detection. Object Detection is a computer vision process where models classify objects. We aim at not only classifying them but also localizing them within images or videos through bounding boxes. A popular approach in this domain is the YOLO (You Only Look Once) family of models, renowned for their speed and accuracy, with the latest iteration, YOLOv8, designed to enhance performance, simplify usage, and expand versatility by employing a single-stage architecture that predicts bounding boxes and class probabilities in one pass through the network.

Scope

This project demonstrates a real-time object detection application using Python. The system leverages the YOLO object detection model and a simple tkinter user interface to allow users to start and stop a live webcam feed with object detection enabled.

Deliverables (High-Level Scope)

A simple frontend built with tkinter that allows the user to start and stop the live video feed and also enable and disable object detection.
A backend that uses YOLOv8 to detect objects in real-time.

Out of Scope

Training the model. A pre-built model will be used.
Any options for the user to adjust the model or process via the user interface.
Any other additional feature that is not described in the deliverables above.

High-Level Design