Fin-telligence – Fish Detection in Fishing Videos Using YOLOv5

Objective

The primary objective of this project was to develop a machine learning model capable of detecting fish in fly fishing videos. The goal was to identify scenes where a fish is in hand, clip these scenes, and save the relevant segments for a TikTok channel.

Introduction

Fly fishing videos often contain valuable moments where anglers hold their catch. Identifying these moments manually can be time-consuming. This project leverages computer vision and deep learning to automate the detection of fish in these videos, enhancing the efficiency of content creation for social media platforms like TikTok.

Technical Setup

Tools and Technologies

• Python: Programming language used for scripting.

• YOLOv5: State-of-the-art object detection model.

• OpenCV: Library for computer vision tasks.

• LabelImg: Tool for annotating images.

• FFmpeg: Tool for processing video files.

• Torch: Deep learning library.

Training Process

•Trained the model using the following command:

python train.py --img 640 --batch 16 --epochs 50 --data /path/to/dataset.yaml --cfg models/yolov5s.yaml --weights yolov5s.pt --name fish_detector

Environment Setup

1. Install Required Libraries:

pip install torch torchvision opencv-python labelImg ffmpeg-python

2. Set Up YOLOv5:

Testing and Evaluation

git clone https://github.com/ultralytics/yolov5
cd yolov5
pip install -r requirements.txt

Data Collection and Preparation

Data Sources
•Collected various fishing images and videos from personal recordings and public datasets.

Data Annotation
•Used LabelImg to annotate images with bounding boxes for fish, hands, fishing rods, and other fishing equipment.
•Saved annotations in YOLO format.

Data Augmentation
•Applied techniques such as flipping, rotating, and adjusting brightness/contrast to increase dataset robustness.

Model Training

Configuration
Created a dataset.yaml file to define the dataset:

train: /path/to/your/dataset/train/images
val: /path/to/your/dataset/val/images
nc: 3  # number of classes
names: ['fish', 'hand', 'fishing_rod']

Evaluation Metrics

• Evaluated model performance using metrics like precision, recall, and confidence thresholds.

• Adjusted the model based on test results and retrained with additional data as necessary.

Results

• Initial results showed a high number of false positives, likely due to confusion between fish and other objects like hands and fishing rods.

Iterations and Improvements

Identified Issues

• High rate of false positives due to model confusion.

• Poor video quality affecting detection accuracy.

Solutions Implemented

• Collected and annotated more diverse images, including negative samples (images without fish).

• Retrained the model to improve its ability to distinguish between fish and other objects.

Conclusion and Reflections

Learnings

• Importance of a diverse and well-annotated dataset in training accurate models.

• Need for high-quality data to enhance model performance.

• Iterative process of training, testing, and refining is crucial in developing robust machine learning models.

Future Work

• Further expand the dataset to include more variations of fish and different environments.

• Experiment with different model architectures and hyperparameters to improve detection accuracy.

• Explore real-time detection capabilities for live video feeds.

Visuals and Examples

Video Clips

Code Snippets

#reelhooked.py - Automated Fish Detection and Clipping for Fly Fishing Videos
import os
import subprocess
import torch
import cv2
import numpy as np
# Directory containing the videos
video_dir = "/Users/XXXX/Desktop/fishing"
output_dir = video_dir  # Output to the same directory
temp_dir = os.path.join(video_dir, "temp")
detections_dir = os.path.join(video_dir, "detections")
if not os.path.exists(output_dir):
    os.makedirs(output_dir)
if not os.path.exists(temp_dir):
    os.makedirs(temp_dir)
if not os.path.exists(detections_dir):
    os.makedirs(detections_dir)
# Full path to FFmpeg
FFMPEG_PATH = "/opt/homebrew/bin/ffmpeg" 
# Load YOLO
print("Loading YOLO...")
model_path = '/Users/XXXX/Desktop/fishing/yolov5/runs/train/exp/weights/best.pt'  # 
if not os.path.exists(model_path):
    raise FileNotFoundError(f"Model weights not found at {model_path}")
model = torch.hub.load('ultralytics/yolov5', 'custom', path=model_path)
model.eval()
print("YOLO loaded successfully.")
def remove_audio(video_path, temp_path):
    command = [
        FFMPEG_PATH,
        "-i", video_path,
        "-c", "copy",
        "-an",  # Remove audio stream
        temp_path
    ]
    print(f"Running ffmpeg command to remove audio: {' '.join(command)}")
    subprocess.run(command, check=True)
# Function to process each video
def process_video(video_path, output_dir):
    temp_video_path = os.path.join(temp_dir, os.path.basename(video_path))
    remove_audio(video_path, temp_video_path)
    print(f"Processing video: {temp_video_path}")
    cap = cv2.VideoCapture(temp_video_path)
    fps = int(cap.get(cv2.CAP_PROP_FPS))
    frame_count = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
    duration = frame_count / fps
    fish_in_hand_times = []
    while cap.isOpened():
        ret, frame = cap.read()
        if not ret:
            break
        current_time = cap.get(cv2.CAP_PROP_POS_MSEC) / 1000.0  # current time in seconds
        # Detect fish in hand
        fish_detected, confidence, bbox = detect_fish_in_hand(frame)
        if fish_detected:
            fish_in_hand_times.append(current_time)
            print(f"Fish detected at {current_time} seconds with confidence {confidence} and bounding box {bbox}.")
            # Save the frame with detection
            save_detection_frame(frame, bbox, current_time)
    cap.release()
    # Clip video segments where fish was detected
    for i, fish_in_hand_time in enumerate(fish_in_hand_times):
        start_time = max(0, fish_in_hand_time - 20)  # Go back 20 seconds
        end_time = fish_in_hand_time
        print(f"Clipping video from {start_time} to {end_time} seconds.")
        # Cut the video using ffmpeg
        input_filename = os.path.basename(video_path)
        output_filename = os.path.join(output_dir, f"clip_{i}_{input_filename}")
        command = [
            FFMPEG_PATH,
            "-i", temp_video_path,
            "-ss", str(start_time),
            "-to", str(end_time),
            "-c:v", "copy",  # Copy video stream
            output_filename
        ]
        print(f"Running ffmpeg command: {' '.join(command)}")
        subprocess.run(command, check=True)
        print(f"Video clip saved: {output_filename}")
def detect_fish_in_hand(frame):
    img = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
    results = model(img)
    labels, coords = results.xyxyn[0][:, -1].numpy(), results.xyxyn[0][:, :-1].numpy()
    n = len(labels)
    class_ids = []
    confidences = []
    boxes = []
    for i in range(n):
        if labels[i] == 0:  # Assuming 'fish' is class 0
            x1, y1, x2, y2, conf = coords[i]
            if conf > 0.3:  # Lower confidence threshold
                width, height = frame.shape[1], frame.shape[0]
                x1, y1, x2, y2 = int(x1 * width), int(y1 * height), int(x2 * width), int(y2 * height)
                boxes.append([x1, y1, x2 - x1, y2 - y1])
                confidences.append(conf)
                class_ids.append(labels[i])
    print(f"Detections: {len(boxes)}")
    if len(boxes) > 0:
        return True, max(confidences), boxes[0]  # Return the first detection for simplicity
    return False, None, None
def save_detection_frame(frame, bbox, current_time):
    x, y, w, h = bbox
    color = (0, 255, 0)  # Green bounding box
    thickness = 2
    cv2.rectangle(frame, (x, y), (x + w, y + h), color, thickness)
    output_path = os.path.join(detections_dir, f"detection_{current_time:.2f}.jpg")
    cv2.imwrite(output_path, frame)
    print(f"Saved detection frame at {output_path}")
# Process all videos in the directory
print("Starting video processing...")
for video_file in os.listdir(video_dir):
    if video_file.endswith(".mp4") or video_file.endswith(".avi"):
        video_path = os.path.join(video_dir, video_file)
        process_video(video_path, output_dir)
print("Video processing completed.")

Fin-telligence – Fish Detection in Fishing Videos Using YOLOv5

Leave a Reply Cancel reply

Start your journey right now! Build unique websites with our creative tool.

Related Projects

Automating QA for Learning Content

Space Invaders – Game-Based Learning and Advanced Analytics

www.eslkidsgames.com

Leave a Reply Cancel reply

Start your journey right now! Build unique websites with our creative tool.