01 / 06 Final Year Major Project

Vision-Guided
Fruit Collection
Robot

Overview

Background &
Objective

The fruit collection robot with vision is an intelligent robotics application designed to provide autonomous fallen fruit management in orchards. The system includes a Raspberry Pi 5 microcontroller with Pi Camera Module 3, which is capable of detecting and classifying fallen tomatoes with a mean average precision of 89.1% using a custom trained YOLO11 Nano model.

Development and performance benchmarking of three distinct architectural paradigms that are MobileNetV2 (Efficient CNN), YOLOv11n (State-of-the-art CNN), and ResNet50 (Real-Time Detection Transformer) to determine the optimal balance of precision and latency for edge-based orchard robots. Design of a 3-DOF servo-actuated robotic manipulator optimized for gentle fruit handling without damage.

The fruit collecting manipulator includes a 3 DOF arm fixed atop a 4-wheel driven mobile robot, where picking actions are executed via a parallel-jaw gripper and controlled using geometric inverse kinematics from the Law of Cosines and camera-to-millimeter space coordinate conversion via homographic matrix calibration. The gripper acts upon detecting fruits in an arcade-claw fashion by descending onto the identified target perpendicularly to reliably collect fruit without sideways misalignment. The fallen fruits are automatically sorted into two bins based on their ripeness and quality: edible fruits (ripe and half-ripe tomatoes) are sorted into the left bin, while the remaining (diseased, overripe, rot, and unripe) tomatoes are sorted into the right bin.

The sorting process ensures no contamination of fruits during picking operations. The robot utilizes an Arduino Uno microcontroller as a slave controller responsible for coordinating the gripper motions in a degree-by-degree manner and serially confirming its position with a handshake mechanism preventing joint collision and avoiding high current demand simultaneously by the servo motors and the Li-ion power source. The robot features an interactive diagnostic view available via web browser, which shows in real time the YOLO11 detections, classifications, and confidence scores of the falling fruit targets. This entire system is a viable, economic, and efficient solution to save human effort and reduce the loss of fruits while allowing the orchard to be monitored based on data analysis.

Project Highlights

🏆
Patented Innovation Indian Patent Filed
📄
Research Publication Pending Review
  • Year2026
  • RoleTeam of 4 · Design, Build, Train, Test
  • HardwareRaspberry Pi 5 · Arduino Uno
  • InferenceYOLO11 Nano (custom trained)
  • VisionMonocular Camera + Homography
  • StatusPrototype Complete
Vision-Guided Fruit Collection Robot
89.1%
mAP Score
3-DOF
Manipulator Axes
<200ms
Avg. Inference Time
2cls
Classification Output
Process

How it
was built

01
Problem
Existing orchard robots rely on expensive LiDAR or stereo cameras. The challenge was to achieve reliable 3D fruit localisation using only a single monocular camera on a $80 SBC without sacrificing real-time performance.
02
Approach
Trained a custom YOLO11 Nano model on a dataset of orchard floor images. Used homographic matrix calibration to translate 2D bounding box centres into real-world coordinates, feeding those directly to the manipulator's inverse kinematics solver.
03
Results
Achieved 89.1% mAP across both classes. The robot successfully completed end-to-end pick-and-sort cycles in a controlled orchard environment, demonstrating that monocular vision can substitute stereo depth perception for ground-plane manipulation tasks.
System Design

System
Architecture

The pipeline is split across two boards. The Raspberry Pi 5 handles all perception — running the YOLO11 Nano model at inference, computing the homographic transform, and dispatching pick coordinates over serial UART.

The Arduino Uno receives coordinates and drives three servo motors via a custom inverse kinematics routine, moving the end-effector to the target position, triggering the gripper, and returning to a home pose.

A simple state machine governs the full cycle: Scan → Detect → Classify → Localise → Pick → Sort → Return.

01
Camera Frame Capture
Raspberry Pi 5 · OpenCV · 30 fps
02
YOLO11 Nano Inference
Custom trained · 2-class detection
03
Homographic Localisation
BBox centre → real-world XY via H matrix
04
Coordinate Dispatch
UART serial → Arduino Uno
05
Inverse Kinematics + Pick
3-DOF arm · servo control · gripper
06
Sort & Return Home
Edible / Rejected bin routing
Tech Stack

Built with

Raspberry Pi 5 Arduino Uno YOLO11 Nano OpenCV Python PyTorch Monocular Visual SLAM Homographic Calibration Inverse Kinematics UART Serial Servo Control NumPy
Model Training
Custom dataset collected from real orchard conditions. Annotated with Roboflow, augmented for lighting variance, and trained for 150 epochs achieving stable convergence at 89.1% mAP.
Homographic Calibration
Calibration board used to compute the perspective homography matrix H, mapping pixel coordinates to ground-plane real-world coordinates with sub-centimetre accuracy within the workspace.
Manipulator Design
3-DOF arm with three servo joints. IK solved analytically in 2D given the ground-plane constraint, enabling real-time position updates without a heavy solver library on the Arduino.
Edge Deployment
Full pipeline runs at ~5 fps detection loop on RPi 5 — sufficient for the quasi-static orchard floor domain. No internet required. Boots and runs autonomously on power-on.
Source Code & Docs
See it
on GitHub
View on GitHub ← All Projects