Pan-Tilt Playground: Part 5 - Hand gesture controller

0 17 Medium

In the fifth part of this series I would like to show you how you can move the Pan-Tilt via computer vision. To be a little more precise, your hands are recognized by a camera and the servos are controlled depending on your hand gesture. However, you will need a PC/laptop with a camera for this, as this is not so easy to implement with Unihiker itself. A Python UDP socket is used to transmit the necessary data to Unihiker.

 

Previous tutorials

 

- Pan-Tilt Playground: Part 1 - Build your own Pan-Tilt
- Pan-Tilt Playground: Part 2 - Servo driver creation
- Pan-Tilt Playground: Part 3 - Bluetooth gamepad controller
- Pan-Tilt Playground: Part 4 - Analog Joystick controller

STEP 1
Objective

Control the pan-tilt using hand gestures. You may also have new experiences with computer vision (OpenCV/MediaPipe), UDP sockets and binary data.

HARDWARE LIST
1 Unihiker M10
1 micro:Driver - Driver Expansion Board
2 2Kg 300° Clutch Servos
1 PC/Laptop with camera
STEP 2
Change few files and add one new

There are a just few adjustments to the controllers.ini, main.py and __init__.py files, as well as a new file called: udp_socket.py.

CODE
# change into project root directory
$ cd Pan-Tilt/

# create new file udp_socket.py
$ touch libs/controller/udp_socket.py

The structure of the project (for Unihiker M10) should now look like this:

CODE
# run tree command (optional)
$ tree .
.
|-- config
|   `-- controllers.ini
|-- libs
|   |-- configuration.py
|   |-- controller
|   |   |-- bluetooth.py
|   |   |-- __init__.py
|   |   |-- joystick.py
|   |   `-- udp_socket.py
|   `-- servo.py
`-- main.py

3 directories, 8 files
STEP 3
The new local project

As already mentioned, this time you also need a Python file on your computer which recognizes the hand gestures through your camera and then sends the important data to Unihiker. It is also worth creating a new project with two files for this. Create a folder called: HandGestures and inside it two files called: requirements.txt and main.py.

CODE
# create local project directory
$ mkdir HandGestures

# change into created project directory
$ cd HandGestures

# create empty file requirements.txt
$ touch requirements.txt

# create empty file main.py
$ touch main.py
STEP 4
Content of Unihiker M10 files

I won't go into most of the files and folders created in the previous tutorials. If you have questions or something is unclear, try to answer them through the previous parts.

 

controllers.ini

 

Only a new section for UDP communication is created in the controllers.ini file.

CODE
[bluetooth]
enabled = false
mac = e4:17:d8:12:ea:3b
name = 8BitDo Zero 2 gamepad
btn_a = 304
btn_b = 305
btn_x = 307
btn_y = 308
btn_tl = 310
btn_tr = 311

[joystick]
enabled = false
x_left = 100
x_right = 3995
y_up = 3995
y_down = 100

[udp]
enabled = true
interface = 0.0.0.0
port = 12345

__init__.py

 

A new import and entry for another module is added to the __init__.py file.

CODE
from .bluetooth import BluetoothControllerHandler
from .joystick import JoystickControllerHandler
from .udp_socket import UDPControllerHandler


__version__ = "1.1"
__all__ = ["BluetoothControllerHandler", "JoystickControllerHandler", "UDPControllerHandler"]

udp_socket.py

 

This new class is very similar in structure to the controller classes that have already been created! One method for initialization, one for receiving and processing the UDP data, one for returning it and one to close the UDP socket at exit. If you take a closer look at the example code and read through the DocStrings, you will get along very quickly and understand everything.

CODE
from socket import socket, AF_INET, SOCK_DGRAM
from atexit import register
from threading import Thread, Lock
from struct import unpack
from time import sleep


class UDPControllerHandler:

    __VALUE: int = 30
    __DELAY: float = .0025

    def __init__(self, ip: str, port: int):
        """
        This class initializes a UDP socket server capable of handling events in a
        separate thread.

        :param ip: The IP address on which the UDP socket server should bind.
        :type ip: str
        :param port: The port number on which the UDP socket server should bind.
        :type port: int
        """
        self._btn_status = {'left': False, 'right': False, 'down': False, 'up': False}

        self._sock = socket(AF_INET, SOCK_DGRAM)
        self._sock.bind((ip, port))
        register(self._close)

        self._lock = Lock()
        self._thread = Thread(target=self._event_listener, daemon=True)
        self._thread.start()

    def _close(self) -> None:
        """
        Closes the underlying socket connection, if it exists.

        :return: None
        """
        if self._sock:
            print('[INFO] Closing UDP socket...')
            self._sock.close()

    def _event_listener(self) -> None:
        """
        Listens to events by receiving binary data through a socket, unpacking it into control
        values for `pan` and `tilt`, and updating the button status based on specified threshold
        conditions.

        :return: None
        """
        while True:
            try:
                binary_data, _ = self._sock.recvfrom(2)
                unpacked_values = unpack("2b", binary_data)
                received_dict = {'pan': unpacked_values[0], 'tilt': unpacked_values[1]}

                self._btn_status.update({
                    'right': received_dict['pan'] > self.__VALUE,
                    'left': received_dict['pan'] < -self.__VALUE,
                    'up': received_dict['tilt'] > self.__VALUE,
                    'down': received_dict['tilt'] < -self.__VALUE
                })

                sleep(self.__DELAY)

            except TimeoutError:
                print('[WARNING] No data received. continue looping...')
            except ValueError:
                print('[ERROR] Malformed message received. Skipping...')
                continue

            sleep(self.__DELAY)

    def get_status(self) -> dict:
        """
        Retrieves the current button status.

        :return: The current button status.
        :rtype: dict
        """
        with self._lock:
            return self._btn_status.copy()

main.py

 

The changes in the main.py file this time are minimal! Another entry in the existing import (the controller) as well as very few lines of new code in the If condition.

CODE
from sys import exit
from time import sleep
from typing import Optional
from pinpong.board import Board, Pin
from libs.configuration import load_configuration
from libs.controller import BluetoothControllerHandler, JoystickControllerHandler, UDPControllerHandler
from libs.servo import DFServo300


PAN_PIN: int = 2
PAN_END_ANGLE: int = 180
PAN_START_ANGLE: int = PAN_END_ANGLE // 2
PAN_SERVO_STEP: int = 4
TILT_PIN: int = 3
TILT_END_ANGLE: int = 90
TILT_START_ANGLE: int = TILT_END_ANGLE // 2
TILT_SERVO_STEP: int = 2
LONG_DELAY: float = .5
SHORT_DELAY: float = .015

JOYSTICK_X_PIN: Pin = Pin.P21
JOYSTICK_Y_PIN: Pin = Pin.P22

CONFIGURATION_PATH: str = 'config/controllers.ini'


def move_servo(servo: DFServo300, nth: int, step: int, clockwise: bool = True, delay: Optional[float] = None) -> None:
    """
    Moves the servo motor by a specified step in a given direction, ensuring the
    angle remains within the permissible range.

    :param servo: An instance of the targeted servo motor.
    :type servo: DFServo300
    :param nth: Maximum permissible angle for the servo's operation.
    :type : int
    :param step: Incremental angle to move the servo.
    :type step: int
    :param clockwise: Boolean indicating the direction of rotation (default: True).
    :type clockwise: bool
    :param delay: Optional time in seconds to wait after moving the servo (default: None).
    :type delay: Optional[float]
    :return: None
    """
    current_angle = servo.get_angle()
    new_angle = current_angle + step if clockwise else current_angle - step

    if 0 <= new_angle <= nth and new_angle != current_angle:
        servo.angle(new_angle)

        if delay:
            sleep(delay)


if __name__ == '__main__':
    Board("UNIHIKER").begin()

    config_data = load_configuration(path=CONFIGURATION_PATH)
    enabled_controller = list(config_data.keys())[0]

    if enabled_controller == "bluetooth":
        button_mappings = {
            key: int(value) for key, value in config_data["bluetooth"].items() if key.startswith("btn_")
        }

        controller = BluetoothControllerHandler(name=config_data['bluetooth']['name'],
                                                mac=config_data['bluetooth']['mac'],
                                                buttons=button_mappings)
    elif enabled_controller == "joystick":
        x_y_mappings = config_data['joystick']
        controller = JoystickControllerHandler(pin_x=JOYSTICK_X_PIN,
                                               pin_y=JOYSTICK_Y_PIN,
                                               mapping=x_y_mappings)
    elif enabled_controller == "udp":
        controller = UDPControllerHandler(ip=config_data['udp']['interface'],
                                          port=config_data['udp']['port'])
    else:
        raise ValueError(f"Invalid controller type: {enabled_controller}")

    pan = DFServo300(PAN_PIN)
    tilt = DFServo300(TILT_PIN)

    try:
        print('[INFO] Move to initial position...')
        pan.angle(PAN_START_ANGLE)
        tilt.angle(TILT_START_ANGLE)
        sleep(LONG_DELAY)

        while True:
            controller_status = controller.get_status()

            if controller_status['left'] and not controller_status['right']:
                move_servo(servo=pan, nth=PAN_END_ANGLE, step=PAN_SERVO_STEP)

            if controller_status['right'] and not controller_status['left']:
                move_servo(servo=pan, nth=PAN_END_ANGLE, clockwise=False, step=PAN_SERVO_STEP)

            if controller_status['down'] and not controller_status['up']:
                move_servo(servo=tilt, nth=TILT_END_ANGLE, clockwise=False, step=TILT_SERVO_STEP)

            if controller_status['up'] and not controller_status['down']:
                move_servo(servo=tilt, nth=TILT_END_ANGLE, step=TILT_SERVO_STEP)

            sleep(SHORT_DELAY)

    except KeyboardInterrupt:
        print('[INFO] KeyboardInterrupt triggered...')
    finally:
        print('[INFO] Move to reset position...')
        pan.reset()
        tilt.reset()
        sleep(LONG_DELAY)

        exit(0)
STEP 5
Content of local files

Of course, the still empty files on the local system also need content.

 

requirements.txt

 

This file contains all the necessary Python modules/packages that are required.

CODE
setuptools==75.8.0
wheel==0.45.1
numpy==2.2.2
opencv-python==4.11.0.86
mediapipe==0.10.20
pygame==2.6.1

main.py

 

This file contains the complete Python code for recognizing the gestures, the effects in the video stream (through the camera) and for byte transmission. This makes this file a bit larger, but I didn't want to lose the actual focus. The important constants for you are: WINDOW_SIZE and FPS! Depending on the camera model and size of your monitor, you may need to adjust the values slightly.

CODE
from sys import exit
from signal import signal, SIGINT
from atexit import register
from math import radians, pi
from struct import pack
from socket import socket, AF_INET, SOCK_DGRAM
from typing import Optional, Tuple
from types import FrameType
import numpy as np
import mediapipe as mp
import pygame
import cv2


WINDOW_NAME: str = 'Pan-Tilt Controller'
WINDOW_SIZE: Tuple[int, int] = (1280, 720)
FPS: int = 30
TARGET_IP: str = '10.1.2.3'
TARGET_PORT: int = 12345
FRAME_THRESHOLD: int = 10
BORDER_DISTANCES: int = 25
BLUE_EFFECT_COLOR: tuple = (173, 216, 230)
TICK_COLOR: tuple = (255, 255, 255)


def signal_handler(sig: int, frame: Optional[FrameType]) -> None:
    """
    Handle a signal event and raise a KeyboardInterrupt.

    :param sig: Numeric value of the received signal.
    :type sig: int
    :param frame: Current stack frame when the signal is caught.
    :type frame: Optional[FrameType]
    :return: None
    """
    _ = frame

    print(f'Received signal {sig}. Exiting...')
    raise KeyboardInterrupt


def send_udp_message(message: dict) -> None:
    """
    Sends a UDP message to a target IP and port.

    :param message: The string message to be sent. String will be encoded in UTF-8 before being sent.
    :type message: str
    :return: None
    """
    global sock

    binary_data = pack("2b", message['left'], message['right'])
    print(f'[INFO] Sending UDP message: {binary_data}')

    sock.sendto(binary_data, (TARGET_IP, TARGET_PORT))


def get_finger_coords(video_frame: np.array, hand_detector: mp.solutions.hands.Hands, hands_ref: mp.solutions.hands) -> dict:
    """
    Extracts coordinates of specific finger joints for detected hands in the given video
    frame using MediaPipe Hands.

    :param video_frame: A single video frame represented as a NumPy array.
    :type video_frame: np.array
    :param hand_detector: Hand detection pipeline from the MediaPipe library.
    :type hand_detector: mp.solutions.hands.Hands
    :param hands_ref: Reference model containing hand landmark definitions.
    :type hands_ref: mp.solutions.hands
    :return: A dictionary containing the pixel coordinates of specified finger joints.
    :rtype: dict
    """
    h, w = video_frame.shape[:2]
    results = hand_detector.process(video_frame)

    if not results.multi_hand_landmarks:
        return {'left': {}, 'right': {}}

    finger_joints = {
        'index_tip': hands_ref.HandLandmark.INDEX_FINGER_TIP,
        'index_dip': hands_ref.HandLandmark.INDEX_FINGER_DIP,
        'middle_tip': hands_ref.HandLandmark.MIDDLE_FINGER_TIP,
        'middle_dip': hands_ref.HandLandmark.MIDDLE_FINGER_DIP,
        'middle_mcp': hands_ref.HandLandmark.MIDDLE_FINGER_MCP,
        'ring_tip': hands_ref.HandLandmark.RING_FINGER_TIP,
        'ring_dip': hands_ref.HandLandmark.RING_FINGER_DIP,
        'pinky_tip': hands_ref.HandLandmark.PINKY_TIP,
        'pinky_dip': hands_ref.HandLandmark.PINKY_DIP
    }

    finger_coords = {'left': {}, 'right': {}}

    for idx, hand_landmarks in enumerate(results.multi_hand_landmarks):
        hand_label = results.multi_handedness[idx].classification[0].label.lower()

        if hand_label not in finger_coords:
            continue

        for key, landmark in finger_joints.items():
            lm = hand_landmarks.landmark[landmark]
            finger_coords[hand_label][key] = (int(lm.x * w), int(lm.y * h))

    return finger_coords


def get_angle(point_1: Optional[Tuple[int, int]], point_2: Optional[Tuple[int, int]]) -> Optional[int]:
    """
    Compute the angle in degrees between a horizontal axis and the line connecting two points.

    :param point_1: The first point represented as a tuple of two integers.
    :type point_1: Optional[Tuple[int, int]]
    :param point_2: The second point represented as a tuple of two integers.
    :type point_2: Optional[Tuple[int, int]]
    :return: The angle in degrees as an integer, or None.
    :rtype: Optional[int]
    """
    if not all(isinstance(p, tuple) for p in (point_1, point_2)):
        return None

    direction_vector = np.array(point_1) - np.array(point_2)
    if np.linalg.norm(direction_vector) == 0:
        return None

    return int(np.degrees(np.arctan2(direction_vector[1], direction_vector[0])) + 90)


def get_circle_points(radius: int, num_points: int, angle: int, center: tuple) -> list[tuple]:
    """
    Generates a list of points lying on a circle with a specified radius and number of points.
    The circle can be rotated by a given angle and translated to a specified center.

    :param radius: Radius of the circle.
    :type radius: int
    :param num_points: Number of points to generate on the circle.
    :type num_points: int
    :param angle: Rotation angle of the circle in degrees.
    :type angle: int
    :param center: Coordinates of the center of the circle as a tuple (x, y).
    :type center: tuple[int, int]
    :return: A list of tuples representing the coordinates of the points.
    :rtype: list[tuple]
    """
    angles = np.linspace(0, 2 * np.pi, num_points, endpoint=False)
    circle_points = np.array([[radius * np.cos(a), radius * np.sin(a)] for a in angles])

    rotation_angle = np.radians(angle)
    rotation_matrix = np.array([
        [np.cos(rotation_angle), -np.sin(rotation_angle)],
        [np.sin(rotation_angle), np.cos(rotation_angle)]
    ])

    rotated_points = np.dot(circle_points, rotation_matrix.T)
    translated_points = rotated_points + np.array(center)

    return translated_points.tolist()


def get_circle_lines(radius: int, length: int, scale: int, center: tuple) -> list[tuple]:
    """
    Generates a list of line segments forming a circle pattern, where each line
    is drawn radially outward from the center of the circle.

    :param radius: The radius of the circle (in units).
    :type radius: int
    :param length: The distance from the start point to the endpoint of each line.
    :type length: int
    :param scale: The number of lines to draw, evenly spaced around the circle.
    :type scale: int
    :param center: The coordinates (x, y) of the center of the circle.
    :type center: tuple[int, int]
    :return: A list of tuples, where each tuple contains the coordinates of a line.
    :rtype: list[tuple[float, float, float, float]]
    """
    angles = np.linspace(0, 2 * np.pi, scale, endpoint=False)

    lines = []
    for angle in angles:
        start_x = center[0] + radius * np.cos(angle)
        start_y = center[1] + radius * np.sin(angle)

        end_x = center[0] + (radius + length) * np.cos(angle)
        end_y = center[1] + (radius + length) * np.sin(angle)

        lines.append((start_x, start_y, end_x, end_y))

    return lines


def map_angle_to_y(angle: int, position: Tuple[int, int], height: int) -> int:
    """
    Maps an angle to a corresponding y-coordinate within a vertical space defined by height.

    :param angle: The angle to map to the vertical space from -90 to 90.
    :type angle: int
    :param position: The (x, y) coordinates of the base position within the target space.
    :type position: Tuple[int, int]
    :param height: The total vertical height of the space within which the mapping is performed.
    :type height: int
    :return: The calculated y-coordinate.
    :rtype: int
    """
    return int(position[1] + (height // 2) - (angle / 90) * (height // 2))


def is_fist_active(coords: dict) -> bool:
    """
    Determines whether the fist gesture is active based on the provided finger
    landmark coordinates.

    :param coords: Dictionary containing coordinates of finger landmarks.
    :type coords: dict
    :return: Return if the fist gesture is active.
    :rtype: bool
    """
    try:
        return all(coords[f"{finger}_tip"][1] < coords[f"{finger}_dip"][1]
                   for finger in ("index", "middle", "ring", "pinky"))
    except (KeyError, TypeError, IndexError):
        return False


def draw_arc(display: pygame.Surface, position: Tuple[int, int], radius: int, rotation: int) -> None:
    """
    Draws a 270-degree arc on the given display surface.

    :param display: The Pygame Surface on which the arc will be drawn.
    :type display: pygame.Surface
    :param position: The (x, y) coordinates of the center of the arc.
    :type position: Tuple[int, int]
    :param radius: The radius of the arc.
    :type radius: int
    :param rotation: The starting angle for the arc in degrees.
    :type rotation: int
    :return: None
    """
    rotation_radians = radians(rotation)
    start_angle = rotation_radians
    end_angle = start_angle + (3 * pi / 2)

    rect = pygame.Rect(position[0] - radius, position[1] - radius, radius * 2, radius * 2)
    pygame.draw.arc(display, BLUE_EFFECT_COLOR, rect, start_angle, end_angle, 1)


def draw_finger_information(display: pygame.Surface, position: Tuple[int, int], coords: dict) -> None:
    """
    Draws a visual representation of finger-related information on a Pygame display surface.

    :param display: The Pygame display surface where the information panel will be drawn.
    :type display: pygame.Surface
    :param position: The (x, y) position of the top-left corner.
    :type position: Tuple[int, int]
    :param coords: A dictionary containing finger-related data.
    :type coords: dict
    :return: None
    """
    width: int = 90
    height: int = 320
    corner: int = 15
    padding: int = 5

    rect_pos = (position[0], position[1], width, height)
    line_start_pos = (position[0], position[1] + height // 2 - 5)
    line_end_pos = (position[0] + width, position[1] + height // 2 - 5)
    values = "\n".join(str(v) for key in coords for v in coords[key].values())
    lines = values.split("\n")
    y_offset = position[1] + padding

    pygame.draw.rect(display, BLUE_EFFECT_COLOR, rect_pos, 1, border_radius=corner)
    pygame.draw.line(display, BLUE_EFFECT_COLOR, line_start_pos, line_end_pos, 1)

    for line in lines:
        text_surface = font.render(line, True, TICK_COLOR)
        display.blit(text_surface, (position[0] + padding, y_offset))
        y_offset += text_surface.get_height() + padding


def draw_angle_information(display: pygame.Surface, position: Tuple[int, int], degrees: dict) -> None:
    """
    Draws an overlay of information on the provided display using graphical representations of a bounding box
    and various elements based on the given position and angles.

    :param display: Pygame surface on which the graphical elements are drawn.
    :type display: pygame.Surface
    :param position: Tuple indicating the x, y coordinates where the graphical information will be drawn.
    :type position: Tuple[int, int]
    :param degrees: Dictionary containing 'left' and 'right' keys representing angles in degrees.
    :type degrees: dict
    :return: None
    """
    width: int = 350
    height: int = 150
    alpha: int = 100
    corner: int = 15
    factor: int = 25

    pan_angle = degrees['left']
    tilt_angle = degrees['right']

    line_start_pos = (position[0], position[1] + height // 2)
    pan_circle_center = (position[0] + 120, map_angle_to_y(pan_angle, position, height))
    tilt_circle_center = (position[0] + 240, map_angle_to_y(tilt_angle, position, height))
    line_end_pos = (position[0] + width, position[1] + height // 2)

    rect = pygame.Rect(position[0], position[1], width, height)
    rect_surface = pygame.Surface(rect.size, pygame.SRCALPHA)
    pygame.draw.rect(rect_surface, (*BLUE_EFFECT_COLOR, alpha), rect_surface.get_rect(), border_radius=corner)
    pygame.draw.rect(rect_surface, TICK_COLOR, rect_surface.get_rect(), 1, border_radius=corner)

    display.blit(rect_surface, rect.topleft)

    line_surface = pygame.Surface((width, height), pygame.SRCALPHA)

    for i in range(1, width // factor):
        v_start_pos = (i * factor, 0)
        v_end_pos = (i * factor, height)
        pygame.draw.line(line_surface, (*TICK_COLOR, alpha // 2), v_start_pos, v_end_pos, 1)

    for i in range(1, height // factor):
        h_start_pos = (0, i * factor)
        h_end_pos = (width, i * factor)
        pygame.draw.line(line_surface, (*TICK_COLOR, alpha // 2), h_start_pos, h_end_pos, 1)

    display.blit(line_surface, position)

    pygame.draw.line(display, TICK_COLOR, line_start_pos, pan_circle_center, 1)
    pygame.draw.circle(display, TICK_COLOR, pan_circle_center, 5)
    pygame.draw.circle(display, TICK_COLOR, pan_circle_center, 10, 1)
    pygame.draw.line(display, TICK_COLOR, pan_circle_center, tilt_circle_center, 1)
    pygame.draw.circle(display, TICK_COLOR, tilt_circle_center, 5)
    pygame.draw.circle(display, TICK_COLOR, tilt_circle_center, 10, 1)
    pygame.draw.line(display, TICK_COLOR, tilt_circle_center, line_end_pos, 1)


def draw_display_scale(display: pygame.Surface) -> None:
    """
    Draws a display scale with ticks and boundary lines on a given pygame surface.

    :param display: The pygame surface where the scale will be drawn.
    :type display: pygame.Surface
    :return: None
    """
    tick_spacing: int = 2
    base_tick_length: int = 5
    long_tick_increment: int = 5

    left_outer_line_start = (BORDER_DISTANCES, BORDER_DISTANCES)
    left_outer_line_end = (BORDER_DISTANCES, display.get_height() - BORDER_DISTANCES)
    left_small_line_end = (BORDER_DISTANCES * 4, BORDER_DISTANCES)
    right_outer_line_start = (display.get_width() - BORDER_DISTANCES, BORDER_DISTANCES)
    right_outer_line_end = (display.get_width() - BORDER_DISTANCES, display.get_height() - BORDER_DISTANCES)
    right_small_line_end = (display.get_width() - BORDER_DISTANCES * 4, BORDER_DISTANCES)
    start_x = BORDER_DISTANCES
    end_x = display.get_width() - BORDER_DISTANCES
    y_position = display.get_height() - BORDER_DISTANCES
    tick_x = start_x
    tick_index = 0

    while tick_x <= end_x:
        tick_length = base_tick_length + (long_tick_increment if tick_index % 5 == 0 else 0)
        start_pos = (tick_x, y_position - tick_length)
        end_pos = (tick_x, y_position)
        pygame.draw.line(display, TICK_COLOR, start_pos, end_pos)

        tick_x += tick_spacing
        tick_index += 1

    pygame.draw.line(display, TICK_COLOR, left_outer_line_start, left_outer_line_end, 1)
    pygame.draw.line(display, TICK_COLOR, left_outer_line_start, left_small_line_end, 1)
    pygame.draw.line(display, TICK_COLOR, right_outer_line_start, right_outer_line_end, 1)
    pygame.draw.line(display, TICK_COLOR, right_outer_line_start, right_small_line_end, 1)


def draw_hand_effect(display: pygame.Surface, center: Tuple[int, int], tip: Tuple[int, int], angle: int) -> None:
    """
    Draws a visual hand effect on the given display surface.

    :param display: The surface where the hand effect will be drawn.
    :type display: pygame.Surface
    :param center: The (x, y) coordinate of the hand center.
    :type center: Tuple[int, int]
    :param tip: The (x, y) coordinate of the tip of the hand, determining the radius.
    :type tip: Tuple[int, int]
    :param angle: The angle of the hand in degrees, affects rotation of the effect.
    :type angle: int
    :return: None
    """
    circle_center = center
    circle_radius = int(np.sqrt((tip[0] - center[0]) ** 2 + (tip[1] - center[1]) ** 2)) - 25
    circle_angle = angle
    angle_radians = radians(angle - 90)
    line_end_x = circle_center[0] + int(circle_radius * np.cos(angle_radians))
    line_end_y = circle_center[1] + int(circle_radius * np.sin(angle_radians))
    lines = get_circle_lines(circle_radius + 15, 14, 90, circle_center)
    points = get_circle_points(circle_radius - 20, 50, angle, circle_center)
    end_point = (circle_center[0], display.get_height() - BORDER_DISTANCES)

    pygame.draw.circle(display, BLUE_EFFECT_COLOR, circle_center, 5)
    pygame.draw.circle(display, BLUE_EFFECT_COLOR, circle_center, circle_radius - 40, 2)

    draw_arc(display, circle_center, circle_radius - 30, circle_angle - 45)

    for point in points:
        pygame.draw.circle(screen, BLUE_EFFECT_COLOR, (int(point[0]), int(point[1])), 2)

    draw_arc(display, circle_center, circle_radius - 10, circle_angle + 135)

    pygame.draw.circle(display, BLUE_EFFECT_COLOR, circle_center, circle_radius, 2)
    pygame.draw.line(display, BLUE_EFFECT_COLOR, circle_center, (line_end_x, line_end_y), 2)

    pygame.draw.circle(display, BLUE_EFFECT_COLOR, (line_end_x, line_end_y), 5)

    for line in lines:
        start = (int(line[0]), int(line[1]))
        end = (int(line[2]), int(line[3]))
        pygame.draw.line(screen, TICK_COLOR, start, end, 1)

    pygame.draw.line(display, BLUE_EFFECT_COLOR, circle_center, end_point, 2)


def main() -> None:
    """
    The `main` function serves as the primary execution loop for the application, handling
    video frame capturing, processing of hand gestures, rendering visual output to the display,
    and managing user input/events.

    :return: None
    """
    global screen, clock, hands, mp_hands

    frame_count = {'left': 0, 'right': 0}
    angles = {'left': 0, 'right': 0}
    running = True

    while running:
        for event in pygame.event.get():
            if event.type == pygame.QUIT:
                running = False
            if event.type == pygame.KEYDOWN:
                if event.key == pygame.K_ESCAPE:
                    running = False
                if event.key == pygame.K_q:
                    running = False

        ret, frame = camera.read()

        if not ret:
            print('[ERROR] Failed to grab video frame]')
            break

        if frame is None or frame.size == 0:
            print('[WARNING] Empty video frame. Skipping...')
            continue

        frame = cv2.flip(frame, flipCode=1)
        frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)

        if frame.dtype != np.uint8:
            frame = frame.astype(np.uint8)

        frame = frame.copy()
        hand_coords = get_finger_coords(video_frame=frame, hand_detector=hands, hands_ref=mp_hands)

        frame_surface = pygame.image.frombuffer(frame.tobytes(), frame.shape[1::-1], "RGB")
        screen.blit(frame_surface, dest=(0, 0))

        for hand_side in ('left', 'right'):
            if hand_coords[hand_side]:
                if is_fist_active(hand_coords[hand_side]):
                    if all(k in hand_coords[hand_side] for k in ['middle_tip', 'middle_dip', 'middle_mcp']):
                        frame_count[hand_side] += 1
                        tip_coords = hand_coords[hand_side]['middle_tip']
                        center_coords = hand_coords[hand_side]['middle_mcp']
                        angle = get_angle(
                            point_1=hand_coords[hand_side]['middle_tip'],
                            point_2=hand_coords[hand_side]['middle_dip']
                        )
                        angles[hand_side] = angle

                        if frame_count[hand_side] >= FRAME_THRESHOLD:
                            draw_hand_effect(display=screen, center=center_coords, tip=tip_coords, angle=angle)
                    else:
                        frame_count[hand_side] = 0
                else:
                    angles = {'left': 0, 'right': 0}
                    frame_count[hand_side] = 0
            else:
                frame_count[hand_side] = 0

        draw_display_scale(display=screen)
        draw_finger_information(display=screen, position=(50, 50), coords=hand_coords)
        draw_angle_information(display=screen, position=(screen.get_width() - 400, 50), degrees=angles)

        send_udp_message(message=angles)

        pygame.display.flip()
        clock.tick(FPS)


if __name__ == '__main__':
    # Signal
    signal(SIGINT, signal_handler)

    # PyGame
    pygame.init()
    screen = pygame.display.set_mode(WINDOW_SIZE)
    pygame.display.set_caption(WINDOW_NAME)
    clock = pygame.time.Clock()
    font = pygame.font.SysFont('courier', 12)

    # OpenCV camera
    camera = cv2.VideoCapture(0)
    camera.set(cv2.CAP_PROP_FRAME_WIDTH, WINDOW_SIZE[0])
    camera.set(cv2.CAP_PROP_FRAME_HEIGHT, WINDOW_SIZE[1])

    if not camera.isOpened():
        print('[ERROR] Cannot open camera. Stopping program...')
        exit(1)

    # Mediapipe hands
    mp_hands = mp.solutions.hands
    hands = mp_hands.Hands(static_image_mode=False,
                           max_num_hands=2,
                           min_detection_confidence=0.75,
                           min_tracking_confidence=0.75)

    # UDP socket
    sock = socket(AF_INET, SOCK_DGRAM)

    # Ensure cleanup on exit
    register(lambda: (sock.close(), camera.release(), cv2.destroyAllWindows()))

    try:
        main()
    except KeyboardInterrupt:
        print('Received KeyboardInterrupt. Exiting...')
    finally:
        print('Stopping application...')
        sock.close()
        pygame.quit()
        camera.release()
        cv2.destroyAllWindows()
        exit(0)
STEP 6
Installation of the necessary

It is recommended to create/use a Python virtualenv and install the modules/packages in it, as well as executing the Python program.

CODE
# create virtualenv (only one time)
$ python -m venv .venv

# activate virtualenv
$ source .venv/bin/activate

# install modules/packages (only one time)
(.venv) $ pip3 install -r requirements.txt

# run Python application
(.venv) $ python3 main.py

Then start the application on the Unihiker M10 and control the Pan-Tilt with your hand gestures.

STEP 7
Annotations

Debug UDP traffic

 

You can also make the UDP transmission visible in a terminal! For example through Wireshark or TCPDump. Here is an example for TCPDump:

CODE
# tcpdump UDP (optional)
$ sudo tcpdump -i [YOUR INTERFACE] udp port 12345 -X

15:25:28.000080 IP 10.1.2.101.57396 > 10.1.2.3.italk: UDP, length 2
	0x0000:  4500 001e 38a0 0000 4011 29c6 0a01 0265  E...8...@.)....e
	0x0010:  0a01 0203 e034 3039 000a d702 0000       .....409......

Did you find the bytes that are the two integers of the transmission? I won't give anything away and leave you alone with the answer. Because the values ​​are not encrypted.

 

I hope you've had a lot of fun with the tutorials so far and learned something new. On Instagram you can find a reel about. See you next time.

License
All Rights
Reserved
licensBg
0