Driver Emotion Recognition Device with YOLO Facial Keypoints

auroraAA Nov 28.2025

0 11 Easy

1. Project Introduction

1.1 Project Overview

Is a long drive boring? Do you want a companion that can understand your emotions while you're driving and even "save your life" when you doze off? We've created a facial expression recognizer using the UNIHIKER K10! It accurately detects facial keypoints through a key point recognition model based on YOLO object detection, and combines the Mediapipe library to recognize keypoints of the mouth, eyes, etc. in real time, enabling precise identification of your current expression. It will encourage you when you are happy, soothe you when you are angry, and remind you when you feel drowsy during fatigued driving, bringing you an interactive, interesting, and safe driving experience.

The computer communicates with the UNIHIKER K10 via siot. When you face the K10's camera and make an expression such as 'smile', the computer will recognize your current expression and display it. In addition, UNIHIKER K10 will make corresponding responses according to your facial expressions, bringing a full sense of artificial intelligence experience!

1.2 Project Functional Diagrams

1.3 Project Video

2.Materials List

2.1 Hardware list

HARDWARE LIST

1 UNIHIKER K10

Link

1 USB 3.0 to Type-C Cable

Link

2.2 Software

Mind+ Graphical Programming Software (Minimum Version Requirement: V1.8.1 RC1.0)

2.3 Basic Mind+ Software Usage Mind+

(1) Double click to open the Mind

The following screen will be called up.

Click and switch to offline mode.

(2) Load UNIHIKER K10

Based on the previous steps, then click on "Extensions" find the "UNIHIKER K10" module under the "Board" and click to add it. After clicking "Back" you can find the UNIHIKER K10 in the "Command Area" and complete the loading of UNIHIKER K10.

Then,you need to use a USB cable to connect UNIHIKER K10 to the computer.

Then, after clicking Connect Device, click 'COM-UNIHIKER K10' to connect.

Note: The device name of different UNIHIKER K10 may vary, but all end with K10.

In Windows 10/11, UNIHIKER K10 is driver-free. However, for Windows 7, manual driver installation is required: https://www.unihiker.com/wiki/K10/faq/#high-frequency-problem.

The next interface you see is the Mind+ programming interface. Let's see what this interface consists of.

Note: For a detailed description of each area of the Mind+ interface, see the Knowledge Hub section of this lesson.

3. Construction Steps

The project is divided into three main parts:

(1) Task 1: UNIHIKER K10 Networking and Webcam Activation

Connect UNIHIKER K10 through IoT communication and enable the webcam function to establish a visual communication channel with the computer for transmitting video data.

(2) Task 2: Visual Detection and Data Upload

The video frame information containing the human body captured by the UNIHIKER K10 camera is transmitted to the computer. The real-time video stream is processed using the computer's YOLOv8-pose model and MediaPipe library. Meanwhile, connect the UNIHIKER K10 to the MQTT platform and upload the detection results from the computer to the SIoT platform.

The video frame information containing faces captured by the camera of UNIHIKER K10 is transmitted to the computer. The real-time video stream is processed through the computer's YOLOv8-pose model and MediaPipe library. Meanwhile, UNIHIKER K10 is connected to the MQTT platform, and the detection results are uploaded from the computer to the SIoT platform.

(3) Task 3: UNIHIKER K10 Receiving Results and Executing Control

UNIHIKER K10 remotely retrieves inference results from SIoT. Display the most similar oracle bone inscriptions on the screen according to different human body movements, along with introductions to the oracle bone inscriptions.

UNIHIKER K10 remotely retrieves inference results from SIoT. It implements different functions according to different facial expressions.

3.1 Task1: UNIHIKER K10 Networking and Webcam Activation

(1) Hardware Setup

Confirm that the UNIHIKER K10 is connected to the computer via a USB cable.

(2) Software Preparation

Make sure that Mind+ is opened and the UNIHIKER board has been successfully loaded. Once confirmed, you can proceed to write the project program.

(3）Write the Program

UNIHIKER K10 Network Connection and Open Webcam

To enable communication between the computer and UNIHIKER K10, first ensure both devices are connected to the same local network.

First,add MQTT communication and Wi-Fi modules from the extension library. Refer to the diagram for commands.

After UNIHIKER K10 is connected to the network, its camera and networking functions are required to transmit camera information to the local area network (LAN), allowing access from any computer on the LAN at any time. This enables the computer to call many existing computer vision libraries for image recognition. Therefore, next we need to load the webcam library to ensure that the information captured by UNIHIKER K10 can be transmitted to the computer.

Click on "Extensions" in the Mind+ toolbar, enter the "User Ext" category. Input: https://gitee.com/yeezb/k10web-cam in the search bar, and click to load the K10 webcam extension library.

User library link: https://gitee.com/yeezb/k10web-cam

We need to use the "Wi-Fi connect to account (SSID, Password)"command in the network communication extension library to configure Wi-Fi for the UNIHIKER K10 terminal. Please ensure that the Wi-Fi connected to UNIHIKER K10 is the same as that of your computer. At the same time, we also need to use the "Webcam On" function to enable the webcam feature, so that the information captured by the UNIHIKER K10 can be transmitted to the computer.

Once the network connection is successfully established, the "Webamera On" block can be used to transmit the information captured by UNIHIKER K10 to the computer.

Then, we need to determine the IP address of the UNIHIKER K10 in the serial monitor. Use a loop to execute the command five times to display the IP address five times. Then, drag the serial content output block, select string output and enable line wrapping. Finally, obtain the WiFi configuration, select the acquired IP address, and display it once every second.

Click Upload button,when the burning progress reaches 100%, it indicates that the program has been successfully uploaded.

Open the serial port monitor, and you can see the IP of UNIHIKER K10.

When you open IP/stream in your browser, you can view the camera screen. For example, in the IP shown in the above picture, you can enter 192.168.11.72/stream in your browser.

3.2 Task2: Vision Detection and Data Upload

Next, it is necessary to connect the UNIHIKER K10 to the MQTT platform, and we will create an SIoT topic to store the results. A computer will be used to implement facial keypoints detection and analyze expressions based on these keypoints. The detection results will then be sent to SIoT for subsequent processing by the UNIHIKER K10.

(1) Prepare the Computer Environment

First,on our computer, we need to download the Windows version of SIoT_V2, extract it, and double-click start SloT.bat to start SIoT. After starting, a black window will pop up to initialize the server.Critical note: DO NOT close this window during operation, as it will terminate the SIoT service immediately.

Note: For details on downloading SIoT_V2, please refer to: https://drive.google.com/file/d/1qVhyUmvdmpD2AYl-2Cijl-2xeJgduJLL/view?usp=drive_link

After starting SIoT.bat on the computer, initialize the parameters for MQTT in UNIHIKER K10: set the IP address as the local computer's IP, the username as SIoT, and the password as dfrobot.

We need to install the required Python dependencies.Used to realize the recognition and processing of facial keypoints information.Open a new Mind+ window，navigate to the Mode Switch section and select "Python Mode".

In Python Mode, click Code in the toolbar.Navigate to Library Management and the Library Installation page will open for dependency management.

Click "PIP Mode" and run the following commands in sequence to install six libraries including the mediapipe library and the ultralytics library.

CODE

pip insatll mediapipe
pip install ultralytics
numpy
request
opencv-python
opencv-contrib-python

(2) Write the Program

STEP One: Create Topics

Access "Your Computer IP Address:8080", such as"192.168.11.41:8080",in a web browser on your computer.

Enter the username 'SIoT' and password 'dfrobot' to log in to the SIoT IoT platform.

After logging into the SIoT platform, navigate to the Topic section and create a topic: 'emo' (Used for storing control instructions of the UNIHIKER K10 display board ) .Refer to the operations shown in the image below.

Next, we will write the code for facial expression detection. The code includes main functional modules: implementing expression recognition based on the calculation of keypoints positions.

Follow these steps to create a new Python file named "visiondetect.py" in Mind+:In the 'Files in Project' directory of the right sidebar in Mind+, create a new py file named "visiondetect".

STEP Two: Facial Expression Recognition Code

We use the YOLOv8 model to detect facial keypoints, and based on the degree of mouth opening and closing, the aspect ratio and closing time of the eyes, as well as the position of the eyebrows in the image, we send commands to the "emo" topic on the SIoT platform.

CODE

import cv2
import requests
import numpy as np
import mediapipe as mp
import time
import siot
import math
from collections import Counter

# ------ Initialize MediaPipe and SIOT --------
mp_face_mesh = mp.solutions.face_mesh
face_mesh = mp_face_mesh.FaceMesh(
    max_num_faces=1,
    refine_landmarks=True,
    min_detection_confidence=0.5,
    min_tracking_confidence=0.5
)

mp_drawing = mp.solutions.drawing_utils
mp_drawing_styles = mp.solutions.drawing_styles

# Initialize SIOT connection
siot.init(client_id="32867041742986824", server="127.0.0.1", port=1883, user="siot", password="dfrobot")
siot.connect()
siot.loop()
# ------ Parameter Configuration --------
url = 'http://192.168.11.72/stream'  # UNIHIKER K10 IP address
emotion_colors = {
    'Happy': (0, 255, 0),        # Green
    'Neutral': (200, 200, 200),   # Gray
    'Angry': (0, 0, 255),         # Red
    'Tired': (255, 255, 0),       # Yellow
}
# Facial landmark indices for eyes
LEFT_EYE = [362, 385, 387, 263, 373, 380]
RIGHT_EYE = [33, 160, 159, 158, 153, 144]
EMOTION_UPDATE_INTERVAL = 2  # Seconds
MAX_HISTORY = 10  # Number of recent emotions to track
EYE_AR_THRESH = 0.25  # Threshold for eye closure
EYE_CLOSED_DURATION_FOR_FATIGUE = 1.0  # Seconds
last_sent_emotion = None

# ------ Helper Functions --------
def eye_aspect_ratio(eye_points, landmarks, frame_shape):
    """Calculate Eye Aspect Ratio (EAR) to determine eye openness"""
    points = []
    for idx in eye_points:
        x = landmarks[idx].x * frame_shape[1]
        y = landmarks[idx].y * frame_shape[0]
        points.append((x, y))
    
    # Calculate vertical and horizontal distances
    vertical1 = math.dist(points[1], points[5])
    vertical2 = math.dist(points[2], points[4])
    horizontal = math.dist(points[0], points[3])
    
    # Compute EAR
    ear = (vertical1 + vertical2) / (2.0 * horizontal)
    return ear

def detect_fatigue(face_landmarks, frame_shape):
    """Detect fatigue based on eye closure"""
    left_ear = eye_aspect_ratio(LEFT_EYE, face_landmarks, frame_shape)
    right_ear = eye_aspect_ratio(RIGHT_EYE, face_landmarks, frame_shape)
    ear_avg = (left_ear + right_ear) / 2.0
    return ear_avg

def recognize_emotion(face_landmarks, frame_shape, ear_avg, eye_closed_duration):
    """Recognize emotion based on facial landmarks"""
    # Prioritize fatigue detection if eyes are closed long enough
    if eye_closed_duration >= EYE_CLOSED_DURATION_FOR_FATIGUE:
        return 'Tired', 0.95
    
    # Get key facial landmarks
    left_eye = (int(face_landmarks[159].x * frame_shape[1]), int(face_landmarks[159].y * frame_shape[0]))
    right_eye = (int(face_landmarks[386].x * frame_shape[1]), int(face_landmarks[386].y * frame_shape[0]))
    mouth_center = (int(face_landmarks[14].x * frame_shape[1]), int(face_landmarks[14].y * frame_shape[0]))
    
    # Calculate mouth openness
    mouth_openness = abs(face_landmarks[13].y - face_landmarks[14].y) * frame_shape[0]
    
    # Determine emotion based on facial features
    if mouth_openness > 30:
        emotion = 'Happy'
    elif mouth_openness < 8:
        if abs(face_landmarks[67].y - face_landmarks[63].y) * frame_shape[0] < 30:
            emotion = 'Angry'
        else:
            emotion = 'Neutral'
    else:
        if face_landmarks[13].y < face_landmarks[14].y + 0.01 * frame_shape[0]:
            emotion = 'Happy'
        else:
            emotion = 'Neutral'
    
    # Generate confidence score
    confidence = 0.85 + np.random.random() * 0.14
    return emotion, confidence

# ------ Main Processing Loop --------
emotion_history = []
last_emotion = None
last_emotion_time = 0
eye_closed_start_time = None

# Initialize display window
cv2.namedWindow("Facial Emotion & Fatigue Detection", cv2.WINDOW_NORMAL)
cv2.resizeWindow("Facial Emotion & Fatigue Detection", 800, 600)

try:
    # Connect to video stream
    response = requests.get(url, stream=True, timeout=10)
    print("Connected to UNIHIKER video stream")
    
    img_data = b''
    
    while True:
        # Read video stream chunks
        for chunk in response.iter_content(chunk_size=1024):
            if chunk:
                img_data += chunk
                
                # Find JPEG start and end markers
                a = img_data.find(b'\xff\xd8')
                b = img_data.find(b'\xff\xd9')
                
                if a != -1 and b != -1:
                    # Extract and decode JPEG frame
                    jpg = img_data[a:b+2]
                    img_data = img_data[b+2:]
                    img = cv2.imdecode(np.frombuffer(jpg, dtype=np.uint8), cv2.IMREAD_COLOR)
                    
                    if img is not None:
                        # Mirror image and convert to RGB
                        frame = cv2.flip(img, 1)
                        rgb_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
                        
                        # Process frame with Face Mesh
                        results = face_mesh.process(rgb_frame)
                        current_emotion = None
                        confidence = 0
                        ear_avg = None
                        eye_closed_duration = 0.0
                        
                        if results.multi_face_landmarks:
                            for face_landmarks in results.multi_face_landmarks:
                                # Draw facial landmarks
                                mp_drawing.draw_landmarks(
                                    image=frame,
                                    landmark_list=face_landmarks,
                                    connections=mp_face_mesh.FACEMESH_TESSELATION,
                                    landmark_drawing_spec=None,
                                    connection_drawing_spec=mp_drawing_styles.get_default_face_mesh_tesselation_style()
                                )
                                
                                # Detect fatigue
                                landmarks_list = face_landmarks.landmark
                                ear_avg = detect_fatigue(landmarks_list, frame.shape[:2])
                                
                                # Track eye closure duration
                                if ear_avg < EYE_AR_THRESH:
                                    if eye_closed_start_time is None:
                                        eye_closed_start_time = time.time()
                                    else:
                                        eye_closed_duration = time.time() - eye_closed_start_time
                                else:
                                    eye_closed_start_time = None
                                    eye_closed_duration = 0.0
                                
                                # Recognize emotion
                                current_emotion, confidence = recognize_emotion(
                                    landmarks_list, 
                                    frame.shape[:2], 
                                    ear_avg, 
                                    eye_closed_duration
                                )
                                
                                # Calculate face bounding box
                                h, w, _ = frame.shape
                                x_min = min([lm.x for lm in landmarks_list]) * w
                                y_min = min([lm.y for lm in landmarks_list]) * h
                                x_max = max([lm.x for lm in landmarks_list]) * w
                                y_max = max([lm.y for lm in landmarks_list]) * h
                                
                                # Add margin to bounding box
                                margin = 20
                                x_min = max(0, int(x_min) - margin)
                                y_min = max(0, int(y_min) - margin)
                                x_max = min(w, int(x_max) + margin)
                                y_max = min(h, int(y_max) + margin)
                                
                                # Draw bounding box and emotion label
                                color = emotion_colors.get(current_emotion, (200, 200, 200))
                                cv2.rectangle(frame, (x_min, y_min), (x_max, y_max), color, 2)
                                
                                label = f"{current_emotion} ({confidence*100:.1f}%)"
                                cv2.putText(frame, label, (x_min, y_min - 10), 
                                          cv2.FONT_HERSHEY_SIMPLEX, 0.7, color, 2)
                                
                                # Display fatigue information
                                if ear_avg is not None:
                                    cv2.putText(frame, f"EAR: {ear_avg:.2f}", (10, 30), 
                                               cv2.FONT_HERSHEY_SIMPLEX, 0.7, (0, 255, 255), 2)
                                
                                if eye_closed_start_time is not None:
                                    cv2.putText(frame, f"Closed: {eye_closed_duration:.1f}s", (10, 60), 
                                               cv2.FONT_HERSHEY_SIMPLEX, 0.7, (0, 0, 255), 2)
                                
                                # Track emotion history
                                if current_emotion:
                                    emotion_history.append(current_emotion)
                                    if len(emotion_history) > MAX_HISTORY:
                                        emotion_history.pop(0)
                                    
                                    # Determine stable emotion from history
                                    emotion_counts = Counter(emotion_history)
                                    stable_emotion = emotion_counts.most_common(1)[0][0]
                                    
                                    # Check if we should update the emotion
                                    current_time = time.time()
                                    if (current_time - last_emotion_time > EMOTION_UPDATE_INTERVAL) or \
                                       (stable_emotion != last_emotion and current_time - last_emotion_time > 0.5):
                                        print(f"[Emotion] Detected: {stable_emotion}")
                                        last_emotion = stable_emotion
                                        last_emotion_time = current_time
                                        
                                        # Send command via SIOT
                                        if stable_emotion in ['Happy', 'Angry', 'Tired']:
                                            if stable_emotion != last_sent_emotion:
                                                try:
                                                    if stable_emotion == 'Happy':
                                                        siot.publish(topic="siot/emo", data='a')
                                                    elif stable_emotion == 'Angry':
                                                        siot.publish(topic="siot/emo", data='b')
                                                    elif stable_emotion == 'Tired':
                                                        siot.publish(topic="siot/emo", data='c')
                                                        
                                                    print(f"[SIOT] Sent command for {stable_emotion}")
                                                    last_sent_emotion = stable_emotion
                                                except Exception as e:
                                                    print(f"[SIOT Error] {str(e)}")
                                    
                        # Display system status
                        status = f"Face Detection: {'Active' if results.multi_face_landmarks else 'Inactive'}"
                        cv2.putText(frame, status, (10, frame.shape[0] - 20), 
                                   cv2.FONT_HERSHEY_SIMPLEX, 0.6, (255, 255, 255), 2)
                        
                        # Display emotion information
                        if current_emotion:
                            emotion_status = f"Current: {current_emotion} | Stable: {last_emotion if last_emotion else 'None'}"
                            cv2.putText(frame, emotion_status, (10, frame.shape[0] - 50), 
                                       cv2.FONT_HERSHEY_SIMPLEX, 0.6, (255, 255, 255), 2)
                        
                        # Show processed frame
                        cv2.imshow("Facial Emotion & Fatigue Detection", frame)
                    
                    # Check for exit key
                    if cv2.waitKey(1) & 0xFF == ord('q'):
                        response.close()
                        cv2.destroyAllWindows()
                        exit(0)
        
except requests.exceptions.RequestException as e:
    print(f"Stream connection error: {e}")
except Exception as e:
    print(f"Unexpected error: {e}")
finally:
    # Clean up resources
    cv2.destroyAllWindows()
    try:
        response.close()
    except:
        pass

The code adopts a multi-modal feature extraction + rule-based threshold decision method for emotion recognition, mainly combining the following facial features:

1.In most cases, when a person is happy, their mouth opens. Therefore, a happy expression is detected when the distance between the upper and lower lips exceeds 30;

2.An angry expression is detected when the mouth is tightly closed (with the distance between the upper and lower lips less than 8), the eyebrows are furrowed downward, and the vertical distance of the eyebrows is less than 30;

3.If you keep your eyes closed for more than one second, you are probably dozing off, and it will be detected that you are currently very tired;

CODE

# ------ Helper Functions --------
def eye_aspect_ratio(eye_points, landmarks, frame_shape):
    """Calculate Eye Aspect Ratio (EAR) to determine eye openness"""
    points = []
    for idx in eye_points:
        x = landmarks[idx].x * frame_shape[1]
        y = landmarks[idx].y * frame_shape[0]
        points.append((x, y))
    
    # Calculate vertical and horizontal distances
    vertical1 = math.dist(points[1], points[5])
    vertical2 = math.dist(points[2], points[4])
    horizontal = math.dist(points[0], points[3])
    
    # Compute EAR
    ear = (vertical1 + vertical2) / (2.0 * horizontal)
    return ear

def detect_fatigue(face_landmarks, frame_shape):
    """Detect fatigue based on eye closure"""
    left_ear = eye_aspect_ratio(LEFT_EYE, face_landmarks, frame_shape)
    right_ear = eye_aspect_ratio(RIGHT_EYE, face_landmarks, frame_shape)
    ear_avg = (left_ear + right_ear) / 2.0
    return ear_avg

def recognize_emotion(face_landmarks, frame_shape, ear_avg, eye_closed_duration):
    """Recognize emotion based on facial landmarks"""
    # Prioritize fatigue detection if eyes are closed long enough
    if eye_closed_duration >= EYE_CLOSED_DURATION_FOR_FATIGUE:
        return 'Tired', 0.95
    
    # Get key facial landmarks
    left_eye = (int(face_landmarks[159].x * frame_shape[1]), int(face_landmarks[159].y * frame_shape[0]))
    right_eye = (int(face_landmarks[386].x * frame_shape[1]), int(face_landmarks[386].y * frame_shape[0]))
    mouth_center = (int(face_landmarks[14].x * frame_shape[1]), int(face_landmarks[14].y * frame_shape[0]))
    
    # Calculate mouth openness
    mouth_openness = abs(face_landmarks[13].y - face_landmarks[14].y) * frame_shape[0]
    
    # Determine emotion based on facial features
    if mouth_openness > 30:
        emotion = 'Happy'
    elif mouth_openness < 8:
        if abs(face_landmarks[67].y - face_landmarks[63].y) * frame_shape[0] < 30:
            emotion = 'Angry'
        else:
            emotion = 'Neutral'
    else:
        if face_landmarks[13].y < face_landmarks[14].y + 0.01 * frame_shape[0]:
            emotion = 'Happy'
        else:
            emotion = 'Neutral'
    
    # Generate confidence score
    confidence = 0.85 + np.random.random() * 0.14
    return emotion, confidence

Since we need to obtain the camera screen from the UNIHIKER K10, we need to replace it with the IP address of your own UNIHIKER K10 in line 27 of the code.

Copy the code into the "visiondetect.py" file created in Mind+.(Note: The complete "visiondetect.py" file is provided as an attachment for reference)

Click Run to start the program.

When your face fully appears in the window, the program will detect the position of your face.

At this point, if you make different expressions in front of the camera, the results of the expression recognition will be displayed in the video frame.

Monitor the terminal output for real-time status updates, as shown in the figure below.

3.3 Task 3：UNIHIKER K10 Receives Results and Executes Control

Next, we will implement the last function. UNIHIKER K10 will perform corresponding functions based on the facial expression information received from SIoT, such as playing music when angry, keeping a smile when happy, and issuing an alarm when tired.

(1) Material Preparation

Use a TF card and a card reader to copy music to the TF card via a computer, as shown in the image below.

In this way, the corresponding music will play when a recognized facial expression is detected.

Ensure that the UNIHIKER K10 is connected to the computer using a USB cable.Confirm that the UNIHIKER K10 is connected to the computer using a USB cable. Insert the TF card into the card slot of the UNIHIKER K10.

Make sure that Mind+ is opened and the UNIHIKER board has been successfully loaded. Once confirmed, you can proceed to write the complete project program.

(2)Write the Program

STEP One: Subscribe to SIoT Topics on UNIHIKER K10

Add an MQTT subscribe block and in the topic input field, enter "siot/emo".

STEP Two: Match the corresponding picture

Use the "When MQTT message received from topic_0" block to enable the smart terminal to process commands from siot/emo

Use "if-then" blocks and comparison operators to set the functions of the UNIHIKER K10 based on the content of MQTT messages. If the MQTT message is equal to "a", the function for when you are happy will be implemented. If the MQTT message is equal to "b", the function for when you are angry will be implemented. If the MQTT message is equal to "c", the function for when you are tired will be implemented.

Then, use the "Cache Local Images" and "Show Cached Content" modules to display images on the screen of the UNIHIKER K10. When the user is happy, they can see a smiling photo to maintain their good mood; when angry, they can see the words "Anger is the devil" to stabilize their emotions; when tired, they can see the words "Please drive safely" to remind the user that it is time to rest.

Select the pictures in the attachment folder, and then choose the pictures corresponding to "happy", "angry"and "tired" from top to bottom in sequence.

Then use the "Play TF card audio" and "Stop TF card audio" modules. When you are happy, it will say "It seems you're in a good mood today" and play cheerful music; when you are angry, it will say "Anger is a devil" and play music to soothe your mood; when you are tired, it will remind you "Please drive safely" and sound an alarm.

Previously, we have transferred three audio files to the TF card. We need to input the corresponding audio names, which are "hap.wav", "ang.wav", and "tire.wav" in order from top to bottom, and set the playback duration to 5 seconds.

Below is the reference for the complete program.

Click Upload button,when the burning progress reaches 100%, it indicates that the program has been successfully uploaded.

STEP:

(1) Click the Run button to start visiondetect.py.

(2) Face the camera with your body:

- UNIHIKER K10 will capture images, and the facial expression analysis results will be displayed in the video frame.

- Make expressions (such as happiness or tiredness) to implement different functions.

4. Knowledge Hub

4.1 What is the Facial Keypoints Model and What are Its Applications?

A facial keypoints model is a type of computer vision model that uses algorithms to detect and locate key feature points on the human face (such as the eyes, nose, mouth, eyebrow contours, etc.), outputs the coordinates of these points in the image, and thereby quantitatively describes the shape and posture of the human face.

These models serve as the foundation for:

- Facial Expression Analysis and Recognition: By tracking changes in the positions of keypoints (such as the corners of the mouth turning up, eyebrows furrowing), it judges the user's emotions (happiness, anger, tiredness, etc.). It can be applied to intelligent hardware interaction (e.g., expression control of XinaBox), affective computing, user experience optimization, and more.

- Human-Computer Interaction (HCI): It captures the dynamics of facial keypoints (such as eye blinking, mouth opening, head turning) and uses them as interaction commands. It can be used for contactless control (e.g., waking up a device by blinking, switching PPTs via head movements), assistive devices for people with disabilities, and so on.

- Medical and Healthcare Field: By analyzing long-term changes in keypoints (such as facial muscle asymmetry, the rate of wrinkle deepening), it assists in the diagnosis of certain neurological diseases (e.g., facial paralysis) or evaluates the degree of skin aging, sleep quality, and other aspects.

4.2 What are the Methods for Facial Expression Detection? What are their Applications?

The core of facial expression detection is to capture changes in facial features through technical means and determine emotion categories (such as happiness, anger, sadness, tiredness, etc.). It is mainly divided into three types: 1. Detection based on facial keypoints; 2. Detection based on image texture features; 3. End-to-end detection based on deep learning.

In daily life, facial expression detection models have a wide range of applications, and they are the basis for the following practical applications:

- Security and Behavior Monitoring: In-vehicle safety: Detect the driver's facial expressions (such as "yawning" or "frowning"), determine whether the driver is fatigued or distracted, and issue an alarm in a timely manner.

- Human-Computer Interaction and Intelligent Hardware: Capture the player's facial expressions to control the emotions of game characters, or adjust the game difficulty based on expressions (e.g., reduce the difficulty when the player is irritable).

- Education and Training: Detect students' facial expressions in class (such as "focused", "distracted", or "confused"), allowing teachers to adjust the teaching pace in real time or generate personalized learning reports.