Introduction
This project is to connect an external USB camera on UNIHIKER and use the camera to detect the target and track it.
Project Objectives
Learn how to detect a target and track it using OpenCV's Lucas-Kanade tracker method.
Software
Practical Process
1. Hardware Setup
Connect the camera to the USB port of UNIHIKER.
Connect the UNIHIKER board to the computer via USB cable.
2. Software Development
Step 1: Open Mind+, and remotely connect to UNIHIKER.
Step 2: Find a folder named "AI" in the "Files in UNIHIKER". And create a folder named "OpenCV lk_track Target Tracking based on UNIHIKER" in this folder. Import the dependency files for this lesson.
Step3: Create a new project file in the same directory as the above file and name it "main.py".
Sample Program:
#!/usr/bin/env python
'''
Lucas-Kanade tracker
====================
Lucas-Kanade sparse optical flow demo. Uses goodFeaturesToTrack
for track initialization and back-tracking for match verification
between frames.
Usage
-----
lk_track.py [<video_source>]
Keys
----
ESC - exit
'''
# Python 2/3 compatibility
from __future__ import print_function
import numpy as np
import cv2 as cv
import video
from common import anorm2, draw_str
lk_params = dict( winSize = (15, 15),
maxLevel = 2,
criteria = (cv.TERM_CRITERIA_EPS | cv.TERM_CRITERIA_COUNT, 10, 0.03))
feature_params = dict( maxCorners = 500,
qualityLevel = 0.3,
minDistance = 7,
blockSize = 7 )
class App:
def __init__(self, video_src):
self.track_len = 10
self.detect_interval = 5
self.tracks = []
self.cam = video.create_capture(video_src)
self.frame_idx = 0
cv.namedWindow('lk_track',cv.WND_PROP_FULLSCREEN)
cv.setWindowProperty('lk_track', cv.WND_PROP_FULLSCREEN, cv.WINDOW_FULLSCREEN)
def run(self):
while True:
_ret, frame = self.cam.read()
frame_gray = cv.cvtColor(frame, cv.COLOR_BGR2GRAY)
vis = frame.copy()
if len(self.tracks) > 0:
img0, img1 = self.prev_gray, frame_gray
p0 = np.float32([tr[-1] for tr in self.tracks]).reshape(-1, 1, 2)
p1, _st, _err = cv.calcOpticalFlowPyrLK(img0, img1, p0, None, **lk_params)
p0r, _st, _err = cv.calcOpticalFlowPyrLK(img1, img0, p1, None, **lk_params)
d = abs(p0-p0r).reshape(-1, 2).max(-1)
good = d < 1
new_tracks = []
for tr, (x, y), good_flag in zip(self.tracks, p1.reshape(-1, 2), good):
if not good_flag:
continue
tr.append((x, y))
if len(tr) > self.track_len:
del tr[0]
new_tracks.append(tr)
cv.circle(vis, (int(x), int(y)), 2, (0, 255, 0), -1)
self.tracks = new_tracks
cv.polylines(vis, [np.int32(tr) for tr in self.tracks], False, (0, 255, 0))
draw_str(vis, (20, 20), 'track count: %d' % len(self.tracks))
if self.frame_idx % self.detect_interval == 0:
mask = np.zeros_like(frame_gray)
mask[:] = 255
for x, y in [np.int32(tr[-1]) for tr in self.tracks]:
cv.circle(mask, (x, y), 5, 0, -1)
p = cv.goodFeaturesToTrack(frame_gray, mask = mask, **feature_params)
if p is not None:
for x, y in np.float32(p).reshape(-1, 2):
self.tracks.append([(x, y)])
self.frame_idx += 1
self.prev_gray = frame_gray
cv.imshow('lk_track', vis)
ch = cv.waitKey(1)
if ch == 27:
break
def main():
import sys
try:
video_src = sys.argv[1]
except:
video_src = 0
App(video_src).run()
print('Done')
if __name__ == '__main__':
print(__doc__)
main()
cv.destroyAllWindows()
3. Run and Debug
Step 1: Run the main program
Running the program "main.py", you can see that initially the screen shows the real-time image captured by the camera. Put the cart into the screen, the cart will be detected and marked with a green dot. Gently move the car, you can see that the green dot moves with the car and leaves a trajectory. The effect of target tracking is realized.
4. Program Analysis
In the above "main.py" file, we mainly use OpenCV library to call the camera and get the real-time video stream. Detect feature points in each frame of the video and then use the optical flow method to track these feature points and display the tracking results. The detection and tracking of feature points is done in an infinite loop until the user chooses to exit the program. The overall process is as follows.
1. Video source acquisition and processing: the program first opens the specified video source, which can be a camera or a video file. Then it reads the video in frames and processes each frame.
2. Feature point detection and tracking: for each frame, the program first checks whether feature point detection is needed. If it is needed, new feature points are detected in the current frame and these points are added to the tracking point list. Then, the program tracks the existing tracking points in the list, regardless of whether new feature points are detected or not. The process of tracking is to calculate the motion of these points between the current frame and the previous frame using the Lucas-Kanade optical flow method to get the position of these points in the current frame.
3. Result verification and display: after the tracking is completed, the program performs a verification using reverse optical flow to ensure the accuracy of the tracking. Then the positions of the tracked points are plotted on a copy of the original video frame and the result is displayed in a full-screen window.
4. User interaction: the program checks if the user has pressed the ESC key, and if so, exits the main loop and ends the program.
Knowledge analysis
1. OpenCV Lucas-Kanade (LK) Optical Flow
Lucas-Kanade (LK) Optical Flow is a classic algorithm used to estimate the motion of objects in an image sequence. It was proposed by Bruce D. Lucas and Takeo Kanade in their 1981 paper An Iterative Image Registration Technique with an Application to Stereo Vision.
The main idea behind LK optical flow is that within a small neighborhood, all pixels have the same motion velocity. This assumption allows us to establish a linear system within the neighborhood and then solve this system to obtain the motion velocities of pixels.
The basic steps of LK optical flow are as follows:
Feature Point Selection: Select some feature points in the first frame of the image sequence. These feature points are typically corners because corners have rich texture information, making them suitable for tracking.
Establish Linear System: For each feature point, choose a small neighborhood (e.g., a 5x5 or 7x7 window), and then establish a linear system within this neighborhood. This linear system is given by the following equation:
I_x * u + I_y * v = -I_t
I_x and I_y are the gradients of the image in the x and y directions, I_t is the gradient of the image in the time direction, and u and v are the motion velocities of pixels in the x and y directions.
Solve the Linear System: Since the linear system is overdetermined (i.e., the number of unknowns is less than the number of equations), we cannot directly solve it. Therefore, we need to use the least squares method to solve this linear system and obtain the approximate motion velocities of pixels.
Iterative Optimization: Since LK optical flow assumes that the motion velocity of pixels is constant within the neighborhood, this assumption may not hold true in practice. Therefore, we need to improve the estimation of motion velocities through iterative optimization. Specifically, we can move the feature points of the current frame according to the estimated motion velocities, and then repeat the above steps at the new positions until the motion velocities converge.
Through the above steps, LK optical flow can estimate the motion of objects in the image sequence. In computer vision, LK optical flow is widely used in various tasks such as video compression, video stabilization, and motion tracking.
2. OpenCV's goodFeaturesToTrack feature point detection
goodFeaturesToTrack is a function in OpenCV used for detecting corners in an image. It is based on the Shi-Tomasi corner detection method, which is an improvement over the Harris corner detection method.
Corners are features in an image characterized by intensity variations in all directions. They are commonly used as feature points in visual tasks due to their rich texture information.
The working principle of the goodFeaturesToTrack function is as follows:
- Calculate the minimum eigenvalue (also known as the Shi-Tomasi score) for each pixel in the image.
- Sort all pixel scores and select the highest ones.
- To ensure an even distribution of selected corners, the function removes corners that are too close together based on a minimum distance parameter.
The main parameters of the function are as follows:
- maxCorners: The maximum number of corners to be detected. If the actual number of detected corners exceeds this value, the function will only return the highest-scoring corners.
- qualityLevel: The threshold for the quality level of corners. This parameter is used to filter out corners with scores lower than the highest score multiplied by this threshold.
- minDistance: The minimum acceptable distance between corners.
With this function, we can quickly find representative feature points in an image for subsequent image processing tasks such as feature matching and optical flow tracking.
Feel free to join our UNIHIKER Discord community! You can engage in more discussions and share your insights!