
1. Firmware Upgrade
My HuskyLens 2 was an early maker release, which required a firmware upgrade to enable the latest features. The process was perfectly detailed and documented on the DFRobot website. I must pause here to offer my genuine thanks: the time and effort DFRobot invests in thorough documentation is invaluable. As makers working on multiple projects, nothing is more frustrating than losing time on setup and configuration simply because hardware companies neglect essential documentation.
You will need to download the following files:
Firmware image: huskylensV2-v1.1.6.1031.img.7zBurning tool: K230BurningTool.zipDriver installation tool: Zadig - Driver Installation Tool
All necessary steps and details are available here
Once firmware version 1.1.6 is successfully installed, navigate to the settings, connect the HuskyLens to your local Wi-Fi router, and ensure the MCP Server is enabled.

2. LLM and API Configuration
Next, you'll need a Google Gemini API key. You can obtain one easily at the Google AI Studio website: https://aistudio.google.com/app/api-keys. Google offers a generous free tier, and the paid options remain highly affordable for more intensive usage.
3. Client Setup
Finally, connect the camera to your Wi-Fi network and note the IP address assigned to the MCP Server. Open the Python client script (HuskyMCPChat.py) with a text editor and configure your Gemini API Key and the MCP Server IP Raddress within the script variables.
Run with $ python HuskyMCPChat.py

Usage and Interactivity
With the Python client running, you can use a menu and natural language commands to interact with the camera via the LLM
Change Algorithms: Switch algorithms (e.g., "switch to face recognition").Take Photos: Capture images, which are stored on the internal memory ("take a picture").Visual Query:Ask the LLM what the camera currently sees based on the active algorithm ("what do you see?").Combined Reasoning: Combine the camera's recognition data with an LLM prompt for queries such as: "Is there anything dangerous on the table?"



HuskyLens MCP Tools Overview
The following tools are exposed by the HuskyLens MCP Server and are callable by the LLM:
get_recognition_result
Obtains the real-time recognition result from HuskyLens, including image data and recognized labels (e.g., object type, person name). The primary operation is get_result. This is crucial for visual reasoning and generating natural-language descriptions of the camera's view.
manage_applications
Used to manage and query all internal applications (algorithms) of the HuskyLens. Supports operations like current_application, switch_application, and application_list.
multimedia_control
Provides control over the HuskyLens multimedia components, primarily the camera. The main operation is take_photo.
task_scheduler
Manages scheduled tasks. Call this tool when you need to create a timed or triggered action, such as: 'Take a picture when you see the keyboard' or 'Take a picture after 3 seconds'. Supports create and list operations. Tasks are defined by a trigger(optional, e.g., 'tiger'), a handler (required, currently only supports take_photo), and an optional timestamp for scheduled time.
Source code
https://github.com/ronibandini/HuskyLens2MCP









