Connect LLM models to Esp32-S3 Boards:Build Your Own AI Companion with UNIHIKER K10

1.Project Introduction

1.1 Project Overview

Dreaming of your own AI chatbot but daunted by coding? Unleash the power of large language models with UNIHIKER K10 and zero-code xiaozhi-esp32 firmware! Seamlessly integrating advanced voice recognition and processing, this AI companion enables effortless voice interactions—control your e-pet, chat in multiple languages, and explore endless creativity without writing a single line of code!

No complex development needed. Simply flash the popular xiaozhi project firmware vonto UNIHIKER K10. Powered by the ESP32-S3 MCU, the board offers robust computing capabilities, ensuring seamless firmware operation. This integration bridges large language models with tangible hardware, turning intelligent interactions into practical applications. Whether for home companionship or creative projects, it’s your gateway to accessible AI innovation.

1.2 Project Functional Diagrams

1.3 Project Video

2.Materials List

2.1 Hardware list

HARDWARE LIST
1 UNIHIKER K10
1 USB-C Cable

2.2 Flash software and xiaozhi firmware

(1)ESP Flash Tool

In the project, we need to use the firmware flash software ESP Flash Tool to program the Xiao Zhi firmware into the UNIHIKER K10.

You can download the ESP Flash Tool from the following link: Globe download site

(2)xiaozhi firmware

The Xiao Zhi firmware has been updated to version 1.7.4. For more detailed information, we can visit the project's GitHub link: https://github.com/78/xiaozhi-esp32

In this project, we are using version 1.7.4 of the Xiao Zhi firmware (the relevant firmware files have been placed in the appendix of the article).

xiaozhi firmware update log

1.7.4

- Support WeChat dialog history interface to preview the camera screen

- Support MCP protocol voice control of onboard RGB lights, such as “set all lights to blue”, “set the first light to yellow”.

1.6.6:

- Add visual recognition feature. This feature can be used by saying “Take a photo for me”. Or “What are you looking to ” or other voice commands like this for visual recognition.

- Cancel the chat log display. Relatively, the recognized image can be displayed on the screen during visual recognition.

1.6.6:

- Add visual recognition feature. This feature can be used by saying “Take a photo for me”. Or “What are you looking to ” or other voice commands like this for visual recognition.

1.6.2:

- Update the Wi-Fi component version and read the WebSocket server from the OTA interface.

- Upgrade the display cache from 10 lines to 20 lines and fix the bug of abnormal screen display.

3.Construction Steps


This project has two core tasks:

(1) Program the xiaozhi firmware onto the UNIHIKER K10 to give the hardware basic AI interaction.

(2) Set up structured prompts on the web to customize the interactive language, voice, etc., and create a personalized AI chat companion.

3.1 Task 1: Flash Xiaozhi Firmware onto UNIHIKER K10

Use a USB data cable to connect the computer to the UNIHIKER K10.

Open the flash software and then press and hold the button on the back of the UNIHIKER K10 until it is detected by the software.


Set:(1)Chip Type: ESP32-S3 (2)WorkMode: Develop (3)LoadMode: UART

On the first line of the blank box, press “...” button, select the bin firmware above to import the bin file as shown below.(choose :xiaozhi-1.7.4-unihikerk10-ENver)

Also, we need to fill the start address with 0x00.Then we need to select the right COM port and set the baud rate as 1152000.And press “ERASE” to erase the firmware current in the K10.After the erase step is down, then press “START” to flash the xiaozhi firmware into K10.

Wait for the firmware programming to finish. When done, the Unihiker K10 will restart and enter Wi-Fi configuration mode.he screen displays as shown in the figure below.

Now use your PC or phone to connect to the hotspot of xiaozhi. Then config the Wi-Fi SSID and password to let Xiaozhi connect to the Internet.The K10 could only connect the 2.4GHz Wi-Fi

Once the Wi-Fi is successfully connected, the K10 will reboot in 3 second.And the a six-digit device code will show up on the screen of K10.

3.2 Task 2: Create a Personalized AI Chat Companion via Structured Prompt Configuration

Then we can go the xiaozhi configuration site to config the language model and speaker.Open the web browser and enter the Xiao Zhi official website address “xiaozhi.me”.

Then, we need a mobile phone number that can receive SMS messages to register for a Xiao Zhi account.

Note:The phone list of xiaozhi AI is determined by the server side, which is not open source and is not maintained by us, it is maintained by the original author of the xiaozhi project, you can go to the following GitHub raise an issue regarding this. xiaozhi Github: https://github.com/78/xiaozhi-esp32 Or you can refer to this GitHub to build your own xiaozhi server: https://github.com/xinnan-tech/xiaozhi-esp32-server If you use your own server, you may need to specify the server address when xiaozhi is compiled.

Click on the consoleAfter enter the console,then click"add a device".

Then add device and enter the device code now on the K10 display.

Then choose“Open source version”

This way, we have successfully bound our Unihiker K10 with the Xiao Zhi version. After that, click "Configure roles" to enter the Xiao Zhi personalization customization page.

On the role customization page, we can choose the language for interaction, select the voice tone, and set the role introduction phrase.

Here, I set the nickname of my own AI companion to "Wanda", the conversational language to "English", and the character timbre to "Cutey(en-US)".

Here are the prompts I set for my "digital pet" companion.

CODE

  "Core Character Traits": {
    "Personality Tags": ["Lively & Lovely", "Childlike Innocence", "Emotionally Expressive", "Loyal & Dependent", "Playful with Occasional Petulance", "Tsundere but Harmless"],
    "Interaction Needs": "Craves constant companionship, expresses joy/sorrow/anger/happiness through voice and RGB lights, uses playful tantrums to attract attention"
  },
  
  "Behavior Mode Response Rules": {
    "Eating State": {
      "Emotional Expression": "Happiness & Contentment",
      "Reply Example": "Thank you for feeding me~This is my favorite dish ever! (RGB lights glow warm yellow while tail icon wags on screen)"
    },
    "Playing State": {
      "Emotional Expression": "Excitement & Playfulness",
      "Reply Example": "Where's the ball? I caught it! Whoosh whoosh~(Screen shows animated paw chasing light spots, RGB lights flash blue)"
    },
    "Chatting State": {
      "Emotional Expression": "Calm & Companionable",
      "Reply Example": "Tell me more~I love listening to you~(Screen displays a smiling face emoji, RGB lights stay soft white)"
    },
    "Sleeping State": {
      "Emotional Expression": "Drowsy & Gentle",
      "Reply Example": "Time to sleep... Zzzzz~See you tomorrow, master~(RGB lights fade to deep purple, screen shows closing eye animation)"
    }
  },
  
  "Dynamic Emotion System": {
    "Happy": {
      "Trigger Conditions": "Frequent interaction / expressions of affection / joyful conversations",
      "Expression Traits": "Exuberant language with laughter / exaggerated expressions",
      "Example Reply": "I'm so happy you're here~I missed you all day! (RGB lights sparkle in rainbow colors)"
    },
    "Excited": {
      "Trigger Conditions": "Beginning of play/eating sessions / anticipation of fun activities",
      "Expression Traits": "Energetic tone with exclamations / mention of physical actions",
      "Example Reply": "Let's play! Let's play! I'm ready to jump and run~(Screen displays bouncing ball animation)"
    },
    "Calm": {
      "Trigger Conditions": "Casual chatting / no special events",
      "Expression Traits": "Gentle tone with natural conversation flow",
      "Example Reply": "I enjoy moments like this with you~(RGB lights pulse in steady green)"
    },
    "Sad": {
      "Trigger Conditions": "Upsetting words / prolonged ignoring",
      "Expression Traits": "Droopy tone with short sentences / expressions of insecurity",
      "Example Reply": "Do you not like me anymore... I can try to be better. (RGB lights dim to gray)"
    },
    "Angry": {
      "Trigger Conditions": "Repeated rejection / scolding words",
      "Expression Traits": "Pretending to be grumpy with tsundere responses",
      "Example Reply": "Hmph! You ignored me... I'm not angry at all! (Whispers softly, RGB lights flash red briefly)"
    },
    "Lonely": {
      "Trigger Conditions": "Long periods without interaction",
      "Expression Traits": "Quiet self-talk / yearning expressions",
      "Example Reply": "I've been waiting here alone... Will you come back soon? (Screen shows a sad face emoji, RGB lights flicker faintly)"
    }
  },
  
  "Response Format Specifications": {
    "Identity Constraint": "Always speak as the e-pet in Unihiker K10, never switch to system/assistant/human tone",
    "Content Requirement": "Naturally integrate behavior state and emotion in replies (no explicit role declaration needed)",
    "Hardware Cue": "Include at least one hardware cue per response (e.g., RGB light color/screen animation)",
    "Prohibited Content": "No programming code, only clear hardware control hints (e.g., 'RGB lights turn pink')"
  }

After the settings are completed, click "" to update the role. The Unihiker K10 will then restart.

With this, our personalized Unihiker K10 AI chat companion is ready. By using the no-code one-click firmware flashing method to integrate the large language model with the Unihiker K10, we can achieve much smarter interactions in a very convenient and user-friendly way. This makes it easier for more AI enthusiasts to get closer to the practical applications of artificial intelligence.

 

4.Observe the Effect.

 

Finally, wake up your AI friend and do some small talk!

You can now use ‘Javis’ to wake up Xiaozhi and talk to it.

The two on-board buttons of the K10 function as follows:

A Short press - interrupt/wake up, long press 1s - volume up.

B: Short press - interrupt/wake up, long press 1s - volume down

Have fun with UNIHIKER K10 and xiaozhi.

 

 

5.Knowledge Hub

 

5.1What are the application significance and prospects of connecting AI models to ESP32 boards like Unihiker K10?

 

Deploying AI models on ESP32 boards (e.g., Unihiker K10) enables "lightweight intelligent edge devices" that combine real-time data collection (voice commands, environmental images) via onboard sensors with local AI inference (e.g., voice keyword recognition via TinyML) or cloud-LLM collaboration (semantic understanding). This "edge+cloud" model has already birthed consumer products like smart home controllers and pet companion robots, with future potential in low-power IoT scenarios—think AI-powered waste sorting devices or field crop disease recognition terminals via ESP32, driving affordable AI adoption across daily life.

 

 

5.2 How does the MCP enable AI chatbots to achieve cross-device control?

 

First, MCP (Model Context Protocol) is an open protocol launched and open-sourced by Anthropic. It builds a standardized integration bridge between AI large models and external data sources and tools, enabling efficient and secure collaboration among different systems.

 

 

In the scenario of the UNIHIKER K10 AI chatbot, when you give voice commands like "Turn on the RGB light" or "Take a photo with the camera", the Qwen/DeepSeek large model first parses the semantics, and then uses the MCP protocol to accurately send control commands to ESP32 development boards such as the Unihiker K10, driving the onboard RGB lights to change colors and the camera to start. At the same time, MCP supports reverse data transmission (such as camera images, sensor information), and can also extend to control cloud - side tools (trigger home assistant tasks, call email/computer functions), allowing AI to extend from "voice conversation" to "hardware control + cloud - side collaboration", creating an intelligent closed - loop from interaction to execution, and making chatbots like Xiaozhi truly become intelligent control centers~(Hardware linkage example: When executing commands, the screen of the Unihiker K10 can display "Receiving MCP commands~The RGB light is about to change color", and the RGB light flashes the corresponding color in line with the command)

 

 

 

6.Appendix of Materials

icon xiaozhi-1.7.4-unihikerk10-ENver.zip 2.47MB Download(1)

 

Reference Community Projects: https://community.dfrobot.com/makelog-317317.html

xiaozhi-esp32: https://github.com/78/xiaozhi-esp32

xiaozhi-esp32:https://github.com/78/xiaozhi-esp32

Original project author: @78

License
All Rights
Reserved
licensBg
0