AI Smart Accessory with UNIHIKER: Creating an Immersive 'The Little Prince' Interactive Experience
1 Project Introduction
This is a smart accessory project made with UNIHIKER, designed to provide engaging interactions and express user personalities through AI role-play. The project borrows the character of "The Little Prince" from Le Petit Prince as the AI prototype.
Users can choose between two main interaction modes:
1. One-agent mode: The user becomes the Little Prince's only Rose. The AI Little Prince provides emotional support by understanding the user's conversations and provides emotional support.
2. Multi-agents mode: The Little Prince and the Rose engage in warm and playful conversations, narrating their interactions through storytelling.
This project is divided into two main parts: the construction of basic models and the creation of the final project. The first part focuses on technical implementation and modular reuse to help users build the project’s basic structure. The second part showcases how to combine technical models to realize creative ideas. The project is still being explored, with plans for future expansion to make the smart accessory even more interesting.
2 Video
Before the project starts, you can watch the video for a more intuitive understanding of the project
- USB Speaker x 1
- 5V rechargeable battery x 1
- Switch x 1
- USB Type-C cable x 1
- Type-C to Type-C cable x 1
- 3D-printed model x 1
- 80mm convex lens x 1
- Electrical tape x 1
- M2 self-tapping screws x 2
4 Basic Models
The basic models can be divided into five parts: environment setup, voice input, large model deployment, speech recognition, and speech synthesis.
4.1 Create and Manage Virtual Environments on UNIHIKER
The default environment on UNIHIKER runs Python 3.7, but some open-source models may require different Python versions to avoid compatibility issues. Therefore, this project uses Conda to create and configure the necessary environments.
Basic steps: Connect the UNIHIKER via USB > Log into the terminal > Install Miniforge Conda, initialize, and activate the environment > Create a new virtual environment.
- Reference materials can be found on the DFRobot forum: How to Install Multiple Python Versions on Unihiker in the Simplest Way?
4.2 Automating Voice Input on UNIHIKER (Silero)
This project utilizes the Silero model for natural voice interaction. It includes automatic speech detection, recording, and sending audio data after the user finishes speaking. This is widely used in smart home control, interactive learning, and other hands-free operation scenarios.
Basic steps: Import necessary libraries > Set up Silero VAD > Load the Silero VAD model > Initialize audio stream > Audio saving function > Main recording loop > Cleanup and resource release.
- Reference materials: GitHub - Silero VAD
- Complete code: 01_silero_test.py [zip file at end of article]
4.3 Local Deployment of Large Models on UNIHIKER (Ollama)
Ollama is a lightweight and extensible framework designed for building and running large language models (LLMs) on local machines. This project demonstrates how to install and configure Ollama on UNIHIKER to run efficiently. The process includes downloading necessary dependencies, configuring the local environment, and performing debugging and testing.
- Reference materials: GitHub - Ollama
- Complete code: 02_2_ollama_test.py [zip file at end of article]
4.4 Continuous Speech Recognition on UNIHIKER (Silero & Whisper)
Building on the previous section, this part uses Whisper technology to achieve continuous speech-to-text functionality. Whisper is a powerful tool suitable for various languages and tasks, offering efficient speech-to-text services in diverse applications.
- Reference materials: GitHub - Whisper
- Complete code: 02_1_silero_whisper.py [zip file at end of article]
4.5 Local Speech Synthesis on UNIHIKER (Edge-tts)
Edge-tts is a Python module that enables local text-to-speech (TTS) functionality on UNIHIKER. By using Edge-tts, you can directly call Microsoft's online TTS service within UNIHIKER environment for high-quality speech synthesis.
- Reference materials: GitHub - Whisper
- Complete code: 02_5_edge_tts_stream.py [zip file at end of article]
5 Creation
5.1 Character Setup & Server Deployment
After completing the basic model construction mentioned above, we will implement interaction between AI character and user voice through two different methods: local deployment and cloud-based calling. To avoid conversation delays, we will adopt an asynchronous strategy for continuous conversation recording, ensuring that every response from the Little Prince is captured. The Little Prince can listen to all voice inputs and respond one by one, providing users with a sense of companionship through dialogue.
5.1.1 Local Computation Version
We plan to first deploy the local module on UNIHIKER to test its computational limits. Here, we use Deepseek as the base model. The following code integrates multiple functional modules. It can realize the complete process from voice recording to voice playback. It covers key points such as voice input, speech recognition, large model calling, speech playback, and asynchronous processing.
1. Voice Input & Recognition: Use Silero & Whisper models to implement continuous speech recognition, capturing the user's voice input.
2. Large Model Calling & AI Conversation Generation: Use the Deepseek model, with Prompt Engineering to guide the AI to mimic the "Little Prince" character for interactive conversations.
3. Voice Playback: Use Edge-tts to convert the AI-generated text into speech, further enhancing user experience.
4. Asynchronous Processing: Achieved through asyncio module for multitasking concurrency, ensuring efficient operation of speech recognition and AI conversation generation.
- Complete code: Local Computation [zip file at end of article]
5.1.2 Cloud-Based Version
It has been proven that in the local computation version, the conversation wasn't smooth enough, so here we replaced the Whisper and Edge-tts code with Baidu's API, moving speech recognition and playback to cloud-based processing.
Get Token: Use the get_baidu_access_token function to obtain an access token from Baidu.
Speech to Text: Replace Whisper model with the baidu_speech_to_text function to process audio files.
Text to Speech: Replace Edge-tts module with the baidu_text_to_speech function to generate audio files.
- Complete code: Cloud-Based Processing [zip file at end of article]
5.2 Multi-AI Communication
In this project, we recreated the loneliness and longing of the Little Prince on his planet. Every day, he sends messages to the Rose on Earth but receives no response. To ease his loneliness, we created a Cyber-Rose, enabling communication between two UNIHIKERs through a Flask server. The programming involves two Python files: one for the Little Prince and one for the Rose. The Little Prince's side is responsible for receiving and processing messages from the Rose and sending back responses. It acts as a client in the system, sending requests to other services and receiving responses. On the Rose’s side, the Flask portion creates a simple web server to handle received messages and perform corresponding operations.
Video: https://watch.wave.video/gg0FPwcxoIpfgjmo
- You need to save the above two programs in the /root/silero-vad-master/ directory on the two UNIHIKERs, naming them 04_5_deepseek_baidu_flask_prince.py and 04_6_deepseek_baidu_flask_Rose.py.
- Little Prince code: 04_5_deepseek_baidu_flask_prince.py
- Rose code: 04_6_deepseek_baidu_flask_rose.py
5.3 Creating Visual Faces for AI Characters
To alleviate the loneliness of the Little Prince, a cyber Rose has been created to respond to his messages. Next, within our self-built virtual environment, we will use GUI and dynamically switch between GIF animations to create lively facial expressions. The main goal is to display single or multiple GIF animations and switch between them using buttons, thus creating visual faces for each AI character.
Code: Visual Faces [zip file at end of article]
5.4 Final Assembly & Conflict Resolution
5.4.1 Hardware Assembly
After completing the above steps, we will finally resolve the logical conflicts between the GUI, Flask, and the original asyncio asynchronous processing. Additionally, we will introduce the detailed process of design and assembly.
Here is the complete hardware and materials. To adapt them for the project, we need to do some simple modifications to some materials.
1. Modify the battery's power output cable to a thinner Type-C to Type-C data cable (Soldering).
2. Solder the switch onto the positive power line.
3. To keep the overall design as compact as possible, we need to cut off the two corners of the UNIHIKER's gold fingers (Handheld cutting machine).
Assembly
Connect wires as shown > Prepare the lower hemisphere case > Put the speaker into the lower hemisphere case > Fix the battery charging port
Insert the battery > then the UNIHIKER > Cover the upper hemisphere shell > Tighten the self-tapping screws on the back
5.4.2 Connect 2 UNIHIKERs
Since there are now 2 UNIHIKERs, we will use WI-FI to connect them remotely.
1. Make sure the two UNIHIKERs are connected to the same network as your computer.
2. Check the IP address on the boards.
3. Open two terminal windows on your computer, one window connects to one UNIHIKER.
4. In each of the two windows, enter the following commands to connect to the board: change 10.1.2.3 to the IP address of the board.
Password: dfrobot
5. Activate the chat_agents environment:
conda activate chat_agents
5.4.3 Run the program
The following code is he modified complete one. The logical conflicts between the GUI, Flask, and the original asyncio asynchronous processing are quite detailed, and further instructions will be provided later.
- Code: 05_5_2_deepseek_baidu_flask_prince_gif_scale_tutorial
05_6_deepseek_baidu_flask_rose_gif_tutorial
Save the above programs to the /root/silero-vad-master/ directory of the two boards and name them: prince.py and Rose.py and run them.
6 Done
Future Plan
- USB camera for multimodal macromodels combined with vision for interactive dialogue.
- Personalisation with Dify Custom Agent Workflow
- Involving human users in the interaction of two UNIHIKERs.