Compare commits

23 Commits

Author SHA1 Message Date
66d318774a fix: server error to disconnect 2026-06-17 15:19:31 +08:00
60df0fe196 feat: add tools to normal agent 2026-06-12 14:23:41 +08:00
2c4329fd84 fix: voice interupt 2026-06-12 11:38:47 +08:00
9637e09aef feat: beaver 2026-06-04 15:48:10 +08:00
b92e6e1b07 feat: remove background cam every time 2026-05-29 14:53:58 +08:00
33ee598c21 feat: add icon beaver 2026-05-29 11:22:31 +08:00
37343ac0fe feat: icon first commit 2026-05-27 17:16:11 +08:00
fc6302661d feat: support camera capture to livekit 2026-05-25 17:21:11 +08:00
4953244c7c fix: voice interrupt 2026-05-22 10:20:00 +08:00
5223333418 fix: voice interrupt 2026-05-22 10:10:16 +08:00
61ad9dafd9 fix: text display 2026-05-21 17:05:09 +08:00
928d40826f feat: ws connect 2026-05-18 15:56:50 +08:00
417f52d759 perf(websocket): switch WiFi to performance mode before connecting (#1985)
Some checks failed
Build Boards / Determine variants to build (push) Has been cancelled
Build Boards / Build ${{ matrix.full_name }} (push) Has been cancelled
* perf(websocket): switch WiFi to performance mode before connecting

Optimize WebSocket connection speed by switching WiFi to performance
mode before establishing the connection, instead of after.

This reduces network latency significantly:
- TCP connection: 1093ms → 88ms (92% faster)
- WebSocket handshake: 1035ms → 80ms (92% faster)
- Total network layer: 2128ms → 173ms (92% faster)

The issue was caused by WiFi power save mode (MAX_MODEM) which adds
significant latency to packet transmission.

* Adjust formatting
2026-05-14 14:36:14 +08:00
67bf599149 fix(otto): WebSocket direct clients not receiving MCP responses (#1992)
* Enhance Otto Robot camera support by adding configuration for OV3660. Updated config.h to define camera types and GPIO settings, modified config.json to include new camera options, and refactored otto_robot.cc for improved camera detection and initialization logic.

* fix: 移除 OttoEmojiDisplay 构造函数中的 SetTheme 调用以修复 LoadProhibited 崩溃

Made-with: Cursor

* refactor: improve audio service error handling and codec timeout management

- Updated AudioService to prevent input task termination on read timeout, introducing a delay instead.
- Enhanced NoAudioCodec to implement a read timeout for I2S channel reads.
- Adjusted WebSocketControlServer to set a control port for improved socket management.
- Added manufacturer information to the config.json for waveshare ESP32-Touch-LCD-3.5.

* fix(otto): WebSocket direct clients not receiving MCP responses

When a browser connects directly to the WebSocket control server (port
8080) and sends a JSON-RPC request, the MCP response was routed through
Application::SendMcpMessage -> protocol_->SendMcpMessage, which sends it
to the cloud protocol channel. As a result, the direct WebSocket client
never received the response, while the WeChat mini-program could because
it communicates via the cloud.

Fix:
- Add BroadcastMessage() to WebSocketControlServer, using
  httpd_queue_work + httpd_ws_send_frame_async to asynchronously
  send responses back to all connected clients on port 8080
- Add RegisterMcpBroadcastCallback() to Application, allowing an
  additional MCP send callback to be registered; SendMcpMessage()
  now invokes it alongside the cloud protocol
- Register the broadcast callback in OttoRobot after the WebSocket
  server starts successfully

Also add WebSocket direct-connect API documentation to README.md
with complete JSON-RPC 2.0 command examples.
2026-05-14 14:35:49 +08:00
ba27c12494 fix(no_audio_codec): replace kReadTimeoutTicks with kReadTimeoutMs for clarity and consistency 2026-05-07 22:11:42 +08:00
c1d520d700 Feat: Add battery support and small fixes for Freenove 2.8 board (#1976)
* feat(freenove-esp32s3): add battery level retrieval

* fix(freenove-esp32s3): add missing comma in config.json

* docs(freenove-esp32s3): note possible shared design with ES3C28P/ES3N28P
2026-05-07 20:51:58 +08:00
1847b58935 fix(mcp): remove unnecessary guard for self.assets.set_download_url tool registration
The guard around the registration of the self.assets.set_download_url tool has been removed, ensuring it is always available for configuration. This change addresses issues on 32MB flash devices where the tool was previously skipped due to partition validation checks.

Fixes #1962
2026-05-02 15:56:47 +08:00
2be3c2cb1a fix(mcp): always register self.assets.set_download_url tool for 32MB flash devices (#1971)
* fix(m5stack-tab5): remove stale esp_video==0.7.0 dependency instructions

The README previously instructed users to override esp_video to 0.7.0
and esp_ipa to 0.1.0, but this causes build failures because:
- esp_video 0.7.0 does not export esp_video_deinit(), resulting in
  linker errors ('MAP_FAILED' and 'esp_video_deinit' not declared)
- The project's main/idf_component.yml already pins the correct
  version (esp_video==1.3.1) that the source code expects

Users should now use the default dependency versions from idf_component.yml
without modification.

Fixes #1957

* fix(mcp): always register self.assets.set_download_url tool

On 32MB flash devices the assets partition layout differs from the
default, causing partition_valid() to return false and silently
skipping registration of the self.assets.set_download_url MCP tool.
Users see 'Unknown tool: self.assets.set_download_url' from their MCP
client.

The tool writes to Settings storage which works regardless of the
partition map, so the partition_valid() guard is unnecessary.
Move the AddUserOnlyTool call outside the guard so the tool is always
available for explicit configuration via MCP.

Fixes #1962

---------

Co-authored-by: Aayush Pratap Singh <aayushpratap.singh@gmail.com>
2026-05-02 06:23:25 +08:00
e12e7351d9 Merge pull request #1958 from rymcu/main
feat(board): add rymcu-bigsmart board support
2026-04-30 17:16:40 +08:00
20175fa059 Move RYMCU BigSmart under manufacturer directory
Create main/boards/rymcu/bigsmart so future RYMCU boards can live under the same manufacturer directory. Update CMake to set MANUFACTURER to rymcu while preserving BOARD_NAME as rymcu-bigsmart, and adjust config.json so release output remains rymcu-bigsmart.
2026-04-30 16:04:30 +08:00
8cbbf3f357 chore: update dependencies in idf_component.yml
- Bump esp-wifi-connect version from ~3.1.2 to ~3.1.3
- Update uart-eth-modem version from ~0.3.4 to ~0.4.0
2026-04-30 13:29:36 +08:00
79a482a09e fix(blufi): GET_WIFI_LIST triggers real-time scan with guaranteed response (#1964)
* fix(blufi): GET_WIFI_LIST triggers real-time scan with guaranteed response

Previously, ESP_BLUFI_EVENT_GET_WIFI_LIST waited for any in-progress scan
to finish and then returned the cached result. When the cache was empty
(e.g. after a config-mode transition that stopped the Wi-Fi driver),
_send_wifi_list() returned silently with no response frame, leaving the
App waiting until timeout.

Changes:
- GET_WIFI_LIST now clears the cache and starts a fresh scan immediately.
- _wifi_scan_event_handler calls _send_wifi_list() after every scan
  triggered by a GET_WIFI_LIST request.
- start_wifi_scan() calls esp_wifi_start() before esp_wifi_scan_start()
  to handle the case where the driver was stopped during a mode
  transition (ESP_ERR_WIFI_STATE is treated as already-started).
- _send_wifi_list() sends ESP_BLUFI_WIFI_SCAN_FAIL when no APs are
  found, so the App always receives a terminal response.
- Redundant static_cast in _wifi_scan_event_handler replaced with the
  existing local `self` pointer.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(blufi): preserve cache fast-path, fall back to live scan only when needed

Address review feedback on always-rescan latency regression.

The GET_WIFI_LIST handler now distinguishes three cases:

1. Scan in flight: defer the response via m_send_list_after_scan; the
   scan-done handler dispatches when it fires. Removes the previous
   blocking `while (m_scan_in_progress) vTaskDelay(500)` which would
   stall the BluFi event task indefinitely if the scan never completed.

2. Cache populated: respond from cache immediately (~50 ms, no latency
   change vs original behavior). _send_wifi_list() still kicks off an
   async refresh scan as before to keep the cache fresh.

3. Cache empty and no scan running: trigger a live scan and dispatch
   from the scan-done handler. If start_wifi_scan() fails, send
   ESP_BLUFI_WIFI_SCAN_FAIL so the App exits its wait state.

State variables are also disentangled:

- m_scan_should_save_ssid keeps its original meaning (write scan results
  into m_ap_records). Cleared during connect-to-AP so the connect-time
  scan does not pollute the cache.
- m_send_list_after_scan is new and tracks "the next scan-done event
  should respond to a pending GET_WIFI_LIST request". The previous PR
  conflated these two responsibilities onto m_scan_should_save_ssid,
  which would have caused init-time scans to spuriously emit a wifi list
  to the App.

start_wifi_scan() now returns bool so the caller can distinguish
"scan started or already running" from "could not start a scan".

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

---------

Co-authored-by: Yixin Shi <shiyixin@qiniu.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-30 09:27:49 +08:00
07e2a11253 feat(board): add rymcu-bigsmart board support 2026-04-23 20:44:35 +08:00
50 changed files with 3526 additions and 313 deletions

2
.gitignore vendored
View File

@ -10,6 +10,7 @@ sdkconfig
dependencies.lock
.env
releases/
vision_frames/
main/assets/lang_config.h
main/mmap_generate_emoji.h
.DS_Store
@ -18,3 +19,4 @@ main/mmap_generate_emoji.h
*.bin
mmap_generate_*.h
.clangd
background_frames/

View File

@ -158,6 +158,13 @@ elseif(CONFIG_BOARD_TYPE_LICHUANG_DEV_C3)
set(BUILTIN_TEXT_FONT font_puhui_basic_20_4)
set(BUILTIN_ICON_FONT font_awesome_20_4)
set(DEFAULT_EMOJI_COLLECTION twemoji_32)
elseif(CONFIG_BOARD_TYPE_RYMCU_BIGSMART)
set(MANUFACTURER "rymcu")
set(BOARD_TYPE "bigsmart")
set(BOARD_NAME "rymcu-bigsmart")
set(BUILTIN_TEXT_FONT font_noto_basic_20_4)
set(BUILTIN_ICON_FONT font_awesome_20_4)
set(DEFAULT_EMOJI_COLLECTION noto-emoji_128)
elseif(CONFIG_BOARD_TYPE_EDA_TV_PRO)
set(MANUFACTURER "lceda-course-examples")
set(BOARD_TYPE "eda-tv-pro")

View File

@ -6,6 +6,34 @@ config OTA_URL
help
The application will access this URL to check for new firmwares and server address.
config USE_DIRECT_WEBSOCKET
bool "Use direct WebSocket without OTA"
default n
help
Skip the OTA server check and use the WebSocket settings below directly.
config WEBSOCKET_URL
string "Default WebSocket URL"
depends on USE_DIRECT_WEBSOCKET
default "ws://172.19.0.240:8080"
help
The WebSocket server URL used when direct WebSocket mode is enabled.
config WEBSOCKET_TOKEN
string "Default WebSocket token"
depends on USE_DIRECT_WEBSOCKET
default ""
help
Optional Authorization token for the direct WebSocket server.
config WEBSOCKET_PROTOCOL_VERSION
int "Default WebSocket protocol version"
depends on USE_DIRECT_WEBSOCKET
range 1 3
default 1
help
Protocol-Version header and hello version used by the WebSocket protocol.
choice
prompt "Flash Assets"
default FLASH_DEFAULT_ASSETS if !USE_EMOTE_MESSAGE_STYLE
@ -215,6 +243,9 @@ choice BOARD_TYPE
config BOARD_TYPE_LICHUANG_DEV_C3
bool "立创·实战派 ESP32-C3"
depends on IDF_TARGET_ESP32C3
config BOARD_TYPE_RYMCU_BIGSMART
bool "RYMCU BigSmart"
depends on IDF_TARGET_ESP32S3
config BOARD_TYPE_EDA_TV_PRO
bool "EDA课程案例 EDA-TV-Pro"
depends on IDF_TARGET_ESP32S3
@ -711,7 +742,7 @@ choice DISPLAY_STYLE
config USE_EMOTE_MESSAGE_STYLE
bool "Emote animation style"
depends on BOARD_TYPE_ESP_BOX || BOARD_TYPE_ESP_BOX_3 \
|| BOARD_TYPE_ESP_VOCAT || BOARD_TYPE_LICHUANG_DEV_S3 \
|| BOARD_TYPE_ESP_VOCAT || BOARD_TYPE_LICHUANG_DEV_S3 || BOARD_TYPE_RYMCU_BIGSMART \
|| BOARD_TYPE_ESP_SENSAIRSHUTTLE
endchoice
@ -808,7 +839,7 @@ config USE_DEVICE_AEC
bool "Enable Device-Side AEC"
default n
depends on USE_AUDIO_PROCESSOR && (BOARD_TYPE_ESP_BOX_3 || BOARD_TYPE_ESP_BOX || BOARD_TYPE_ESP_BOX_LITE \
|| BOARD_TYPE_LICHUANG_DEV_S3 || BOARD_TYPE_ESP_KORVO2_V3 || BOARD_TYPE_WAVESHARE_ESP32_S3_TOUCH_AMOLED_1_75|| BOARD_TYPE_WAVESHARE_ESP32_S3_TOUCH_AMOLED_1_75C || BOARD_TYPE_WAVESHARE_ESP32_S3_TOUCH_LCD_1_83\
|| BOARD_TYPE_LICHUANG_DEV_S3 || BOARD_TYPE_RYMCU_BIGSMART || BOARD_TYPE_ESP_KORVO2_V3 || BOARD_TYPE_WAVESHARE_ESP32_S3_TOUCH_AMOLED_1_75|| BOARD_TYPE_WAVESHARE_ESP32_S3_TOUCH_AMOLED_1_75C || BOARD_TYPE_WAVESHARE_ESP32_S3_TOUCH_LCD_1_83\
|| BOARD_TYPE_WAVESHARE_ESP32_S3_TOUCH_AMOLED_2_06 || BOARD_TYPE_WAVESHARE_ESP32_S3_TOUCH_LCD_4B || BOARD_TYPE_WAVESHARE_ESP32_P4_WIFI6_TOUCH_LCD_4B || BOARD_TYPE_WAVESHARE_ESP32_P4_WIFI6_TOUCH_LCD_4_3 \
|| BOARD_TYPE_WAVESHARE_ESP32_P4_WIFI6_TOUCH_LCD_7B \
|| BOARD_TYPE_WAVESHARE_ESP32_P4_WIFI6_TOUCH_LCD_3_4C || BOARD_TYPE_WAVESHARE_ESP32_P4_WIFI6_TOUCH_LCD_4C || BOARD_TYPE_ESP_S3_LCD_EV_Board_2 || BOARD_TYPE_YUNLIAO_S3 \

View File

@ -1,25 +1,24 @@
#include "application.h"
#include "assets.h"
#include "assets/lang_config.h"
#include "audio_codec.h"
#include "board.h"
#include "display.h"
#include "system_info.h"
#include "audio_codec.h"
#include "mqtt_protocol.h"
#include "websocket_protocol.h"
#include "assets/lang_config.h"
#include "mcp_server.h"
#include "assets.h"
#include "mqtt_protocol.h"
#include "settings.h"
#include "system_info.h"
#include "websocket_protocol.h"
#include <cstring>
#include <esp_log.h>
#include <cJSON.h>
#include <driver/gpio.h>
#include <esp_log.h>
#include <arpa/inet.h>
#include <cJSON.h>
#include <font_awesome.h>
#include <cstring>
#define TAG "Application"
Application::Application() {
event_group_ = xEventGroupCreate();
@ -33,16 +32,16 @@ Application::Application() {
aec_mode_ = kAecOff;
#endif
esp_timer_create_args_t clock_timer_args = {
.callback = [](void* arg) {
Application* app = (Application*)arg;
xEventGroupSetBits(app->event_group_, MAIN_EVENT_CLOCK_TICK);
},
.arg = this,
.dispatch_method = ESP_TIMER_TASK,
.name = "clock_timer",
.skip_unhandled_events = true
};
esp_timer_create_args_t clock_timer_args = {.callback =
[](void* arg) {
Application* app = (Application*)arg;
xEventGroupSetBits(app->event_group_,
MAIN_EVENT_CLOCK_TICK);
},
.arg = this,
.dispatch_method = ESP_TIMER_TASK,
.name = "clock_timer",
.skip_unhandled_events = true};
esp_timer_create(&clock_timer_args, &clock_timer_handle_);
}
@ -54,9 +53,7 @@ Application::~Application() {
vEventGroupDelete(event_group_);
}
bool Application::SetDeviceState(DeviceState state) {
return state_machine_.TransitionTo(state);
}
bool Application::SetDeviceState(DeviceState state) { return state_machine_.TransitionTo(state); }
void Application::Initialize() {
auto& board = Board::GetInstance();
@ -81,6 +78,7 @@ void Application::Initialize() {
xEventGroupSetBits(event_group_, MAIN_EVENT_WAKE_WORD_DETECTED);
};
callbacks.on_vad_change = [this](bool speaking) {
vad_speaking_.store(speaking);
xEventGroupSetBits(event_group_, MAIN_EVENT_VAD_CHANGE);
};
audio_service_.SetCallbacks(callbacks);
@ -101,7 +99,7 @@ void Application::Initialize() {
// Set network event callback for UI updates and network state handling
board.SetNetworkEventCallback([this](NetworkEvent event, const std::string& data) {
auto display = Board::GetInstance().GetDisplay();
switch (event) {
case NetworkEvent::Scanning:
display->ShowNotification(Lang::Strings::SCANNING_WIFI, 30000);
@ -141,13 +139,16 @@ void Application::Initialize() {
display->SetStatus(Lang::Strings::DETECTING_MODULE);
break;
case NetworkEvent::ModemErrorNoSim:
Alert(Lang::Strings::ERROR, Lang::Strings::PIN_ERROR, "triangle_exclamation", Lang::Sounds::OGG_ERR_PIN);
Alert(Lang::Strings::ERROR, Lang::Strings::PIN_ERROR, "triangle_exclamation",
Lang::Sounds::OGG_ERR_PIN);
break;
case NetworkEvent::ModemErrorRegDenied:
Alert(Lang::Strings::ERROR, Lang::Strings::REG_ERROR, "triangle_exclamation", Lang::Sounds::OGG_ERR_REG);
Alert(Lang::Strings::ERROR, Lang::Strings::REG_ERROR, "triangle_exclamation",
Lang::Sounds::OGG_ERR_REG);
break;
case NetworkEvent::ModemErrorInitFailed:
Alert(Lang::Strings::ERROR, Lang::Strings::MODEM_INIT_ERROR, "triangle_exclamation", Lang::Sounds::OGG_EXCLAMATION);
Alert(Lang::Strings::ERROR, Lang::Strings::MODEM_INIT_ERROR, "triangle_exclamation",
Lang::Sounds::OGG_EXCLAMATION);
break;
case NetworkEvent::ModemErrorTimeout:
display->SetStatus(Lang::Strings::REGISTERING_NETWORK);
@ -166,19 +167,11 @@ void Application::Run() {
// Set the priority of the main task to 10
vTaskPrioritySet(nullptr, 10);
const EventBits_t ALL_EVENTS =
MAIN_EVENT_SCHEDULE |
MAIN_EVENT_SEND_AUDIO |
MAIN_EVENT_WAKE_WORD_DETECTED |
MAIN_EVENT_VAD_CHANGE |
MAIN_EVENT_CLOCK_TICK |
MAIN_EVENT_ERROR |
MAIN_EVENT_NETWORK_CONNECTED |
MAIN_EVENT_NETWORK_DISCONNECTED |
MAIN_EVENT_TOGGLE_CHAT |
MAIN_EVENT_START_LISTENING |
MAIN_EVENT_STOP_LISTENING |
MAIN_EVENT_ACTIVATION_DONE |
const EventBits_t ALL_EVENTS =
MAIN_EVENT_SCHEDULE | MAIN_EVENT_SEND_AUDIO | MAIN_EVENT_WAKE_WORD_DETECTED |
MAIN_EVENT_VAD_CHANGE | MAIN_EVENT_CLOCK_TICK | MAIN_EVENT_ERROR |
MAIN_EVENT_NETWORK_CONNECTED | MAIN_EVENT_NETWORK_DISCONNECTED | MAIN_EVENT_TOGGLE_CHAT |
MAIN_EVENT_START_LISTENING | MAIN_EVENT_STOP_LISTENING | MAIN_EVENT_ACTIVATION_DONE |
MAIN_EVENT_STATE_CHANGED;
while (true) {
@ -186,7 +179,8 @@ void Application::Run() {
if (bits & MAIN_EVENT_ERROR) {
SetDeviceState(kDeviceStateIdle);
Alert(Lang::Strings::ERROR, last_error_message_.c_str(), "circle_xmark", Lang::Sounds::OGG_EXCLAMATION);
Alert(Lang::Strings::ERROR, last_error_message_.c_str(), "circle_xmark",
Lang::Sounds::OGG_EXCLAMATION);
}
if (bits & MAIN_EVENT_NETWORK_CONNECTED) {
@ -233,6 +227,13 @@ void Application::Run() {
if (GetDeviceState() == kDeviceStateListening) {
auto led = Board::GetInstance().GetLed();
led->OnStateChanged();
if (vad_speaking_.load() && vision_text_mode_enabled_.load() &&
!vision_frame_sent_for_current_listen_.exchange(true)) {
if (!SendCurrentVisionFrame()) {
vision_frame_sent_for_current_listen_.store(false);
}
}
}
}
@ -249,7 +250,7 @@ void Application::Run() {
clock_ticks_++;
auto display = Board::GetInstance().GetDisplay();
display->UpdateStatusBar();
// Print debug info every 10 seconds
if (clock_ticks_ % 10 == 0) {
SystemInfo::PrintHeapStats();
@ -270,12 +271,14 @@ void Application::HandleNetworkConnectedEvent() {
return;
}
xTaskCreate([](void* arg) {
Application* app = static_cast<Application*>(arg);
app->ActivationTask();
app->activation_task_handle_ = nullptr;
vTaskDelete(NULL);
}, "activation", 4096 * 2, this, 2, &activation_task_handle_);
xTaskCreate(
[](void* arg) {
Application* app = static_cast<Application*>(arg);
app->ActivationTask();
app->activation_task_handle_ = nullptr;
vTaskDelete(NULL);
},
"activation", 4096 * 2, this, 2, &activation_task_handle_);
}
// Update the status bar immediately to show the network state
@ -286,7 +289,8 @@ void Application::HandleNetworkConnectedEvent() {
void Application::HandleNetworkDisconnectedEvent() {
// Close current conversation when network disconnected
auto state = GetDeviceState();
if (state == kDeviceStateConnecting || state == kDeviceStateListening || state == kDeviceStateSpeaking) {
if (state == kDeviceStateConnecting || state == kDeviceStateListening ||
state == kDeviceStateThinking || state == kDeviceStateSpeaking) {
ESP_LOGI(TAG, "Closing audio channel due to network disconnection");
protocol_->CloseAudioChannel();
}
@ -302,11 +306,15 @@ void Application::HandleActivationDoneEvent() {
SystemInfo::PrintHeapStats();
SetDeviceState(kDeviceStateIdle);
has_server_time_ = ota_->HasServerTime();
if (ota_ != nullptr) {
has_server_time_ = ota_->HasServerTime();
}
auto display = Board::GetInstance().GetDisplay();
std::string message = std::string(Lang::Strings::VERSION) + ota_->GetCurrentVersion();
display->ShowNotification(message.c_str());
if (ota_ != nullptr) {
std::string message = std::string(Lang::Strings::VERSION) + ota_->GetCurrentVersion();
display->ShowNotification(message.c_str());
}
display->SetChatMessage("system", "");
// Release OTA object after activation is complete
@ -321,6 +329,10 @@ void Application::HandleActivationDoneEvent() {
}
void Application::ActivationTask() {
#if CONFIG_USE_DIRECT_WEBSOCKET
CheckAssetsVersion();
InitializeProtocol();
#else
// Create OTA object for activation process
ota_ = std::make_unique<Ota>();
@ -332,6 +344,7 @@ void Application::ActivationTask() {
// Initialize the protocol
InitializeProtocol();
#endif
// Signal completion to main loop
xEventGroupSetBits(event_group_, MAIN_EVENT_ACTIVATION_DONE);
@ -352,7 +365,7 @@ void Application::CheckAssetsVersion() {
ESP_LOGW(TAG, "Assets partition is disabled for board %s", BOARD_NAME);
return;
}
Settings settings("assets", true);
// Check if there is a new assets need to be downloaded
std::string download_url = settings.GetString("download_url");
@ -362,27 +375,30 @@ void Application::CheckAssetsVersion() {
char message[256];
snprintf(message, sizeof(message), Lang::Strings::FOUND_NEW_ASSETS, download_url.c_str());
Alert(Lang::Strings::LOADING_ASSETS, message, "cloud_arrow_down", Lang::Sounds::OGG_UPGRADE);
Alert(Lang::Strings::LOADING_ASSETS, message, "cloud_arrow_down",
Lang::Sounds::OGG_UPGRADE);
// Wait for the audio service to be idle for 3 seconds
vTaskDelay(pdMS_TO_TICKS(3000));
SetDeviceState(kDeviceStateUpgrading);
board.SetPowerSaveLevel(PowerSaveLevel::PERFORMANCE);
display->SetChatMessage("system", Lang::Strings::PLEASE_WAIT);
bool success = assets.Download(download_url, [this, display](int progress, size_t speed) -> void {
char buffer[32];
snprintf(buffer, sizeof(buffer), "%d%% %uKB/s", progress, speed / 1024);
Schedule([display, message = std::string(buffer)]() {
display->SetChatMessage("system", message.c_str());
bool success =
assets.Download(download_url, [this, display](int progress, size_t speed) -> void {
char buffer[32];
snprintf(buffer, sizeof(buffer), "%d%% %uKB/s", progress, speed / 1024);
Schedule([display, message = std::string(buffer)]() {
display->SetChatMessage("system", message.c_str());
});
});
});
board.SetPowerSaveLevel(PowerSaveLevel::LOW_POWER);
vTaskDelay(pdMS_TO_TICKS(1000));
if (!success) {
Alert(Lang::Strings::ERROR, Lang::Strings::DOWNLOAD_ASSETS_FAILED, "circle_xmark", Lang::Sounds::OGG_EXCLAMATION);
Alert(Lang::Strings::ERROR, Lang::Strings::DOWNLOAD_ASSETS_FAILED, "circle_xmark",
Lang::Sounds::OGG_EXCLAMATION);
vTaskDelay(pdMS_TO_TICKS(2000));
SetDeviceState(kDeviceStateActivating);
return;
@ -398,7 +414,7 @@ void Application::CheckAssetsVersion() {
void Application::CheckNewVersion() {
const int MAX_RETRY = 10;
int retry_count = 0;
int retry_delay = 10; // Initial retry delay in seconds
int retry_delay = 10; // Initial retry delay in seconds
auto& board = Board::GetInstance();
while (true) {
@ -414,27 +430,30 @@ void Application::CheckNewVersion() {
}
char error_message[128];
snprintf(error_message, sizeof(error_message), "code=%d, url=%s", err, ota_->GetCheckVersionUrl().c_str());
snprintf(error_message, sizeof(error_message), "code=%d, url=%s", err,
ota_->GetCheckVersionUrl().c_str());
char buffer[256];
snprintf(buffer, sizeof(buffer), Lang::Strings::CHECK_NEW_VERSION_FAILED, retry_delay, error_message);
snprintf(buffer, sizeof(buffer), Lang::Strings::CHECK_NEW_VERSION_FAILED, retry_delay,
error_message);
Alert(Lang::Strings::ERROR, buffer, "cloud_slash", Lang::Sounds::OGG_EXCLAMATION);
ESP_LOGW(TAG, "Check new version failed, retry in %d seconds (%d/%d)", retry_delay, retry_count, MAX_RETRY);
ESP_LOGW(TAG, "Check new version failed, retry in %d seconds (%d/%d)", retry_delay,
retry_count, MAX_RETRY);
for (int i = 0; i < retry_delay; i++) {
vTaskDelay(pdMS_TO_TICKS(1000));
if (GetDeviceState() == kDeviceStateIdle) {
break;
}
}
retry_delay *= 2; // Double the retry delay
retry_delay *= 2; // Double the retry delay
continue;
}
retry_count = 0;
retry_delay = 10; // Reset retry delay
retry_delay = 10; // Reset retry delay
if (ota_->HasNewVersion()) {
if (UpgradeFirmware(ota_->GetFirmwareUrl(), ota_->GetFirmwareVersion())) {
return; // This line will never be reached after reboot
return; // This line will never be reached after reboot
}
// If upgrade failed, continue to normal operation
}
@ -477,6 +496,9 @@ void Application::InitializeProtocol() {
display->SetStatus(Lang::Strings::LOADING_PROTOCOL);
#if CONFIG_USE_DIRECT_WEBSOCKET
protocol_ = std::make_unique<WebsocketProtocol>();
#else
if (ota_->HasMqttConfig()) {
protocol_ = std::make_unique<MqttProtocol>();
} else if (ota_->HasWebsocketConfig()) {
@ -485,52 +507,63 @@ void Application::InitializeProtocol() {
ESP_LOGW(TAG, "No protocol specified in the OTA config, using MQTT");
protocol_ = std::make_unique<MqttProtocol>();
}
#endif
protocol_->OnConnected([this]() {
DismissAlert();
});
protocol_->OnConnected([this]() { DismissAlert(); });
protocol_->OnNetworkError([this](const std::string& message) {
last_error_message_ = message;
xEventGroupSetBits(event_group_, MAIN_EVENT_ERROR);
});
protocol_->OnIncomingAudio([this](std::unique_ptr<AudioStreamPacket> packet) {
if (GetDeviceState() == kDeviceStateSpeaking) {
if (accepting_tts_audio_.load() || GetDeviceState() == kDeviceStateSpeaking) {
audio_service_.PushPacketToDecodeQueue(std::move(packet));
}
});
protocol_->OnAudioChannelOpened([this, codec, &board]() {
board.SetPowerSaveLevel(PowerSaveLevel::PERFORMANCE);
if (protocol_->server_sample_rate() != codec->output_sample_rate()) {
ESP_LOGW(TAG, "Server sample rate %d does not match device output sample rate %d, resampling may cause distortion",
protocol_->server_sample_rate(), codec->output_sample_rate());
ESP_LOGW(TAG,
"Server sample rate %d does not match device output sample rate %d, "
"resampling may cause distortion",
protocol_->server_sample_rate(), codec->output_sample_rate());
}
});
protocol_->OnAudioChannelClosed([this, &board]() {
board.SetPowerSaveLevel(PowerSaveLevel::LOW_POWER);
accepting_tts_audio_.store(false);
Schedule([this]() {
if (GetDeviceState() == kDeviceStateConnecting) {
return;
}
auto display = Board::GetInstance().GetDisplay();
display->SetChatMessage("system", "");
SetDeviceState(kDeviceStateIdle);
});
});
protocol_->OnIncomingJson([this, display](const cJSON* root) {
// Parse JSON data
auto type = cJSON_GetObjectItem(root, "type");
if (strcmp(type->valuestring, "tts") == 0) {
auto state = cJSON_GetObjectItem(root, "state");
if (strcmp(state->valuestring, "start") == 0) {
if (strcmp(state->valuestring, "thinking") == 0) {
Schedule([this]() { SetDeviceState(kDeviceStateThinking); });
} else if (strcmp(state->valuestring, "start") == 0) {
audio_service_.ResetDecoder();
accepting_tts_audio_.store(true);
Schedule([this]() {
aborted_ = false;
SetDeviceState(kDeviceStateSpeaking);
});
} else if (strcmp(state->valuestring, "stop") == 0) {
accepting_tts_audio_.store(false);
Schedule([this]() {
if (GetDeviceState() == kDeviceStateSpeaking) {
auto state = GetDeviceState();
if (state == kDeviceStateSpeaking || state == kDeviceStateThinking) {
if (listening_mode_ == kListeningModeManualStop) {
SetDeviceState(kDeviceStateIdle);
} else {
@ -573,9 +606,7 @@ void Application::InitializeProtocol() {
ESP_LOGI(TAG, "System command: %s", command->valuestring);
if (strcmp(command->valuestring, "reboot") == 0) {
// Do a reboot if user requests a OTA update
Schedule([this]() {
Reboot();
});
Schedule([this]() { Reboot(); });
} else {
ESP_LOGW(TAG, "Unknown system command: %s", command->valuestring);
}
@ -585,7 +616,8 @@ void Application::InitializeProtocol() {
auto message = cJSON_GetObjectItem(root, "message");
auto emotion = cJSON_GetObjectItem(root, "emotion");
if (cJSON_IsString(status) && cJSON_IsString(message) && cJSON_IsString(emotion)) {
Alert(status->valuestring, message->valuestring, emotion->valuestring, Lang::Sounds::OGG_VIBRATION);
Alert(status->valuestring, message->valuestring, emotion->valuestring,
Lang::Sounds::OGG_VIBRATION);
} else {
ESP_LOGW(TAG, "Alert command requires status, message and emotion");
}
@ -594,9 +626,10 @@ void Application::InitializeProtocol() {
auto payload = cJSON_GetObjectItem(root, "payload");
ESP_LOGI(TAG, "Received custom message: %s", cJSON_PrintUnformatted(root));
if (cJSON_IsObject(payload)) {
Schedule([this, display, payload_str = std::string(cJSON_PrintUnformatted(payload))]() {
display->SetChatMessage("system", payload_str.c_str());
});
Schedule(
[this, display, payload_str = std::string(cJSON_PrintUnformatted(payload))]() {
display->SetChatMessage("system", payload_str.c_str());
});
} else {
ESP_LOGW(TAG, "Invalid custom message format: missing payload");
}
@ -605,7 +638,7 @@ void Application::InitializeProtocol() {
ESP_LOGW(TAG, "Unknown message type: %s", type->valuestring);
}
});
protocol_->Start();
}
@ -614,32 +647,27 @@ void Application::ShowActivationCode(const std::string& code, const std::string&
char digit;
const std::string_view& sound;
};
static const std::array<digit_sound, 10> digit_sounds{{
digit_sound{'0', Lang::Sounds::OGG_0},
digit_sound{'1', Lang::Sounds::OGG_1},
digit_sound{'2', Lang::Sounds::OGG_2},
digit_sound{'3', Lang::Sounds::OGG_3},
digit_sound{'4', Lang::Sounds::OGG_4},
digit_sound{'5', Lang::Sounds::OGG_5},
digit_sound{'6', Lang::Sounds::OGG_6},
digit_sound{'7', Lang::Sounds::OGG_7},
digit_sound{'8', Lang::Sounds::OGG_8},
digit_sound{'9', Lang::Sounds::OGG_9}
}};
static const std::array<digit_sound, 10> digit_sounds{
{digit_sound{'0', Lang::Sounds::OGG_0}, digit_sound{'1', Lang::Sounds::OGG_1},
digit_sound{'2', Lang::Sounds::OGG_2}, digit_sound{'3', Lang::Sounds::OGG_3},
digit_sound{'4', Lang::Sounds::OGG_4}, digit_sound{'5', Lang::Sounds::OGG_5},
digit_sound{'6', Lang::Sounds::OGG_6}, digit_sound{'7', Lang::Sounds::OGG_7},
digit_sound{'8', Lang::Sounds::OGG_8}, digit_sound{'9', Lang::Sounds::OGG_9}}};
// This sentence uses 9KB of SRAM, so we need to wait for it to finish
Alert(Lang::Strings::ACTIVATION, message.c_str(), "link", Lang::Sounds::OGG_ACTIVATION);
for (const auto& digit : code) {
auto it = std::find_if(digit_sounds.begin(), digit_sounds.end(),
[digit](const digit_sound& ds) { return ds.digit == digit; });
[digit](const digit_sound& ds) { return ds.digit == digit; });
if (it != digit_sounds.end()) {
audio_service_.PlaySound(it->sound);
}
}
}
void Application::Alert(const char* status, const char* message, const char* emotion, const std::string_view& sound) {
void Application::Alert(const char* status, const char* message, const char* emotion,
const std::string_view& sound) {
ESP_LOGW(TAG, "Alert [%s] %s: %s", emotion, status, message);
auto display = Board::GetInstance().GetDisplay();
display->SetStatus(status);
@ -659,21 +687,44 @@ void Application::DismissAlert() {
}
}
void Application::ToggleChatState() {
void Application::ToggleChatState() { ToggleChatStateForMode(kChatAgentModeNormal, false); }
void Application::ToggleChatStateWithVision() {
ToggleChatStateForMode(kChatAgentModeNormal, true);
}
void Application::ToggleChatStateForMode(ChatAgentMode agent_mode, bool vision_enabled) {
chat_agent_mode_.store(agent_mode);
vision_text_mode_enabled_.store(vision_enabled);
vision_frame_sent_for_current_listen_.store(false);
xEventGroupSetBits(event_group_, MAIN_EVENT_TOGGLE_CHAT);
}
bool Application::IsVisionTextModeEnabled() const { return vision_text_mode_enabled_.load(); }
const char* Application::GetChatAgentModeName() const {
return chat_agent_mode_.load() == kChatAgentModeBeaver ? "beaver" : "normal";
}
const char* Application::GetChatModeName() const {
bool vision_enabled = vision_text_mode_enabled_.load();
if (chat_agent_mode_.load() == kChatAgentModeBeaver) {
return vision_enabled ? "vision-beaver" : "beaver";
}
return vision_enabled ? "vision-normal" : "normal";
}
void Application::StartListening() {
vision_text_mode_enabled_.store(false);
vision_frame_sent_for_current_listen_.store(false);
xEventGroupSetBits(event_group_, MAIN_EVENT_START_LISTENING);
}
void Application::StopListening() {
xEventGroupSetBits(event_group_, MAIN_EVENT_STOP_LISTENING);
}
void Application::StopListening() { xEventGroupSetBits(event_group_, MAIN_EVENT_STOP_LISTENING); }
void Application::HandleToggleChatEvent() {
auto state = GetDeviceState();
if (state == kDeviceStateActivating) {
SetDeviceState(kDeviceStateIdle);
return;
@ -694,17 +745,22 @@ void Application::HandleToggleChatEvent() {
if (state == kDeviceStateIdle) {
ListeningMode mode = GetDefaultListeningMode();
if (!protocol_->IsAudioChannelOpened()) {
bool agent_mode_changed = chat_agent_mode_.load() != active_chat_agent_mode_.load();
bool vision_mode_changed =
vision_text_mode_enabled_.load() != active_vision_text_mode_enabled_.load();
if (!protocol_->IsAudioChannelOpened() || agent_mode_changed || vision_mode_changed) {
if (protocol_->IsAudioChannelOpened()) {
protocol_->CloseAudioChannel();
}
SetDeviceState(kDeviceStateConnecting);
// Schedule to let the state change be processed first (UI update)
Schedule([this, mode]() {
ContinueOpenAudioChannel(mode);
});
Schedule([this, mode]() { ContinueOpenAudioChannel(mode); });
return;
}
SetListeningMode(mode);
} else if (state == kDeviceStateSpeaking) {
} else if (state == kDeviceStateSpeaking || state == kDeviceStateThinking) {
AbortSpeaking(kAbortReasonNone);
SetListeningMode(GetDefaultListeningMode());
} else if (state == kDeviceStateListening) {
protocol_->CloseAudioChannel();
}
@ -716,18 +772,24 @@ void Application::ContinueOpenAudioChannel(ListeningMode mode) {
return;
}
// Switch to performance mode before connecting to reduce latency
auto& board = Board::GetInstance();
board.SetPowerSaveLevel(PowerSaveLevel::PERFORMANCE);
if (!protocol_->IsAudioChannelOpened()) {
if (!protocol_->OpenAudioChannel()) {
return;
}
}
active_chat_agent_mode_.store(chat_agent_mode_.load());
active_vision_text_mode_enabled_.store(vision_text_mode_enabled_.load());
SetListeningMode(mode);
}
void Application::HandleStartListeningEvent() {
auto state = GetDeviceState();
if (state == kDeviceStateActivating) {
SetDeviceState(kDeviceStateIdle);
return;
@ -741,18 +803,16 @@ void Application::HandleStartListeningEvent() {
ESP_LOGE(TAG, "Protocol not initialized");
return;
}
if (state == kDeviceStateIdle) {
if (!protocol_->IsAudioChannelOpened()) {
SetDeviceState(kDeviceStateConnecting);
// Schedule to let the state change be processed first (UI update)
Schedule([this]() {
ContinueOpenAudioChannel(kListeningModeManualStop);
});
Schedule([this]() { ContinueOpenAudioChannel(kListeningModeManualStop); });
return;
}
SetListeningMode(kListeningModeManualStop);
} else if (state == kDeviceStateSpeaking) {
} else if (state == kDeviceStateSpeaking || state == kDeviceStateThinking) {
AbortSpeaking(kAbortReasonNone);
SetListeningMode(kListeningModeManualStop);
}
@ -760,7 +820,7 @@ void Application::HandleStartListeningEvent() {
void Application::HandleStopListeningEvent() {
auto state = GetDeviceState();
if (state == kDeviceStateAudioTesting) {
audio_service_.EnableAudioTesting(false);
SetDeviceState(kDeviceStateWifiConfiguring);
@ -790,17 +850,14 @@ void Application::HandleWakeWordDetectedEvent() {
SetDeviceState(kDeviceStateConnecting);
// Schedule to let the state change be processed first (UI update),
// then continue with OpenAudioChannel which may block for ~1 second
Schedule([this, wake_word]() {
ContinueWakeWordInvoke(wake_word);
});
Schedule([this, wake_word]() { ContinueWakeWordInvoke(wake_word); });
return;
}
// Channel already opened, continue directly
ContinueWakeWordInvoke(wake_word);
} else if (state == kDeviceStateSpeaking || state == kDeviceStateListening) {
} else if (state == kDeviceStateSpeaking || state == kDeviceStateThinking ||
state == kDeviceStateListening) {
AbortSpeaking(kAbortReasonWakeWordDetected);
// Clear send queue to avoid sending residues to server
while (audio_service_.PopPacketFromSendQueue());
if (state == kDeviceStateListening) {
protocol_->SendStartListening(GetDefaultListeningMode());
@ -825,6 +882,10 @@ void Application::ContinueWakeWordInvoke(const std::string& wake_word) {
return;
}
// Switch to performance mode before connecting to reduce latency
auto& board = Board::GetInstance();
board.SetPowerSaveLevel(PowerSaveLevel::PERFORMANCE);
if (!protocol_->IsAudioChannelOpened()) {
if (!protocol_->OpenAudioChannel()) {
audio_service_.EnableWakeWordDetection(true);
@ -857,13 +918,14 @@ void Application::HandleStateChangedEvent() {
auto display = board.GetDisplay();
auto led = board.GetLed();
led->OnStateChanged();
switch (new_state) {
case kDeviceStateUnknown:
case kDeviceStateIdle:
vision_frame_sent_for_current_listen_.store(false);
display->SetStatus(Lang::Strings::STANDBY);
display->ClearChatMessages(); // Clear messages first
display->SetEmotion("neutral"); // Then set emotion (wechat mode checks child count)
display->ClearChatMessages(); // Clear messages first
display->SetEmotion("neutral"); // Then set emotion (wechat mode checks child count)
audio_service_.EnableVoiceProcessing(false);
audio_service_.EnableWakeWordDetection(true);
break;
@ -873,21 +935,19 @@ void Application::HandleStateChangedEvent() {
display->SetChatMessage("system", "");
break;
case kDeviceStateListening:
vad_speaking_.store(false);
vision_frame_sent_for_current_listen_.store(false);
display->SetStatus(Lang::Strings::LISTENING);
display->SetEmotion("neutral");
// Make sure the audio processor is running
if (play_popup_on_listening_ || !audio_service_.IsAudioProcessorRunning()) {
// For auto mode, wait for playback queue to be empty before enabling voice processing
// This prevents audio truncation when STOP arrives late due to network jitter
if (listening_mode_ == kListeningModeAutoStop) {
audio_service_.WaitForPlaybackQueueEmpty();
}
// Send the start listening command
protocol_->SendStartListening(listening_mode_);
audio_service_.EnableVoiceProcessing(true);
// Re-entering listening after an interrupt must restart the capture path even if the
// processor task is still marked running, otherwise realtime mode can show Listening
// while no fresh mic frames are sent.
if (listening_mode_ == kListeningModeAutoStop) {
audio_service_.WaitForPlaybackQueueEmpty();
}
protocol_->SendStartListening(listening_mode_);
audio_service_.EnableVoiceProcessing(true);
#ifdef CONFIG_WAKE_WORD_DETECTION_IN_LISTENING
// Enable wake word detection in listening mode (configured via Kconfig)
@ -896,13 +956,23 @@ void Application::HandleStateChangedEvent() {
// Disable wake word detection in listening mode
audio_service_.EnableWakeWordDetection(false);
#endif
// Play popup sound after ResetDecoder (in EnableVoiceProcessing) has been called
if (play_popup_on_listening_) {
play_popup_on_listening_ = false;
audio_service_.PlaySound(Lang::Sounds::OGG_POPUP);
}
break;
case kDeviceStateThinking:
vad_speaking_.store(false);
display->SetStatus(Lang::Strings::THINKING);
display->SetEmotion("thinking");
if (listening_mode_ != kListeningModeRealtime) {
audio_service_.EnableVoiceProcessing(false);
audio_service_.EnableWakeWordDetection(audio_service_.IsAfeWakeWord());
}
break;
case kDeviceStateSpeaking:
display->SetStatus(Lang::Strings::SPEAKING);
@ -911,7 +981,9 @@ void Application::HandleStateChangedEvent() {
// Only AFE wake word can be detected in speaking mode
audio_service_.EnableWakeWordDetection(audio_service_.IsAfeWakeWord());
}
audio_service_.ResetDecoder();
if (!accepting_tts_audio_.load()) {
audio_service_.ResetDecoder();
}
break;
case kDeviceStateWifiConfiguring:
audio_service_.EnableVoiceProcessing(false);
@ -923,6 +995,27 @@ void Application::HandleStateChangedEvent() {
}
}
bool Application::SendCurrentVisionFrame() {
if (!protocol_ || !protocol_->IsAudioChannelOpened()) {
return false;
}
auto camera = Board::GetInstance().GetCamera();
if (camera == nullptr) {
return false;
}
std::string jpeg_data;
if (!camera->CaptureToJpeg(jpeg_data, true)) {
ESP_LOGW(TAG, "Failed to capture vision frame");
return false;
}
protocol_->SendVisionFrame(jpeg_data);
ESP_LOGI(TAG, "Sent vision frame, size=%u bytes", static_cast<unsigned>(jpeg_data.size()));
return true;
}
void Application::Schedule(std::function<void()>&& callback) {
{
std::lock_guard<std::mutex> lock(mutex_);
@ -934,6 +1027,8 @@ void Application::Schedule(std::function<void()>&& callback) {
void Application::AbortSpeaking(AbortReason reason) {
ESP_LOGI(TAG, "Abort speaking");
aborted_ = true;
accepting_tts_audio_.store(false);
audio_service_.ResetDecoder();
if (protocol_) {
protocol_->SendAbortSpeaking(reason);
}
@ -941,6 +1036,8 @@ void Application::AbortSpeaking(AbortReason reason) {
void Application::SetListeningMode(ListeningMode mode) {
listening_mode_ = mode;
vad_speaking_.store(false);
vision_frame_sent_for_current_listen_.store(false);
SetDeviceState(kDeviceStateListening);
}
@ -975,7 +1072,8 @@ bool Application::UpgradeFirmware(const std::string& url, const std::string& ver
}
ESP_LOGI(TAG, "Starting firmware upgrade from URL: %s", upgrade_url.c_str());
Alert(Lang::Strings::OTA_UPGRADE, Lang::Strings::UPGRADING, "download", Lang::Sounds::OGG_UPGRADE);
Alert(Lang::Strings::OTA_UPGRADE, Lang::Strings::UPGRADING, "download",
Lang::Sounds::OGG_UPGRADE);
vTaskDelay(pdMS_TO_TICKS(3000));
SetDeviceState(kDeviceStateUpgrading);
@ -997,17 +1095,19 @@ bool Application::UpgradeFirmware(const std::string& url, const std::string& ver
if (!upgrade_success) {
// Upgrade failed, restart audio service and continue running
ESP_LOGE(TAG, "Firmware upgrade failed, restarting audio service and continuing operation...");
audio_service_.Start(); // Restart audio service
board.SetPowerSaveLevel(PowerSaveLevel::LOW_POWER); // Restore power save level
Alert(Lang::Strings::ERROR, Lang::Strings::UPGRADE_FAILED, "circle_xmark", Lang::Sounds::OGG_EXCLAMATION);
ESP_LOGE(TAG,
"Firmware upgrade failed, restarting audio service and continuing operation...");
audio_service_.Start(); // Restart audio service
board.SetPowerSaveLevel(PowerSaveLevel::LOW_POWER); // Restore power save level
Alert(Lang::Strings::ERROR, Lang::Strings::UPGRADE_FAILED, "circle_xmark",
Lang::Sounds::OGG_EXCLAMATION);
vTaskDelay(pdMS_TO_TICKS(3000));
return false;
} else {
// Upgrade success, reboot immediately
ESP_LOGI(TAG, "Firmware upgrade successful, rebooting...");
display->SetChatMessage("system", "Upgrade successful, rebooting...");
vTaskDelay(pdMS_TO_TICKS(1000)); // Brief pause to show message
vTaskDelay(pdMS_TO_TICKS(1000)); // Brief pause to show message
Reboot();
return true;
}
@ -1019,25 +1119,21 @@ void Application::WakeWordInvoke(const std::string& wake_word) {
}
auto state = GetDeviceState();
if (state == kDeviceStateIdle) {
audio_service_.EncodeWakeWord();
if (!protocol_->IsAudioChannelOpened()) {
SetDeviceState(kDeviceStateConnecting);
// Schedule to let the state change be processed first (UI update)
Schedule([this, wake_word]() {
ContinueWakeWordInvoke(wake_word);
});
Schedule([this, wake_word]() { ContinueWakeWordInvoke(wake_word); });
return;
}
// Channel already opened, continue directly
ContinueWakeWordInvoke(wake_word);
} else if (state == kDeviceStateSpeaking) {
Schedule([this]() {
AbortSpeaking(kAbortReasonNone);
});
} else if (state == kDeviceStateListening) {
} else if (state == kDeviceStateSpeaking || state == kDeviceStateThinking) {
Schedule([this]() { AbortSpeaking(kAbortReasonNone); });
} else if (state == kDeviceStateListening) {
Schedule([this]() {
if (protocol_) {
protocol_->CloseAudioChannel();
@ -1063,12 +1159,19 @@ bool Application::CanEnterSleepMode() {
return true;
}
void Application::RegisterMcpBroadcastCallback(std::function<void(const std::string&)> callback) {
mcp_broadcast_callback_ = std::move(callback);
}
void Application::SendMcpMessage(const std::string& payload) {
// Always schedule to run in main task for thread safety
Schedule([this, payload = std::move(payload)]() {
Schedule([this, payload]() {
if (protocol_) {
protocol_->SendMcpMessage(payload);
}
if (mcp_broadcast_callback_) {
mcp_broadcast_callback_(payload);
}
});
}
@ -1078,18 +1181,18 @@ void Application::SetAecMode(AecMode mode) {
auto& board = Board::GetInstance();
auto display = board.GetDisplay();
switch (aec_mode_) {
case kAecOff:
audio_service_.EnableDeviceAec(false);
display->ShowNotification(Lang::Strings::RTC_MODE_OFF);
break;
case kAecOnServerSide:
audio_service_.EnableDeviceAec(false);
display->ShowNotification(Lang::Strings::RTC_MODE_ON);
break;
case kAecOnDeviceSide:
audio_service_.EnableDeviceAec(true);
display->ShowNotification(Lang::Strings::RTC_MODE_ON);
break;
case kAecOff:
audio_service_.EnableDeviceAec(false);
display->ShowNotification(Lang::Strings::RTC_MODE_OFF);
break;
case kAecOnServerSide:
audio_service_.EnableDeviceAec(false);
display->ShowNotification(Lang::Strings::RTC_MODE_ON);
break;
case kAecOnDeviceSide:
audio_service_.EnableDeviceAec(true);
display->ShowNotification(Lang::Strings::RTC_MODE_ON);
break;
}
// If the AEC mode is changed, close the audio channel
@ -1099,9 +1202,7 @@ void Application::SetAecMode(AecMode mode) {
});
}
void Application::PlaySound(const std::string_view& sound) {
audio_service_.PlaySound(sound);
}
void Application::PlaySound(const std::string_view& sound) { audio_service_.PlaySound(sound); }
void Application::ResetProtocol() {
Schedule([this]() {
@ -1113,4 +1214,3 @@ void Application::ResetProtocol() {
protocol_.reset();
});
}

View File

@ -10,6 +10,8 @@
#include <mutex>
#include <deque>
#include <memory>
#include <functional>
#include <atomic>
#include "protocol.h"
#include "ota.h"
@ -39,6 +41,11 @@ enum AecMode {
kAecOnServerSide,
};
enum ChatAgentMode {
kChatAgentModeNormal,
kChatAgentModeBeaver,
};
class Application {
public:
static Application& GetInstance() {
@ -90,6 +97,12 @@ public:
* Sends MAIN_EVENT_TOGGLE_CHAT to be handled in Run()
*/
void ToggleChatState();
void ToggleChatStateWithVision();
void ToggleChatStateForMode(ChatAgentMode agent_mode, bool vision_enabled);
bool IsVisionTextModeEnabled() const;
ChatAgentMode GetChatAgentMode() const { return chat_agent_mode_.load(); }
const char* GetChatAgentModeName() const;
const char* GetChatModeName() const;
/**
* Start listening (event-based, thread-safe)
@ -108,6 +121,7 @@ public:
bool UpgradeFirmware(const std::string& url, const std::string& version = "");
bool CanEnterSleepMode();
void SendMcpMessage(const std::string& payload);
void RegisterMcpBroadcastCallback(std::function<void(const std::string&)> callback);
void SetAecMode(AecMode mode);
AecMode GetAecMode() const { return aec_mode_; }
void PlaySound(const std::string_view& sound);
@ -136,10 +150,19 @@ private:
AudioService audio_service_;
std::unique_ptr<Ota> ota_;
std::function<void(const std::string&)> mcp_broadcast_callback_;
bool has_server_time_ = false;
bool aborted_ = false;
bool assets_version_checked_ = false;
bool play_popup_on_listening_ = false; // Flag to play popup sound after state changes to listening
std::atomic<ChatAgentMode> chat_agent_mode_ = kChatAgentModeNormal;
std::atomic<ChatAgentMode> active_chat_agent_mode_ = kChatAgentModeNormal;
std::atomic<bool> vision_text_mode_enabled_ = false;
std::atomic<bool> active_vision_text_mode_enabled_ = false;
std::atomic<bool> vad_speaking_ = false;
std::atomic<bool> vision_frame_sent_for_current_listen_ = false;
std::atomic<bool> accepting_tts_audio_ = false;
int clock_ticks_ = 0;
TaskHandle_t activation_task_handle_ = nullptr;
@ -155,6 +178,7 @@ private:
void HandleWakeWordDetectedEvent();
void ContinueOpenAudioChannel(ListeningMode mode);
void ContinueWakeWordInvoke(const std::string& wake_word);
bool SendCurrentVisionFrame();
// Activation task (runs in background)
void ActivationTask();

View File

@ -26,6 +26,7 @@
"CONNECTION_SUCCESSFUL": "Connection Successful",
"CONNECTED_TO": "Connected to ",
"LISTENING": "Listening...",
"THINKING": "Thinking...",
"SPEAKING": "Speaking...",
"SERVER_NOT_FOUND": "Looking for available service",
"SERVER_NOT_CONNECTED": "Unable to connect to service, please try again later",
@ -56,4 +57,4 @@
"LOADING_ASSETS": "Loading assets...",
"HELLO_MY_FRIEND": "Hello, my friend!"
}
}
}

View File

@ -23,6 +23,7 @@
"CONNECTING": "连接中...",
"CONNECTED_TO": "已连接 ",
"LISTENING": "聆听中...",
"THINKING": "思考中...",
"SPEAKING": "说话中...",
"SERVER_NOT_FOUND": "正在寻找可用服务",
"SERVER_NOT_CONNECTED": "无法连接服务,请稍后再试",
@ -56,4 +57,4 @@
"FLIGHT_MODE_OFF": "飞行模式已关闭",
"FLIGHT_MODE_ON": "飞行模式已开启"
}
}
}

View File

@ -579,6 +579,7 @@ void AudioService::EnableWakeWordDetection(bool enable) {
void AudioService::EnableVoiceProcessing(bool enable) {
ESP_LOGD(TAG, "%s voice processing", enable ? "Enabling" : "Disabling");
if (enable) {
bool was_running = IsAudioProcessorRunning();
if (!audio_processor_initialized_) {
audio_processor_->Initialize(codec_, OPUS_FRAME_DURATION_MS, models_list_);
audio_processor_initialized_ = true;
@ -586,7 +587,7 @@ void AudioService::EnableVoiceProcessing(bool enable) {
/* We should make sure no audio is playing */
ResetDecoder();
audio_input_need_warmup_ = true;
audio_input_need_warmup_ = !was_running;
// Reset input resampler to clear cached data from previous mode (e.g. WakeWord)
// This prevents buffer overflow when switching between different feed sizes
{

View File

@ -2,6 +2,7 @@
#include <esp_log.h>
#include <cmath>
#include <cstdint>
#include <cstring>
#define TAG "NoAudioCodec"
@ -239,10 +240,10 @@ int NoAudioCodec::Write(const int16_t* data, int samples) {
int NoAudioCodec::Read(int16_t* dest, int samples) {
size_t bytes_read;
constexpr TickType_t kReadTimeoutTicks = pdMS_TO_TICKS(200);
constexpr uint32_t kReadTimeoutMs = 200;
std::vector<int32_t> bit32_buffer(samples);
if (i2s_channel_read(rx_handle_, bit32_buffer.data(), samples * sizeof(int32_t), &bytes_read, kReadTimeoutTicks) != ESP_OK) {
if (i2s_channel_read(rx_handle_, bit32_buffer.data(), samples * sizeof(int32_t), &bytes_read, kReadTimeoutMs) != ESP_OK) {
return 0;
}

View File

@ -0,0 +1,177 @@
#include "background_capture_service.h"
#include "board.h"
#include "camera.h"
#include <algorithm>
#include <esp_heap_caps.h>
#include <esp_log.h>
#define TAG "BgCapture"
BackgroundCaptureService::BackgroundCaptureService() = default;
BackgroundCaptureService::~BackgroundCaptureService() {
Stop();
}
void BackgroundCaptureService::Start() {
#if CONFIG_BACKGROUND_CAPTURE_ENABLE
if (running_.exchange(true)) {
return;
}
auto result = xTaskCreate(
&BackgroundCaptureService::TaskEntry,
"bg_capture",
CONFIG_BACKGROUND_CAPTURE_TASK_STACK_SIZE,
this,
CONFIG_BACKGROUND_CAPTURE_TASK_PRIORITY,
&task_handle_);
if (result != pdPASS) {
running_.store(false);
task_handle_ = nullptr;
ESP_LOGE(TAG, "Failed to create background capture task");
}
#endif
}
void BackgroundCaptureService::Stop() {
#if CONFIG_BACKGROUND_CAPTURE_ENABLE
if (!running_.exchange(false)) {
return;
}
while (task_handle_ != nullptr) {
vTaskDelay(pdMS_TO_TICKS(20));
}
#endif
}
void BackgroundCaptureService::TaskEntry(void* arg) {
#if CONFIG_BACKGROUND_CAPTURE_ENABLE
auto* service = static_cast<BackgroundCaptureService*>(arg);
service->Run();
service->task_handle_ = nullptr;
#else
(void)arg;
#endif
vTaskDelete(nullptr);
}
void BackgroundCaptureService::Run() {
#if CONFIG_BACKGROUND_CAPTURE_ENABLE
ESP_LOGI(TAG, "Background capture task started");
while (running_.load()) {
if (!CaptureAndSendFrame()) {
consecutive_failures_++;
auto delay_ms = GetFailureDelayMs();
ESP_LOGW(TAG, "Background capture retry in %u ms, failures=%u",
delay_ms, consecutive_failures_);
vTaskDelay(pdMS_TO_TICKS(delay_ms));
continue;
}
consecutive_failures_ = 0;
vTaskDelay(pdMS_TO_TICKS(CONFIG_BACKGROUND_CAPTURE_FRAME_INTERVAL_MS));
}
ESP_LOGI(TAG, "Background capture task stopped");
#endif
}
bool BackgroundCaptureService::CaptureAndSendFrame() {
#if CONFIG_BACKGROUND_CAPTURE_ENABLE
const size_t free_internal_heap = heap_caps_get_free_size(MALLOC_CAP_INTERNAL);
if (free_internal_heap < CONFIG_BACKGROUND_CAPTURE_MIN_FREE_INTERNAL_HEAP) {
ESP_LOGW(TAG, "Skip background capture, low internal heap: free=%u threshold=%u",
static_cast<unsigned>(free_internal_heap),
static_cast<unsigned>(CONFIG_BACKGROUND_CAPTURE_MIN_FREE_INTERNAL_HEAP));
return false;
}
auto camera = Board::GetInstance().GetCamera();
if (camera == nullptr) {
ESP_LOGW(TAG, "No camera available for background capture");
return false;
}
std::string jpeg_data;
if (!camera->CaptureToJpeg(jpeg_data, false)) {
ESP_LOGW(TAG, "Failed to capture background frame");
return false;
}
if (jpeg_data.empty()) {
ESP_LOGW(TAG, "Captured empty background frame");
return false;
}
return UploadJpegFrame(jpeg_data);
#else
return false;
#endif
}
uint32_t BackgroundCaptureService::GetFailureDelayMs() const {
#if CONFIG_BACKGROUND_CAPTURE_ENABLE
const uint32_t base_delay_ms = CONFIG_BACKGROUND_CAPTURE_RETRY_INTERVAL_MS;
const uint32_t max_delay_ms = CONFIG_BACKGROUND_CAPTURE_MAX_BACKOFF_MS;
const uint32_t shift = std::min<uint32_t>(consecutive_failures_ - 1, 4);
return std::min<uint32_t>(base_delay_ms << shift, max_delay_ms);
#else
return 0;
#endif
}
bool BackgroundCaptureService::UploadJpegFrame(const std::string& jpeg_data) {
#if CONFIG_BACKGROUND_CAPTURE_ENABLE
const std::string url = CONFIG_BACKGROUND_CAPTURE_UPLOAD_URL;
if (url.empty()) {
ESP_LOGI(TAG, "Captured background frame: %u bytes", static_cast<unsigned>(jpeg_data.size()));
return true;
}
auto network = Board::GetInstance().GetNetwork();
if (network == nullptr) {
ESP_LOGW(TAG, "No network available for background upload");
return false;
}
const std::string boundary = "----XIAOZHI_BACKGROUND_CAPTURE_BOUNDARY";
auto http = network->CreateHttp(3);
http->SetHeader("Content-Type", "multipart/form-data; boundary=" + boundary);
if (!http->Open("POST", url)) {
ESP_LOGW(TAG, "Failed to open background upload URL: %s", url.c_str());
return false;
}
std::string file_header;
file_header += "--" + boundary + "\r\n";
file_header += "Content-Disposition: form-data; name=\"file\"; filename=\"frame.jpg\"\r\n";
file_header += "Content-Type: image/jpeg\r\n\r\n";
http->Write(file_header.c_str(), file_header.size());
http->Write(jpeg_data.data(), jpeg_data.size());
std::string footer;
footer += "\r\n--" + boundary + "--\r\n";
http->Write(footer.c_str(), footer.size());
http->Write("", 0);
const int status_code = http->GetStatusCode();
http->Close();
if (status_code < 200 || status_code >= 300) {
ESP_LOGW(TAG, "Background upload failed, status=%d", status_code);
return false;
}
ESP_LOGI(TAG, "Uploaded background frame: %u bytes", static_cast<unsigned>(jpeg_data.size()));
return true;
#else
(void)jpeg_data;
return false;
#endif
}

View File

@ -0,0 +1,32 @@
#ifndef BACKGROUND_CAPTURE_SERVICE_H
#define BACKGROUND_CAPTURE_SERVICE_H
#include <freertos/FreeRTOS.h>
#include <freertos/task.h>
#include <atomic>
#include <cstdint>
#include <string>
class BackgroundCaptureService {
public:
BackgroundCaptureService();
~BackgroundCaptureService();
void Start();
void Stop();
bool IsRunning() const { return running_.load(); }
private:
TaskHandle_t task_handle_ = nullptr;
std::atomic<bool> running_ = false;
uint32_t consecutive_failures_ = 0;
static void TaskEntry(void* arg);
void Run();
bool CaptureAndSendFrame();
bool UploadJpegFrame(const std::string& jpeg_data);
uint32_t GetFailureDelayMs() const;
};
#endif // BACKGROUND_CAPTURE_SERVICE_H

View File

@ -214,6 +214,9 @@ public:
case kDeviceStateSpeaking:
ctrl_->SetStatusColor(64, 0, 0); // red
break;
case kDeviceStateThinking:
ctrl_->SetStatusColor(0, 0, 64); // blue
break;
default:
ctrl_->SetStatusColor(0, 0, 64); // blue
break;

View File

@ -533,13 +533,13 @@ int Blufi::_get_softap_conn_num() {
return 0;
}
void Blufi::start_wifi_scan() {
bool Blufi::start_wifi_scan() {
ESP_LOGI(BLUFI_TAG, "Starting dedicated WiFi scan");
// Check if a scan is already in progress
// Already running: caller can rely on the in-flight scan and await its done event.
if (m_scan_in_progress) {
ESP_LOGW(BLUFI_TAG, "Scan already in progress, skipping");
return;
return true;
}
m_scan_in_progress = true;
@ -555,14 +555,14 @@ void Blufi::start_wifi_scan() {
if (err != ESP_OK) {
ESP_LOGE(BLUFI_TAG, "Failed to set WiFi mode to STA: %s", esp_err_to_name(err));
m_scan_in_progress = false;
return;
return false;
}
// Need to restart WiFi for mode change to take effect
err = esp_wifi_start();
if (err != ESP_OK) {
ESP_LOGE(BLUFI_TAG, "Failed to start WiFi after mode switch: %s", esp_err_to_name(err));
m_scan_in_progress = false;
return;
return false;
}
// Register scan event handler
esp_event_handler_instance_t scan_event_instance;
@ -575,28 +575,36 @@ void Blufi::start_wifi_scan() {
if (err != ESP_OK) {
ESP_LOGE(BLUFI_TAG, "Failed to start WiFi scan: %s", esp_err_to_name(err));
m_scan_in_progress = false;
return;
return false;
}
} else if (current_mode == WIFI_MODE_STA || current_mode == WIFI_MODE_APSTA) {
// Ensure WiFi driver is started (may have been stopped during config mode transition)
err = esp_wifi_start();
if (err != ESP_OK && err != ESP_ERR_WIFI_STATE) {
ESP_LOGE(BLUFI_TAG, "Failed to start WiFi before scan: %s", esp_err_to_name(err));
m_scan_in_progress = false;
return false;
}
} else if (current_mode == WIFI_MODE_STA) {
// Start scan
err = esp_wifi_scan_start(NULL, false);
if (err != ESP_OK) {
ESP_LOGE(BLUFI_TAG, "Failed to start WiFi scan: %s", esp_err_to_name(err));
m_scan_in_progress = false;
return;
return false;
}
} else {
ESP_LOGE(BLUFI_TAG, "Unexpected WiFi mode: %d", current_mode);
m_scan_in_progress = false;
return;
return false;
}
ESP_LOGI(BLUFI_TAG, "WiFi scan started");
return true;
}
void Blufi::_send_wifi_list() {
if (m_ap_records.empty()) {
ESP_LOGW(BLUFI_TAG, "No AP records available to send");
ESP_LOGW(BLUFI_TAG, "No AP records available, sending WiFi scan fail");
esp_blufi_send_error_info(ESP_BLUFI_WIFI_SCAN_FAIL);
return;
}
@ -631,7 +639,7 @@ void Blufi::_wifi_scan_event_handler(void* arg, esp_event_base_t event_base, int
ESP_LOGW(BLUFI_TAG, "No APs found");
self->m_ap_records.clear();
} else {
if (static_cast<Blufi*>(arg)->m_scan_should_save_ssid == true) {
if (self->m_scan_should_save_ssid) {
self->m_ap_records.resize(ap_num);
esp_wifi_scan_get_ap_records(&ap_num, self->m_ap_records.data());
@ -643,6 +651,11 @@ void Blufi::_wifi_scan_event_handler(void* arg, esp_event_base_t event_base, int
}
}
self->m_scan_in_progress = false;
// Dispatch a pending GET_WIFI_LIST response if one is waiting on this scan.
if (self->m_send_list_after_scan) {
self->m_send_list_after_scan = false;
self->_send_wifi_list();
}
}
}
@ -865,10 +878,29 @@ void Blufi::_handle_event(esp_blufi_cb_event_t event, esp_blufi_cb_param_t* para
break;
case ESP_BLUFI_EVENT_GET_WIFI_LIST: {
ESP_LOGI(BLUFI_TAG, "BLUFI get wifi list");
while (m_scan_in_progress) {
vTaskDelay(pdMS_TO_TICKS(500));
// Case 1: a scan is already in flight (init scan or refresh scan started by
// the previous _send_wifi_list()). Defer the response to its done handler
// instead of blocking the BluFi task.
if (m_scan_in_progress) {
m_send_list_after_scan = true;
break;
}
// Case 2: cache is populated. Respond immediately; _send_wifi_list() also
// kicks off an async refresh scan to keep the cache fresh.
if (!m_ap_records.empty()) {
_send_wifi_list();
break;
}
// Case 3: no cache (e.g. driver was stopped during a config-mode transition,
// init scan never completed). Trigger a real scan and dispatch from the
// scan-done handler. If the scan cannot start, return an error frame so the
// App exits its wait state instead of timing out.
m_scan_should_save_ssid = true;
m_send_list_after_scan = true;
if (!start_wifi_scan()) {
m_send_list_after_scan = false;
esp_blufi_send_error_info(ESP_BLUFI_WIFI_SCAN_FAIL);
}
_send_wifi_list();
break;
}
default:

View File

@ -25,8 +25,9 @@ public:
* This method intelligently handles WiFi scanning based on current WiFi state:
* - If WiFi config mode is active, it uses the existing scan results from WifiConfigurationAp
* - Otherwise, it performs a dedicated scan without interfering with normal WiFi operations
* @return true if a scan was started (or was already in progress); false on failure.
*/
void start_wifi_scan();
bool start_wifi_scan();
/**
* @brief Initializes the Bluetooth controller, host, and Blufi profile.
@ -143,5 +144,13 @@ private:
// WiFi scan related
std::vector<wifi_ap_record_t> m_ap_records;
bool m_scan_in_progress = false;
// When true, scan results are stored in m_ap_records on scan completion.
// Cleared during connect-to-AP so that the connect-time scan does not
// overwrite the cache with results gathered for connection purposes.
bool m_scan_should_save_ssid = true;
// When true, the next scan-done event responds to a pending GET_WIFI_LIST
// request from the App. Set by the GET_WIFI_LIST handler when no cache is
// available or a scan is already in flight; cleared by the scan-done
// handler after dispatching the response.
bool m_send_list_after_scan = false;
};

View File

@ -7,6 +7,8 @@ class Camera {
public:
virtual void SetExplainUrl(const std::string& url, const std::string& token) = 0;
virtual bool Capture() = 0;
virtual bool CaptureBackground() { return Capture(); }
virtual bool CaptureToJpeg(std::string& jpeg_data, bool show_preview = false) { return false; }
virtual bool SetHMirror(bool enabled) = 0;
virtual bool SetVFlip(bool enabled) = 0;
virtual bool SetSwapBytes(bool enabled) { return false; } // Optional, default no-op

View File

@ -24,6 +24,7 @@
#include "lvgl_display.h"
#include "mcp_server.h"
#include "system_info.h"
#include "esp_timer.h"
#ifdef CONFIG_XIAOZHI_ENABLE_CAMERA_DEBUG_MODE
#undef LOG_LOCAL_LEVEL
@ -55,6 +56,7 @@
#define TAG "EspVideo"
#define FOREGROUND_CAPTURE_PROTECTION_US (10 * 1000 * 1000)
#if defined(CONFIG_CAMERA_SENSOR_SWAP_PIXEL_BYTE_ORDER) || defined(CONFIG_XIAOZHI_ENABLE_CAMERA_ENDIANNESS_SWAP)
#warning \
@ -381,11 +383,47 @@ EspVideo::~EspVideo() {
}
void EspVideo::SetExplainUrl(const std::string& url, const std::string& token) {
std::lock_guard<std::mutex> lock(frame_mutex_);
explain_url_ = url;
explain_token_ = token;
}
bool EspVideo::Capture() {
return CaptureFrame(true);
}
bool EspVideo::CaptureBackground() {
return CaptureFrame(false);
}
bool EspVideo::CaptureToJpeg(std::string& jpeg_data, bool show_preview) {
jpeg_data.clear();
if (!CaptureFrame(show_preview)) {
return false;
}
std::lock_guard<std::mutex> lock(frame_mutex_);
if (frame_.data == nullptr || frame_.len == 0) {
return false;
}
uint16_t w = frame_.width ? frame_.width : 320;
uint16_t h = frame_.height ? frame_.height : 240;
return image_to_jpeg_cb(
frame_.data, frame_.len, w, h, frame_.format, 60,
[](void* arg, size_t index, const void* data, size_t len) -> size_t {
auto jpeg_data = static_cast<std::string*>(arg);
if (data != nullptr && len > 0) {
jpeg_data->append(static_cast<const char*>(data), len);
}
return len;
},
&jpeg_data);
}
bool EspVideo::CaptureFrame(bool show_preview) {
std::lock_guard<std::mutex> lock(frame_mutex_);
if (encoder_thread_.joinable()) {
encoder_thread_.join();
}
@ -394,6 +432,10 @@ bool EspVideo::Capture() {
return false;
}
if (!show_preview && esp_timer_get_time() < foreground_capture_protected_until_us_) {
return true;
}
for (int i = 0; i < 3; i++) {
struct v4l2_buffer buf = {};
buf.type = V4L2_BUF_TYPE_VIDEO_CAPTURE;
@ -729,9 +771,14 @@ bool EspVideo::Capture() {
}
}
// 显示预览图片
auto display = dynamic_cast<LvglDisplay*>(Board::GetInstance().GetDisplay());
if (display != nullptr) {
if (show_preview) {
foreground_capture_protected_until_us_ = esp_timer_get_time() + FOREGROUND_CAPTURE_PROTECTION_US;
}
if (show_preview) {
// 显示预览图片
auto display = dynamic_cast<LvglDisplay*>(Board::GetInstance().GetDisplay());
if (display != nullptr) {
if (!frame_.data) {
ESP_LOGE(TAG, "frame.data is null");
return false;
@ -836,6 +883,7 @@ bool EspVideo::Capture() {
auto image = std::make_unique<LvglAllocatedImage>(data, lvgl_image_size, w, h, stride, color_format);
display->SetPreviewImage(std::move(image));
}
}
return true;
}
@ -898,10 +946,16 @@ bool EspVideo::SetVFlip(bool enabled) {
* @warning 如果摄像头缓冲区为空或网络连接失败,将返回错误信息
*/
std::string EspVideo::Explain(const std::string& question) {
std::lock_guard<std::mutex> lock(frame_mutex_);
if (explain_url_.empty()) {
throw std::runtime_error("Image explain URL or token is not set");
}
if (frame_.data == nullptr || frame_.len == 0) {
throw std::runtime_error("No camera frame captured");
}
// 创建局部的 JPEG 队列, 40 entries is about to store 512 * 40 = 20480 bytes of JPEG data
QueueHandle_t jpeg_queue = xQueueCreate(40, sizeof(JpegChunk));
if (jpeg_queue == nullptr) {

View File

@ -5,6 +5,8 @@
#include <thread>
#include <memory>
#include <vector>
#include <mutex>
#include <cstdint>
#include <freertos/FreeRTOS.h>
#include <freertos/queue.h>
@ -39,6 +41,10 @@ private:
std::string explain_url_;
std::string explain_token_;
std::thread encoder_thread_;
std::mutex frame_mutex_;
int64_t foreground_capture_protected_until_us_ = 0;
bool CaptureFrame(bool show_preview);
public:
EspVideo(const esp_video_init_config_t& config);
@ -46,6 +52,8 @@ public:
virtual void SetExplainUrl(const std::string& url, const std::string& token);
virtual bool Capture();
virtual bool CaptureBackground() override;
virtual bool CaptureToJpeg(std::string& jpeg_data, bool show_preview = false) override;
// 翻转控制函数
virtual bool SetHMirror(bool enabled) override;
virtual bool SetVFlip(bool enabled) override;

View File

@ -110,6 +110,8 @@ void Nt26Board::StartNetwork() {
case UartEthModem::UartEthModemEvent::InFlightMode:
ESP_LOGW(TAG, "Modem in flight mode");
break;
case UartEthModem::UartEthModemEvent::RequestingPdpContext:
break;
}
});

View File

@ -203,7 +203,10 @@ void WifiBoard::EnterWifiConfigMode() {
auto& app = Application::GetInstance();
auto state = app.GetDeviceState();
if (state == kDeviceStateSpeaking || state == kDeviceStateListening || state == kDeviceStateIdle) {
if (state == kDeviceStateSpeaking ||
state == kDeviceStateThinking ||
state == kDeviceStateListening ||
state == kDeviceStateIdle) {
// Reset protocol (close audio channel, reset protocol)
Application::GetInstance().ResetProtocol();

View File

@ -85,6 +85,13 @@ void ElectronEmojiDisplay::SetStatus(const char* status) {
lv_obj_add_flag(network_label_, LV_OBJ_FLAG_HIDDEN);
lv_obj_add_flag(battery_label_, LV_OBJ_FLAG_HIDDEN);
return;
} else if (strcmp(status, Lang::Strings::THINKING) == 0) {
lv_obj_set_style_text_font(status_label_, text_font, 0);
lv_label_set_text(status_label_, status);
lv_obj_clear_flag(status_label_, LV_OBJ_FLAG_HIDDEN);
lv_obj_add_flag(network_label_, LV_OBJ_FLAG_HIDDEN);
lv_obj_add_flag(battery_label_, LV_OBJ_FLAG_HIDDEN);
return;
} else if (strcmp(status, Lang::Strings::CONNECTING) == 0) {
lv_obj_set_style_text_font(status_label_, &OTTO_ICON_FONT, 0);
lv_label_set_text(status_label_, "\xEF\x83\x81"); // U+F0c1 连接图标
@ -102,4 +109,4 @@ void ElectronEmojiDisplay::SetStatus(const char* status) {
lv_obj_clear_flag(status_label_, LV_OBJ_FLAG_HIDDEN);
lv_obj_clear_flag(network_label_, LV_OBJ_FLAG_HIDDEN);
lv_obj_clear_flag(battery_label_, LV_OBJ_FLAG_HIDDEN);
}
}

View File

@ -155,6 +155,8 @@ void EmojiWidget::SetStatus(const char* status)
if (player_) {
if (strcmp(status, Lang::Strings::LISTENING) == 0) {
player_->StartPlayer("asking", true, 15);
} else if (strcmp(status, Lang::Strings::THINKING) == 0) {
player_->StartPlayer("thinking", true, 15);
} else if (strcmp(status, Lang::Strings::STANDBY) == 0) {
player_->StartPlayer("wake", true, 15);
}

View File

@ -2,4 +2,6 @@
[product](https://store.freenove.com/products/fnk0104)
Official github: [freenove-esp32s3-display-2.8-lcd](https://github.com/Freenove/Freenove_ESP32_S3_Display)
Official github: [freenove-esp32s3-display-2.8-lcd](https://github.com/Freenove/Freenove_ESP32_S3_Display)
Likely the same hardware design as [LCD wiki ES3C28P/ES3N28P](https://www.lcdwiki.com/2.8inch_ESP32-S3_Display)

View File

@ -6,7 +6,7 @@
"sdkconfig_append": [
"CONFIG_ESPTOOLPY_FLASHSIZE_16MB=y",
"CONFIG_PARTITION_TABLE_CUSTOM_FILENAME=\"partitions/v2/16m.csv\"",
"CONFIG_LANGUAGE_EN_US=y"
"CONFIG_LANGUAGE_EN_US=y",
"CONFIG_SR_WN_WN9S_HIESP=y",
"CONFIG_SR_WN_WN9_HIESP=y"
]

View File

@ -10,6 +10,7 @@
#include "button.h"
#include "config.h"
#include "mcp_server.h"
#include "adc_battery_monitor.h"
#include <esp_log.h>
#include <driver/i2c_master.h>
@ -65,6 +66,11 @@ private:
LcdDisplay *display_;
i2c_master_bus_handle_t codec_i2c_bus_;
TouchDriver touch_;
AdcBatteryMonitor* adc_battery_monitor_;
void InitializeBatteryMonitor() {
adc_battery_monitor_ = new AdcBatteryMonitor(ADC_UNIT_1, ADC_CHANNEL_8, 200000, 200000, GPIO_NUM_NC);
}
static void TouchTask(void *arg) {
auto *self = static_cast<FreenoveESP32S3Display*>(arg);
@ -197,6 +203,7 @@ public:
FreenoveESP32S3Display(): boot_button_(BOOT_BUTTON_GPIO)
{
InitializeI2c();
InitializeBatteryMonitor();
InitializeSpi();
InitializeLcdDisplay();
InitializeTouch();
@ -224,6 +231,13 @@ public:
static PwmBacklight backlight(DISPLAY_BACKLIGHT_PIN, DISPLAY_BACKLIGHT_OUTPUT_INVERT);
return &backlight;
}
virtual bool GetBatteryLevel(int &level, bool& charging, bool& discharging) override {
charging = adc_battery_monitor_->IsCharging();
discharging = adc_battery_monitor_->IsDischarging();
level = adc_battery_monitor_->GetBatteryLevel();
return true;
}
};
DECLARE_BOARD(FreenoveESP32S3Display);

View File

@ -231,9 +231,9 @@ private:
// 如果当前是聆听状态,切换到待命状态
ESP_LOGI(TAG, "从聆听状态切换到待命状态");
app.ToggleChatState(); // 切换到待命状态
} else if (current_state == kDeviceStateSpeaking) {
// 如果当前是说话状态,终止说话并切换到待命状态
ESP_LOGI(TAG, "从说话状态切换到待命状态");
} else if (current_state == kDeviceStateSpeaking || current_state == kDeviceStateThinking) {
// 如果当前是说话或思考状态,终止并切换到待命状态
ESP_LOGI(TAG, "从说话/思考状态切换到待命状态");
app.ToggleChatState(); // 终止说话
} else {
// 其他状态下只唤醒设备

View File

@ -1,21 +1,23 @@
#include "wifi_board.h"
#include "application.h"
#include "axp2101.h"
#include "config.h"
#include "cores3_audio_codec.h"
#include "display/lcd_display.h"
#include "application.h"
#include "config.h"
#include "power_save_timer.h"
#include "i2c_device.h"
#include "axp2101.h"
#include "power_save_timer.h"
#include "wifi_board.h"
#include <esp_log.h>
#include <driver/i2c_master.h>
#include <esp_lcd_ili9341.h>
#include <esp_lcd_panel_io.h>
#include <esp_lcd_panel_ops.h>
#include <esp_lcd_ili9341.h>
#include <esp_log.h>
#include <esp_timer.h>
#include "esp_video.h"
#define TAG "M5StackCoreS3Board"
#define BACKGROUND_VISION_INITIAL_DELAY_MS 8000
#define BACKGROUND_VISION_SAMPLE_INTERVAL_MS 100
class Pmic : public Axp2101 {
public:
@ -41,7 +43,7 @@ public:
class CustomBacklight : public Backlight {
public:
CustomBacklight(Pmic *pmic) : pmic_(pmic) {}
CustomBacklight(Pmic* pmic) : pmic_(pmic) {}
void SetBrightnessImpl(uint8_t brightness) override {
pmic_->SetBrightness(target_brightness_);
@ -49,7 +51,7 @@ public:
}
private:
Pmic *pmic_;
Pmic* pmic_;
};
class Aw9523 : public I2cDevice {
@ -89,16 +91,14 @@ public:
int x = -1;
int y = -1;
};
Ft6336(i2c_master_bus_handle_t i2c_bus, uint8_t addr) : I2cDevice(i2c_bus, addr) {
uint8_t chip_id = ReadReg(0xA3);
ESP_LOGI(TAG, "Get chip ID: 0x%02X", chip_id);
read_buffer_ = new uint8_t[6];
}
~Ft6336() {
delete[] read_buffer_;
}
~Ft6336() { delete[] read_buffer_; }
void UpdateTouchPoint() {
ReadRegs(0x02, read_buffer_, 6);
@ -107,9 +107,7 @@ public:
tp_.y = ((read_buffer_[3] & 0x0F) << 8) | read_buffer_[4];
}
inline const TouchPoint_t& GetTouchPoint() {
return tp_;
}
inline const TouchPoint_t& GetTouchPoint() { return tp_; }
private:
uint8_t* read_buffer_ = nullptr;
@ -137,9 +135,7 @@ private:
GetDisplay()->SetPowerSaveMode(false);
GetBacklight()->RestoreBrightness();
});
power_save_timer_->OnShutdownRequest([this]() {
pmic_->PowerOff();
});
power_save_timer_->OnShutdownRequest([this]() { pmic_->PowerOff(); });
power_save_timer_->SetEnabled(true);
}
@ -153,9 +149,10 @@ private:
.glitch_ignore_cnt = 7,
.intr_priority = 0,
.trans_queue_depth = 0,
.flags = {
.enable_internal_pullup = 1,
},
.flags =
{
.enable_internal_pullup = 1,
},
};
ESP_ERROR_CHECK(i2c_new_master_bus(&i2c_bus_cfg, &i2c_bus_));
}
@ -195,29 +192,37 @@ private:
void PollTouchpad() {
static bool was_touched = false;
static int64_t touch_start_time = 0;
static int touch_start_x = -1;
const int64_t TOUCH_THRESHOLD_MS = 500; // 触摸时长阈值超过500ms视为长按
ft6336_->UpdateTouchPoint();
auto& touch_point = ft6336_->GetTouchPoint();
// 检测触摸开始
if (touch_point.num > 0 && !was_touched) {
was_touched = true;
touch_start_time = esp_timer_get_time() / 1000; // 转换为毫秒
}
touch_start_time = esp_timer_get_time() / 1000; // 转换为毫秒
touch_start_x = touch_point.x;
}
// 检测触摸释放
else if (touch_point.num == 0 && was_touched) {
was_touched = false;
int64_t touch_duration = (esp_timer_get_time() / 1000) - touch_start_time;
// 只有短触才触发
bool beaver_mode = touch_start_x >= DISPLAY_WIDTH / 2;
auto agent_mode = beaver_mode ? kChatAgentModeBeaver : kChatAgentModeNormal;
if (touch_duration < TOUCH_THRESHOLD_MS) {
auto& app = Application::GetInstance();
if (app.GetDeviceState() == kDeviceStateStarting) {
EnterWifiConfigMode();
return;
}
app.ToggleChatState();
ESP_LOGI(TAG, "Touch short: %s text-only mode", beaver_mode ? "beaver" : "normal");
app.ToggleChatStateForMode(agent_mode, false);
} else {
auto& app = Application::GetInstance();
ESP_LOGI(TAG, "Touch long: %s vision+text mode", beaver_mode ? "beaver" : "normal");
app.ToggleChatStateForMode(agent_mode, true);
}
}
}
@ -225,19 +230,20 @@ private:
void InitializeFt6336TouchPad() {
ESP_LOGI(TAG, "Init FT6336");
ft6336_ = new Ft6336(i2c_bus_, 0x38);
// 创建定时器20ms 间隔
esp_timer_create_args_t timer_args = {
.callback = [](void* arg) {
M5StackCoreS3Board* board = (M5StackCoreS3Board*)arg;
board->PollTouchpad();
},
.callback =
[](void* arg) {
M5StackCoreS3Board* board = (M5StackCoreS3Board*)arg;
board->PollTouchpad();
},
.arg = this,
.dispatch_method = ESP_TIMER_TASK,
.name = "touchpad_timer",
.skip_unhandled_events = true,
};
ESP_ERROR_CHECK(esp_timer_create(&timer_args, &touchpad_timer_));
ESP_ERROR_CHECK(esp_timer_start_periodic(touchpad_timer_, 20 * 1000));
}
@ -276,7 +282,7 @@ private:
panel_config.rgb_ele_order = LCD_RGB_ELEMENT_ORDER_BGR;
panel_config.bits_per_pixel = 16;
ESP_ERROR_CHECK(esp_lcd_new_panel_ili9341(panel_io, &panel_config, &panel));
esp_lcd_panel_reset(panel);
aw9523_->ResetIli9342();
@ -285,23 +291,25 @@ private:
esp_lcd_panel_swap_xy(panel, DISPLAY_SWAP_XY);
esp_lcd_panel_mirror(panel, DISPLAY_MIRROR_X, DISPLAY_MIRROR_Y);
display_ = new SpiLcdDisplay(panel_io, panel,
DISPLAY_WIDTH, DISPLAY_HEIGHT, DISPLAY_OFFSET_X, DISPLAY_OFFSET_Y, DISPLAY_MIRROR_X, DISPLAY_MIRROR_Y, DISPLAY_SWAP_XY);
display_ = new SpiLcdDisplay(panel_io, panel, DISPLAY_WIDTH, DISPLAY_HEIGHT,
DISPLAY_OFFSET_X, DISPLAY_OFFSET_Y, DISPLAY_MIRROR_X,
DISPLAY_MIRROR_Y, DISPLAY_SWAP_XY);
}
void InitializeCamera() {
void InitializeCamera() {
static esp_cam_ctlr_dvp_pin_config_t dvp_pin_config = {
.data_width = CAM_CTLR_DATA_WIDTH_8,
.data_io = {
[0] = CAMERA_PIN_D0,
[1] = CAMERA_PIN_D1,
[2] = CAMERA_PIN_D2,
[3] = CAMERA_PIN_D3,
[4] = CAMERA_PIN_D4,
[5] = CAMERA_PIN_D5,
[6] = CAMERA_PIN_D6,
[7] = CAMERA_PIN_D7,
},
.data_io =
{
[0] = CAMERA_PIN_D0,
[1] = CAMERA_PIN_D1,
[2] = CAMERA_PIN_D2,
[3] = CAMERA_PIN_D3,
[4] = CAMERA_PIN_D4,
[5] = CAMERA_PIN_D5,
[6] = CAMERA_PIN_D6,
[7] = CAMERA_PIN_D7,
},
.vsync_io = CAMERA_PIN_VSYNC,
.de_io = CAMERA_PIN_HREF,
.pclk_io = CAMERA_PIN_PCLK,
@ -330,6 +338,42 @@ private:
camera_->SetHMirror(false);
}
void InitializeBackgroundVisionSampler() {
xTaskCreate(
[](void* arg) {
auto board = static_cast<M5StackCoreS3Board*>(arg);
bool has_logged_success = false;
bool has_logged_failure = false;
vTaskDelay(pdMS_TO_TICKS(BACKGROUND_VISION_INITIAL_DELAY_MS));
while (true) {
if (!Application::GetInstance().IsVisionTextModeEnabled()) {
vTaskDelay(pdMS_TO_TICKS(BACKGROUND_VISION_SAMPLE_INTERVAL_MS));
continue;
}
if (board->camera_ == nullptr) {
vTaskDelay(pdMS_TO_TICKS(BACKGROUND_VISION_SAMPLE_INTERVAL_MS));
continue;
}
if (board->camera_->Capture()) {
if (!has_logged_success) {
ESP_LOGI(TAG, "Vision preview sampler started");
has_logged_success = true;
}
} else if (!has_logged_failure) {
ESP_LOGW(TAG, "Vision preview sampler is waiting for camera");
has_logged_failure = true;
}
vTaskDelay(pdMS_TO_TICKS(BACKGROUND_VISION_SAMPLE_INTERVAL_MS));
}
},
"BgVisionSampler", 4096, this, 1, nullptr);
}
public:
M5StackCoreS3Board() {
InitializePowerSaveTimer();
@ -340,34 +384,24 @@ public:
InitializeSpi();
InitializeIli9342Display();
InitializeCamera();
InitializeBackgroundVisionSampler();
InitializeFt6336TouchPad();
GetBacklight()->RestoreBrightness();
}
virtual AudioCodec* GetAudioCodec() override {
static CoreS3AudioCodec audio_codec(i2c_bus_,
AUDIO_INPUT_SAMPLE_RATE,
AUDIO_OUTPUT_SAMPLE_RATE,
AUDIO_I2S_GPIO_MCLK,
AUDIO_I2S_GPIO_BCLK,
AUDIO_I2S_GPIO_WS,
AUDIO_I2S_GPIO_DOUT,
AUDIO_I2S_GPIO_DIN,
AUDIO_CODEC_AW88298_ADDR,
AUDIO_CODEC_ES7210_ADDR,
AUDIO_INPUT_REFERENCE);
static CoreS3AudioCodec audio_codec(
i2c_bus_, AUDIO_INPUT_SAMPLE_RATE, AUDIO_OUTPUT_SAMPLE_RATE, AUDIO_I2S_GPIO_MCLK,
AUDIO_I2S_GPIO_BCLK, AUDIO_I2S_GPIO_WS, AUDIO_I2S_GPIO_DOUT, AUDIO_I2S_GPIO_DIN,
AUDIO_CODEC_AW88298_ADDR, AUDIO_CODEC_ES7210_ADDR, AUDIO_INPUT_REFERENCE);
return &audio_codec;
}
virtual Display* GetDisplay() override {
return display_;
}
virtual Display* GetDisplay() override { return display_; }
virtual Camera* GetCamera() override {
return camera_;
}
virtual Camera* GetCamera() override { return camera_; }
virtual bool GetBatteryLevel(int &level, bool& charging, bool& discharging) override {
virtual bool GetBatteryLevel(int& level, bool& charging, bool& discharging) override {
static bool last_discharging = false;
charging = pmic_->IsCharging();
discharging = pmic_->IsDischarging();
@ -387,7 +421,7 @@ public:
WifiBoard::SetPowerSaveLevel(level);
}
virtual Backlight *GetBacklight() override {
virtual Backlight* GetBacklight() override {
static CustomBacklight backlight(pmic_);
return &backlight;
}

View File

@ -8,27 +8,9 @@
## 基础使用
* idf version: v6.0-dev
* idf version: v5.5.2 or above (recommended: v6.0-dev)
1. 调整 idf_component.yml
```yaml
espressif/esp_video:
version: ==1.3.1 # for compatibility. update version may need to modify this project code.
rules:
- if: target not in [esp32]
```
修改为
```yaml
espressif/esp_video:
version: '==0.7.0'
rules:
- if: target not in [esp32]
espressif/esp_ipa: '==0.1.0'
```
* idf version: v5.5.3
* No dependency override needed — the project already specifies the correct `esp_video` and `esp_ipa` versions in `main/idf_component.yml`. Do NOT change the dependency versions unless you are also modifying the source code to match the older API.
针对 ESP32-P4 Rev <3.0 用户:
确保你的 sdkconfig.defaults 包含:
@ -37,7 +19,7 @@ CONFIG_ESP32P4_SELECTS_REV_LESS_V3=y
否则烧写的时候会出现'bootloader/bootloader.bin' requires chip revision in range [v3.0 - v3.99] (this chip is revision v1.x)
2. 使用 release.py 编译
1. 使用 release.py 编译
```shell
python ./scripts/release.py m5stack-tab5
@ -45,7 +27,7 @@ python ./scripts/release.py m5stack-tab5
如需手动编译请参考 `m5stack-tab5/config.json` 修改 menuconfig 对应选项
3. 编译烧录程序
2. 编译烧录程序
```shell
idf.py flash monitor
@ -62,5 +44,3 @@ idf.py flash monitor
1. listening... 需要等几秒才能获取语音输入???
2. 亮度调节不对
3. 音量调节不对
## TODO

View File

@ -205,3 +205,212 @@ otto 机器人具有丰富的动作能力,包括行走、转向、跳跃、摇
**说明**: 小智控制机器人动作是创建新的任务在后台控制,动作执行期间仍可接受新的语音指令。可以通过"停止"语音指令立即停下Otto。
---
## WebSocket 直连调试接口
Otto 机器人内置 WebSocket 服务器,可在局域网内直接调试,无需经过云端。
**连接地址:** `ws://<设备IP>:8080/ws`
> 协议格式JSON-RPC 2.0`id` 字段自行递增即可。
### 连接方式
1. 确认 Otto 已连上 WiFi获取 IP 地址(可通过小程序或串口日志查看)
2. 打开任意 WebSocket 调试工具(如 [websocket.org/echo](https://websocket.org/echo) 或浏览器控制台)
3. 连接 `ws://192.168.x.x:8080/ws`(注意末尾必须有 `/ws`
4. 发送 JSON 命令,响应会直接返回到同一连接
---
### 一、协议初始化(首次连接建议先发)
```json
{"jsonrpc":"2.0","method":"initialize","params":{"protocolVersion":"2024-11-05","capabilities":{}},"id":1}
```
---
### 二、获取工具列表
```json
{"jsonrpc":"2.0","method":"tools/list","params":{},"id":2}
```
---
### 三、Otto 机器人工具命令
#### 获取舵机微调值
```json
{"jsonrpc":"2.0","method":"tools/call","params":{"name":"self.otto.get_trims","arguments":{}},"id":3}
```
#### 设置单个舵机微调(永久保存)
```json
{"jsonrpc":"2.0","method":"tools/call","params":{"name":"self.otto.set_trim","arguments":{"servo_type":"left_leg","trim_value":5}},"id":4}
```
`servo_type` 可选值:`left_leg` / `right_leg` / `left_foot` / `right_foot` / `left_hand` / `right_hand``trim_value` 范围 `-50` ~ `50`
#### 行走前进3步
```json
{"jsonrpc":"2.0","method":"tools/call","params":{"name":"self.otto.action","arguments":{"action":"walk","steps":3,"speed":700,"direction":1}},"id":5}
```
#### 后退
```json
{"jsonrpc":"2.0","method":"tools/call","params":{"name":"self.otto.action","arguments":{"action":"walk","steps":3,"speed":700,"direction":-1}},"id":6}
```
#### 左转
```json
{"jsonrpc":"2.0","method":"tools/call","params":{"name":"self.otto.action","arguments":{"action":"turn","steps":3,"speed":700,"direction":-1}},"id":7}
```
#### 跳跃
```json
{"jsonrpc":"2.0","method":"tools/call","params":{"name":"self.otto.action","arguments":{"action":"jump","steps":1,"speed":500}},"id":8}
```
#### 摇摆
```json
{"jsonrpc":"2.0","method":"tools/call","params":{"name":"self.otto.action","arguments":{"action":"swing","steps":5,"speed":600,"amount":30}},"id":9}
```
#### 太空步
```json
{"jsonrpc":"2.0","method":"tools/call","params":{"name":"self.otto.action","arguments":{"action":"moonwalk","steps":3,"speed":800,"direction":1,"amount":30}},"id":10}
```
#### 坐下
```json
{"jsonrpc":"2.0","method":"tools/call","params":{"name":"self.otto.action","arguments":{"action":"sit"}},"id":11}
```
#### 复位
```json
{"jsonrpc":"2.0","method":"tools/call","params":{"name":"self.otto.action","arguments":{"action":"home"}},"id":12}
```
#### 展示动作
```json
{"jsonrpc":"2.0","method":"tools/call","params":{"name":"self.otto.action","arguments":{"action":"showcase"}},"id":13}
```
#### 举手(需手部舵机)
```json
{"jsonrpc":"2.0","method":"tools/call","params":{"name":"self.otto.action","arguments":{"action":"hands_up","speed":500,"direction":1}},"id":14}
```
#### 挥手(需手部舵机)
```json
{"jsonrpc":"2.0","method":"tools/call","params":{"name":"self.otto.action","arguments":{"action":"hand_wave","direction":1}},"id":15}
```
#### 立即停止所有动作
```json
{"jsonrpc":"2.0","method":"tools/call","params":{"name":"self.otto.stop","arguments":{}},"id":16}
```
#### 获取运动状态(返回 `"moving"` 或 `"idle"`
```json
{"jsonrpc":"2.0","method":"tools/call","params":{"name":"self.otto.get_status","arguments":{}},"id":17}
```
#### 获取 IP 地址
```json
{"jsonrpc":"2.0","method":"tools/call","params":{"name":"self.otto.get_ip","arguments":{}},"id":18}
```
#### 获取电池电量
```json
{"jsonrpc":"2.0","method":"tools/call","params":{"name":"self.battery.get_level","arguments":{}},"id":19}
```
---
### 四、系统通用工具
#### 获取设备状态(音量/网络/电池等)
```json
{"jsonrpc":"2.0","method":"tools/call","params":{"name":"self.get_device_status","arguments":{}},"id":20}
```
#### 设置音量0~100
```json
{"jsonrpc":"2.0","method":"tools/call","params":{"name":"self.audio_speaker.set_volume","arguments":{"volume":70}},"id":21}
```
#### 重启设备
```json
{"jsonrpc":"2.0","method":"tools/call","params":{"name":"self.reboot","arguments":{}},"id":22}
```
---
### 五、自定义舵机序列
#### 普通移动模式(逐步移动各舵机)
```json
{"jsonrpc":"2.0","method":"tools/call","params":{"name":"self.otto.servo_sequences","arguments":{"sequence":"{\"a\":[{\"s\":{\"ll\":110,\"rl\":70},\"v\":800},{\"s\":{\"ll\":90,\"rl\":90},\"v\":800}],\"d\":0}"}},"id":23}
```
#### 振荡器模式(双臂摆动)
```json
{"jsonrpc":"2.0","method":"tools/call","params":{"name":"self.otto.servo_sequences","arguments":{"sequence":"{\"a\":[{\"osc\":{\"a\":{\"lh\":30,\"rh\":30},\"o\":{\"lh\":90,\"rh\":90},\"ph\":{\"rh\":180},\"p\":500,\"c\":5.0}}]}"}},"id":24}
```
#### 振荡器模式(左右摇摆波浪)
```json
{"jsonrpc":"2.0","method":"tools/call","params":{"name":"self.otto.servo_sequences","arguments":{"sequence":"{\"a\":[{\"osc\":{\"a\":{\"ll\":20,\"rl\":20},\"o\":{\"ll\":90,\"rl\":90},\"ph\":{\"rl\":180},\"p\":600,\"c\":5.0}}]}"}},"id":25}
```
**序列舵机键名说明:**
| 键名 | 舵机 | 说明 |
|------|------|------|
| `ll` | 左腿 | 0=完全外展90=中立180=完全内收 |
| `rl` | 右腿 | 0=完全内收90=中立180=完全外展 |
| `lf` | 左脚 | 0=完全向上90=水平180=完全向下 |
| `rf` | 右脚 | 0=完全向下90=水平180=完全向上 |
| `lh` | 左手 | 0=完全向下90=水平180=完全向上 |
| `rh` | 右手 | 0=完全向上90=水平180=完全向下 |
---
### 六、动作参数速查
| 参数 | 说明 | 范围 | 默认 |
|------|------|------|------|
| `steps` | 动作步数 | 1~100 | 3 |
| `speed` | 速度(毫秒,越小越快) | 100~3000 | 700 |
| `direction` | 方向1=前/左,-1=后/右) | -1~1 | 1 |
| `amount` | 幅度 | 0~170 | 30 |
| `arm_swing` | 手臂摆动幅度 | 0~170 | 50 |
| `trim_value` | 舵机微调 | -50~50 | 0 |

View File

@ -77,6 +77,13 @@ void OttoEmojiDisplay::SetStatus(const char* status) {
lv_obj_add_flag(network_label_, LV_OBJ_FLAG_HIDDEN);
lv_obj_add_flag(battery_label_, LV_OBJ_FLAG_HIDDEN);
return;
} else if (strcmp(status, Lang::Strings::THINKING) == 0) {
lv_obj_set_style_text_font(status_label_, text_font, 0);
lv_label_set_text(status_label_, status);
lv_obj_clear_flag(status_label_, LV_OBJ_FLAG_HIDDEN);
lv_obj_add_flag(network_label_, LV_OBJ_FLAG_HIDDEN);
lv_obj_add_flag(battery_label_, LV_OBJ_FLAG_HIDDEN);
return;
} else if (strcmp(status, Lang::Strings::CONNECTING) == 0) {
lv_obj_set_style_text_font(status_label_, &OTTO_ICON_FONT, 0);
lv_label_set_text(status_label_, "\xEF\x83\x81"); // U+F0c1 连接图标
@ -131,4 +138,4 @@ void OttoEmojiDisplay::SetPreviewImage(std::unique_ptr<LvglImage> image) {
lv_obj_remove_flag(preview_image_, LV_OBJ_FLAG_HIDDEN);
esp_timer_stop(preview_timer_);
ESP_ERROR_CHECK(esp_timer_start_once(preview_timer_, PREVIEW_IMAGE_DURATION_MS * 1000));
}
}

View File

@ -230,7 +230,16 @@ private:
if (!ws_control_server_->Start(8080)) {
delete ws_control_server_;
ws_control_server_ = nullptr;
return;
}
// 将 MCP 响应同时广播回连接到 8080 端口的 WebSocket 客户端
Application::GetInstance().RegisterMcpBroadcastCallback(
[this](const std::string& payload) {
if (ws_control_server_) {
ws_control_server_->BroadcastMessage(payload);
}
}
);
}
void StartNetwork() override {

View File

@ -190,3 +190,61 @@ void WebSocketControlServer::RemoveClient(httpd_req_t *req) {
size_t WebSocketControlServer::GetClientCount() const {
return clients_.size();
}
struct WsBroadcastJob {
httpd_handle_t server;
int fd;
char* payload;
size_t len;
};
static void ws_broadcast_send_job(void* arg) {
WsBroadcastJob* job = static_cast<WsBroadcastJob*>(arg);
httpd_ws_frame_t ws_pkt = {};
ws_pkt.type = HTTPD_WS_TYPE_TEXT;
ws_pkt.payload = reinterpret_cast<uint8_t*>(job->payload);
ws_pkt.len = job->len;
ws_pkt.final = true;
esp_err_t ret = httpd_ws_send_frame_async(job->server, job->fd, &ws_pkt);
if (ret != ESP_OK) {
ESP_LOGE("WSControl", "BroadcastMessage: send failed fd=%d err=%d", job->fd, ret);
}
free(job->payload);
free(job);
}
void WebSocketControlServer::BroadcastMessage(const std::string& message) {
if (!server_handle_ || clients_.empty()) {
return;
}
for (auto& [fd, req] : clients_) {
WsBroadcastJob* job = static_cast<WsBroadcastJob*>(malloc(sizeof(WsBroadcastJob)));
if (!job) {
ESP_LOGE(TAG, "BroadcastMessage: failed to allocate job");
continue;
}
job->server = server_handle_;
job->fd = fd;
job->len = message.length();
job->payload = static_cast<char*>(malloc(message.length() + 1));
if (!job->payload) {
ESP_LOGE(TAG, "BroadcastMessage: failed to allocate payload");
free(job);
continue;
}
memcpy(job->payload, message.c_str(), message.length());
job->payload[message.length()] = '\0';
esp_err_t ret = httpd_queue_work(server_handle_, ws_broadcast_send_job, job);
if (ret != ESP_OK) {
ESP_LOGE(TAG, "BroadcastMessage: httpd_queue_work failed fd=%d err=%d", fd, ret);
free(job->payload);
free(job);
}
}
}

View File

@ -17,6 +17,8 @@ public:
size_t GetClientCount() const;
void BroadcastMessage(const std::string& message);
private:
httpd_handle_t server_handle_;
std::map<int, httpd_req_t*> clients_;

View File

@ -0,0 +1,31 @@
# RYMCU BigSmart
该目录为 `RYMCU BigSmart` 开发板适配,并按以下硬件资源完成映射:
- 主控ESP32-S3-WROOM-1-N16R8
- 显示ST7789320x240SPI
- 触摸GT911I2C
- 音频ES8311 + ES7210I2S + I2C
- IO扩展PCA9557I2C 地址 `0x19`
- 摄像头GC0308DVP
参考硬件文档:
- https://github.com/rymcu/BigSmart-Open/blob/main/docs/rymcu-bigsmart-hardware.md
## 编译
```bash
idf.py set-target esp32s3
idf.py menuconfig
```
在菜单中选择:
`Xiaozhi Assistant -> Board Type -> RYMCU BigSmart`
然后执行:
```bash
idf.py build
```

View File

@ -0,0 +1,69 @@
#ifndef _BOARD_CONFIG_H_
#define _BOARD_CONFIG_H_
#include <driver/gpio.h>
#define AUDIO_INPUT_SAMPLE_RATE 24000
#define AUDIO_OUTPUT_SAMPLE_RATE 24000
#define AUDIO_INPUT_REFERENCE true
#define AUDIO_I2S_GPIO_MCLK GPIO_NUM_38
#define AUDIO_I2S_GPIO_WS GPIO_NUM_13
#define AUDIO_I2S_GPIO_BCLK GPIO_NUM_14
#define AUDIO_I2S_GPIO_DIN GPIO_NUM_12
#define AUDIO_I2S_GPIO_DOUT GPIO_NUM_45
#define AUDIO_CODEC_USE_PCA9557
#define AUDIO_CODEC_I2C_SDA_PIN GPIO_NUM_1
#define AUDIO_CODEC_I2C_SCL_PIN GPIO_NUM_2
#define AUDIO_CODEC_ES8311_ADDR ES8311_CODEC_DEFAULT_ADDR
#define AUDIO_CODEC_ES7210_ADDR 0x82
#define BUILTIN_LED_GPIO GPIO_NUM_NC
#define RGB_LED_GPIO GPIO_NUM_43
#define BOOT_BUTTON_GPIO GPIO_NUM_0
#define CUSTOM_BUTTON_GPIO GPIO_NUM_10
#define VOLUME_UP_BUTTON_GPIO GPIO_NUM_NC
#define VOLUME_DOWN_BUTTON_GPIO GPIO_NUM_NC
#define DISPLAY_WIDTH 320
#define DISPLAY_HEIGHT 240
#define DISPLAY_MIRROR_X true
#define DISPLAY_MIRROR_Y false
#define DISPLAY_SWAP_XY true
#define DISPLAY_OFFSET_X 0
#define DISPLAY_OFFSET_Y 0
#define DISPLAY_BACKLIGHT_PIN GPIO_NUM_42
#define DISPLAY_BACKLIGHT_OUTPUT_INVERT true
#define TOUCH_INT_GPIO GPIO_NUM_NC
#define TOUCH_RST_GPIO GPIO_NUM_NC
/* Camera pins */
#define CAMERA_PIN_PWDN GPIO_NUM_NC
#define CAMERA_PIN_RESET GPIO_NUM_NC
#define CAMERA_PIN_XCLK GPIO_NUM_5
#define CAMERA_PIN_SIOD GPIO_NUM_1
#define CAMERA_PIN_SIOC GPIO_NUM_2
#define CAMERA_PIN_D7 GPIO_NUM_9
#define CAMERA_PIN_D6 GPIO_NUM_4
#define CAMERA_PIN_D5 GPIO_NUM_6
#define CAMERA_PIN_D4 GPIO_NUM_15
#define CAMERA_PIN_D3 GPIO_NUM_17
#define CAMERA_PIN_D2 GPIO_NUM_8
#define CAMERA_PIN_D1 GPIO_NUM_18
#define CAMERA_PIN_D0 GPIO_NUM_16
#define CAMERA_PIN_VSYNC GPIO_NUM_44
#define CAMERA_PIN_HREF GPIO_NUM_46
#define CAMERA_PIN_PCLK GPIO_NUM_7
#define XCLK_FREQ_HZ 16000000
#define BATTERY_CHARGING_PIN GPIO_NUM_3
#define BATTERY_ADC_PIN GPIO_NUM_11
#endif // _BOARD_CONFIG_H_

View File

@ -0,0 +1,12 @@
{
"manufacturer": "rymcu",
"target": "esp32s3",
"builds": [
{
"name": "bigsmart",
"sdkconfig_append": [
"CONFIG_USE_DEVICE_AEC=y"
]
}
]
}

View File

@ -0,0 +1,284 @@
#include "wifi_board.h"
#include "codecs/box_audio_codec.h"
#include "display/lcd_display.h"
#include "display/emote_display.h"
#include "application.h"
#include "button.h"
#include "config.h"
#include "i2c_device.h"
#include "esp32_camera.h"
#include "mcp_server.h"
#include <esp_log.h>
#include <esp_lcd_panel_vendor.h>
#include <driver/i2c_master.h>
#include <driver/spi_common.h>
#include <esp_lcd_touch_gt911.h>
#include <esp_lvgl_port.h>
#include <lvgl.h>
#define TAG "RymcuBigsmartBoard"
class Pca9557 : public I2cDevice {
public:
Pca9557(i2c_master_bus_handle_t i2c_bus, uint8_t addr) : I2cDevice(i2c_bus, addr) {
WriteReg(0x01, 0x03);
WriteReg(0x03, 0xf8);
}
void SetOutputState(uint8_t bit, uint8_t level) {
uint8_t data = ReadReg(0x01);
data = (data & ~(1 << bit)) | (level << bit);
WriteReg(0x01, data);
}
};
class CustomAudioCodec : public BoxAudioCodec {
private:
Pca9557* pca9557_;
public:
CustomAudioCodec(i2c_master_bus_handle_t i2c_bus, Pca9557* pca9557)
: BoxAudioCodec(i2c_bus,
AUDIO_INPUT_SAMPLE_RATE,
AUDIO_OUTPUT_SAMPLE_RATE,
AUDIO_I2S_GPIO_MCLK,
AUDIO_I2S_GPIO_BCLK,
AUDIO_I2S_GPIO_WS,
AUDIO_I2S_GPIO_DOUT,
AUDIO_I2S_GPIO_DIN,
GPIO_NUM_NC,
AUDIO_CODEC_ES8311_ADDR,
AUDIO_CODEC_ES7210_ADDR,
AUDIO_INPUT_REFERENCE),
pca9557_(pca9557) {
}
virtual void EnableOutput(bool enable) override {
BoxAudioCodec::EnableOutput(enable);
if (enable) {
pca9557_->SetOutputState(1, 1);
} else {
pca9557_->SetOutputState(1, 0);
}
}
};
class RymcuBigsmartBoard : public WifiBoard {
private:
i2c_master_bus_handle_t i2c_bus_;
Button boot_button_;
Display* display_;
Pca9557* pca9557_;
Esp32Camera* camera_;
void InitializeI2c() {
i2c_master_bus_config_t i2c_bus_cfg = {
.i2c_port = (i2c_port_t)1,
.sda_io_num = AUDIO_CODEC_I2C_SDA_PIN,
.scl_io_num = AUDIO_CODEC_I2C_SCL_PIN,
.clk_source = I2C_CLK_SRC_DEFAULT,
.glitch_ignore_cnt = 7,
.intr_priority = 0,
.trans_queue_depth = 0,
.flags = {
.enable_internal_pullup = 1,
},
};
ESP_ERROR_CHECK(i2c_new_master_bus(&i2c_bus_cfg, &i2c_bus_));
pca9557_ = new Pca9557(i2c_bus_, 0x19);
}
void InitializeSpi() {
spi_bus_config_t buscfg = {};
buscfg.mosi_io_num = GPIO_NUM_40;
buscfg.miso_io_num = GPIO_NUM_NC;
buscfg.sclk_io_num = GPIO_NUM_41;
buscfg.quadwp_io_num = GPIO_NUM_NC;
buscfg.quadhd_io_num = GPIO_NUM_NC;
buscfg.max_transfer_sz = DISPLAY_WIDTH * DISPLAY_HEIGHT * sizeof(uint16_t);
ESP_ERROR_CHECK(spi_bus_initialize(SPI3_HOST, &buscfg, SPI_DMA_CH_AUTO));
}
void InitializeButtons() {
boot_button_.OnClick([this]() {
auto& app = Application::GetInstance();
if (app.GetDeviceState() == kDeviceStateStarting) {
EnterWifiConfigMode();
return;
}
app.ToggleChatState();
});
#if CONFIG_USE_DEVICE_AEC
boot_button_.OnDoubleClick([this]() {
auto& app = Application::GetInstance();
if (app.GetDeviceState() == kDeviceStateIdle) {
app.SetAecMode(app.GetAecMode() == kAecOff ? kAecOnDeviceSide : kAecOff);
}
});
#endif
}
void InitializeSt7789Display() {
esp_lcd_panel_io_handle_t panel_io = nullptr;
esp_lcd_panel_handle_t panel = nullptr;
ESP_LOGD(TAG, "Install panel IO");
esp_lcd_panel_io_spi_config_t io_config = {};
io_config.cs_gpio_num = GPIO_NUM_NC;
io_config.dc_gpio_num = GPIO_NUM_39;
io_config.spi_mode = 2;
io_config.pclk_hz = 80 * 1000 * 1000;
io_config.trans_queue_depth = 10;
io_config.lcd_cmd_bits = 8;
io_config.lcd_param_bits = 8;
ESP_ERROR_CHECK(esp_lcd_new_panel_io_spi(SPI3_HOST, &io_config, &panel_io));
ESP_LOGD(TAG, "Install LCD driver");
esp_lcd_panel_dev_config_t panel_config = {};
panel_config.reset_gpio_num = GPIO_NUM_NC;
panel_config.rgb_ele_order = LCD_RGB_ELEMENT_ORDER_RGB;
panel_config.bits_per_pixel = 16;
ESP_ERROR_CHECK(esp_lcd_new_panel_st7789(panel_io, &panel_config, &panel));
esp_lcd_panel_reset(panel);
pca9557_->SetOutputState(0, 0);
esp_lcd_panel_init(panel);
esp_lcd_panel_invert_color(panel, true);
esp_lcd_panel_swap_xy(panel, DISPLAY_SWAP_XY);
esp_lcd_panel_mirror(panel, DISPLAY_MIRROR_X, DISPLAY_MIRROR_Y);
esp_lcd_panel_disp_on_off(panel, true);
#if CONFIG_USE_EMOTE_MESSAGE_STYLE
display_ = new emote::EmoteDisplay(panel, panel_io, DISPLAY_WIDTH, DISPLAY_HEIGHT);
#else
display_ = new SpiLcdDisplay(panel_io, panel,
DISPLAY_WIDTH, DISPLAY_HEIGHT, DISPLAY_OFFSET_X, DISPLAY_OFFSET_Y, DISPLAY_MIRROR_X, DISPLAY_MIRROR_Y, DISPLAY_SWAP_XY);
#endif
}
void InitializeTouch() {
esp_lcd_touch_handle_t tp;
esp_lcd_touch_config_t tp_cfg = {
.x_max = DISPLAY_HEIGHT,
.y_max = DISPLAY_WIDTH,
.rst_gpio_num = TOUCH_RST_GPIO,
.int_gpio_num = TOUCH_INT_GPIO,
.levels = {
.reset = 0,
.interrupt = 0,
},
.flags = {
.swap_xy = 1,
.mirror_x = 1,
.mirror_y = 0,
},
};
esp_lcd_panel_io_handle_t tp_io_handle = NULL;
esp_lcd_panel_io_i2c_config_t tp_io_config = {
.dev_addr = ESP_LCD_TOUCH_IO_I2C_GT911_ADDRESS,
.control_phase_bytes = 1,
.dc_bit_offset = 0,
.lcd_cmd_bits = 16,
.flags = {
.disable_control_phase = 1,
}
};
tp_io_config.scl_speed_hz = 400000;
ESP_ERROR_CHECK(esp_lcd_new_panel_io_i2c(i2c_bus_, &tp_io_config, &tp_io_handle));
ESP_ERROR_CHECK(esp_lcd_touch_new_i2c_gt911(tp_io_handle, &tp_cfg, &tp));
assert(tp);
const lvgl_port_touch_cfg_t touch_cfg = {
.disp = lv_display_get_default(),
.handle = tp,
};
if (touch_cfg.disp) {
lvgl_port_add_touch(&touch_cfg);
} else {
ESP_LOGE(TAG, "Touch display is not initialized");
}
}
void InitializeCamera() {
pca9557_->SetOutputState(2, 0);
camera_config_t config = {};
config.ledc_channel = LEDC_CHANNEL_2;
config.ledc_timer = LEDC_TIMER_2;
config.pin_d0 = CAMERA_PIN_D0;
config.pin_d1 = CAMERA_PIN_D1;
config.pin_d2 = CAMERA_PIN_D2;
config.pin_d3 = CAMERA_PIN_D3;
config.pin_d4 = CAMERA_PIN_D4;
config.pin_d5 = CAMERA_PIN_D5;
config.pin_d6 = CAMERA_PIN_D6;
config.pin_d7 = CAMERA_PIN_D7;
config.pin_xclk = CAMERA_PIN_XCLK;
config.pin_pclk = CAMERA_PIN_PCLK;
config.pin_vsync = CAMERA_PIN_VSYNC;
config.pin_href = CAMERA_PIN_HREF;
config.pin_sccb_sda = -1;
config.pin_sccb_scl = CAMERA_PIN_SIOC;
config.sccb_i2c_port = 1;
config.pin_pwdn = CAMERA_PIN_PWDN;
config.pin_reset = CAMERA_PIN_RESET;
config.xclk_freq_hz = XCLK_FREQ_HZ;
config.pixel_format = PIXFORMAT_RGB565;
config.frame_size = FRAMESIZE_QVGA;
config.jpeg_quality = 12;
config.fb_count = 1;
config.fb_location = CAMERA_FB_IN_PSRAM;
config.grab_mode = CAMERA_GRAB_WHEN_EMPTY;
camera_ = new Esp32Camera(config);
}
void InitializeTools() {
auto& mcp_server = McpServer::GetInstance();
mcp_server.AddTool("self.system.reconfigure_wifi",
"End this conversation and enter WiFi configuration mode.\n"
"**CAUTION** You must ask the user to confirm this action.",
PropertyList(), [this](const PropertyList& properties) {
EnterWifiConfigMode();
return true;
});
}
public:
RymcuBigsmartBoard() : boot_button_(BOOT_BUTTON_GPIO) {
InitializeI2c();
InitializeSpi();
InitializeSt7789Display();
InitializeTouch();
InitializeButtons();
InitializeCamera();
InitializeTools();
GetBacklight()->RestoreBrightness();
}
virtual AudioCodec* GetAudioCodec() override {
static CustomAudioCodec audio_codec(i2c_bus_, pca9557_);
return &audio_codec;
}
virtual Display* GetDisplay() override {
return display_;
}
virtual Backlight* GetBacklight() override {
static PwmBacklight backlight(DISPLAY_BACKLIGHT_PIN, DISPLAY_BACKLIGHT_OUTPUT_INVERT);
return &backlight;
}
virtual Camera* GetCamera() override {
return camera_;
}
};
DECLARE_BOARD(RymcuBigsmartBoard);

View File

@ -598,6 +598,10 @@ CONFIG_PARTITION_TABLE_MD5=y
# Xiaozhi Assistant
#
CONFIG_OTA_URL="https://api.tenclass.net/xiaozhi/ota/"
CONFIG_USE_DIRECT_WEBSOCKET=y
CONFIG_WEBSOCKET_URL="ws://172.19.0.240:8080"
CONFIG_WEBSOCKET_TOKEN=""
CONFIG_WEBSOCKET_PROTOCOL_VERSION=1
# CONFIG_FLASH_NONE_ASSETS is not set
CONFIG_FLASH_DEFAULT_ASSETS=y
# CONFIG_FLASH_CUSTOM_ASSETS is not set

1929
main/bridge_server.py Normal file

File diff suppressed because it is too large Load Diff

View File

@ -8,6 +8,7 @@ enum DeviceState {
kDeviceStateIdle,
kDeviceStateConnecting,
kDeviceStateListening,
kDeviceStateThinking,
kDeviceStateSpeaking,
kDeviceStateUpgrading,
kDeviceStateActivating,
@ -15,4 +16,4 @@ enum DeviceState {
kDeviceStateFatalError
};
#endif // _DEVICE_STATE_H_
#endif // _DEVICE_STATE_H_

View File

@ -13,6 +13,7 @@ static const char* const STATE_STRINGS[] = {
"idle",
"connecting",
"listening",
"thinking",
"speaking",
"upgrading",
"activating",
@ -69,9 +70,10 @@ bool DeviceStateMachine::IsValidTransition(DeviceState from, DeviceState to) con
to == kDeviceStateActivating;
case kDeviceStateIdle:
// Can go to connecting, listening (manual mode), speaking, activating, upgrading, or wifi configuring
// Can go to connecting, listening (manual mode), thinking, speaking, activating, upgrading, or wifi configuring
return to == kDeviceStateConnecting ||
to == kDeviceStateListening ||
to == kDeviceStateThinking ||
to == kDeviceStateSpeaking ||
to == kDeviceStateActivating ||
to == kDeviceStateUpgrading ||
@ -83,8 +85,15 @@ bool DeviceStateMachine::IsValidTransition(DeviceState from, DeviceState to) con
to == kDeviceStateListening;
case kDeviceStateListening:
// Can go to speaking or idle
// Can go to thinking, speaking, or idle
return to == kDeviceStateThinking ||
to == kDeviceStateSpeaking ||
to == kDeviceStateIdle;
case kDeviceStateThinking:
// Can go to speaking, listening, or idle
return to == kDeviceStateSpeaking ||
to == kDeviceStateListening ||
to == kDeviceStateIdle;
case kDeviceStateSpeaking:

View File

@ -167,6 +167,8 @@ void EmoteDisplay::SetStatus(const char* const status)
emote_set_event_msg(emote_handle_, EMOTE_MGR_EVT_LISTEN, NULL);
} else if (std::strcmp(status, Lang::Strings::STANDBY) == 0) {
emote_set_event_msg(emote_handle_, EMOTE_MGR_EVT_IDLE, NULL);
} else if (std::strcmp(status, Lang::Strings::THINKING) == 0) {
emote_set_event_msg(emote_handle_, EMOTE_MGR_EVT_LISTEN, NULL);
} else if (std::strcmp(status, Lang::Strings::SPEAKING) == 0) {
emote_set_event_msg(emote_handle_, EMOTE_MGR_EVT_SPEAK, NULL);
} else if (std::strcmp(status, Lang::Strings::ERROR) == 0) {
@ -247,4 +249,4 @@ void EmoteDisplay::RefreshAll()
}
}
} // namespace emote
} // namespace emote

View File

@ -203,6 +203,7 @@ void LvglDisplay::UpdateStatusBar(bool update_all) {
kDeviceStateStarting,
kDeviceStateWifiConfiguring,
kDeviceStateListening,
kDeviceStateThinking,
kDeviceStateActivating,
};
if (std::find(allowed_states.begin(), allowed_states.end(), device_state) != allowed_states.end()) {

View File

@ -20,12 +20,12 @@ dependencies:
waveshare/custom_io_expander_ch32v003: ^1.0.0
espressif/esp_lcd_panel_io_additions: ^1.0.1
78/esp_lcd_nv3023: ~1.0.0
78/esp-wifi-connect: ~3.1.2
78/esp-wifi-connect: ~3.1.3
espressif/esp_audio_effects: ~1.2.1
espressif/esp_audio_codec: ~2.4.1
78/esp-ml307: ~3.6.5
78/uart-eth-modem:
version: ~0.3.4
version: ~0.4.0
rules:
- if: target not in [esp32]
78/xiaozhi-fonts: ~1.6.0

View File

@ -228,6 +228,11 @@ void CircularStrip::OnStateChanged() {
SetAllColor(color);
break;
}
case kDeviceStateThinking: {
StripColor color = { low_brightness_, low_brightness_, default_brightness_ };
Blink(color, 300);
break;
}
case kDeviceStateUpgrading: {
StripColor color = { low_brightness_, default_brightness_, low_brightness_ };
Blink(color, 100);

View File

@ -235,6 +235,10 @@ void GpioLed::OnStateChanged() {
// TurnOn();
StartFadeTask();
break;
case kDeviceStateThinking:
SetBrightness(DEFAULT_BRIGHTNESS);
StartContinuousBlink(300);
break;
case kDeviceStateSpeaking:
SetBrightness(SPEAKING_BRIGHTNESS);
TurnOn();
@ -260,4 +264,4 @@ void GpioLed::EventTask(void* arg) {
ulTaskNotifyTake(pdTRUE, portMAX_DELAY);
led->OnFadeEnd();
}
}
}

View File

@ -152,6 +152,10 @@ void SingleLed::OnStateChanged() {
SetColor(0, DEFAULT_BRIGHTNESS, 0);
TurnOn();
break;
case kDeviceStateThinking:
SetColor(0, 0, DEFAULT_BRIGHTNESS);
StartContinuousBlink(300);
break;
case kDeviceStateUpgrading:
SetColor(0, DEFAULT_BRIGHTNESS, 0);
StartContinuousBlink(100);

View File

@ -284,10 +284,8 @@ void McpServer::AddUserOnlyTools() {
}
#endif // HAVE_LVGL
// Assets download url
auto& assets = Assets::GetInstance();
if (assets.partition_valid()) {
AddUserOnlyTool("self.assets.set_download_url", "Set the download url for the assets",
// Assets download url (always registered — Settings storage works regardless of partition layout)
AddUserOnlyTool("self.assets.set_download_url", "Set the download url for the assets",
PropertyList({
Property("url", kPropertyTypeString)
}),
@ -297,7 +295,6 @@ void McpServer::AddUserOnlyTools() {
settings.SetString("download_url", url);
return true;
});
}
}
void McpServer::AddTool(McpTool* tool) {

View File

@ -1,9 +1,22 @@
#include "protocol.h"
#include <esp_log.h>
#include <mbedtls/base64.h>
#define TAG "Protocol"
static std::string Base64Encode(const std::string& data) {
size_t encoded_length = 0;
size_t output_length = 0;
mbedtls_base64_encode(nullptr, 0, &encoded_length,
reinterpret_cast<const unsigned char*>(data.data()), data.size());
std::string result(encoded_length, 0);
mbedtls_base64_encode(reinterpret_cast<unsigned char*>(result.data()), result.size(), &output_length,
reinterpret_cast<const unsigned char*>(data.data()), data.size());
result.resize(output_length);
return result;
}
void Protocol::OnIncomingJson(std::function<void(const cJSON* root)> callback) {
on_incoming_json_ = callback;
}
@ -78,6 +91,27 @@ void Protocol::SendMcpMessage(const std::string& payload) {
SendText(message);
}
void Protocol::SendVisionFrame(const std::string& jpeg_data) {
if (jpeg_data.empty()) {
return;
}
cJSON* root = cJSON_CreateObject();
cJSON_AddStringToObject(root, "session_id", session_id_.c_str());
cJSON_AddStringToObject(root, "type", "vision");
cJSON_AddStringToObject(root, "state", "frame");
cJSON_AddStringToObject(root, "mime_type", "image/jpeg");
auto encoded = Base64Encode(jpeg_data);
cJSON_AddStringToObject(root, "image", encoded.c_str());
char* json_str = cJSON_PrintUnformatted(root);
if (json_str != nullptr) {
SendText(json_str);
cJSON_free(json_str);
}
cJSON_Delete(root);
}
bool Protocol::IsTimeout() const {
const int kTimeoutSeconds = 120;
auto now = std::chrono::steady_clock::now();

View File

@ -73,6 +73,7 @@ public:
virtual void SendStopListening();
virtual void SendAbortSpeaking(AbortReason reason);
virtual void SendMcpMessage(const std::string& message);
virtual void SendVisionFrame(const std::string& jpeg_data);
protected:
std::function<void(const cJSON* root)> on_incoming_json_;
@ -95,4 +96,3 @@ protected:
};
#endif // PROTOCOL_H

View File

@ -85,10 +85,21 @@ bool WebsocketProtocol::OpenAudioChannel() {
std::string url = settings.GetString("url");
std::string token = settings.GetString("token");
int version = settings.GetInt("version");
#if CONFIG_USE_DIRECT_WEBSOCKET
url = CONFIG_WEBSOCKET_URL;
token = CONFIG_WEBSOCKET_TOKEN;
version = CONFIG_WEBSOCKET_PROTOCOL_VERSION;
#endif
if (version != 0) {
version_ = version;
}
if (url.empty()) {
ESP_LOGE(TAG, "Websocket URL is not set");
SetError(Lang::Strings::SERVER_NOT_CONNECTED);
return false;
}
error_occurred_ = false;
auto network = Board::GetInstance().GetNetwork();
@ -108,6 +119,8 @@ bool WebsocketProtocol::OpenAudioChannel() {
websocket_->SetHeader("Protocol-Version", std::to_string(version_).c_str());
websocket_->SetHeader("Device-Id", SystemInfo::GetMacAddress().c_str());
websocket_->SetHeader("Client-Id", Board::GetInstance().GetUuid().c_str());
websocket_->SetHeader("Agent-Mode", Application::GetInstance().GetChatAgentModeName());
websocket_->SetHeader("Chat-Mode", Application::GetInstance().GetChatModeName());
websocket_->OnData([this](const char* data, size_t len, bool binary) {
if (binary) {