Many things, including a log timing report in the test
Timing report: I (52322) LOG_TEST: === WRITE TIMING REPORT === I (52322) LOG_TEST: Iterations: 200 I (52322) LOG_TEST: Payload size: 39 bytes I (52322) LOG_TEST: Min: 49960 us I (52332) LOG_TEST: Max: 54476 us I (52332) LOG_TEST: Avg: 50005 us I (52342) LOG_TEST: Sector crossings: 2 (max 49983 us) I (52342) LOG_TEST: WDT margin: 4.9s (WDT=5s, worst=54476us) I (52352) LOG_TEST: =========================== so a write takes up to 54ms - not negligible!
This commit is contained in:
@@ -25,10 +25,6 @@ See `README.md` for full project documentation (hardware, architecture, protocol
|
||||
**Current project-specific overrides (sdkconfig.defaults):**
|
||||
| Setting | Value | Why |
|
||||
|---------|-------|-----|
|
||||
| `CONFIG_RTC_CLK_SRC_EXT_CRYS` | y | Use external 32kHz crystal for accurate RTC |
|
||||
| `CONFIG_ESP32_RTC_EXT_CRYST_ADDIT_CURRENT_V2` | y | Drive high-ESR crystal during startup |
|
||||
| `CONFIG_ESP_SYSTEM_RTC_EXT_XTAL_BOOTSTRAP_CYCLES` | 500 | Extra bootstrap for slow-starting crystal |
|
||||
| `CONFIG_RTC_XTAL_CAL_RETRY` | 3 | More calibration attempts before RC fallback |
|
||||
| `CONFIG_ESP_TASK_WDT_PANIC` | y | WDT timeout → panic → reboot (feeds OTA rollback counter) |
|
||||
|
||||
**Already correct at IDF defaults (verified, no override needed):**
|
||||
|
||||
30
TODO.md
30
TODO.md
@@ -7,14 +7,14 @@
|
||||
- [clauded] Confirm brownout detector level — ~2.43V is correct (ESP32 rail protection; battery low-V handled by FSM's `LOW_PROTECTION_V`)
|
||||
- [clauded] Research sdkconfig management best practices — documented in CLAUDE.md "sdkconfig Management" section
|
||||
2. - [clauded] Fix managed_components: removed unused `littlefs` and `tca95x5` deps, pinned `mdns` to `~1.9.1`, bumped IDF min to `>=5.0`; documented in CLAUDE.md
|
||||
3. - [ ] OTA rollback via consecutive-reset counter
|
||||
- [ ] Add `RTC_DATA_ATTR uint8_t reset_counter` — increment on boot, clear after successful health check
|
||||
- [ ] On counter ≥ 5, call `esp_ota_mark_app_invalid_rollback_and_reboot()`
|
||||
- [ ] After POST passes and FSM starts, call `esp_ota_mark_app_valid_cancel_rollback()`
|
||||
- [ ] Decide what "health check passes" means (POST passes? 30s uptime? first successful FSM cycle?)
|
||||
3. - [clauded] OTA rollback via consecutive-reset counter
|
||||
- [clauded] Add `RTC_DATA_ATTR uint8_t ota_reset_counter` — incremented on panic/WDT resets, cleared on power-on/ext reset
|
||||
- [clauded] On counter ≥ 5, call `esp_ota_mark_app_invalid_rollback_and_reboot()`
|
||||
- [clauded] After POST passes and FSM starts, call `esp_ota_mark_app_valid_cancel_rollback()` and clear counter
|
||||
- [clauded] Health check = POST passes + all critical inits + FSM task started + non-critical inits attempted
|
||||
4. - [clauded] Critical init failures (ADC, storage, log, I2C, FSM, UART) → `init_critical()` retries 3×, then `esp_restart()`
|
||||
5. - [clauded] Non-critical init failures (RF, BT, webserver) → log error, continue booting
|
||||
- [ ] WiFi/BT already have restart paths (`webserver_restart_wifi()`, `bt_hid_resume()`) — wire these into a retry-on-failure path at boot, not just soft idle exit
|
||||
- [clauded] WiFi/BT/RF retry once on init failure at boot (200ms delay for RF/BT, 500ms for WiFi), then log and continue
|
||||
6. - [clauded] Power-on self-test (POST) — `init_critical()` wrapper + dedicated POST checks after init
|
||||
- [clauded] ADC: `adc_post()` reads all 4 channels twice with 5ms delay, warns if frozen
|
||||
- [clauded] I2C: `i2c_post()` verifies TCA9555 responds (read port 0)
|
||||
@@ -25,12 +25,12 @@
|
||||
- [ ] Enforce validation inside `commit_params()` (covers both `storage_init()` load and `/set` POST)
|
||||
- [ ] Audit for anywhere params are set without an immediate `commit_params()` call
|
||||
- [ ] Audit abandoned parameters (e.g. jack current) — add comments marking them deprecated
|
||||
8. - [ ] Factory reset: erase entire storage partition (not just params), require 10s button hold, LED indication (flash all → hold solid once triggered)
|
||||
9. - [ ] Ensure RTC_DATA_ATTR variables survive panics/WDT resets
|
||||
- [ ] Verify `sync_unix_us`, `sync_rtc_us`, `rtc_set` (time) are not corrupted by any init path
|
||||
- [ ] Verify `remaining_distance`, `fsm_error` (FSM state) are not zeroed except by intentional reset
|
||||
- [ ] Verify `log_head_offset`, `log_tail_offset` stay consistent after crash (no partial writes)
|
||||
10. - [ ] Measure flash log write duration (bracket with `esp_timer_get_time()`, compare to WDT timeout of 5s)
|
||||
8. - [clauded] Factory reset: erases params + log + post_test partitions, requires 10s button hold on cold boot, LEDs flash during hold → solid when triggered
|
||||
9. - [clauded] Ensure RTC_DATA_ATTR variables survive panics/WDT resets
|
||||
- [clauded] Verified `sync_unix_us`, `sync_rtc_us`, `rtc_set` — no init path zeroes them; `rtc_restore_time()` recovers via RTC HW counter
|
||||
- [clauded] Verified `remaining_distance`, `fsm_error` — `fsm_init()` does not touch them; only cleared by explicit user action
|
||||
- [clauded] Verified `log_head_offset`, `log_tail_offset` — `log_init()` always recovers from flash scan; RTC_DATA_ATTR is historical/harmless
|
||||
10. - [clauded] Measure flash log write duration — `test_log_write_timing()` in log_test.c, runs 200 iterations of 39-byte writes, reports min/max/avg/sector-crossing times, compares to 5s WDT
|
||||
11. - [ ] WiFi STA mode with event-group signaling
|
||||
- [ ] Try connecting to saved STA network first, fall back to softAP on failure/timeout
|
||||
- [ ] Add `EventGroupHandle_t` with `WIFI_READY_BIT` (set when STA connected or softAP up) and `BT_READY_BIT` (set when BT scan task starts)
|
||||
@@ -41,9 +41,9 @@
|
||||
- [ ] Decide: move to main.c (simpler) or keep in `control_task()` (current) — either way, remove the dead commented-out call in main.c and add a clarifying comment
|
||||
- [ ] Audit all ISRs are IRAM-safe: no `ESP_LOGx`, `printf`, `malloc`, or flash access — only `xQueueSendFromISR()`
|
||||
- [ ] Handle `sensors_init()` failure as critical (→ reboot)
|
||||
13. - [ ] Confirm whether external RTC crystal can be dropped (device never enters deep sleep now) — if yes, remove `rtc_xtal_init()` and related sdkconfig entries; if no, document why it must stay
|
||||
14. - [ ] Remove `rtc_wakeup_cause()` call (informational only, no longer needed)
|
||||
15. - [ ] Confirm `rtc_check_shutdown_timer()` uses signed subtraction — then remove the esp_timer overflow TODO comment (int64_t overflows after 292K years)
|
||||
13. - [clauded] External 32kHz crystal not needed (deep sleep disabled, soft idle instead) — removed crystal config from sdkconfig.defaults; `rtc_xtal_init()` already a no-op; crystal remains on PCB but unused
|
||||
14. - [clauded] Removed `rtc_wakeup_cause()` — was unused (informational only, never called)
|
||||
15. - [clauded] Confirmed `rtc_check_shutdown_timer()` uses unsigned `TickType_t` subtraction — wraps correctly; removed esp_timer overflow TODO comment from main.c
|
||||
16. - [ ] Extract pure logic (e-fuse thermal model, param serialization, sensor debounce) into host-testable modules with Unity/CMock
|
||||
17. - [ ] UART integration test framework: Python runner + ESP-side test commands
|
||||
18. - [test] Logtool GUI output (matplotlib)
|
||||
|
||||
17
main/comms_events.h
Normal file
17
main/comms_events.h
Normal file
@@ -0,0 +1,17 @@
|
||||
#ifndef COMMS_EVENTS_H
|
||||
#define COMMS_EVENTS_H
|
||||
|
||||
#include "freertos/FreeRTOS.h"
|
||||
#include "freertos/event_groups.h"
|
||||
|
||||
// Shared event group for WiFi/BT readiness signaling.
|
||||
// Set by webserver.c and bt_hid.c; waited on by main.c during alarm wake.
|
||||
|
||||
#define WIFI_READY_BIT BIT0 // Set when STA connected or softAP is up
|
||||
#define BT_READY_BIT BIT1 // Set when BT scan task starts
|
||||
#define COMMS_ALL_BITS (WIFI_READY_BIT | BT_READY_BIT)
|
||||
|
||||
// Must be created once (by main.c) before webserver_init() / bt_hid_init()
|
||||
extern EventGroupHandle_t comms_event_group;
|
||||
|
||||
#endif // COMMS_EVENTS_H
|
||||
@@ -32,14 +32,15 @@
|
||||
|
||||
static QueueHandle_t fsm_cmd_queue = NULL;
|
||||
|
||||
// AUDIT: fsm_init() does not zero these — they persist across panics/WDT resets.
|
||||
// Only cleared by explicit user action (fsm_clear_error, fsm_set_remaining_distance).
|
||||
RTC_DATA_ATTR esp_err_t fsm_error = ESP_OK;
|
||||
esp_err_t fsm_get_error() { return fsm_error; }
|
||||
void fsm_clear_error() { fsm_error = ESP_OK; }
|
||||
|
||||
|
||||
|
||||
int64_t override_time = -1;
|
||||
fsm_override_t override_cmd;
|
||||
//int64_t override_cooldown[8] = {-1};
|
||||
bool enabled = false;
|
||||
|
||||
float this_move_dist = 0.0f;
|
||||
@@ -182,7 +183,7 @@ void control_task(void *param) {
|
||||
const TickType_t xFrequency = pdMS_TO_TICKS(20);
|
||||
enabled = true;
|
||||
|
||||
sensors_init(); // TODO: Why is this *here* rather than in main?
|
||||
// sensors_init() is called from main.c as a critical init (before FSM starts)
|
||||
|
||||
while (enabled) {
|
||||
vTaskDelayUntil(&xLastWakeTime, xFrequency);
|
||||
|
||||
@@ -4,6 +4,7 @@
|
||||
#include "esp_log.h"
|
||||
#include "esp_err.h"
|
||||
#include "esp_task_wdt.h"
|
||||
#include "esp_timer.h"
|
||||
#include <string.h>
|
||||
#include <stdio.h>
|
||||
|
||||
@@ -1087,6 +1088,87 @@ int count_passed_tests(test_result_t* results, int num_tests) {
|
||||
return passed;
|
||||
}
|
||||
|
||||
// ============================================================================
|
||||
// Write timing benchmark — measures blocking write duration
|
||||
// ============================================================================
|
||||
void test_log_write_timing(void) {
|
||||
ESP_LOGI(TAG, "");
|
||||
ESP_LOGI(TAG, "=== Log Write Timing Benchmark ===");
|
||||
|
||||
// Erase and reinit to get a clean state
|
||||
esp_err_t err = log_erase_all_sectors();
|
||||
if (err != ESP_OK) {
|
||||
ESP_LOGE(TAG, "Failed to erase log for timing test");
|
||||
return;
|
||||
}
|
||||
esp_task_wdt_reset();
|
||||
|
||||
// Use a 39-byte payload (typical FSM log entry size)
|
||||
uint8_t payload[39];
|
||||
for (int i = 0; i < 39; i++) payload[i] = (uint8_t)i;
|
||||
|
||||
#define TIMING_ITERATIONS 200
|
||||
|
||||
int64_t min_us = INT64_MAX;
|
||||
int64_t max_us = 0;
|
||||
int64_t total_us = 0;
|
||||
int sector_cross_count = 0;
|
||||
int64_t sector_cross_max_us = 0;
|
||||
|
||||
uint32_t prev_head = log_get_head();
|
||||
|
||||
for (int i = 0; i < TIMING_ITERATIONS; i++) {
|
||||
int64_t t0 = esp_timer_get_time();
|
||||
err = log_write_blocking_test(payload, sizeof(payload), LOG_TYPE_DATA);
|
||||
// Wait for queue flush
|
||||
vTaskDelay(pdMS_TO_TICKS(50));
|
||||
int64_t t1 = esp_timer_get_time();
|
||||
|
||||
if (err != ESP_OK) {
|
||||
ESP_LOGE(TAG, "Write %d failed: %s", i, esp_err_to_name(err));
|
||||
continue;
|
||||
}
|
||||
|
||||
int64_t dt = t1 - t0;
|
||||
total_us += dt;
|
||||
if (dt < min_us) min_us = dt;
|
||||
if (dt > max_us) max_us = dt;
|
||||
|
||||
// Detect sector crossing (head wrapped or jumped by > payload size)
|
||||
uint32_t cur_head = log_get_head();
|
||||
if (cur_head < prev_head || (cur_head - prev_head) > sizeof(payload) + 10) {
|
||||
sector_cross_count++;
|
||||
if (dt > sector_cross_max_us) sector_cross_max_us = dt;
|
||||
}
|
||||
prev_head = cur_head;
|
||||
|
||||
if (i % 50 == 0) esp_task_wdt_reset();
|
||||
}
|
||||
|
||||
int64_t avg_us = total_us / TIMING_ITERATIONS;
|
||||
|
||||
ESP_LOGI(TAG, "");
|
||||
ESP_LOGI(TAG, "=== WRITE TIMING REPORT ===");
|
||||
ESP_LOGI(TAG, " Iterations: %d", TIMING_ITERATIONS);
|
||||
ESP_LOGI(TAG, " Payload size: %d bytes", (int)sizeof(payload));
|
||||
ESP_LOGI(TAG, " Min: %lld us", (long long)min_us);
|
||||
ESP_LOGI(TAG, " Max: %lld us", (long long)max_us);
|
||||
ESP_LOGI(TAG, " Avg: %lld us", (long long)avg_us);
|
||||
ESP_LOGI(TAG, " Sector crossings: %d (max %lld us)", sector_cross_count, (long long)sector_cross_max_us);
|
||||
ESP_LOGI(TAG, " WDT margin: %.1fs (WDT=5s, worst=%lldus)",
|
||||
5.0 - (double)max_us / 1000000.0, (long long)max_us);
|
||||
if (max_us > 1000000) {
|
||||
ESP_LOGW(TAG, " WARNING: max write > 1s — close to WDT timeout!");
|
||||
} else if (max_us > 100000) {
|
||||
ESP_LOGI(TAG, " Note: max write > 100ms (expected during sector erase)");
|
||||
}
|
||||
ESP_LOGI(TAG, "===========================");
|
||||
ESP_LOGI(TAG, "");
|
||||
|
||||
#undef TIMING_ITERATIONS
|
||||
esp_task_wdt_reset();
|
||||
}
|
||||
|
||||
// Main test runner
|
||||
esp_err_t run_all_log_tests(void) {
|
||||
ESP_LOGI(TAG, "\n\n");
|
||||
@@ -1169,12 +1251,12 @@ esp_err_t run_all_log_tests(void) {
|
||||
|
||||
if (passed == num_tests) {
|
||||
ESP_LOGI(TAG, "ALL TESTS PASSED!");
|
||||
|
||||
// Run write timing benchmark as a final report (not a pass/fail test)
|
||||
test_log_write_timing();
|
||||
} else {
|
||||
ESP_LOGE(TAG, "SOME TESTS FAILED!");
|
||||
|
||||
}
|
||||
|
||||
|
||||
while(1) { esp_task_wdt_reset(); vTaskDelay(pdMS_TO_TICKS(100)); }
|
||||
return ESP_OK;
|
||||
}
|
||||
@@ -40,6 +40,9 @@ bool test_log_full_partition(void);
|
||||
bool test_log_read_after_write(void);
|
||||
bool test_log_multiple_types(void);
|
||||
|
||||
// Write timing benchmark (not a pass/fail test — prints min/max/avg report)
|
||||
void test_log_write_timing(void);
|
||||
|
||||
// Helper functions for testing
|
||||
void print_test_results(test_result_t* results, int num_tests);
|
||||
int count_passed_tests(test_result_t* results, int num_tests);
|
||||
|
||||
157
main/main.c
157
main/main.c
@@ -1,5 +1,6 @@
|
||||
#include "esp_task_wdt.h"
|
||||
#include "esp_system.h"
|
||||
#include "esp_ota_ops.h"
|
||||
#include "i2c.h"
|
||||
#include "log_test.h"
|
||||
#include "partition_test.h"
|
||||
@@ -16,12 +17,20 @@
|
||||
#include "rf_433.h"
|
||||
#include "bt_hid.h"
|
||||
#include "webserver.h"
|
||||
#include "comms_events.h"
|
||||
#include "version.h"
|
||||
#include <string.h>
|
||||
|
||||
EventGroupHandle_t comms_event_group = NULL;
|
||||
|
||||
#define TAG "MAIN"
|
||||
|
||||
#define POST_MAX_RETRIES 3
|
||||
#define OTA_ROLLBACK_THRESHOLD 5
|
||||
#define FACTORY_RESET_HOLD_MS 10000
|
||||
|
||||
// Survives resets (panic, WDT, sw reset) but NOT power-on or external reset
|
||||
RTC_DATA_ATTR static uint8_t ota_reset_counter = 0;
|
||||
|
||||
// Try an init function up to POST_MAX_RETRIES times. On final failure, reboot.
|
||||
// Critical inits (ADC, I2C, storage, FSM, sensors) use this — a permanent failure
|
||||
@@ -137,49 +146,58 @@ void app_main(void) {esp_task_wdt_add(NULL);
|
||||
drive_leds(LED_STATE_BOOTING);
|
||||
|
||||
|
||||
// Check for factory reset condition: Cold boot (power-on/ext-reset) + button held
|
||||
// Factory reset: cold boot + button held for 10s
|
||||
// LEDs flash while waiting, go solid when triggered
|
||||
esp_reset_reason_t boot_reset_reason = esp_reset_reason();
|
||||
if ((boot_reset_reason == ESP_RST_POWERON || boot_reset_reason == ESP_RST_EXT)
|
||||
&& gpio_get_level(GPIO_NUM_13) == 0) {
|
||||
ESP_LOGW(TAG, "FACTORY RESET TRIGGERED - Button held on cold boot");
|
||||
|
||||
// Flash LED pattern to indicate factory reset
|
||||
for (int i = 0; i < 10; i++) {
|
||||
i2c_set_led1(0b111);
|
||||
ESP_LOGW(TAG, "Button held on cold boot — hold %ds for factory reset", FACTORY_RESET_HOLD_MS / 1000);
|
||||
|
||||
// Flash all LEDs while user holds button (100ms on/off cycle)
|
||||
int held_ms = 0;
|
||||
while (gpio_get_level(GPIO_NUM_13) == 0 && held_ms < FACTORY_RESET_HOLD_MS) {
|
||||
i2c_set_led1((held_ms / 100) % 2 ? 0b111 : 0b000);
|
||||
vTaskDelay(pdMS_TO_TICKS(100));
|
||||
held_ms += 100;
|
||||
esp_task_wdt_reset();
|
||||
}
|
||||
|
||||
if (held_ms < FACTORY_RESET_HOLD_MS) {
|
||||
ESP_LOGI(TAG, "Button released early (%dms) — skipping factory reset", held_ms);
|
||||
i2c_set_led1(0b000);
|
||||
vTaskDelay(pdMS_TO_TICKS(100));
|
||||
}
|
||||
|
||||
// Initialize minimal components needed for factory reset
|
||||
if (storage_init() != ESP_OK) ESP_LOGE(TAG, "STORAGE FAILED");
|
||||
|
||||
// Perform factory reset
|
||||
esp_err_t reset_err = factory_reset();
|
||||
if (reset_err == ESP_OK) {
|
||||
ESP_LOGI(TAG, "Factory reset completed successfully");
|
||||
// Flash success pattern
|
||||
for (int i = 0; i < 5; i++) {
|
||||
i2c_set_led1(0b010);
|
||||
vTaskDelay(pdMS_TO_TICKS(200));
|
||||
i2c_set_led1(0b000);
|
||||
vTaskDelay(pdMS_TO_TICKS(200));
|
||||
}
|
||||
} else {
|
||||
ESP_LOGE(TAG, "Factory reset failed!");
|
||||
// Flash error pattern
|
||||
for (int i = 0; i < 5; i++) {
|
||||
i2c_set_led1(0b100);
|
||||
vTaskDelay(pdMS_TO_TICKS(200));
|
||||
i2c_set_led1(0b000);
|
||||
vTaskDelay(pdMS_TO_TICKS(200));
|
||||
// Solid LEDs = reset triggered
|
||||
i2c_set_led1(0b111);
|
||||
ESP_LOGW(TAG, "FACTORY RESET TRIGGERED");
|
||||
|
||||
// Initialize storage so we can erase it
|
||||
if (storage_init() != ESP_OK) ESP_LOGE(TAG, "STORAGE FAILED");
|
||||
|
||||
esp_err_t reset_err = factory_reset();
|
||||
if (reset_err == ESP_OK) {
|
||||
ESP_LOGI(TAG, "Factory reset completed successfully");
|
||||
// Success: green blink
|
||||
for (int i = 0; i < 5; i++) {
|
||||
i2c_set_led1(0b010);
|
||||
vTaskDelay(pdMS_TO_TICKS(200));
|
||||
i2c_set_led1(0b000);
|
||||
vTaskDelay(pdMS_TO_TICKS(200));
|
||||
}
|
||||
} else {
|
||||
ESP_LOGE(TAG, "Factory reset failed!");
|
||||
// Error: red blink
|
||||
for (int i = 0; i < 5; i++) {
|
||||
i2c_set_led1(0b100);
|
||||
vTaskDelay(pdMS_TO_TICKS(200));
|
||||
i2c_set_led1(0b000);
|
||||
vTaskDelay(pdMS_TO_TICKS(200));
|
||||
}
|
||||
}
|
||||
|
||||
ESP_LOGI(TAG, "Rebooting system...");
|
||||
vTaskDelay(pdMS_TO_TICKS(1000));
|
||||
esp_restart();
|
||||
}
|
||||
|
||||
// Reboot the system
|
||||
ESP_LOGI(TAG, "Rebooting system...");
|
||||
vTaskDelay(pdMS_TO_TICKS(1000));
|
||||
esp_restart();
|
||||
}
|
||||
|
||||
// Critical inits — retry up to 3 times, then reboot
|
||||
@@ -191,7 +209,7 @@ void app_main(void) {esp_task_wdt_add(NULL);
|
||||
adc_post(); // ADC channels readable and not frozen
|
||||
storage_post(); // flash write-read-verify on test sector
|
||||
|
||||
//run_all_log_tests();
|
||||
run_all_log_tests();
|
||||
|
||||
esp_reset_reason_t reset_reason = esp_reset_reason();
|
||||
esp_sleep_wakeup_cause_t wake_cause = esp_sleep_get_wakeup_cause();
|
||||
@@ -205,18 +223,31 @@ void app_main(void) {esp_task_wdt_add(NULL);
|
||||
log_write(boot_entry, sizeof(boot_entry), LOG_TYPE_BOOT);
|
||||
}
|
||||
|
||||
// TODO: OTA rollback counter (see TODO.md #3)
|
||||
// Write a crash log entry if we rebooted unexpectedly
|
||||
if (reset_reason == ESP_RST_PANIC ||
|
||||
reset_reason == ESP_RST_INT_WDT ||
|
||||
reset_reason == ESP_RST_TASK_WDT ||
|
||||
reset_reason == ESP_RST_WDT) {
|
||||
ESP_LOGW(TAG, "Crash detected! Reset reason: %d", reset_reason);
|
||||
// OTA rollback: count consecutive abnormal resets (panic/WDT).
|
||||
// Power-on and external resets clear the counter; crashes increment it.
|
||||
// After OTA_ROLLBACK_THRESHOLD consecutive crashes, roll back to the
|
||||
// previous OTA partition (if available).
|
||||
if (reset_reason == ESP_RST_POWERON || reset_reason == ESP_RST_EXT) {
|
||||
ota_reset_counter = 0;
|
||||
} else if (reset_reason == ESP_RST_PANIC ||
|
||||
reset_reason == ESP_RST_INT_WDT ||
|
||||
reset_reason == ESP_RST_TASK_WDT ||
|
||||
reset_reason == ESP_RST_WDT) {
|
||||
ota_reset_counter++;
|
||||
ESP_LOGW(TAG, "Crash detected (reason=%d), reset counter=%d/%d",
|
||||
reset_reason, ota_reset_counter, OTA_ROLLBACK_THRESHOLD);
|
||||
|
||||
uint8_t crash_entry[9] = {};
|
||||
uint64_t ts = rtc_get_ms();
|
||||
memcpy(&crash_entry[0], &ts, 8);
|
||||
crash_entry[8] = (uint8_t)reset_reason;
|
||||
log_write(crash_entry, sizeof(crash_entry), LOG_TYPE_CRASH);
|
||||
|
||||
if (ota_reset_counter >= OTA_ROLLBACK_THRESHOLD) {
|
||||
ESP_LOGE(TAG, "Rollback threshold reached — marking app invalid");
|
||||
esp_ota_mark_app_invalid_rollback_and_reboot();
|
||||
// Does not return — reboots into previous OTA slot
|
||||
}
|
||||
}
|
||||
|
||||
if (solar_run_fsm() != ESP_OK) ESP_LOGE(TAG, "SOLAR FAILED");
|
||||
@@ -226,13 +257,33 @@ void app_main(void) {esp_task_wdt_add(NULL);
|
||||
/*** FULL BOOT ***/
|
||||
// Critical — must succeed or reboot
|
||||
init_critical("UART", uart_init);
|
||||
init_critical("SENSORS", sensors_init);
|
||||
init_critical("FSM", fsm_init);
|
||||
// sensors_init() is called inside control_task() — see control_fsm.c:185
|
||||
|
||||
// Non-critical — log error but continue booting
|
||||
if (rf_433_init() != ESP_OK) ESP_LOGE(TAG, "RF FAILED");
|
||||
if (bt_hid_init() != ESP_OK) ESP_LOGE(TAG, "BT HID FAILED");
|
||||
if (webserver_init() != ESP_OK) ESP_LOGE(TAG, "WEBSERVER FAILED");
|
||||
// Create event group before non-critical inits (they set bits on it)
|
||||
comms_event_group = xEventGroupCreate();
|
||||
|
||||
// Non-critical — retry once on failure, then log and continue
|
||||
if (rf_433_init() != ESP_OK) {
|
||||
ESP_LOGW(TAG, "RF init failed, retrying...");
|
||||
vTaskDelay(pdMS_TO_TICKS(200));
|
||||
if (rf_433_init() != ESP_OK) ESP_LOGE(TAG, "RF FAILED (continuing without RF)");
|
||||
}
|
||||
if (bt_hid_init() != ESP_OK) {
|
||||
ESP_LOGW(TAG, "BT init failed, retrying...");
|
||||
vTaskDelay(pdMS_TO_TICKS(200));
|
||||
if (bt_hid_init() != ESP_OK) ESP_LOGE(TAG, "BT HID FAILED (continuing without BT)");
|
||||
}
|
||||
if (webserver_init() != ESP_OK) {
|
||||
ESP_LOGW(TAG, "Webserver init failed, retrying...");
|
||||
vTaskDelay(pdMS_TO_TICKS(500));
|
||||
if (webserver_init() != ESP_OK) ESP_LOGE(TAG, "WEBSERVER FAILED (continuing without WiFi)");
|
||||
}
|
||||
|
||||
// POST + FSM started successfully — this firmware is good.
|
||||
// Clear the rollback counter and mark the OTA partition as valid.
|
||||
ota_reset_counter = 0;
|
||||
esp_ota_mark_app_valid_cancel_rollback();
|
||||
|
||||
/*** MAIN LOOP ***/
|
||||
TickType_t xLastWakeTime = xTaskGetTickCount();
|
||||
@@ -254,8 +305,12 @@ void app_main(void) {esp_task_wdt_add(NULL);
|
||||
if (rtc_alarm_tripped()) {
|
||||
soft_idle_exit();
|
||||
xLastWakeTime = xTaskGetTickCount();
|
||||
vTaskDelay(pdMS_TO_TICKS(500));
|
||||
// TODO: do a hard wait until wifi and bluetooth come up, not just blindly wait; might be better to be non-blocking
|
||||
// Wait for WiFi + BT to come back up (or timeout after 5s)
|
||||
if (comms_event_group) {
|
||||
xEventGroupWaitBits(comms_event_group, COMMS_ALL_BITS,
|
||||
pdFALSE, pdTRUE, pdMS_TO_TICKS(5000));
|
||||
}
|
||||
esp_task_wdt_reset();
|
||||
fsm_request(FSM_CMD_START);
|
||||
rtc_schedule_next_alarm();
|
||||
}
|
||||
@@ -364,7 +419,7 @@ void app_main(void) {esp_task_wdt_add(NULL);
|
||||
}
|
||||
|
||||
solar_run_fsm();
|
||||
rtc_check_shutdown_timer(); // TODO: Will esp timer overflow? Handle overflow if needed (this used to be handled by the fact that we were in deep sleep)
|
||||
rtc_check_shutdown_timer();
|
||||
esp_task_wdt_reset();
|
||||
}
|
||||
}
|
||||
18
main/rtc.c
18
main/rtc.c
@@ -44,7 +44,9 @@ static uint64_t rtc_hw_time_us(void)
|
||||
|
||||
uint64_t last_activity_tick = 0;
|
||||
|
||||
// RTC_DATA_ATTR keeps these in RTC memory; persists across software resets (panics, WDT)
|
||||
// RTC_DATA_ATTR keeps these in RTC memory; persists across software resets (panics, WDT).
|
||||
// AUDIT: no init path zeroes these — rtc_restore_time() recovers via RTC HW counter,
|
||||
// rtc_set_s() is only called explicitly by the user. Verified 2026-03-12.
|
||||
RTC_DATA_ATTR int64_t next_alarm_time_s = -1;
|
||||
RTC_DATA_ATTR bool rtc_set = false;
|
||||
RTC_DATA_ATTR int64_t sync_unix_us = 0; // Unix time in µs at last rtc_set_s() call
|
||||
@@ -160,22 +162,14 @@ int64_t rtc_get_s_in_day(void)
|
||||
return rtc_get_s() % 86400UL;
|
||||
}
|
||||
|
||||
esp_sleep_wakeup_cause_t rtc_wakeup_cause(void)
|
||||
{
|
||||
esp_sleep_wakeup_cause_t c = esp_sleep_get_wakeup_cause();
|
||||
switch (c) {
|
||||
case ESP_SLEEP_WAKEUP_EXT0: ESP_LOGI("RTC", "Wakeup: GPIO"); break;
|
||||
case ESP_SLEEP_WAKEUP_TIMER: ESP_LOGI("RTC", "Wakeup: timer"); break;
|
||||
default: ESP_LOGI("RTC", "Wakeup: normal boot"); break;
|
||||
}
|
||||
return c;
|
||||
}
|
||||
|
||||
/* -------------------------------------------------------------------------- */
|
||||
/* Unified periodic update */
|
||||
/* -------------------------------------------------------------------------- */
|
||||
void rtc_check_shutdown_timer(void)
|
||||
{
|
||||
// Unsigned subtraction handles TickType_t (uint32_t) wraparound correctly:
|
||||
// e.g. if tick wrapped from 0xFFFFFFFE to 5, elapsed = 5 - 0xFFFFFFFE = 7.
|
||||
// At 1ms/tick, uint32_t wraps after ~49.7 days — well beyond the 180s timeout.
|
||||
TickType_t elapsed = xTaskGetTickCount() - last_activity_tick;
|
||||
if (elapsed * portTICK_PERIOD_MS >= POWER_INACTIVITY_TIMEOUT_MS)
|
||||
soft_idle_enter();
|
||||
|
||||
@@ -36,7 +36,6 @@ void soft_idle_enter(void);
|
||||
void soft_idle_exit(void);
|
||||
bool soft_idle_is_active(void);
|
||||
bool soft_idle_button_raw(void); /* direct GPIO read, no I2C */
|
||||
esp_sleep_wakeup_cause_t rtc_wakeup_cause();
|
||||
|
||||
/*void adjust_rtc_hour(char *key, int8_t dir);
|
||||
void adjust_rtc_min(char *key, int8_t dir);*/
|
||||
|
||||
@@ -173,45 +173,6 @@ int8_t pack_sensors() {
|
||||
return ret;
|
||||
}
|
||||
|
||||
/*esp_err_t sensors_init() {
|
||||
gpio_config_t io_conf = {
|
||||
.pin_bit_mask = (1ULL << sensor_pins[0]) | (1ULL << sensor_pins[1]),
|
||||
.mode = GPIO_MODE_INPUT,
|
||||
.pull_up_en = GPIO_PULLUP_ENABLE,
|
||||
.pull_down_en = GPIO_PULLDOWN_DISABLE,
|
||||
.intr_type = GPIO_INTR_ANYEDGE,
|
||||
};
|
||||
ESP_ERROR_CHECK(gpio_config(&io_conf));
|
||||
|
||||
sensor_event_queue = xQueueCreate(16, sizeof(sensor_event_t));
|
||||
if (!sensor_event_queue) {
|
||||
ESP_LOGE(TAG, "Failed to create sensor queue");
|
||||
return ESP_FAIL;
|
||||
}
|
||||
|
||||
// Install ISR service
|
||||
ESP_ERROR_CHECK(gpio_install_isr_service(0));
|
||||
|
||||
for (uint8_t i = 0; i < N_SENSORS; i++) {
|
||||
ESP_ERROR_CHECK(gpio_isr_handler_add(sensor_pins[i], sensor_isr_handler, INT2VOIDP(sensor_pins[i])));
|
||||
sensor_stable_state[i] = !gpio_get_level(sensor_pins[i]);
|
||||
}
|
||||
|
||||
xTaskCreate(sensor_debounce_task, "SENSORS", 3072, NULL, 6, NULL);
|
||||
|
||||
return ESP_OK;
|
||||
}
|
||||
|
||||
esp_err_t sensors_stop() {
|
||||
for (uint8_t i = 0; i < N_SENSORS; i++) {
|
||||
gpio_isr_handler_remove(sensor_pins[i]);
|
||||
}
|
||||
gpio_uninstall_isr_service();
|
||||
vQueueDelete(sensor_event_queue);
|
||||
|
||||
return ESP_OK;
|
||||
}*/
|
||||
|
||||
// Public API
|
||||
bool get_sensor(sensor_t i) {
|
||||
return sensor_stable_state[i];
|
||||
|
||||
@@ -113,8 +113,10 @@ static const esp_partition_t *params_partition = NULL;
|
||||
static const esp_partition_t *log_partition = NULL;
|
||||
static const esp_partition_t *post_partition = NULL;
|
||||
|
||||
// Log head/tail tracking with mutex protection
|
||||
// These track byte offsets within the log partition (0-based)
|
||||
// Log head/tail tracking with mutex protection.
|
||||
// These track byte offsets within the log partition (0-based).
|
||||
// RTC_DATA_ATTR is historical — log_init() always recovers these from a flash scan,
|
||||
// so the RTC values are overwritten on every boot. No partial-write risk.
|
||||
RTC_DATA_ATTR static uint32_t log_head_offset = 0;
|
||||
RTC_DATA_ATTR static uint32_t log_tail_offset = 0;
|
||||
RTC_DATA_ATTR static bool log_initialized = false;
|
||||
@@ -407,21 +409,46 @@ esp_err_t commit_params(void) {
|
||||
// ============================================================================
|
||||
esp_err_t factory_reset(void) {
|
||||
ESP_LOGI(TAG, "Performing factory reset...");
|
||||
|
||||
|
||||
// Reset all parameters to defaults
|
||||
for (int i = 0; i < NUM_PARAMS; i++) {
|
||||
memcpy(¶meter_table[i], ¶meter_defaults[i], sizeof(param_value_t));
|
||||
}
|
||||
|
||||
// TODO: WIPE ENTIRE PARTITION
|
||||
|
||||
// Commit defaults to flash
|
||||
|
||||
// Commit defaults to params partition
|
||||
esp_err_t err = commit_params();
|
||||
if (err != ESP_OK) {
|
||||
ESP_LOGE(TAG, "Failed to commit defaults during factory reset");
|
||||
return err;
|
||||
}
|
||||
|
||||
|
||||
// Erase the log partition
|
||||
const esp_partition_t *log_part = esp_partition_find_first(
|
||||
ESP_PARTITION_TYPE_DATA, ESP_PARTITION_SUBTYPE_ANY, "log");
|
||||
if (log_part != NULL) {
|
||||
ESP_LOGI(TAG, "Erasing log partition (%lu bytes)...", (unsigned long)log_part->size);
|
||||
err = esp_partition_erase_range(log_part, 0, log_part->size);
|
||||
if (err != ESP_OK) {
|
||||
ESP_LOGE(TAG, "Failed to erase log partition: %s", esp_err_to_name(err));
|
||||
return err;
|
||||
}
|
||||
}
|
||||
|
||||
// Erase the POST test partition
|
||||
const esp_partition_t *post_part = esp_partition_find_first(
|
||||
ESP_PARTITION_TYPE_DATA, ESP_PARTITION_SUBTYPE_ANY, "post_test");
|
||||
if (post_part != NULL) {
|
||||
err = esp_partition_erase_range(post_part, 0, post_part->size);
|
||||
if (err != ESP_OK) {
|
||||
ESP_LOGW(TAG, "Failed to erase post_test partition: %s", esp_err_to_name(err));
|
||||
}
|
||||
}
|
||||
|
||||
// Reset log state so next boot starts fresh
|
||||
log_head_offset = 0;
|
||||
log_tail_offset = 0;
|
||||
log_initialized = false;
|
||||
|
||||
ESP_LOGI(TAG, "Factory reset complete");
|
||||
return ESP_OK;
|
||||
}
|
||||
|
||||
@@ -39,6 +39,7 @@
|
||||
|
||||
#include "webpage.h"
|
||||
#include "webserver.h"
|
||||
#include "comms_events.h"
|
||||
|
||||
#include "esp_partition.h"
|
||||
|
||||
@@ -1017,6 +1018,7 @@ static esp_err_t try_connect_sta(const char *ssid, const char *pass, bool reset_
|
||||
}
|
||||
|
||||
s_wifi_running = true;
|
||||
if (comms_event_group) xEventGroupSetBits(comms_event_group, WIFI_READY_BIT);
|
||||
return ESP_OK;
|
||||
}
|
||||
|
||||
@@ -1099,6 +1101,7 @@ static esp_err_t launch_soft_ap(void) {
|
||||
ESP_LOGI(TAG, "SoftAP ready. SSID: %s, Channel: %d, Password: %s",
|
||||
wifi_config.ap.ssid, wifi_config.ap.channel, placeholder);
|
||||
ESP_LOGI(TAG, "Access at: http://%s.local or http://192.168.4.1", HOSTNAME);
|
||||
if (comms_event_group) xEventGroupSetBits(comms_event_group, WIFI_READY_BIT);
|
||||
return ESP_OK;
|
||||
}
|
||||
|
||||
@@ -1124,6 +1127,7 @@ esp_err_t webserver_stop(void) {
|
||||
esp_wifi_stop();
|
||||
s_wifi_running = false;
|
||||
}
|
||||
if (comms_event_group) xEventGroupClearBits(comms_event_group, WIFI_READY_BIT);
|
||||
return ESP_OK;
|
||||
}
|
||||
|
||||
|
||||
@@ -2,24 +2,10 @@
|
||||
# These are applied during "idf.py reconfigure" and take precedence over IDF defaults.
|
||||
# Do NOT override settings that are managed by idf.py menuconfig interactively.
|
||||
|
||||
# Use external 32kHz crystal for the RTC slow clock (GPIO32/33 on the PCB).
|
||||
# This gives accurate timekeeping across deep sleep instead of the +/-5% internal RC.
|
||||
CONFIG_RTC_CLK_SRC_EXT_CRYS=y
|
||||
|
||||
# Enable additional drive current for the external 32kHz crystal.
|
||||
# Required for high-ESR tuning-fork crystals (e.g. CM315D32768DZFT ~70kΩ ESR, CL=12.5pF).
|
||||
# Without this the ESP32 oscillator can't drive the crystal reliably.
|
||||
# V2 injects extra current only during the oscillation startup window.
|
||||
CONFIG_ESP32_RTC_EXT_CRYST_ADDIT_CURRENT_V2=y
|
||||
|
||||
# Increase bootstrap cycles for high-ESR crystal.
|
||||
# Default of 5 is insufficient; 500 gives the oscillator enough time to build amplitude.
|
||||
CONFIG_ESP_SYSTEM_RTC_EXT_XTAL_BOOTSTRAP_CYCLES=500
|
||||
CONFIG_ESP32_RTC_XTAL_BOOTSTRAP_CYCLES=500
|
||||
|
||||
# Allow more calibration retries before falling back to RC oscillator.
|
||||
CONFIG_RTC_XTAL_CAL_RETRY=3
|
||||
CONFIG_ESP32_RTC_XTAL_CAL_RETRY=3
|
||||
# 32kHz external crystal (GPIO32/33) is on the PCB but NOT used.
|
||||
# Deep sleep is disabled (soft idle instead), so RTC slow clock accuracy is irrelevant.
|
||||
# Time tracking uses esp_timer (40MHz APB crystal, ~20ppm).
|
||||
# Let the RTC slow clock use the default internal RC oscillator.
|
||||
|
||||
# --- Safety & Panic ---
|
||||
|
||||
|
||||
Reference in New Issue
Block a user