From ff1ea6615c35cf17562f4748b0fb4eca55f0d8bc Mon Sep 17 00:00:00 2001 From: Thaddeus Hughes Date: Thu, 12 Mar 2026 19:58:39 -0500 Subject: [PATCH] Many things, including a log timing report in the test Timing report: I (52322) LOG_TEST: === WRITE TIMING REPORT === I (52322) LOG_TEST: Iterations: 200 I (52322) LOG_TEST: Payload size: 39 bytes I (52322) LOG_TEST: Min: 49960 us I (52332) LOG_TEST: Max: 54476 us I (52332) LOG_TEST: Avg: 50005 us I (52342) LOG_TEST: Sector crossings: 2 (max 49983 us) I (52342) LOG_TEST: WDT margin: 4.9s (WDT=5s, worst=54476us) I (52352) LOG_TEST: =========================== so a write takes up to 54ms - not negligible! --- CLAUDE.md | 4 -- TODO.md | 30 ++++----- main/comms_events.h | 17 +++++ main/control_fsm.c | 7 +- main/log_test.c | 88 ++++++++++++++++++++++++- main/log_test.h | 3 + main/main.c | 157 ++++++++++++++++++++++++++++++-------------- main/rtc.c | 18 ++--- main/rtc.h | 1 - main/sensors.c | 39 ----------- main/storage.c | 43 +++++++++--- main/webserver.c | 4 ++ sdkconfig.defaults | 22 ++----- 13 files changed, 279 insertions(+), 154 deletions(-) create mode 100644 main/comms_events.h diff --git a/CLAUDE.md b/CLAUDE.md index 77bec20..35b528b 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -25,10 +25,6 @@ See `README.md` for full project documentation (hardware, architecture, protocol **Current project-specific overrides (sdkconfig.defaults):** | Setting | Value | Why | |---------|-------|-----| -| `CONFIG_RTC_CLK_SRC_EXT_CRYS` | y | Use external 32kHz crystal for accurate RTC | -| `CONFIG_ESP32_RTC_EXT_CRYST_ADDIT_CURRENT_V2` | y | Drive high-ESR crystal during startup | -| `CONFIG_ESP_SYSTEM_RTC_EXT_XTAL_BOOTSTRAP_CYCLES` | 500 | Extra bootstrap for slow-starting crystal | -| `CONFIG_RTC_XTAL_CAL_RETRY` | 3 | More calibration attempts before RC fallback | | `CONFIG_ESP_TASK_WDT_PANIC` | y | WDT timeout → panic → reboot (feeds OTA rollback counter) | **Already correct at IDF defaults (verified, no override needed):** diff --git a/TODO.md b/TODO.md index f48189d..b781733 100644 --- a/TODO.md +++ b/TODO.md @@ -7,14 +7,14 @@ - [clauded] Confirm brownout detector level — ~2.43V is correct (ESP32 rail protection; battery low-V handled by FSM's `LOW_PROTECTION_V`) - [clauded] Research sdkconfig management best practices — documented in CLAUDE.md "sdkconfig Management" section 2. - [clauded] Fix managed_components: removed unused `littlefs` and `tca95x5` deps, pinned `mdns` to `~1.9.1`, bumped IDF min to `>=5.0`; documented in CLAUDE.md -3. - [ ] OTA rollback via consecutive-reset counter - - [ ] Add `RTC_DATA_ATTR uint8_t reset_counter` — increment on boot, clear after successful health check - - [ ] On counter ≥ 5, call `esp_ota_mark_app_invalid_rollback_and_reboot()` - - [ ] After POST passes and FSM starts, call `esp_ota_mark_app_valid_cancel_rollback()` - - [ ] Decide what "health check passes" means (POST passes? 30s uptime? first successful FSM cycle?) +3. - [clauded] OTA rollback via consecutive-reset counter + - [clauded] Add `RTC_DATA_ATTR uint8_t ota_reset_counter` — incremented on panic/WDT resets, cleared on power-on/ext reset + - [clauded] On counter ≥ 5, call `esp_ota_mark_app_invalid_rollback_and_reboot()` + - [clauded] After POST passes and FSM starts, call `esp_ota_mark_app_valid_cancel_rollback()` and clear counter + - [clauded] Health check = POST passes + all critical inits + FSM task started + non-critical inits attempted 4. - [clauded] Critical init failures (ADC, storage, log, I2C, FSM, UART) → `init_critical()` retries 3×, then `esp_restart()` 5. - [clauded] Non-critical init failures (RF, BT, webserver) → log error, continue booting - - [ ] WiFi/BT already have restart paths (`webserver_restart_wifi()`, `bt_hid_resume()`) — wire these into a retry-on-failure path at boot, not just soft idle exit + - [clauded] WiFi/BT/RF retry once on init failure at boot (200ms delay for RF/BT, 500ms for WiFi), then log and continue 6. - [clauded] Power-on self-test (POST) — `init_critical()` wrapper + dedicated POST checks after init - [clauded] ADC: `adc_post()` reads all 4 channels twice with 5ms delay, warns if frozen - [clauded] I2C: `i2c_post()` verifies TCA9555 responds (read port 0) @@ -25,12 +25,12 @@ - [ ] Enforce validation inside `commit_params()` (covers both `storage_init()` load and `/set` POST) - [ ] Audit for anywhere params are set without an immediate `commit_params()` call - [ ] Audit abandoned parameters (e.g. jack current) — add comments marking them deprecated -8. - [ ] Factory reset: erase entire storage partition (not just params), require 10s button hold, LED indication (flash all → hold solid once triggered) -9. - [ ] Ensure RTC_DATA_ATTR variables survive panics/WDT resets - - [ ] Verify `sync_unix_us`, `sync_rtc_us`, `rtc_set` (time) are not corrupted by any init path - - [ ] Verify `remaining_distance`, `fsm_error` (FSM state) are not zeroed except by intentional reset - - [ ] Verify `log_head_offset`, `log_tail_offset` stay consistent after crash (no partial writes) -10. - [ ] Measure flash log write duration (bracket with `esp_timer_get_time()`, compare to WDT timeout of 5s) +8. - [clauded] Factory reset: erases params + log + post_test partitions, requires 10s button hold on cold boot, LEDs flash during hold → solid when triggered +9. - [clauded] Ensure RTC_DATA_ATTR variables survive panics/WDT resets + - [clauded] Verified `sync_unix_us`, `sync_rtc_us`, `rtc_set` — no init path zeroes them; `rtc_restore_time()` recovers via RTC HW counter + - [clauded] Verified `remaining_distance`, `fsm_error` — `fsm_init()` does not touch them; only cleared by explicit user action + - [clauded] Verified `log_head_offset`, `log_tail_offset` — `log_init()` always recovers from flash scan; RTC_DATA_ATTR is historical/harmless +10. - [clauded] Measure flash log write duration — `test_log_write_timing()` in log_test.c, runs 200 iterations of 39-byte writes, reports min/max/avg/sector-crossing times, compares to 5s WDT 11. - [ ] WiFi STA mode with event-group signaling - [ ] Try connecting to saved STA network first, fall back to softAP on failure/timeout - [ ] Add `EventGroupHandle_t` with `WIFI_READY_BIT` (set when STA connected or softAP up) and `BT_READY_BIT` (set when BT scan task starts) @@ -41,9 +41,9 @@ - [ ] Decide: move to main.c (simpler) or keep in `control_task()` (current) — either way, remove the dead commented-out call in main.c and add a clarifying comment - [ ] Audit all ISRs are IRAM-safe: no `ESP_LOGx`, `printf`, `malloc`, or flash access — only `xQueueSendFromISR()` - [ ] Handle `sensors_init()` failure as critical (→ reboot) -13. - [ ] Confirm whether external RTC crystal can be dropped (device never enters deep sleep now) — if yes, remove `rtc_xtal_init()` and related sdkconfig entries; if no, document why it must stay -14. - [ ] Remove `rtc_wakeup_cause()` call (informational only, no longer needed) -15. - [ ] Confirm `rtc_check_shutdown_timer()` uses signed subtraction — then remove the esp_timer overflow TODO comment (int64_t overflows after 292K years) +13. - [clauded] External 32kHz crystal not needed (deep sleep disabled, soft idle instead) — removed crystal config from sdkconfig.defaults; `rtc_xtal_init()` already a no-op; crystal remains on PCB but unused +14. - [clauded] Removed `rtc_wakeup_cause()` — was unused (informational only, never called) +15. - [clauded] Confirmed `rtc_check_shutdown_timer()` uses unsigned `TickType_t` subtraction — wraps correctly; removed esp_timer overflow TODO comment from main.c 16. - [ ] Extract pure logic (e-fuse thermal model, param serialization, sensor debounce) into host-testable modules with Unity/CMock 17. - [ ] UART integration test framework: Python runner + ESP-side test commands 18. - [test] Logtool GUI output (matplotlib) diff --git a/main/comms_events.h b/main/comms_events.h new file mode 100644 index 0000000..d94a4ef --- /dev/null +++ b/main/comms_events.h @@ -0,0 +1,17 @@ +#ifndef COMMS_EVENTS_H +#define COMMS_EVENTS_H + +#include "freertos/FreeRTOS.h" +#include "freertos/event_groups.h" + +// Shared event group for WiFi/BT readiness signaling. +// Set by webserver.c and bt_hid.c; waited on by main.c during alarm wake. + +#define WIFI_READY_BIT BIT0 // Set when STA connected or softAP is up +#define BT_READY_BIT BIT1 // Set when BT scan task starts +#define COMMS_ALL_BITS (WIFI_READY_BIT | BT_READY_BIT) + +// Must be created once (by main.c) before webserver_init() / bt_hid_init() +extern EventGroupHandle_t comms_event_group; + +#endif // COMMS_EVENTS_H diff --git a/main/control_fsm.c b/main/control_fsm.c index 1ecbdcf..eedc37d 100644 --- a/main/control_fsm.c +++ b/main/control_fsm.c @@ -32,14 +32,15 @@ static QueueHandle_t fsm_cmd_queue = NULL; +// AUDIT: fsm_init() does not zero these — they persist across panics/WDT resets. +// Only cleared by explicit user action (fsm_clear_error, fsm_set_remaining_distance). RTC_DATA_ATTR esp_err_t fsm_error = ESP_OK; esp_err_t fsm_get_error() { return fsm_error; } void fsm_clear_error() { fsm_error = ESP_OK; } - + int64_t override_time = -1; fsm_override_t override_cmd; -//int64_t override_cooldown[8] = {-1}; bool enabled = false; float this_move_dist = 0.0f; @@ -182,7 +183,7 @@ void control_task(void *param) { const TickType_t xFrequency = pdMS_TO_TICKS(20); enabled = true; - sensors_init(); // TODO: Why is this *here* rather than in main? + // sensors_init() is called from main.c as a critical init (before FSM starts) while (enabled) { vTaskDelayUntil(&xLastWakeTime, xFrequency); diff --git a/main/log_test.c b/main/log_test.c index 6259764..b23f611 100644 --- a/main/log_test.c +++ b/main/log_test.c @@ -4,6 +4,7 @@ #include "esp_log.h" #include "esp_err.h" #include "esp_task_wdt.h" +#include "esp_timer.h" #include #include @@ -1087,6 +1088,87 @@ int count_passed_tests(test_result_t* results, int num_tests) { return passed; } +// ============================================================================ +// Write timing benchmark — measures blocking write duration +// ============================================================================ +void test_log_write_timing(void) { + ESP_LOGI(TAG, ""); + ESP_LOGI(TAG, "=== Log Write Timing Benchmark ==="); + + // Erase and reinit to get a clean state + esp_err_t err = log_erase_all_sectors(); + if (err != ESP_OK) { + ESP_LOGE(TAG, "Failed to erase log for timing test"); + return; + } + esp_task_wdt_reset(); + + // Use a 39-byte payload (typical FSM log entry size) + uint8_t payload[39]; + for (int i = 0; i < 39; i++) payload[i] = (uint8_t)i; + + #define TIMING_ITERATIONS 200 + + int64_t min_us = INT64_MAX; + int64_t max_us = 0; + int64_t total_us = 0; + int sector_cross_count = 0; + int64_t sector_cross_max_us = 0; + + uint32_t prev_head = log_get_head(); + + for (int i = 0; i < TIMING_ITERATIONS; i++) { + int64_t t0 = esp_timer_get_time(); + err = log_write_blocking_test(payload, sizeof(payload), LOG_TYPE_DATA); + // Wait for queue flush + vTaskDelay(pdMS_TO_TICKS(50)); + int64_t t1 = esp_timer_get_time(); + + if (err != ESP_OK) { + ESP_LOGE(TAG, "Write %d failed: %s", i, esp_err_to_name(err)); + continue; + } + + int64_t dt = t1 - t0; + total_us += dt; + if (dt < min_us) min_us = dt; + if (dt > max_us) max_us = dt; + + // Detect sector crossing (head wrapped or jumped by > payload size) + uint32_t cur_head = log_get_head(); + if (cur_head < prev_head || (cur_head - prev_head) > sizeof(payload) + 10) { + sector_cross_count++; + if (dt > sector_cross_max_us) sector_cross_max_us = dt; + } + prev_head = cur_head; + + if (i % 50 == 0) esp_task_wdt_reset(); + } + + int64_t avg_us = total_us / TIMING_ITERATIONS; + + ESP_LOGI(TAG, ""); + ESP_LOGI(TAG, "=== WRITE TIMING REPORT ==="); + ESP_LOGI(TAG, " Iterations: %d", TIMING_ITERATIONS); + ESP_LOGI(TAG, " Payload size: %d bytes", (int)sizeof(payload)); + ESP_LOGI(TAG, " Min: %lld us", (long long)min_us); + ESP_LOGI(TAG, " Max: %lld us", (long long)max_us); + ESP_LOGI(TAG, " Avg: %lld us", (long long)avg_us); + ESP_LOGI(TAG, " Sector crossings: %d (max %lld us)", sector_cross_count, (long long)sector_cross_max_us); + ESP_LOGI(TAG, " WDT margin: %.1fs (WDT=5s, worst=%lldus)", + 5.0 - (double)max_us / 1000000.0, (long long)max_us); + if (max_us > 1000000) { + ESP_LOGW(TAG, " WARNING: max write > 1s — close to WDT timeout!"); + } else if (max_us > 100000) { + ESP_LOGI(TAG, " Note: max write > 100ms (expected during sector erase)"); + } + ESP_LOGI(TAG, "==========================="); + ESP_LOGI(TAG, ""); + + #undef TIMING_ITERATIONS + esp_task_wdt_reset(); +} + // Main test runner esp_err_t run_all_log_tests(void) { ESP_LOGI(TAG, "\n\n"); @@ -1169,12 +1251,12 @@ esp_err_t run_all_log_tests(void) { if (passed == num_tests) { ESP_LOGI(TAG, "ALL TESTS PASSED!"); - + // Run write timing benchmark as a final report (not a pass/fail test) + test_log_write_timing(); } else { ESP_LOGE(TAG, "SOME TESTS FAILED!"); - } - + while(1) { esp_task_wdt_reset(); vTaskDelay(pdMS_TO_TICKS(100)); } return ESP_OK; } \ No newline at end of file diff --git a/main/log_test.h b/main/log_test.h index 2bd8fa8..4589794 100644 --- a/main/log_test.h +++ b/main/log_test.h @@ -40,6 +40,9 @@ bool test_log_full_partition(void); bool test_log_read_after_write(void); bool test_log_multiple_types(void); +// Write timing benchmark (not a pass/fail test — prints min/max/avg report) +void test_log_write_timing(void); + // Helper functions for testing void print_test_results(test_result_t* results, int num_tests); int count_passed_tests(test_result_t* results, int num_tests); diff --git a/main/main.c b/main/main.c index 79d6c27..7bd22c4 100644 --- a/main/main.c +++ b/main/main.c @@ -1,5 +1,6 @@ #include "esp_task_wdt.h" #include "esp_system.h" +#include "esp_ota_ops.h" #include "i2c.h" #include "log_test.h" #include "partition_test.h" @@ -16,12 +17,20 @@ #include "rf_433.h" #include "bt_hid.h" #include "webserver.h" +#include "comms_events.h" #include "version.h" #include +EventGroupHandle_t comms_event_group = NULL; + #define TAG "MAIN" #define POST_MAX_RETRIES 3 +#define OTA_ROLLBACK_THRESHOLD 5 +#define FACTORY_RESET_HOLD_MS 10000 + +// Survives resets (panic, WDT, sw reset) but NOT power-on or external reset +RTC_DATA_ATTR static uint8_t ota_reset_counter = 0; // Try an init function up to POST_MAX_RETRIES times. On final failure, reboot. // Critical inits (ADC, I2C, storage, FSM, sensors) use this — a permanent failure @@ -137,49 +146,58 @@ void app_main(void) {esp_task_wdt_add(NULL); drive_leds(LED_STATE_BOOTING); - // Check for factory reset condition: Cold boot (power-on/ext-reset) + button held + // Factory reset: cold boot + button held for 10s + // LEDs flash while waiting, go solid when triggered esp_reset_reason_t boot_reset_reason = esp_reset_reason(); if ((boot_reset_reason == ESP_RST_POWERON || boot_reset_reason == ESP_RST_EXT) && gpio_get_level(GPIO_NUM_13) == 0) { - ESP_LOGW(TAG, "FACTORY RESET TRIGGERED - Button held on cold boot"); - - // Flash LED pattern to indicate factory reset - for (int i = 0; i < 10; i++) { - i2c_set_led1(0b111); + ESP_LOGW(TAG, "Button held on cold boot — hold %ds for factory reset", FACTORY_RESET_HOLD_MS / 1000); + + // Flash all LEDs while user holds button (100ms on/off cycle) + int held_ms = 0; + while (gpio_get_level(GPIO_NUM_13) == 0 && held_ms < FACTORY_RESET_HOLD_MS) { + i2c_set_led1((held_ms / 100) % 2 ? 0b111 : 0b000); vTaskDelay(pdMS_TO_TICKS(100)); + held_ms += 100; + esp_task_wdt_reset(); + } + + if (held_ms < FACTORY_RESET_HOLD_MS) { + ESP_LOGI(TAG, "Button released early (%dms) — skipping factory reset", held_ms); i2c_set_led1(0b000); - vTaskDelay(pdMS_TO_TICKS(100)); - } - - // Initialize minimal components needed for factory reset - if (storage_init() != ESP_OK) ESP_LOGE(TAG, "STORAGE FAILED"); - - // Perform factory reset - esp_err_t reset_err = factory_reset(); - if (reset_err == ESP_OK) { - ESP_LOGI(TAG, "Factory reset completed successfully"); - // Flash success pattern - for (int i = 0; i < 5; i++) { - i2c_set_led1(0b010); - vTaskDelay(pdMS_TO_TICKS(200)); - i2c_set_led1(0b000); - vTaskDelay(pdMS_TO_TICKS(200)); - } } else { - ESP_LOGE(TAG, "Factory reset failed!"); - // Flash error pattern - for (int i = 0; i < 5; i++) { - i2c_set_led1(0b100); - vTaskDelay(pdMS_TO_TICKS(200)); - i2c_set_led1(0b000); - vTaskDelay(pdMS_TO_TICKS(200)); + // Solid LEDs = reset triggered + i2c_set_led1(0b111); + ESP_LOGW(TAG, "FACTORY RESET TRIGGERED"); + + // Initialize storage so we can erase it + if (storage_init() != ESP_OK) ESP_LOGE(TAG, "STORAGE FAILED"); + + esp_err_t reset_err = factory_reset(); + if (reset_err == ESP_OK) { + ESP_LOGI(TAG, "Factory reset completed successfully"); + // Success: green blink + for (int i = 0; i < 5; i++) { + i2c_set_led1(0b010); + vTaskDelay(pdMS_TO_TICKS(200)); + i2c_set_led1(0b000); + vTaskDelay(pdMS_TO_TICKS(200)); + } + } else { + ESP_LOGE(TAG, "Factory reset failed!"); + // Error: red blink + for (int i = 0; i < 5; i++) { + i2c_set_led1(0b100); + vTaskDelay(pdMS_TO_TICKS(200)); + i2c_set_led1(0b000); + vTaskDelay(pdMS_TO_TICKS(200)); + } } + + ESP_LOGI(TAG, "Rebooting system..."); + vTaskDelay(pdMS_TO_TICKS(1000)); + esp_restart(); } - - // Reboot the system - ESP_LOGI(TAG, "Rebooting system..."); - vTaskDelay(pdMS_TO_TICKS(1000)); - esp_restart(); } // Critical inits — retry up to 3 times, then reboot @@ -191,7 +209,7 @@ void app_main(void) {esp_task_wdt_add(NULL); adc_post(); // ADC channels readable and not frozen storage_post(); // flash write-read-verify on test sector - //run_all_log_tests(); + run_all_log_tests(); esp_reset_reason_t reset_reason = esp_reset_reason(); esp_sleep_wakeup_cause_t wake_cause = esp_sleep_get_wakeup_cause(); @@ -205,18 +223,31 @@ void app_main(void) {esp_task_wdt_add(NULL); log_write(boot_entry, sizeof(boot_entry), LOG_TYPE_BOOT); } - // TODO: OTA rollback counter (see TODO.md #3) - // Write a crash log entry if we rebooted unexpectedly - if (reset_reason == ESP_RST_PANIC || - reset_reason == ESP_RST_INT_WDT || - reset_reason == ESP_RST_TASK_WDT || - reset_reason == ESP_RST_WDT) { - ESP_LOGW(TAG, "Crash detected! Reset reason: %d", reset_reason); + // OTA rollback: count consecutive abnormal resets (panic/WDT). + // Power-on and external resets clear the counter; crashes increment it. + // After OTA_ROLLBACK_THRESHOLD consecutive crashes, roll back to the + // previous OTA partition (if available). + if (reset_reason == ESP_RST_POWERON || reset_reason == ESP_RST_EXT) { + ota_reset_counter = 0; + } else if (reset_reason == ESP_RST_PANIC || + reset_reason == ESP_RST_INT_WDT || + reset_reason == ESP_RST_TASK_WDT || + reset_reason == ESP_RST_WDT) { + ota_reset_counter++; + ESP_LOGW(TAG, "Crash detected (reason=%d), reset counter=%d/%d", + reset_reason, ota_reset_counter, OTA_ROLLBACK_THRESHOLD); + uint8_t crash_entry[9] = {}; uint64_t ts = rtc_get_ms(); memcpy(&crash_entry[0], &ts, 8); crash_entry[8] = (uint8_t)reset_reason; log_write(crash_entry, sizeof(crash_entry), LOG_TYPE_CRASH); + + if (ota_reset_counter >= OTA_ROLLBACK_THRESHOLD) { + ESP_LOGE(TAG, "Rollback threshold reached — marking app invalid"); + esp_ota_mark_app_invalid_rollback_and_reboot(); + // Does not return — reboots into previous OTA slot + } } if (solar_run_fsm() != ESP_OK) ESP_LOGE(TAG, "SOLAR FAILED"); @@ -226,13 +257,33 @@ void app_main(void) {esp_task_wdt_add(NULL); /*** FULL BOOT ***/ // Critical — must succeed or reboot init_critical("UART", uart_init); + init_critical("SENSORS", sensors_init); init_critical("FSM", fsm_init); - // sensors_init() is called inside control_task() — see control_fsm.c:185 - // Non-critical — log error but continue booting - if (rf_433_init() != ESP_OK) ESP_LOGE(TAG, "RF FAILED"); - if (bt_hid_init() != ESP_OK) ESP_LOGE(TAG, "BT HID FAILED"); - if (webserver_init() != ESP_OK) ESP_LOGE(TAG, "WEBSERVER FAILED"); + // Create event group before non-critical inits (they set bits on it) + comms_event_group = xEventGroupCreate(); + + // Non-critical — retry once on failure, then log and continue + if (rf_433_init() != ESP_OK) { + ESP_LOGW(TAG, "RF init failed, retrying..."); + vTaskDelay(pdMS_TO_TICKS(200)); + if (rf_433_init() != ESP_OK) ESP_LOGE(TAG, "RF FAILED (continuing without RF)"); + } + if (bt_hid_init() != ESP_OK) { + ESP_LOGW(TAG, "BT init failed, retrying..."); + vTaskDelay(pdMS_TO_TICKS(200)); + if (bt_hid_init() != ESP_OK) ESP_LOGE(TAG, "BT HID FAILED (continuing without BT)"); + } + if (webserver_init() != ESP_OK) { + ESP_LOGW(TAG, "Webserver init failed, retrying..."); + vTaskDelay(pdMS_TO_TICKS(500)); + if (webserver_init() != ESP_OK) ESP_LOGE(TAG, "WEBSERVER FAILED (continuing without WiFi)"); + } + + // POST + FSM started successfully — this firmware is good. + // Clear the rollback counter and mark the OTA partition as valid. + ota_reset_counter = 0; + esp_ota_mark_app_valid_cancel_rollback(); /*** MAIN LOOP ***/ TickType_t xLastWakeTime = xTaskGetTickCount(); @@ -254,8 +305,12 @@ void app_main(void) {esp_task_wdt_add(NULL); if (rtc_alarm_tripped()) { soft_idle_exit(); xLastWakeTime = xTaskGetTickCount(); - vTaskDelay(pdMS_TO_TICKS(500)); - // TODO: do a hard wait until wifi and bluetooth come up, not just blindly wait; might be better to be non-blocking + // Wait for WiFi + BT to come back up (or timeout after 5s) + if (comms_event_group) { + xEventGroupWaitBits(comms_event_group, COMMS_ALL_BITS, + pdFALSE, pdTRUE, pdMS_TO_TICKS(5000)); + } + esp_task_wdt_reset(); fsm_request(FSM_CMD_START); rtc_schedule_next_alarm(); } @@ -364,7 +419,7 @@ void app_main(void) {esp_task_wdt_add(NULL); } solar_run_fsm(); - rtc_check_shutdown_timer(); // TODO: Will esp timer overflow? Handle overflow if needed (this used to be handled by the fact that we were in deep sleep) + rtc_check_shutdown_timer(); esp_task_wdt_reset(); } } \ No newline at end of file diff --git a/main/rtc.c b/main/rtc.c index 513934e..ec6b705 100644 --- a/main/rtc.c +++ b/main/rtc.c @@ -44,7 +44,9 @@ static uint64_t rtc_hw_time_us(void) uint64_t last_activity_tick = 0; -// RTC_DATA_ATTR keeps these in RTC memory; persists across software resets (panics, WDT) +// RTC_DATA_ATTR keeps these in RTC memory; persists across software resets (panics, WDT). +// AUDIT: no init path zeroes these — rtc_restore_time() recovers via RTC HW counter, +// rtc_set_s() is only called explicitly by the user. Verified 2026-03-12. RTC_DATA_ATTR int64_t next_alarm_time_s = -1; RTC_DATA_ATTR bool rtc_set = false; RTC_DATA_ATTR int64_t sync_unix_us = 0; // Unix time in µs at last rtc_set_s() call @@ -160,22 +162,14 @@ int64_t rtc_get_s_in_day(void) return rtc_get_s() % 86400UL; } -esp_sleep_wakeup_cause_t rtc_wakeup_cause(void) -{ - esp_sleep_wakeup_cause_t c = esp_sleep_get_wakeup_cause(); - switch (c) { - case ESP_SLEEP_WAKEUP_EXT0: ESP_LOGI("RTC", "Wakeup: GPIO"); break; - case ESP_SLEEP_WAKEUP_TIMER: ESP_LOGI("RTC", "Wakeup: timer"); break; - default: ESP_LOGI("RTC", "Wakeup: normal boot"); break; - } - return c; -} - /* -------------------------------------------------------------------------- */ /* Unified periodic update */ /* -------------------------------------------------------------------------- */ void rtc_check_shutdown_timer(void) { + // Unsigned subtraction handles TickType_t (uint32_t) wraparound correctly: + // e.g. if tick wrapped from 0xFFFFFFFE to 5, elapsed = 5 - 0xFFFFFFFE = 7. + // At 1ms/tick, uint32_t wraps after ~49.7 days — well beyond the 180s timeout. TickType_t elapsed = xTaskGetTickCount() - last_activity_tick; if (elapsed * portTICK_PERIOD_MS >= POWER_INACTIVITY_TIMEOUT_MS) soft_idle_enter(); diff --git a/main/rtc.h b/main/rtc.h index 637cf2b..cdf7c35 100644 --- a/main/rtc.h +++ b/main/rtc.h @@ -36,7 +36,6 @@ void soft_idle_enter(void); void soft_idle_exit(void); bool soft_idle_is_active(void); bool soft_idle_button_raw(void); /* direct GPIO read, no I2C */ -esp_sleep_wakeup_cause_t rtc_wakeup_cause(); /*void adjust_rtc_hour(char *key, int8_t dir); void adjust_rtc_min(char *key, int8_t dir);*/ diff --git a/main/sensors.c b/main/sensors.c index 11a82e9..d2b181a 100644 --- a/main/sensors.c +++ b/main/sensors.c @@ -173,45 +173,6 @@ int8_t pack_sensors() { return ret; } -/*esp_err_t sensors_init() { - gpio_config_t io_conf = { - .pin_bit_mask = (1ULL << sensor_pins[0]) | (1ULL << sensor_pins[1]), - .mode = GPIO_MODE_INPUT, - .pull_up_en = GPIO_PULLUP_ENABLE, - .pull_down_en = GPIO_PULLDOWN_DISABLE, - .intr_type = GPIO_INTR_ANYEDGE, - }; - ESP_ERROR_CHECK(gpio_config(&io_conf)); - - sensor_event_queue = xQueueCreate(16, sizeof(sensor_event_t)); - if (!sensor_event_queue) { - ESP_LOGE(TAG, "Failed to create sensor queue"); - return ESP_FAIL; - } - - // Install ISR service - ESP_ERROR_CHECK(gpio_install_isr_service(0)); - - for (uint8_t i = 0; i < N_SENSORS; i++) { - ESP_ERROR_CHECK(gpio_isr_handler_add(sensor_pins[i], sensor_isr_handler, INT2VOIDP(sensor_pins[i]))); - sensor_stable_state[i] = !gpio_get_level(sensor_pins[i]); - } - - xTaskCreate(sensor_debounce_task, "SENSORS", 3072, NULL, 6, NULL); - - return ESP_OK; -} - -esp_err_t sensors_stop() { - for (uint8_t i = 0; i < N_SENSORS; i++) { - gpio_isr_handler_remove(sensor_pins[i]); - } - gpio_uninstall_isr_service(); - vQueueDelete(sensor_event_queue); - - return ESP_OK; -}*/ - // Public API bool get_sensor(sensor_t i) { return sensor_stable_state[i]; diff --git a/main/storage.c b/main/storage.c index 44cd222..570a69c 100644 --- a/main/storage.c +++ b/main/storage.c @@ -113,8 +113,10 @@ static const esp_partition_t *params_partition = NULL; static const esp_partition_t *log_partition = NULL; static const esp_partition_t *post_partition = NULL; -// Log head/tail tracking with mutex protection -// These track byte offsets within the log partition (0-based) +// Log head/tail tracking with mutex protection. +// These track byte offsets within the log partition (0-based). +// RTC_DATA_ATTR is historical — log_init() always recovers these from a flash scan, +// so the RTC values are overwritten on every boot. No partial-write risk. RTC_DATA_ATTR static uint32_t log_head_offset = 0; RTC_DATA_ATTR static uint32_t log_tail_offset = 0; RTC_DATA_ATTR static bool log_initialized = false; @@ -407,21 +409,46 @@ esp_err_t commit_params(void) { // ============================================================================ esp_err_t factory_reset(void) { ESP_LOGI(TAG, "Performing factory reset..."); - + // Reset all parameters to defaults for (int i = 0; i < NUM_PARAMS; i++) { memcpy(¶meter_table[i], ¶meter_defaults[i], sizeof(param_value_t)); } - - // TODO: WIPE ENTIRE PARTITION - - // Commit defaults to flash + + // Commit defaults to params partition esp_err_t err = commit_params(); if (err != ESP_OK) { ESP_LOGE(TAG, "Failed to commit defaults during factory reset"); return err; } - + + // Erase the log partition + const esp_partition_t *log_part = esp_partition_find_first( + ESP_PARTITION_TYPE_DATA, ESP_PARTITION_SUBTYPE_ANY, "log"); + if (log_part != NULL) { + ESP_LOGI(TAG, "Erasing log partition (%lu bytes)...", (unsigned long)log_part->size); + err = esp_partition_erase_range(log_part, 0, log_part->size); + if (err != ESP_OK) { + ESP_LOGE(TAG, "Failed to erase log partition: %s", esp_err_to_name(err)); + return err; + } + } + + // Erase the POST test partition + const esp_partition_t *post_part = esp_partition_find_first( + ESP_PARTITION_TYPE_DATA, ESP_PARTITION_SUBTYPE_ANY, "post_test"); + if (post_part != NULL) { + err = esp_partition_erase_range(post_part, 0, post_part->size); + if (err != ESP_OK) { + ESP_LOGW(TAG, "Failed to erase post_test partition: %s", esp_err_to_name(err)); + } + } + + // Reset log state so next boot starts fresh + log_head_offset = 0; + log_tail_offset = 0; + log_initialized = false; + ESP_LOGI(TAG, "Factory reset complete"); return ESP_OK; } diff --git a/main/webserver.c b/main/webserver.c index 4c9411d..9f9dc51 100644 --- a/main/webserver.c +++ b/main/webserver.c @@ -39,6 +39,7 @@ #include "webpage.h" #include "webserver.h" +#include "comms_events.h" #include "esp_partition.h" @@ -1017,6 +1018,7 @@ static esp_err_t try_connect_sta(const char *ssid, const char *pass, bool reset_ } s_wifi_running = true; + if (comms_event_group) xEventGroupSetBits(comms_event_group, WIFI_READY_BIT); return ESP_OK; } @@ -1099,6 +1101,7 @@ static esp_err_t launch_soft_ap(void) { ESP_LOGI(TAG, "SoftAP ready. SSID: %s, Channel: %d, Password: %s", wifi_config.ap.ssid, wifi_config.ap.channel, placeholder); ESP_LOGI(TAG, "Access at: http://%s.local or http://192.168.4.1", HOSTNAME); + if (comms_event_group) xEventGroupSetBits(comms_event_group, WIFI_READY_BIT); return ESP_OK; } @@ -1124,6 +1127,7 @@ esp_err_t webserver_stop(void) { esp_wifi_stop(); s_wifi_running = false; } + if (comms_event_group) xEventGroupClearBits(comms_event_group, WIFI_READY_BIT); return ESP_OK; } diff --git a/sdkconfig.defaults b/sdkconfig.defaults index 62cb9a1..ae7496d 100644 --- a/sdkconfig.defaults +++ b/sdkconfig.defaults @@ -2,24 +2,10 @@ # These are applied during "idf.py reconfigure" and take precedence over IDF defaults. # Do NOT override settings that are managed by idf.py menuconfig interactively. -# Use external 32kHz crystal for the RTC slow clock (GPIO32/33 on the PCB). -# This gives accurate timekeeping across deep sleep instead of the +/-5% internal RC. -CONFIG_RTC_CLK_SRC_EXT_CRYS=y - -# Enable additional drive current for the external 32kHz crystal. -# Required for high-ESR tuning-fork crystals (e.g. CM315D32768DZFT ~70kΩ ESR, CL=12.5pF). -# Without this the ESP32 oscillator can't drive the crystal reliably. -# V2 injects extra current only during the oscillation startup window. -CONFIG_ESP32_RTC_EXT_CRYST_ADDIT_CURRENT_V2=y - -# Increase bootstrap cycles for high-ESR crystal. -# Default of 5 is insufficient; 500 gives the oscillator enough time to build amplitude. -CONFIG_ESP_SYSTEM_RTC_EXT_XTAL_BOOTSTRAP_CYCLES=500 -CONFIG_ESP32_RTC_XTAL_BOOTSTRAP_CYCLES=500 - -# Allow more calibration retries before falling back to RC oscillator. -CONFIG_RTC_XTAL_CAL_RETRY=3 -CONFIG_ESP32_RTC_XTAL_CAL_RETRY=3 +# 32kHz external crystal (GPIO32/33) is on the PCB but NOT used. +# Deep sleep is disabled (soft idle instead), so RTC slow clock accuracy is irrelevant. +# Time tracking uses esp_timer (40MHz APB crystal, ~20ppm). +# Let the RTC slow clock use the default internal RC oscillator. # --- Safety & Panic ---