-
Notifications
You must be signed in to change notification settings - Fork 8k
fix(hw_support): Fix crash when reconfiguring flash from 40 to 80 MHz (IDFGH-16831) #17905
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Reading from the flash while it is being reconfigured leads to data corruption and a crash when the reconfiguration code is located in flash. This is only an issue if a device has a bootloader that runs with 40 MHz flash and an application flashed via OTA that runs with 80 MHz flash. If bootloader and application run with the same flash speed, the reconfiguration is basically a no-op and no data corruption occurs. Fix reconfiguration by placing the code back into IRAM.
👋 Hello MattiasTF, we appreciate your contribution to this project! 📘 Please review the project's Contributions Guide for key guidelines on code, documentation, testing, and more. 🖊️ Please also make sure you have read and signed the Contributor License Agreement for this project. Click to see more instructions ...
Review and merge process you can expect ...
|
|
Hi @MattiasTF , Thanks for your report. Your proposal looks reasonable. However we can't reproduce the issue on our side. To make sure the fix can work well, could you please provide the app (better one in the examples), esp-idf sha, and sdkconfig of bootloader and app that can reproduce the issue? Thanks in advance. |
|
I have finally managed to consistently reproduce the issue. Apparently, the problem only manifests if the application is very large and the IRAM is so full that the linker has to use longcalls for the four functions to reconfigure the flash. Any example project I tried is too small to use longcalls and I couldn’t manage to force the linker to always use longcalls, so please use the crude patch in step 2 to enforce longcalls. I’m not providing an sdkconfig because the steps below use the defaults, besides the change in step 6. Steps to reproduce
The ESP32 is now running a 40 MHz bootloader with a 40 MHz application, which works fine.
The ESP32 is now running a 40 MHz bootloader with an 80 MHz application, which will hang until it is restarted by the RTC watchdog.
The ESP32 is now running a 40 MHz bootloader with an 80 MHz application, which will crash.
The ESP32 is now running an 80 MHz bootloader with an 80 MHz application, which works fine. Hardware usedThe tests were performed with an ESP32-WROOM-32E module that uses an ESP32-D0WD-V3. Additional informationThe relevant disassembly from my original crashing application looks like this: Note the use of longcalls ( When building hello_world without the patch, the relevant disassembly looks like this: Note that the code is located at smaller addresses, which means that My guess is that the shorter code is fully cached and can be executed correctly while the flash is being reconfigured. If the code uses the larger longcall variant, it doesn’t fit into the cache and more code has to be loaded while the flash is incorrectly configured. Instead of moving the calls to a function in IRAM, the crash or hang can also be fixed by moving the call to For reference, the disassembly with the patch to force longcalls looks like this: The order of Additional suggestionsIn my PR, I call the newly created function |
7549d08 moved
call_start_cpu0from IRAM to flash. Unfortunately, this included code that reconfigures the flash if the application’s flash speed differs from the bootloader’s flash speed.When an application with a flash speed of 80 MHz is started from a bootloader with a flash speed of 40 MHz, it will crash shortly after
bootloader_flash_cs_timing_config()returns. This is caused by reading instructions from flash while the flash is being reconfigured to the higher speed, which results in corrupt instruction data being read.Possible crashes to encounter:
movi.n a8, 1.As this code runs early during start-up, the device will be stuck in an infinite reboot loop. This means that loading an application built from the 5.5.1 release via OTA onto a device that contains an older bootloader with a slower flash speed will brick the device.
This PR fixes the problem by moving the flash configuration calls into a helper function that is placed in IRAM. The call to increase the flash clock has also been moved so that the GPIO settings and SPI dummy cycles are set to safe values before the clock is increased.
The commit that introduces the problem has already been backported to 5.5.1 as e1faf67, so this fix will also have to be backported.
Note
Moves ESP32 flash config sequence into IRAM helper
configure_flash()and invokes it during early init (when PSRAM HW init is disabled) to avoid executing from flash while reconfiguring.configure_flash()to adjust flash GPIO, dummy cycles, clock, and CS timing safely.bootloader_flash_*calls withconfigure_flash(&fhdr)in early init when!CONFIG_SPIRAM_BOOT_HW_INIT.CONFIG_IDF_TARGET_ESP32; no behavior change for other targets.Written by Cursor Bugbot for commit 7565d53. This will update automatically on new commits. Configure here.