Skip to content

Conversation

@MattiasTF
Copy link

@MattiasTF MattiasTF commented Nov 19, 2025

7549d08 moved call_start_cpu0 from IRAM to flash. Unfortunately, this included code that reconfigures the flash if the application’s flash speed differs from the bootloader’s flash speed.

When an application with a flash speed of 80 MHz is started from a bootloader with a flash speed of 40 MHz, it will crash shortly after bootloader_flash_cs_timing_config() returns. This is caused by reading instructions from flash while the flash is being reconfigured to the higher speed, which results in corrupt instruction data being read.

Possible crashes to encounter:

  • IllegalInstruction at an address between two assembly instructions, possibly from a corrupt previous instruction that was shorter or longer than what should have been there.
  • IllegalInstruction at an address that should contain a valid instruction.
  • LoadProhibited at an address that should contain an instruction that cannot cause that exception, such as movi.n a8, 1.

As this code runs early during start-up, the device will be stuck in an infinite reboot loop. This means that loading an application built from the 5.5.1 release via OTA onto a device that contains an older bootloader with a slower flash speed will brick the device.

This PR fixes the problem by moving the flash configuration calls into a helper function that is placed in IRAM. The call to increase the flash clock has also been moved so that the GPIO settings and SPI dummy cycles are set to safe values before the clock is increased.

The commit that introduces the problem has already been backported to 5.5.1 as e1faf67, so this fix will also have to be backported.


Note

Moves ESP32 flash config sequence into IRAM helper configure_flash() and invokes it during early init (when PSRAM HW init is disabled) to avoid executing from flash while reconfiguring.

  • ESP32 boot/flash init:
    • Add IRAM-only helper configure_flash() to adjust flash GPIO, dummy cycles, clock, and CS timing safely.
    • Replace inline bootloader_flash_* calls with configure_flash(&fhdr) in early init when !CONFIG_SPIRAM_BOOT_HW_INIT.
    • Guard with CONFIG_IDF_TARGET_ESP32; no behavior change for other targets.

Written by Cursor Bugbot for commit 7565d53. This will update automatically on new commits. Configure here.

Reading from the flash while it is being reconfigured leads to data
corruption and a crash when the reconfiguration code is located in flash.
This is only an issue if a device has a bootloader that runs with 40 MHz
flash and an application flashed via OTA that runs with 80 MHz flash.
If bootloader and application run with the same flash speed, the
reconfiguration is basically a no-op and no data corruption occurs.
Fix reconfiguration by placing the code back into IRAM.
@CLAassistant
Copy link

CLAassistant commented Nov 19, 2025

CLA assistant check
All committers have signed the CLA.

@github-actions
Copy link

github-actions bot commented Nov 19, 2025

Messages
📖 🎉 Good Job! All checks are passing!

👋 Hello MattiasTF, we appreciate your contribution to this project!


📘 Please review the project's Contributions Guide for key guidelines on code, documentation, testing, and more.

🖊️ Please also make sure you have read and signed the Contributor License Agreement for this project.

Click to see more instructions ...


This automated output is generated by the PR linter DangerJS, which checks if your Pull Request meets the project's requirements and helps you fix potential issues.

DangerJS is triggered with each push event to a Pull Request and modify the contents of this comment.

Please consider the following:
- Danger mainly focuses on the PR structure and formatting and can't understand the meaning behind your code or changes.
- Danger is not a substitute for human code reviews; it's still important to request a code review from your colleagues.
- To manually retry these Danger checks, please navigate to the Actions tab and re-run last Danger workflow.

Review and merge process you can expect ...


We do welcome contributions in the form of bug reports, feature requests and pull requests via this public GitHub repository.

This GitHub project is public mirror of our internal git repository

1. An internal issue has been created for the PR, we assign it to the relevant engineer.
2. They review the PR and either approve it or ask you for changes or clarifications.
3. Once the GitHub PR is approved, we synchronize it into our internal git repository.
4. In the internal git repository we do the final review, collect approvals from core owners and make sure all the automated tests are passing.
- At this point we may do some adjustments to the proposed change, or extend it by adding tests or documentation.
5. If the change is approved and passes the tests it is merged into the default branch.
5. On next sync from the internal git repository merged change will appear in this public GitHub repository.

Generated by 🚫 dangerJS against 7565d53

@github-actions github-actions bot changed the title fix(hw_support): Fix crash when reconfiguring flash from 40 to 80 MHz fix(hw_support): Fix crash when reconfiguring flash from 40 to 80 MHz (IDFGH-16831) Nov 19, 2025
@espressif-bot espressif-bot added the Status: Opened Issue is new label Nov 19, 2025
@ginkgm
Copy link
Collaborator

ginkgm commented Nov 30, 2025

Hi @MattiasTF ,

Thanks for your report. Your proposal looks reasonable. However we can't reproduce the issue on our side. To make sure the fix can work well, could you please provide the app (better one in the examples), esp-idf sha, and sdkconfig of bootloader and app that can reproduce the issue?

Thanks in advance.

@MattiasTF
Copy link
Author

MattiasTF commented Dec 2, 2025

I have finally managed to consistently reproduce the issue. Apparently, the problem only manifests if the application is very large and the IRAM is so full that the linker has to use longcalls for the four functions to reconfigure the flash. Any example project I tried is too small to use longcalls and I couldn’t manage to force the linker to always use longcalls, so please use the crude patch in step 2 to enforce longcalls.

I’m not providing an sdkconfig because the steps below use the defaults, besides the change in step 6.

Steps to reproduce

  1. Checkout master (683ddf8).
  2. Apply patch to enforce longcalls.
  3. cd examples/get-started/hello_world
  4. idf.py set-target esp32
  5. idf.py flash

The ESP32 is now running a 40 MHz bootloader with a 40 MHz application, which works fine.
The log will show spi_speed 0x0 to indicate running at 40 MHz.

  1. idf.py menuconfig -> Select ESPTOOLPY_FLASHFREQ_80M and save.
  2. idf.py build
  3. esptool --chip esp32 -b 460800 --before default-reset --after hard-reset write-flash --flash-mode dio --flash-size 2MB --flash-freq 80m 0x10000 build/hello_world.bin

The ESP32 is now running a 40 MHz bootloader with an 80 MHz application, which will hang until it is restarted by the RTC watchdog.

  1. Edit CMakeLists.txt and remove "idf_build_set_property(MINIMAL_BUILD ON)"
  2. idf.py build
  3. esptool --chip esp32 -b 460800 --before default-reset --after hard-reset write-flash --flash-mode dio --flash-size 2MB --flash-freq 80m 0x10000 build/hello_world.bin

The ESP32 is now running a 40 MHz bootloader with an 80 MHz application, which will crash.

  1. idf.py flash

The ESP32 is now running an 80 MHz bootloader with an 80 MHz application, which works fine.
The log will show spi_speed 0xF to indicate running at 80 MHz.

Hardware used

The tests were performed with an ESP32-WROOM-32E module that uses an ESP32-D0WD-V3.
It has 16 MB of flash instead of the 2 MB used by the example, but that makes no difference.

Additional information

The relevant disassembly from my original crashing application looks like this:

40172bc7:       01ad00          slli    a10, a13, 32
40172bca:       6b7081          l32r    a8, 4014d98c <prepare_main+0x2ccc> (400835c4 <bootloader_flash_clock_config>)
40172bcd:       0008e0          callx8  a8
40172bd0:       01ad            mov.n   a10, a1
40172bd2:       6b6f81          l32r    a8, 4014d990 <prepare_main+0x2cd0> (400835fc <bootloader_flash_gpio_config>)
40172bd5:       0008e0          callx8  a8
40172bd8:       01ad            mov.n   a10, a1
40172bda:       6b6e81          l32r    a8, 4014d994 <prepare_main+0x2cd4> (400837e8 <bootloader_flash_dummy_config>)
40172bdd:       0008e0          callx8  a8
40172be0:       6b6e81          l32r    a8, 4014d998 <prepare_main+0x2cd8> (40083554 <bootloader_flash_cs_timing_config>)
40172be3:       0008e0          callx8  a8
40172be6:       6b5e71          l32r    a7, 4014d960 <prepare_main+0x2ca0> (3ffb8f31 <s_cpu_inited>)
40172be9:       180c            movi.n  a8, 1
40172beb:       0020c0          memw
40172bee:       004782          s8i     a8, a7, 0

Note the use of longcalls (l32r + callx8).

When building hello_world without the patch, the relevant disassembly looks like this:

400d141c:       01ad            mov.n   a10, a1
400d141e:       b18c25          call8   40082ce0 <bootloader_flash_clock_config>
400d1421:       01ad            mov.n   a10, a1
400d1423:       b190a5          call8   40082d2c <bootloader_flash_gpio_config>
400d1426:       01ad            mov.n   a10, a1
400d1428:       b1b065          call8   40082f30 <bootloader_flash_dummy_config>
400d142b:       b18465          call8   40082c70 <bootloader_flash_cs_timing_config>
400d142e:       fb3f81          l32r    a8, 400d012c <_stext+0x10c> (3ffb2444 <s_cpu_inited>)
400d1431:       190c            movi.n  a9, 1
400d1433:       0020c0          memw
400d1436:       004892          s8i     a9, a8, 0

Note that the code is located at smaller addresses, which means that call8 can be used, which produces shorter byte code.

My guess is that the shorter code is fully cached and can be executed correctly while the flash is being reconfigured. If the code uses the larger longcall variant, it doesn’t fit into the cache and more code has to be loaded while the flash is incorrectly configured.

Instead of moving the calls to a function in IRAM, the crash or hang can also be fixed by moving the call to bootloader_flash_clock_config after bootloader_flash_cs_timing_config, but I wouldn’t want to rely on that.

For reference, the disassembly with the patch to force longcalls looks like this:

400d1588:       fae981          l32r    a8, 400d012c <_stext+0x10c> (40082cec <bootloader_flash_clock_config>)
400d158b:       01ad            mov.n   a10, a1
400d158d:       0008e0          callx8  a8
400d1590:       fae881          l32r    a8, 400d0130 <_stext+0x110> (40082d38 <bootloader_flash_gpio_config>)
400d1593:       01ad            mov.n   a10, a1
400d1595:       0008e0          callx8  a8
400d1598:       fae781          l32r    a8, 400d0134 <_stext+0x114> (40082f3c <bootloader_flash_dummy_config>)
400d159b:       01ad            mov.n   a10, a1
400d159d:       0008e0          callx8  a8
400d15a0:       fae681          l32r    a8, 400d0138 <_stext+0x118> (40082c7c <bootloader_flash_cs_timing_config>)
400d15a3:       0008e0          callx8  a8
400d15a6:       fae581          l32r    a8, 400d013c <_stext+0x11c> (3ffb2644 <s_cpu_inited>)
400d15a9:       190c            movi.n  a9, 1
400d15ab:       0020c0          memw
400d15ae:       004892          s8i     a9, a8, 0

The order of l32r and mov.n is different, but the size of the code is almost the same.

Additional suggestions

In my PR, I call the newly created function configure_flash from system_early_init because that’s where the calls were before. It might be more sensible to move the call to call_start_cpu0 and place it before the separator comment in line 964, because that’s where all the other external memory configuration happens. Feel free to edit the branch however you see fit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Status: Opened Issue is new

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants