-
Notifications
You must be signed in to change notification settings - Fork 13
Description
Description
There is a race condition between workflow state saving and retrieval during workflow startup that causes intermittent "not found" errors when activities try to access workflow configuration from the StateStore.
When a workflow is initiated using TemporalClient.start_workflow(), the following steps take place:
-
The workflow configuration is saved to the StateStore through
StateStore.save_state_object(). -
The workflow starts immediately.
-
During the first activity of the workflow,
get_workflow_args()is called to retrieve the configuration viaStateStore.get_state(). -
Race condition: The configuration may not be available yet, causing "not found" errors
Root Cause Analysis
File: application_sdk\clients\temporal.py (lines 262-280)
The workflow startup sequence has a timing gap:
File: application_sdk\services\statestore.py (lines 243-267)
The save_state_object() method has asynchronous operations:
The get_state() method fails when object store upload hasn't completed:
Impact:
- Intermittent workflow failures during startup.
- Timing-dependent behavior that is difficult to reproduce consistently.
- Users are compelled to implement workarounds with retry logic in their activities.
My current workaround is implementing retry logic in their get_workflow_args() overrides:
Reproduction Repo/Script (if any)
https://github.com/drockparashar/githubConnector
Reproduction Steps
1. Import SDK
2. Call method `...`
3. Observe error `...`Logs / Tracebacks
Expected vs Actual
Expected Behavior: When a workflow is initiated using TemporalClient.start_workflow(), the workflow configuration should be instantly accessible to activities that invoke get_workflow_args(). The expected sequence of events is as follows:
StateStore.save_state_object()saves the workflow configuration to the object store.- The method returns successfully, indicating that the state has been persisted and is available.
- The Temporal workflow starts and calls
get_workflow_args(). StateStore.get_state()successfully retrieves the configuration.- The workflow proceeds without any errors.
Actual Behavior: The workflow configuration is intermittently unavailable immediately after being saved, resulting in "not found" errors:
StateStore.save_state_object()appears to complete successfully.- The Temporal workflow starts right away and invokes
get_workflow_args(). StateStore.get_state()fails with an "object not found" error at lines 113-117 instatestore.py.- As a result, the workflow fails to start and generates a "No state found" message.
Environment
None
SDK Version
0.1.1rc46