chore: Mode resolution and switching for FDv2#326
chore: Mode resolution and switching for FDv2#326aaron-zeisler wants to merge 14 commits intomainfrom
Conversation
launchdarkly-android-client-sdk/src/main/java/com/launchdarkly/sdk/android/FDv2DataSource.java
Show resolved
Hide resolved
launchdarkly-android-client-sdk/src/main/java/com/launchdarkly/sdk/android/ModeAware.java
Outdated
Show resolved
Hide resolved
| } else if (dataSource == null || dataSource.needsRefresh(!foreground, | ||
| currentContext.get())) { | ||
| updateDataSource(true, LDUtil.noOpCallback()); | ||
| } |
There was a problem hiding this comment.
Update each platform state listener to call handleModeState(...). Inside handleModeState, decide if you need to change the eventProcessor state and decide if you need to change the data source state.
connectivityChangeListener = networkAvailable -> {
handleModeStateChange()
}
foregroundListener = foreground -> {
handleModeStateChange()
}
handleModeStateChange() {
ModeState state = new ModeState(
platformState.isForeground(),
platformState.isNetworkAvailable()
)
updateEventProcessor(state)
updateDataSource(state)
}
updateEventProcessor(newModeState) {
eventProcessor.setOffline(forcedOffline.get() || !networkAvailable);
eventProcessor.setInBackground(!foreground);
}
updateDataSource(newModeState) {
if (dataSource instanceof ModeAware) {
resolveAndSwitchMode((ModeAware) dataSource, newModeState);
} else {
updateDataSource(false, LDUtil.noOpCallback());
}
}
There was a problem hiding this comment.
I've addressed this in the commit titled "refactor: separate event processor and data source logic in ConnectivityManager"
There was a problem hiding this comment.
This event processor logic should be moved out of update datasource.
There was a problem hiding this comment.
I addressed this in yesterday's commits 👍
| this.evaluationContext = evaluationContext; | ||
| this.dataSourceUpdateSink = dataSourceUpdateSink; | ||
| this.logger = logger; | ||
| this.modeTable = Collections.unmodifiableMap(new EnumMap<>(modeTable)); |
There was a problem hiding this comment.
Probably shouldn't make a copy. I would expect the modeTable to already be an immutable.
| this.evaluationContext = evaluationContext; | ||
| this.dataSourceUpdateSink = dataSourceUpdateSink; | ||
| this.logger = logger; | ||
| this.modeTable = null; |
There was a problem hiding this comment.
Consider Collections.emptyMap() iirc.
|
|
||
| ResolvedModeDefinition startDef = modeTable.get(startingMode); | ||
| if (startDef == null) { | ||
| throw new IllegalArgumentException("No mode definition for starting mode: " + startingMode); |
There was a problem hiding this comment.
This is a runtime exception, it would be better to log an error. A future call to switchMode could possible be valid.
Eventually the starting mode will be coming from the customer, so there may be cases where they can provide invalid inputs if we don't constrain them with a Builder pattern or such. It seems like in any case where invalid configuration input was provided, the SDK could just be offline and not making connections.
There was a problem hiding this comment.
Note to self: Look at line 240 to understand what happens if the SourceManager has no initializers or synchronizers.
| logger.warn("switchMode({}) called but no mode table configured", newMode); | ||
| return; | ||
| } | ||
| if (stopped.get()) { |
There was a problem hiding this comment.
stopped check probably would be before modeTable check.
| */ | ||
| @Override | ||
| public void switchMode(@NonNull ConnectionMode newMode) { | ||
| if (modeTable == null) { |
There was a problem hiding this comment.
This may turn into a empty check instead of a null check due to my emptyMap comment above.
| sourceManager = newManager; | ||
| if (oldManager != null) { | ||
| oldManager.close(); | ||
| } |
There was a problem hiding this comment.
Look into race condition. There may already be an execution thread interacting with the old manager. Is it ok to start the next execution task before the previous one finishes? Probably not to avoid an out of order push of data into the update sink.
There was a problem hiding this comment.
We can perform the swap inside SourceManager. SourceManager already has utilities to manage synchronizers in a thread-safe manner. Idea: We add a switchMode() method to SourceManager. The benefit is then the DataSource must work w/ the SourceManager to retrieve synchronizers via getNextInitializerAndSetActive(), so SourceManager can swap them under the hood.
There was a problem hiding this comment.
I've addressed this in the commit titled "refactor: move synchronizer switching into SourceManager to prevent race condition".
I added a function called switchSynchronizers() to be really precise, but I can make it switchMode() (and pass the entire Mode object) if you'd prefer.
I also used an atomic boolean "executionLoopRunning" to prevent two concurrent runSynchronizer() loops. I'm not sure if this is the best thing to do ... let me know what you think.
| ) { | ||
| try { | ||
| Synchronizer synchronizer = sourceManager.getNextAvailableSynchronizerAndSetActive(); | ||
| Synchronizer synchronizer = sm.getNextAvailableSynchronizerAndSetActive(); |
There was a problem hiding this comment.
Note to self: This could block until a valid synchronizer is available. This is in reference to the comment around startDef being invalid.
Needs more discussion, to be continued tomorrow.
4e24bb9 to
070f859
Compare
ee7bfc4 to
c71379b
Compare
|
I merged the code from the other approach into this branch in the commit titled "refactor: switch to Approach 2 for FDv2 mode resolution and switching", and then I closed that other PR. |
| LDContext newEvaluationContext, | ||
| boolean newInBackground, | ||
| Boolean previouslyInBackground, | ||
| @Nullable TransactionalDataStore transactionalDataStore |
There was a problem hiding this comment.
The overloads of forDataSource need comments or perhaps a different name for the latter version to help someone understand the implication of each.
There was a problem hiding this comment.
I've addressed this in my last push. Is it a problem if the comments directly reference FDv1 and FDv2?
| static final ConnectionMode POLLING = new ConnectionMode("POLLING"); | ||
| static final ConnectionMode OFFLINE = new ConnectionMode("OFFLINE"); | ||
| static final ConnectionMode ONE_SHOT = new ConnectionMode("ONE_SHOT"); | ||
| static final ConnectionMode BACKGROUND = new ConnectionMode("BACKGROUND"); |
There was a problem hiding this comment.
I think JS is using lower case for the String name value as that will eventually be compared to the passed in values, when we support custom modes, as part of the custom mode table merge operation.
The constant itself can be upper case as that is Java style.
There was a problem hiding this comment.
Yes, the js-core code uses lowercase. I've addressed this in my latest push 👍
| }); | ||
|
|
||
| // If the app starts in the background, the builder creates the data source with | ||
| // STREAMING as the starting mode. Perform an initial mode resolution to correct this. |
There was a problem hiding this comment.
I think this is problematic as the call to start just before this could cause the source to make requests and the call to resolveAndSwitchMode would then also make requests wasting previous effort. Wouldn't the factory have the necessary information to know how to create the data source in the correct mode?
Perhaps resolveAndSwitchMode is a no-op in this case?
There is a log that states logger.debug("Creating data source (background={})", inBackground);. This indicates that all code past that is for the new source. I think this indicates resolveAndSwitchMode is in the wrong spot. Re-examine the code paths.
There was a problem hiding this comment.
Notes:
- Is the code below necessary? As Todd mentions, this is happening below the "start" method. Also, there are places above in this function where we return early for specific reasons.
- Are the variables created in lines 238-243 used here? Review that code and its downstream effects.
- Are FDv1 object and FDv2 objects being managed correctly and independently? ModeAware is the mechanism to make this decision. Where is the decision being made to use FDv1 or FDv2.
- Test-driven development would be useful here. Create test cases that use FDv1, then create separate case that use FDv2, and compare / debug. Are the objects' lifecycles in line with the concepts of DSv1 vs FDv2?
| updateDataSource(true, LDUtil.noOpCallback()); | ||
| } else { | ||
| updateDataSource(false, LDUtil.noOpCallback()); | ||
| } |
There was a problem hiding this comment.
Let's pair on this block. I think the resolveAndSwitchMode call outside of updateDataSource are going to make rationalizing about the code difficult. Perhaps there is a FDv1 vs FDv2 data source interaction / complexity I'm missing.
There was a problem hiding this comment.
Note: This code block is handling some of the logic that should be inside updateDataSource. Can updateDataSource call resolveAndSwitchMode? And can it also perform the needsRefresh() check that's happening in this if/switch statement?
| ModeState state = new ModeState( | ||
| foreground && !forceOffline, | ||
| networkAvailable && !forceOffline | ||
| ); |
There was a problem hiding this comment.
foreground && !forceOffline
I can understand why forceOffline would cause us to act like network is unavailable, but I'm not sure why forceOffline is causing ModeState to act like it is in the background. That feels like too big of a jump in logic to be reasonable and I think it corrupts the ModeState. It seems like you know it is ok because you know how it will resolve, but to maintain separation of concerns, you shouldn't know it is ok.
There was a problem hiding this comment.
In my latest push, I'm handling the OFFLINE case first, and then creating the ModeState by simply feeding the platformState variables. Is this what you had in mind?
| )); | ||
| table.put(ConnectionMode.POLLING, new ModeDefinition( | ||
| Collections.singletonList(pollingInitializer), | ||
| Collections.singletonList(pollingSynchronizer) |
There was a problem hiding this comment.
We need a story in the epic for ensuring that there is a delay between the polling initializer and the polling synchronizer, otherwise we end up with two fast back to back requests.
We can't remove the polling initializer and just have the polling synchronizer because the switch from initializers to synchronizers is critical for communicating that the SDK is initialized and running.
| Collections.<ComponentConfigurer<Synchronizer>>emptyList() | ||
| )); | ||
| table.put(ConnectionMode.ONE_SHOT, new ModeDefinition( | ||
| Arrays.asList(pollingInitializer, pollingInitializer, pollingInitializer), |
There was a problem hiding this comment.
pollingInitializer, pollingInitializer, pollingInitializer
Seems like placeholders. We can pair on making a synchronizer -> initializer adapter so we can at least put in the streaming synchronizer as an initializer where necessary.
| )); | ||
| table.put(ConnectionMode.BACKGROUND, new ModeDefinition( | ||
| Collections.singletonList(pollingInitializer), | ||
| Collections.singletonList(backgroundPollingSynchronizer) |
There was a problem hiding this comment.
Same back to back request concern here.
| Map<ConnectionMode, ResolvedModeDefinition> getResolvedModeTable() { | ||
| if (resolvedModeTable == null) { | ||
| throw new IllegalStateException("build() must be called before getResolvedModeTable()"); | ||
| } |
There was a problem hiding this comment.
Let's discuss the runtime exceptions during pairing.
| throw new IllegalStateException( | ||
| "ModeResolutionTable has no matching entry for state: " + | ||
| "foreground=" + state.isForeground() + ", networkAvailable=" + state.isNetworkAvailable() | ||
| ); |
There was a problem hiding this comment.
Since this a developer bug, perhaps move last ModeResolutionEntry out of the map and put the default handling here. That is the all unresolved states go to ConnectionMode.STREAMING.
In languages with stronger typing semantics, the compiler could determine exhaustive handling, but Java is not strong enough.
...arkly-android-client-sdk/src/main/java/com/launchdarkly/sdk/android/ConnectivityManager.java
Show resolved
Hide resolved
c71379b to
0b070e6
Compare
...arkly-android-client-sdk/src/main/java/com/launchdarkly/sdk/android/ConnectivityManager.java
Show resolved
Hide resolved
...arkly-android-client-sdk/src/main/java/com/launchdarkly/sdk/android/ConnectivityManager.java
Show resolved
Hide resolved
| if (useFDv2ModeResolution) { | ||
| currentFDv2Mode = resolveMode(); | ||
| FDv2DataSourceBuilder fdv2Builder = (FDv2DataSourceBuilder) dataSourceFactory; | ||
| fdv2Builder.setActiveMode(currentFDv2Mode, true); |
There was a problem hiding this comment.
fdv2Builder isn't used?
There was a problem hiding this comment.
I was confused by this syntax also. I've changed the code to make it more readable in my latest push 👍
| FDv2DataSourceBuilder fdv2Builder = (FDv2DataSourceBuilder) dataSourceFactory; | ||
| ModeDefinition oldDef = fdv2Builder.getModeDefinition(currentFDv2Mode); | ||
| ModeDefinition newDef = fdv2Builder.getModeDefinition(newMode); | ||
| if (oldDef != null && oldDef == newDef) { |
There was a problem hiding this comment.
| if (oldDef != null && oldDef == newDef) { | |
| if (oldDef != null && Objects.equals(oldDef, newDef)) { |
There was a problem hiding this comment.
Object.equals provides null safe equality check. Can both be null? If so, you still need the oldDef != null check.
| } | ||
|
|
||
| // FDv1 path: check whether the data source needs a full rebuild. | ||
| if (!mustReinitializeDataSource && existingDataSource != null) { |
There was a problem hiding this comment.
The comment on this block says FDv1 path, but I think this is also hit in the FDv2 path.
...arkly-android-client-sdk/src/main/java/com/launchdarkly/sdk/android/ConnectivityManager.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Double check that there is sufficient test coverage to protect the FDv1 data source handling in connectivity manager. That is the logic we need to make sure doesn't break.
There was a problem hiding this comment.
Sounds good. I'll work on this specific task in the afternoon today.
e6ed1e7 to
b0b1e2f
Compare
Introduces the core types for FDv2 mode resolution (CONNMODE spec): - ConnectionMode: enum for streaming, polling, offline, one-shot, background - ModeDefinition: initializer/synchronizer lists per mode with stubbed configurers - ModeState: platform state snapshot (foreground, networkAvailable) - ModeResolutionEntry: condition + mode pair for resolution table entries - ModeResolutionTable: ordered first-match-wins resolver with MOBILE default table - ModeAware: interface for DataSources that support runtime switchMode() All types are package-private. No changes to existing code.
Add ResolvedModeDefinition and mode-table constructors so FDv2DataSource can look up initializer/synchronizer factories per ConnectionMode. switchMode() tears down the current SourceManager, builds a new one with the target mode's synchronizers (skipping initializers per CONNMODE 2.0.1), and schedules execution on the background executor. runSynchronizers() now takes an explicit SourceManager parameter to prevent a race where the finally block could close a SourceManager swapped in by a concurrent switchMode().
Introduces FDv2DataSourceBuilder, a ComponentConfigurer<DataSource> that resolves the ModeDefinition table's ComponentConfigurers into DataSourceFactories at build time by capturing the ClientContext. The configurers are currently stubbed (return null); real wiring of concrete initializer/synchronizer types will follow in a subsequent commit.
…ourceBuilder Replace stub configurers with concrete factories that create FDv2PollingInitializer, FDv2PollingSynchronizer, and FDv2StreamingSynchronizer. Shared dependencies (SelectorSource, ScheduledExecutorService) are created once per build() call; each factory creates a fresh DefaultFDv2Requestor for lifecycle isolation. Add FDv2 endpoint path constants to StandardEndpoints. Thread TransactionalDataStore through ClientContextImpl and ConnectivityManager so the builder can construct SelectorSourceFacade from ClientContext.
ConnectivityManager now detects ModeAware data sources and routes foreground, connectivity, and force-offline state changes through resolveAndSwitchMode() instead of the legacy teardown/rebuild cycle.
…d switching Replace Approach 1 implementation with Approach 2, which the team preferred for its cleaner architecture: - ConnectivityManager owns the resolved mode table and performs ModeState -> ConnectionMode -> ResolvedModeDefinition lookup - FDv2DataSource receives ResolvedModeDefinition via switchMode() and has no internal mode table - FDv2DataSourceBuilder uses a unified ComponentConfigurer-based code path for both production and test mode tables - ResolvedModeDefinition is a top-level class rather than an inner class of FDv2DataSource - ConnectionMode is a final class with static instances instead of a Java enum Made-with: Cursor
FDv2DataSource now explicitly implements both DataSource and ModeAware, keeping the two interfaces independent. Made-with: Cursor
…n ConnectivityManager Extract updateEventProcessor() and handleModeStateChange() so that event processor state (setOffline, setInBackground) is managed independently from data source lifecycle. Both platform listeners and setForceOffline() now route through handleModeStateChange(), which snapshots state once and updates each subsystem separately. Made-with: Cursor
…o prevent race condition SourceManager now owns a switchSynchronizers() method that atomically swaps the synchronizer list under the existing lock, eliminating the window where two runSynchronizers() loops could push data into the update sink concurrently. FDv2DataSource keeps a single final SourceManager and uses an AtomicBoolean guard to ensure only one execution loop runs at a time. Made-with: Cursor
…pdateDataSource handleModeStateChange() now simply updates the event processor and delegates to updateDataSource(). The FDv2 ModeAware early-return and FDv1 needsRefresh() check both live inside updateDataSource, keeping the branching logic in one place. Made-with: Cursor
…nectivityManager level Per updated CSFDV2 spec and JS implementation, mode switching now tears down the old data source and builds a new one rather than swapping internal synchronizers. Delete ModeAware interface, remove switchMode() from FDv2DataSource and switchSynchronizers() from SourceManager. FDv2DataSourceBuilder becomes the sole owner of mode resolution via setActiveMode()/build(), with ConnectivityManager using a useFDv2ModeResolution flag to route FDv2 through the new path while preserving FDv1 behavior. Implements CSFDV2 5.3.8 (retain data source when old and new modes share the same ModeDefinition). Made-with: Cursor
- Short-circuit forceOffline in resolveMode() so ModeState reflects actual platform state - Match ConnectionMode string values to cross-SDK spec (lowercase, hyphenated) - Add Javadoc to ConnectionMode, ClientContextImpl overloads, and FDv2DataSource internals - Inline FDv2DataSourceBuilder casts in ConnectivityManager - Restore try/finally and explanatory comments in runSynchronizers Made-with: Cursor
b0b1e2f to
ba6ac91
Compare
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 3 potential issues.
Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.
| Collections.singletonList(pollingInitializer), | ||
| Collections.singletonList(backgroundPollingSynchronizer) | ||
| )); | ||
| return table; |
There was a problem hiding this comment.
Mode table uses duplicate pollingInitializer for all initializer slots
High Severity
The makeDefaultModeTable() uses pollingInitializer for every initializer slot, instead of distinct initializer types. Per the development plan, STREAMING mode needs [cache, polling] but gets [polling, polling], ONE_SHOT needs [cache, polling, streaming] but gets [polling, polling, polling], and OFFLINE needs [cache] but gets [polling]. This causes duplicate redundant HTTP polling requests during initialization (two for STREAMING, three for ONE_SHOT) and will attempt a network poll in OFFLINE mode when no network is available.
| [9]: /Users/azeisler/code/launchdarkly/js-core/packages/shared/sdk-client/src/datasource/fdv2/FDv2DataSource.ts | ||
| [10]: /Users/azeisler/code/launchdarkly/js-core/packages/shared/sdk-client/src/datasource/fdv2/SourceManager.ts | ||
| [11]: /Users/azeisler/code/launchdarkly/js-core/packages/shared/sdk-client/src/datasource/fdv2/Conditions.ts | ||
| [12]: /Users/azeisler/code/launchdarkly/js-core/packages/shared/sdk-client/src/datasource/fdv2/CacheInitializer.ts |
There was a problem hiding this comment.
Local filesystem paths accidentally committed in documentation
Low Severity
The reference link definitions contain absolute local filesystem paths like /Users/azeisler/code/launchdarkly/js-core/... and /Users/azeisler/code/launchdarkly/android-client-sdk/.... These are personal machine paths that expose a developer's local environment details and won't resolve for anyone else.
Additional Locations (1)
| // FDv2 mode resolution already accounts for offline/background states via | ||
| // the ModeResolutionTable, so we always rebuild when the mode changed. | ||
| shouldStopExistingDataSource = mustReinitializeDataSource; | ||
| shouldStartDataSourceIfStopped = true; |
There was a problem hiding this comment.
FDv2 updateDataSource skips initialized=true for force-offline startup
Medium Severity
When useFDv2ModeResolution is true, the if (useFDv2ModeResolution) branch at line 233 always sets shouldStartDataSourceIfStopped = true, bypassing the FDv1 forceOffline/!networkEnabled/backgroundUpdatingDisabled branches that set initialized = true and update ConnectionInformation status. For FDv2 starting in force-offline or no-network state, isInitialized() won't return true until the OFFLINE-mode data source's async start callback fires, creating a behavioral gap compared to FDv1 where initialization is immediate for these states.


Details coming soon
Requirements
Related issues
Provide links to any issues in this repository or elsewhere relating to this pull request.
Describe the solution you've provided
Provide a clear and concise description of what you expect to happen.
Describe alternatives you've considered
Provide a clear and concise description of any alternative solutions or features you've considered.
Additional context
Add any other context about the pull request here.
Note
Medium Risk
Touches
ConnectivityManager’s data-source lifecycle decisions and adds new FDv2 mode-selection/build logic, so regressions could affect online/offline/background behavior. Risk is mitigated by added unit tests covering mode transitions and no-op cases.Overview
Introduces internal FDv2 mode configuration primitives (
ConnectionMode,ModeDefinition,ModeState,ModeResolutionTable/Entry,ResolvedModeDefinition) and tests for deterministic mode resolution.Adds
FDv2DataSourceBuilderto constructFDv2DataSourceinstances from a mode table, including wiring for FDv2 polling/streaming endpoints and use of aTransactionalDataStoreto supply selector state.Updates
ConnectivityManagerto optionally use FDv2 mode resolution when the configured data source factory is anFDv2DataSourceBuilder: it resolves foreground/network/forced-offline to a target mode, triggers rebuilds only when the mode changes (skipping rebuilds when two modes share the sameModeDefinition), and ensuresEventProcessoroffline/background state stays in sync.ClientContextImplis extended to carry an optionalTransactionalDataStore, and test utilities/tests are expanded to cover FDv2 mode transitions andFDv2DataSource.needsRefresh()context-only refresh behavior.Written by Cursor Bugbot for commit ba6ac91. This will update automatically on new commits. Configure here.