Handle null Terraform state ids#628
Conversation
📝 WalkthroughWalkthroughThis PR refactors Terraform state emptiness evaluation by extracting shared logic into two new helper functions ( Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
pkg/terraform/store.go (1)
243-253:⚠️ Potential issue | 🔴 CriticalSynchronize
hasTFIDupdates to avoid race conditions and stale behaviorThanks for threading this through. One concern:
w.hasTFIDis mutated here without workspace-level synchronization, whileRefresh/Importread it concurrently. Also, the early return at Line 243 can skip this assignment, leaving stale/default semantics during in-flight operations.Could you confirm whether
hasTFIDis intended to be immutable per workspace? If yes, setting it once at workspace creation (option-based) would avoid this mutable shared state entirely.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/terraform/store.go` around lines 243 - 253, w.hasTFID is being mutated in the code path that can be skipped by an early return and is read concurrently by Refresh/Import, causing race conditions; fix by making hasTFID immutable at workspace creation or protecting access with synchronization: either (preferred) compute and set hasTFID once when the workspace is constructed (e.g., via a workspace option/constructor so NewFileProducer no longer mutates w.hasTFID), or if immutability is not possible, add a workspace-level lock/atomic around reads/writes to w.hasTFID (update the writers in the block that currently calls NewFileProducer and readers in Refresh/Import to use the same mutex or atomic variable) and ensure the early return cannot leave hasTFID unset by moving the schema-check/assignment before the early return.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@pkg/terraform/workspace.go`:
- Around line 414-421: Add a new TestWorkspaceImport that mirrors
TestWorkspaceRefresh and includes the "NullID" and "IDlessSchema" cases so the
Import path exercises the new stateAttributes and stateExists logic; for each
case construct the same resource/state fixtures used in TestWorkspaceRefresh,
call the workspace Import path (the same method that hits the code invoking
stateAttributes and stateExists), and assert the import result/behavior matches
expectations (no panics and correct existence handling). Ensure the test names
are "NullID" and "IDlessSchema", reuse the same setup/fixtures/assertions from
TestWorkspaceRefresh, and place the test alongside other workspace tests so
stateAttributes/stateExists are covered by unit tests.
---
Outside diff comments:
In `@pkg/terraform/store.go`:
- Around line 243-253: w.hasTFID is being mutated in the code path that can be
skipped by an early return and is read concurrently by Refresh/Import, causing
race conditions; fix by making hasTFID immutable at workspace creation or
protecting access with synchronization: either (preferred) compute and set
hasTFID once when the workspace is constructed (e.g., via a workspace
option/constructor so NewFileProducer no longer mutates w.hasTFID), or if
immutability is not possible, add a workspace-level lock/atomic around
reads/writes to w.hasTFID (update the writers in the block that currently calls
NewFileProducer and readers in Refresh/Import to use the same mutex or atomic
variable) and ensure the early return cannot leave hasTFID unset by moving the
schema-check/assignment before the early return.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: da0c01e8-4ce6-444b-a93f-20827176ebab
📒 Files selected for processing (6)
pkg/terraform/files.gopkg/terraform/files_test.gopkg/terraform/state.gopkg/terraform/store.gopkg/terraform/workspace.gopkg/terraform/workspace_test.go
| attr, err := stateAttributes(s.GetAttributes()) | ||
| if err != nil { | ||
| return ImportResult{}, errors.Wrap(err, "cannot unmarshal state attributes") | ||
| } | ||
| exists, err := stateExists(attr, w.hasTFID) | ||
| if err != nil { | ||
| return ImportResult{}, errors.Wrap(err, "cannot determine whether imported state exists") | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify whether Import behavior is covered by tests in pkg/terraform.
rg -n --type=go -C3 'func TestWorkspaceImport|\\.Import\\(' pkg/terraform
rg -n --type=go -C3 'id":null|IDlessSchema|WithWorkspaceHasIDAttribute\\(false\\)' pkg/terraformRepository: crossplane/upjet
Length of output: 2651
🏁 Script executed:
#!/bin/bash
# Find test function containing the id:null and IDlessSchema test cases
cd pkg/terraform
grep -n "func Test" workspace_test.go | head -20Repository: crossplane/upjet
Length of output: 337
🏁 Script executed:
#!/bin/bash
# Check which test function contains the "NullID" and "IDlessSchema" cases
sed -n '260,320p' pkg/terraform/workspace_test.goRepository: crossplane/upjet
Length of output: 1712
🏁 Script executed:
#!/bin/bash
# Verify if lines 414-421 are in Import or Refresh function
sed -n '400,430p' pkg/terraform/workspace.go | cat -nRepository: crossplane/upjet
Length of output: 1297
🏁 Script executed:
#!/bin/bash
# Search comprehensively for any Import-related tests
rg -n "func.*Import|\.Import\(" pkg/terraform/workspace_test.goRepository: crossplane/upjet
Length of output: 42
Add test cases for the Import path with null IDs and ID-less schemas
The new stateAttributes and stateExists logic is now used in both Refresh and Import (lines 414–421), but Import lacks test coverage. You added "NullID" and "IDlessSchema" test cases in TestWorkspaceRefresh—can you add equivalent cases to a TestWorkspaceImport function to ensure the Import path handles these scenarios correctly?
Per coding guidelines, "**/*.go: All Upjet code must be covered by tests."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@pkg/terraform/workspace.go` around lines 414 - 421, Add a new
TestWorkspaceImport that mirrors TestWorkspaceRefresh and includes the "NullID"
and "IDlessSchema" cases so the Import path exercises the new stateAttributes
and stateExists logic; for each case construct the same resource/state fixtures
used in TestWorkspaceRefresh, call the workspace Import path (the same method
that hits the code invoking stateAttributes and stateExists), and assert the
import result/behavior matches expectations (no panics and correct existence
handling). Ensure the test names are "NullID" and "IDlessSchema", reuse the same
setup/fixtures/assertions from TestWorkspaceRefresh, and place the test
alongside other workspace tests so stateAttributes/stateExists are covered by
unit tests.
5eb0d26 to
67fc744
Compare
Summary
id: nullin Terraform state as an empty state instead of a hard errorhasTFIDinto the workspace so refresh/import use the same existence semantics as file productionWhy
We originally observed this in a real provider flow where a later reconcile reopened a
terraform.tfstatewhose resource attributes had been rewritten to include"id": null.At that point Upjet treated the state as broken instead of empty, and the reconcile failed with:
cannot work with a non-string id: <nil>We later traced that specific downstream reproduction to provider behavior that made Terraform more likely to serialize a removed instance as
id: null, and fixed that downstream bug separately.That does not change the Upjet-side behavior once this state shape exists: the file-backed workspace path still fails before it can report
ResourceExists=false.So this PR is intentionally framed as generic hardening:
id: null, Upjet should treat that as an empty binding rather than a hard parse errorIn other words, this patch is not meant to explain every way
id: nullcan be produced. It makes Upjet robust once that state shape is encountered.Testing
go test ./pkg/terraform/...go test ./pkg/...