Skip to content

Handle null Terraform state ids#628

Draft
bruno-espino wants to merge 1 commit intocrossplane:mainfrom
bruno-espino:fix-null-id-state
Draft

Handle null Terraform state ids#628
bruno-espino wants to merge 1 commit intocrossplane:mainfrom
bruno-espino:fix-null-id-state

Conversation

@bruno-espino
Copy link
Copy Markdown

@bruno-espino bruno-espino commented Apr 7, 2026

Summary

  • treat id: null in Terraform state as an empty state instead of a hard error
  • thread hasTFID into the workspace so refresh/import use the same existence semantics as file production
  • add regression tests for null-id and id-less schema state handling

Why

We originally observed this in a real provider flow where a later reconcile reopened a terraform.tfstate whose resource attributes had been rewritten to include "id": null.

At that point Upjet treated the state as broken instead of empty, and the reconcile failed with:

  • cannot work with a non-string id: <nil>

We later traced that specific downstream reproduction to provider behavior that made Terraform more likely to serialize a removed instance as id: null, and fixed that downstream bug separately.

That does not change the Upjet-side behavior once this state shape exists: the file-backed workspace path still fails before it can report ResourceExists=false.

So this PR is intentionally framed as generic hardening:

  • if a Terraform state file contains id: null, Upjet should treat that as an empty binding rather than a hard parse error
  • genuinely invalid non-string IDs such as numbers should still fail
  • id-less-schema resources should continue to use their existing non-ID existence semantics

In other words, this patch is not meant to explain every way id: null can be produced. It makes Upjet robust once that state shape is encountered.

Testing

  • go test ./pkg/terraform/...
  • go test ./pkg/...

@bruno-espino bruno-espino marked this pull request as draft April 7, 2026 22:57
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 7, 2026

📝 Walkthrough

Walkthrough

This PR refactors Terraform state emptiness evaluation by extracting shared logic into two new helper functions (stateAttributes and stateExists) in a new module. The helpers are then integrated into FileProducer.isStateEmpty and Workspace state handling, with updated tests to cover edge cases like null ID values and ID-less schemas.

Changes

Cohort / File(s) Summary
State Evaluation Helpers
pkg/terraform/state.go
New file introducing stateAttributes() to unmarshal JSON state attributes and stateExists() to evaluate state existence with logic for handling nil attributes, ID presence/absence, and string type validation.
FileProducer State Emptiness Refactoring
pkg/terraform/files.go, pkg/terraform/files_test.go
Refactored isStateEmpty to delegate attribute extraction and emptiness evaluation to the new helpers; added test case for null ID values and updated expectations for non-string ID handling.
Workspace State Handling Integration
pkg/terraform/workspace.go, pkg/terraform/workspace_test.go, pkg/terraform/store.go
Integrated new helpers into Workspace.Refresh and Workspace.Import for state existence checks; added WithWorkspaceHasIDAttribute option to track ID schema presence; extended tests with NullID and IDlessSchema scenarios.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the main change: handling null Terraform state IDs, which is the central focus of all modifications across the codebase.
Configuration Api Breaking Changes ✅ Passed PR makes zero modifications to pkg/config/** directory; all changes confined to pkg/terraform/** files only.
Generated Code Manual Edits ✅ Passed No files matching 'zz_*.go' pattern were modified. All changes are to hand-written code files in pkg/terraform/ directory, which is appropriate for this refactoring.
Template Breaking Changes ✅ Passed This PR modifies only pkg/terraform/ files; no pkg/controller/external*.go template files were changed.
Description check ✅ Passed The PR description clearly explains the changes: handling null Terraform state IDs as empty states, threading hasTFID into the workspace, and adding regression tests. It provides context about the real-world issue, rationale for the approach, and testing details.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
pkg/terraform/store.go (1)

243-253: ⚠️ Potential issue | 🔴 Critical

Synchronize hasTFID updates to avoid race conditions and stale behavior

Thanks for threading this through. One concern: w.hasTFID is mutated here without workspace-level synchronization, while Refresh/Import read it concurrently. Also, the early return at Line 243 can skip this assignment, leaving stale/default semantics during in-flight operations.

Could you confirm whether hasTFID is intended to be immutable per workspace? If yes, setting it once at workspace creation (option-based) would avoid this mutable shared state entirely.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/terraform/store.go` around lines 243 - 253, w.hasTFID is being mutated in
the code path that can be skipped by an early return and is read concurrently by
Refresh/Import, causing race conditions; fix by making hasTFID immutable at
workspace creation or protecting access with synchronization: either (preferred)
compute and set hasTFID once when the workspace is constructed (e.g., via a
workspace option/constructor so NewFileProducer no longer mutates w.hasTFID), or
if immutability is not possible, add a workspace-level lock/atomic around
reads/writes to w.hasTFID (update the writers in the block that currently calls
NewFileProducer and readers in Refresh/Import to use the same mutex or atomic
variable) and ensure the early return cannot leave hasTFID unset by moving the
schema-check/assignment before the early return.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/terraform/workspace.go`:
- Around line 414-421: Add a new TestWorkspaceImport that mirrors
TestWorkspaceRefresh and includes the "NullID" and "IDlessSchema" cases so the
Import path exercises the new stateAttributes and stateExists logic; for each
case construct the same resource/state fixtures used in TestWorkspaceRefresh,
call the workspace Import path (the same method that hits the code invoking
stateAttributes and stateExists), and assert the import result/behavior matches
expectations (no panics and correct existence handling). Ensure the test names
are "NullID" and "IDlessSchema", reuse the same setup/fixtures/assertions from
TestWorkspaceRefresh, and place the test alongside other workspace tests so
stateAttributes/stateExists are covered by unit tests.

---

Outside diff comments:
In `@pkg/terraform/store.go`:
- Around line 243-253: w.hasTFID is being mutated in the code path that can be
skipped by an early return and is read concurrently by Refresh/Import, causing
race conditions; fix by making hasTFID immutable at workspace creation or
protecting access with synchronization: either (preferred) compute and set
hasTFID once when the workspace is constructed (e.g., via a workspace
option/constructor so NewFileProducer no longer mutates w.hasTFID), or if
immutability is not possible, add a workspace-level lock/atomic around
reads/writes to w.hasTFID (update the writers in the block that currently calls
NewFileProducer and readers in Refresh/Import to use the same mutex or atomic
variable) and ensure the early return cannot leave hasTFID unset by moving the
schema-check/assignment before the early return.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: da0c01e8-4ce6-444b-a93f-20827176ebab

📥 Commits

Reviewing files that changed from the base of the PR and between 52796ea and 5eb0d26.

📒 Files selected for processing (6)
  • pkg/terraform/files.go
  • pkg/terraform/files_test.go
  • pkg/terraform/state.go
  • pkg/terraform/store.go
  • pkg/terraform/workspace.go
  • pkg/terraform/workspace_test.go

Comment on lines +414 to +421
attr, err := stateAttributes(s.GetAttributes())
if err != nil {
return ImportResult{}, errors.Wrap(err, "cannot unmarshal state attributes")
}
exists, err := stateExists(attr, w.hasTFID)
if err != nil {
return ImportResult{}, errors.Wrap(err, "cannot determine whether imported state exists")
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify whether Import behavior is covered by tests in pkg/terraform.
rg -n --type=go -C3 'func TestWorkspaceImport|\\.Import\\(' pkg/terraform
rg -n --type=go -C3 'id":null|IDlessSchema|WithWorkspaceHasIDAttribute\\(false\\)' pkg/terraform

Repository: crossplane/upjet

Length of output: 2651


🏁 Script executed:

#!/bin/bash
# Find test function containing the id:null and IDlessSchema test cases
cd pkg/terraform
grep -n "func Test" workspace_test.go | head -20

Repository: crossplane/upjet

Length of output: 337


🏁 Script executed:

#!/bin/bash
# Check which test function contains the "NullID" and "IDlessSchema" cases
sed -n '260,320p' pkg/terraform/workspace_test.go

Repository: crossplane/upjet

Length of output: 1712


🏁 Script executed:

#!/bin/bash
# Verify if lines 414-421 are in Import or Refresh function
sed -n '400,430p' pkg/terraform/workspace.go | cat -n

Repository: crossplane/upjet

Length of output: 1297


🏁 Script executed:

#!/bin/bash
# Search comprehensively for any Import-related tests
rg -n "func.*Import|\.Import\(" pkg/terraform/workspace_test.go

Repository: crossplane/upjet

Length of output: 42


Add test cases for the Import path with null IDs and ID-less schemas

The new stateAttributes and stateExists logic is now used in both Refresh and Import (lines 414–421), but Import lacks test coverage. You added "NullID" and "IDlessSchema" test cases in TestWorkspaceRefresh—can you add equivalent cases to a TestWorkspaceImport function to ensure the Import path handles these scenarios correctly?

Per coding guidelines, "**/*.go: All Upjet code must be covered by tests."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/terraform/workspace.go` around lines 414 - 421, Add a new
TestWorkspaceImport that mirrors TestWorkspaceRefresh and includes the "NullID"
and "IDlessSchema" cases so the Import path exercises the new stateAttributes
and stateExists logic; for each case construct the same resource/state fixtures
used in TestWorkspaceRefresh, call the workspace Import path (the same method
that hits the code invoking stateAttributes and stateExists), and assert the
import result/behavior matches expectations (no panics and correct existence
handling). Ensure the test names are "NullID" and "IDlessSchema", reuse the same
setup/fixtures/assertions from TestWorkspaceRefresh, and place the test
alongside other workspace tests so stateAttributes/stateExists are covered by
unit tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant