Skip to content

Comments

BLDX-434 | Migrate to msgspec.Struct models#811

Draft
Aryamanz29 wants to merge 14 commits intomainfrom
BLDX-434
Draft

BLDX-434 | Migrate to msgspec.Struct models#811
Aryamanz29 wants to merge 14 commits intomainfrom
BLDX-434

Conversation

@Aryamanz29
Copy link
Member

✨ Description

https://linear.app/atlan-epd/issue/BLDX-434/plan-productionization-of-msgspecstruct-models

🧩 Type of change

Select all that apply:

  • 🚀 New feature (non-breaking change that adds functionality)
  • 🐛 Bug fix (non-breaking change that fixes an issue) — please include tests! Refer testing-toolkit 🧪
  • 🔄 Refactor (code change that neither fixes a bug nor adds a feature)
  • 🧹 Maintenance (chores, cleanup, minor improvements)
  • 💥 Breaking change (fix or feature that may break existing functionality)
  • 📦 Dependency upgrade/downgrade
  • 📚 Documentation updates

📋 Checklist

  • My code follows the project’s style guidelines
  • I’ve performed a self-review of my code
  • I’ve added comments in tricky or complex areas
  • I’ve updated the documentation as needed
  • There are no new warnings from my changes
  • I’ve added tests to cover my changes
  • All new and existing tests pass locally

@Aryamanz29 Aryamanz29 self-assigned this Feb 15, 2026
@Aryamanz29 Aryamanz29 added feature New feature or request dependencies Pull requests that update a dependency file change Pyatlan change pull request breaking-change labels Feb 15, 2026
@fyzanshaik-atlan
Copy link
Contributor

@greptile review

@greptile-apps
Copy link

greptile-apps bot commented Feb 16, 2026

Too many files changed for review. (650 files found, 500 file limit)

@fyzanshaik-atlan
Copy link
Contributor

@greptile re-review

@fyzanshaik-atlan
Copy link
Contributor

@claude /review

- Remove dead code: admin/, checkpoint.py, exceptions.py, py.typed
- Delete old single-file client.py (replaced by client/ package later)
- Rename models/ → model/assets/ for consistency with legacy pyatlan/model/assets/
- Move infrastructure files (conversion_utils.py, serde.py, transform.py) to model/
- Add model/__init__.py for package-level re-exports
- Update all import paths (pyatlan_v9.models → pyatlan_v9.model.assets)
- Update all model test imports to match new paths
Migrate all legacy pyatlan model files (AtlanObject/Pydantic BaseModel)
to pyatlan_v9 msgspec.Struct equivalents:

Infrastructure:
- core.py: AtlanObject base, AtlanTag, AtlanField helpers
- structs.py: SourceTagAttachment, BadgeCondition, etc.
- translators.py/retranslators.py: Tag name translation pipeline

Models (28 new files):
- search.py: DSL, IndexSearchRequest, Query types
- typedef.py: EnumDef, StructDef, AtlanTagDef, CustomMetadataDef, etc.
- lineage.py: LineageListRequest, FluentLineage, LineageResponse, etc.
- audit.py, search_log.py: AuditSearchRequest, SearchLogRequest
- response.py: AssetMutationResponse, AssetResponse
- group.py, user.py, role.py: GroupRequest, AtlanUser, AtlanRole
- credential.py, oauth_client.py, sso.py, api_tokens.py
- events.py, keycloak_events.py: AtlanEvent, KeycloakEvent
- query.py, task.py, workflow.py, suggestions.py
- aggregation.py, atlan_image.py, contract.py, custom_metadata.py
- data_mesh.py, dq_rule_conditions.py, file.py, internal.py, lineage_ref.py

Assets:
- purpose.py: Purpose model with tag translation support
- snowflake_dynamic_table.py: SnowflakeDynamicTable model

All models use msgspec conventions: kw_only=True, UNSET/UnsetType,
rename='camel' where needed, and proper serialization methods.
Convert AtlanClient from a plain Python class to msgspec.Struct:
- AtlanClient(msgspec.Struct, kw_only=True): base_url, api_key, proxy,
  verify, retry config, and httpx session management
- PyatlanSyncTransport/PyatlanAsyncTransport: custom httpx transports
  with configurable retry logic
- Delegates to legacy pyatlan sub-clients (AssetClient, GroupClient, etc.)
  while maintaining the same interface
- Uses __post_init__ for session initialization and header configuration
- Supports proxy and SSL verification configuration via constructor args
  or environment variables
Port non-model tests from legacy tests/unit/ to tests_v9/unit/:

Test files ported:
- test_client.py: 200 tests (full parity with legacy 89 — deprecated excluded)
  Covers: terms operations, find operations, error handling, batch,
  bulk request, proxy/SSL config, pagination, validation, DQ rules
- test_typedef_model.py: 47 tests (EnumDef, StructDef, AtlanTagDef, etc.)
- test_search_model.py: 231 tests (DSL, queries, sort, pagination)
- test_atlan_tag_name.py: 6 tests (tag name resolution)
- test_core.py: 12 tests (AtlanObject, AtlanTag, Announcement)
- test_structs.py: 1 test (SourceTagAttachment)
- test_utils.py: 17 tests (utility functions)

Infrastructure:
- conftest.py: Pydantic v9-compat layer that allows legacy client methods
  (using @validate_arguments) to accept msgspec.Struct instances by
  converting them to legacy Pydantic models on the fly. Also patches
  Pydantic JSON encoder for msgspec.Struct serialization.
- constants.py: Shared test constants

Key patterns:
- Tests use v9 models (pyatlan_v9.model.assets) where possible
- Legacy Pydantic models used for BulkRequest/Batch tests (Pydantic internals)
- Client-returned objects checked by type name (legacy deserialisation)

Total v9 test suite: 1540 passed, 2 skipped
Implement a framework-agnostic validate_arguments decorator in
pyatlan/client/common/validate.py that replaces pydantic.v1's
@validate_arguments. This decorator:

- Validates function arguments against type annotations
- Supports both Pydantic BaseModel and msgspec.Struct models
- Handles basic types (str, int, bool, float, Enum)
- Handles container types (List, Set, Dict, Tuple)
- Handles Optional[X], Union[X, Y], Type[X], Callable
- Handles Pydantic constrained types (constr, StrictStr, StrictBool, StrictInt)
- Handles TypeVar resolution for bound types
- Supports enum coercion from string values
- Matches Pydantic v1 error message format and ValueError exceptions

Replace all pydantic.v1 validate_arguments imports across:
- 19 sync client files (pyatlan/client/*.py)
- 19 async client files (pyatlan/client/aio/*.py)
- 3 model files (lineage.py, search.py, ui.py)
Update all test files to work with the custom validate_arguments
decorator's error format:

- Replace pytest.raises(ValidationError) with pytest.raises(ValueError)
  since the custom decorator raises ValueError directly
- Remove unused pydantic.v1 ValidationError imports
- Update expected error messages in tests/unit/constants.py to match
  the custom decorator's output format:
  - Remove Pydantic-specific (type=...) suffixes
  - Update error counts for Union/list validation (1 vs N)
  - Match 'instance of X expected' for non-builtin types
  - Match 'str type expected', 'none is not an allowed value' etc.
- Update credential, SSO, query, task, workflow, file, UI test files
  with corrected error message formats
- Remove trailing spaces from error message constants
… v9 test failures

- Moved pyatlan/client/common/validate.py → pyatlan/validate.py to break
  circular import chain: search.py → client.common.__init__ → asset → model.assets → atlan_fields
- Updated all 41 import paths from pyatlan.client.common.validate to pyatlan.validate
- Added Mock spec-class support to _is_model_instance for v9 Batch tests
- Changed v9 test_client.py to catch ValueError instead of ValidationError
- All tests pass: 5798 legacy + 1540 v9
…ec models

- Replace isinstance checks in client/common/asset.py with _is_model_instance
  for dual-model compatibility (9 call sites)
- Replace isinstance in client/asset.py and aio/batch.py for AtlasGlossaryTerm
- Make BulkRequest.process_attributes skip msgspec models (they handle
  relationship categorization in their own serialization pipeline)
- Use _is_model_instance in BulkRequest.process_relationship_attributes
- Register msgspec JSON encoder in pyatlan/model/core.py using model's own
  to_json(nested=True) for proper nested API format serialization
- Make Asset._convert_to_real_type_ accept v9 msgspec models via _is_model_instance
- Remove all monkey-patches from tests_v9/unit/conftest.py (Patch 1-4 no longer
  needed — dual-model support is now in production code)

All tests pass: 5798 legacy + 1540 v9
- search.py: delete 25 duplicated dataclass/ABC/Enum classes and ABC
  registration block; re-export from pyatlan.model.search. Keep only
  msgspec DSL/IndexSearchRequest/IndexSearchRequestMetadata + v9 helpers.
- lineage.py: delete duplicated DirectedPair/LineageGraph; re-export
  from pyatlan.model.lineage.
- audit.py: delete duplicated AuditActionType; re-export from
  pyatlan.model.audit.
- pyatlan/model/search.py: TermAttributes/TextAttributes use plain
  str/bool/float instead of Pydantic StrictStr/StrictBool/StrictFloat.
- pyatlan/validate.py: add _is_model_instance helper for cross-boundary
  Pydantic/msgspec isinstance checks.
- pyatlan/client/common/asset.py: register msgspec.Struct in Pydantic
  ENCODERS_BY_TYPE for JSON serialization.
- core.py: add to_dict() on BulkRequest for nested serialization.
- entity.py: add semantic field to Entity base class.
- asset.py: ref_by_guid/ref_by_qualified_name accept semantic param.
- tests: update VALUES_BY_TYPE to plain types, use _is_model_instance
  for cross-boundary assertions, clean up test_client.py imports.
…, test parity

- Add v9 TaskSearchRequest (msgspec.Struct) with json() method
- Add v9 FluentTasks that produces v9 DSL and TaskSearchRequest
- Add from_yaml()/to_yaml() to v9 DataContractSpec
- Update test_task_client.py to use v9 models
- Update open_lineage_test.py to use v9 FluentTasks
- Update data_contract_test.py to use v9 DataContractSpec
- Fix imports in atlan_fields_test, data_quality_rule_test, workflow_client
- Document test_packages.py legacy Asset import (ClassVar fields)
- Port v9 test files: credential, custom relationships, workflow, etc.
- Client layer: validate_arguments migration, msgspec Struct support
- Formatting changes

All 7650 tests pass (1852 v9 + 5798 legacy)
…Q rules, lineage

- pyatlan_v9/model/events.py: Full v9 migration with AtlanEvent.from_dict()
  for polymorphic Asset dispatch via type registry and payload discrimination
- pyatlan_v9/model/packages/: Migrate AbstractPackage, crawlers, miners to msgspec
- pyatlan_v9/model/open_lineage/: All OpenLineage models as msgspec.Struct
- pyatlan_v9/model/assets/data_quality_rule.py: Creator/updater methods, static helpers
- pyatlan_v9/model/assets/asset.py: remove_description/user_description/owners, ClassVar descriptors
- pyatlan_v9/model/workflow.py: rename=camel, to_json()
- pyatlan_v9/model/credential.py: rename=camel
- pyatlan_v9/model/lineage.py: rename=camel, validate_arguments
- pyatlan_v9/client/atlan.py: msgspec.Struct handling in _create_params, parse_query, upload_image
- tests_v9/unit/: Update all v9 tests to use v9 models and client exclusively
- tests_v9/unit/test_events.py: Uses v9 AtlanEvent.from_dict(), no legacy imports
- tests_v9/unit/test_lineage.py, test_model.py: New, ported from legacy
@fyzanshaik-atlan
Copy link
Contributor

@claude /review

@claude
Copy link

claude bot commented Feb 20, 2026

Claude encountered an error —— View job


I'll analyze this and get back to you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

breaking-change change Pyatlan change pull request dependencies Pull requests that update a dependency file feature New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants