Fix: start timeout timer after payload transmission for large payloads#39
Open
cortex-assistant[bot] wants to merge 34 commits intodevelopfrom
Open
Fix: start timeout timer after payload transmission for large payloads#39cortex-assistant[bot] wants to merge 34 commits intodevelopfrom
cortex-assistant[bot] wants to merge 34 commits intodevelopfrom
Conversation
Add a manually-triggered GitHub Actions workflow that runs test_download_two_layers 5 times with debug instrumentation to diagnose why layers sometimes return 'Layer1' instead of 'Layer2' on Docker (but not Finch). Instrumentation logs: - Layer ordering in download_all() input/output - Generated Dockerfile ADD command sequence - Tarball construction order - Image build vs reuse decisions
Instead of running isolated iterations, run the exact same pytest command as the CI local-invoke suite with added debug logging. This reproduces the real test ordering and cross-test interactions. Enhanced instrumentation now also captures: - Per-layer download decisions (cached vs fresh) - Extracted layer file contents after download - Docker build stream output (cache hits show as CACHED)
Run just the TestLayerVersion class instead of the full local-invoke suite. This covers the failing test_download_two_layers plus the other tests in the class that may cause cross-test contamination.
debug: add workflow to investigate flaky layer ordering tests
debug: remove repo guard for fork
debug: use direct OIDC auth
debug: add BY_CANARY=true
…rm container tests * Update notify-slack.yml to add sleep (aws#8646) * fix: use tag prefix matching to clean up samcli/lambda-* images (aws#8647) Docker's images.list(name='samcli/lambda') does exact repository matching and won't match repositories like 'samcli/lambda-python'. This caused stale images to persist across parameterized test classes, leading to flaky test_download_two_layers failures where Layer2 should overwrite Layer1 but the image was never rebuilt. Fix by: 1. Adding _cleanup_samcli_images() that lists all images and filters by 'samcli/lambda-' tag prefix 2. Using this method in both tearDown and tearDownClass 3. Fixing the same pattern in TestLayerVersionThatDoNotCreateCache * fix: remove fallback in count_running_containers that causes flaky warm container tests The count_running_containers method had a fallback that returned the count of ALL SAM CLI containers when MODE env var filtering found no matches. This caused AssertionError: 3 != 2 when stale containers from other tests were present. Now it strictly counts only containers matching this test's unique MODE UUID and uses exact string matching. * fix: add AWS_DEFAULT_REGION and use -k pattern for parameterized tests * fix: add BY_CANARY=true to enable Docker tests on CI * fix: exclude RemoteLayers tests that need AWS credentials --------- Co-authored-by: seshubaws <116689586+seshubaws@users.noreply.github.com>
When build_in_source is used with Node.js, the build directory contains a node_modules symlink. During local invoke, SAM CLI resolves this symlink and creates an additional bind mount. Docker tolerates creating a mountpoint over a symlink, but Finch (containerd/runc) fails with 'not a directory'. This fix temporarily replaces symlinks with empty directories before container creation, then restores them afterward. This ensures: - Finch/runc gets a valid directory mountpoint - Docker continues to work as before - Repeated invocations work because symlinks are restored - Host filesystem is left unchanged even if container creation fails
fix: replace symlinks with dirs for Finch container mount compatibility
fix: restore BY_CANARY to run docker tests on CI
Containerd/Finch resolves bind mounts at start time, not create time. Moving the symlink restore to after start() ensures the empty directory mountpoints are still present when the container runtime sets up mounts.
fix: move symlink restore to after container start
Bind mounts must remain valid for the container's entire lifetime. Both Docker and Finch need the empty directory mountpoint to persist until the container is stopped and deleted. Restoring symlinks in the delete() method's finally block ensures proper cleanup alongside the existing host_tmp_dir cleanup.
fix: restore symlinks at container delete time
Implement a local CloudFormation Language Extensions processor supporting: - Fn::ForEach loop expansion in Resources, Conditions, and Outputs - Fn::Length, Fn::ToJsonString intrinsic functions - Fn::FindInMap with DefaultValue support - Conditional DeletionPolicy/UpdateReplacePolicy - Nested ForEach depth validation (max 5 levels) - Partial resolution mode preserving unresolvable references Pipeline architecture: TemplateParsingProcessor -> ForEachProcessor -> IntrinsicResolverProcessor -> DeletionPolicyProcessor -> UpdateReplacePolicyProcessor Includes comprehensive unit tests and CloudFormation compatibility suite.
Wire the language extensions library into SAM CLI with two-phase architecture: - Phase 1: expand_language_extensions() -> LanguageExtensionResult - Phase 2: SamTranslatorWrapper.run_plugins() (SAM transform only) Key components: - expand_language_extensions() canonical entry point with template-level cache keyed on (path, mtime, params_hash) - SamTranslatorWrapper receives pre-expanded template (Phase 2 only) - SamLocalStackProvider.get_stacks() calls expand_language_extensions() - SamTemplateValidator calls expand_language_extensions() - DynamicArtifactProperty dataclass for Mappings transformation - Fn::ForEach guards in artifact_exporter, normalizer, cdk/utils - clear_expansion_cache() for warm container file change events
- _get_template_for_output() preserves Fn::ForEach in build output - _update_foreach_artifact_paths() generates Mappings for dynamic artifact properties with per-function build paths - Recursive nested Fn::ForEach support - ForEach-aware path resolution skips Docker image URIs Test templates: static CodeUri, dynamic CodeUri, parameter collections, nested stacks, nested ForEach, dynamic ImageUri, depth validation.
Package:
- _export() calls expand_language_extensions() for Phase 1
- Preserves Fn::ForEach in packaged template with S3 URIs
- Generates Mappings for dynamic artifact properties
- _find_artifact_uri_for_resource() handles all export formats:
string, {S3Bucket,S3Key}, {Bucket,Key}, {ImageUri}
- Recursive nested Fn::ForEach support
- Warning for parameter-based collections
Deploy:
- Uploads original unexpanded template to CloudFormation
- Clear error for missing Mapping keys
Integration tests for CodeUri, ContentUri, DefinitionUri, ImageUri,
BodyS3Location across all packageable resource types.
- sam validate: valid ForEach, invalid syntax, cloud-dependent collections, dynamic CodeUri, nested depth validation (5 valid, 6 invalid) - sam local invoke: expanded function names from ForEach - sam local start-api: ForEach-generated API endpoints
Add make test-lang-ext and make test-all targets so the 1695 language extensions unit tests only run when needed, keeping the default make test fast for unrelated PRs.
The expand_language_extensions() cache stored references to template dicts that were later mutated in-place by ApplicationBuilder.update_template() (which changes nested stack Location properties to build-output paths). On cache hit, the mutated dict was returned, causing TemplateNotFoundException during the second infra sync in sam sync --watch. Remove the cache entirely since deep-copying on hit negates the performance benefit and adds complexity. Keep clear_expansion_cache() as a no-op for backward compatibility. Fixes TestSyncInfraNestedStacks_0 and TestSyncInfraNestedStacks_1 integration test failures.
fix: remove expansion cache to fix sync watch nested stack failures
fix: use OIDC credentials directly for sync test workflow
fix: handle non-ASCII file paths in build graph (closes #24)
…ure timeout with large payloads
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #38.
The function timeout timer was starting before the HTTP POST that transmits the event payload to the container. For large payloads (e.g. 10MB), significant time was consumed uploading the event data before the Lambda handler began executing, causing premature timeouts.
This fix moves the timer start to after the payload has been fully transmitted, aligning behavior with real AWS Lambda where timeout only counts execution time.