Conversation
Adds a `python` builtin command that executes Python 3.4 source code using the gpython pure-Go interpreter — no CPython installation required. Usage: python [-c CODE] [-h] [SCRIPT | -] [ARG ...] Security sandbox (enforced in builtins/internal/pyruntime/): - os.system, os.popen, all exec/spawn/fork/write/delete functions removed - open() replaced with read-only AllowedPaths-aware version; write/append modes raise PermissionError - tempfile and glob modules neutered (functions removed) - sys.exit() exit code propagated via closure variable before VM wraps error - Source and file reads bounded at 1 MiB - Context cancellation respected (goroutine + select on ctx.Done()) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…builtins, and keywords Adds 61 new scenario tests across 10 categories to improve coverage of the python builtin's gpython (3.4) interpreter: - keywords: pass, del, assert, global, nonlocal, in/not-in, is/is-not, break, continue - comprehensions: list, filtered list, dict, set, generator expression, nested - generators: basic yield, generator.send(), yield from, StopIteration - lambdas: basic, sorted key, map - builtins: len, range, enumerate, zip, map, filter, sorted, all/any, min/max, sum, chr/ord, bin/hex/oct, isinstance, type constructors, repr, print kwargs, getattr/setattr/hasattr, abs/divmod/pow - exceptions: try/finally, try/except/finally, bare raise, raise from, multiple except handlers - operators: bitwise, augmented assignment, chained comparisons, ternary, boolean short-circuit - data_structures: tuple unpacking, extended unpacking, set operations, string format (%), string methods - functions: default args, *args, **kwargs - os_module: os.getcwd(), os.environ Tests account for gpython v0.2.0 limitations: no str.format(), no str.lower/upper, no len(bytes), no frozenset(), no classmethod/staticmethod, no closures (free variable capture without nonlocal), no integer dict keys, no enumerate(start=). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Plan: Replace gpython with a Custom Pure-Go Python InterpreterContextThe python builtin currently uses The existing test suite has 129 scenario tests and a Go unit test suite that together validate:
Recommended Approach: Custom Pure-Go Python InterpreterWhy not Starlark (
Why a custom interpreter is the right call:
Implementation ScopeThe interpreter implements the Python 3 subset actually used by the tests. Out-of-scope: decorators, multiple-inheritance MRO edge cases, async/await, metaclasses, the full CPython stdlib. Files to create (in
|
| File | Description | Est. lines |
|---|---|---|
lexer.go |
Tokenizer: keywords, operators, indentation (INDENT/DEDENT), string/number literals | ~600 |
ast.go |
AST node types for all statements and expressions | ~400 |
parser.go |
Recursive-descent parser: all statement forms, operator precedence | ~1800 |
types.go |
Python object model: Int, Float, Str, Bytes, Bool, None, List, Dict, Set, Tuple, Function, Class, Instance, Generator, Exception | ~1000 |
eval.go |
Tree-walking evaluator: scopes, exceptions, generators, context managers | ~2500 |
builtins_py.go |
Built-in functions: print, len, range, enumerate, zip, sorted, map, filter, sum, min, max, abs, chr, ord, bin, hex, oct, type, isinstance, repr, str, int, float, bool, list, dict, set, tuple, open (sandboxed) | ~700 |
modules.go |
Standard modules: sys (argv, exit, stdin, stdout, stderr), math, os (read-only subset), string, binascii, time | ~600 |
sandbox.go |
Security layer: blocked os functions, write-mode open rejection, blocked imports (tempfile, glob) | ~300 |
pyruntime.go |
Entry point: Run(ctx, RunOpts) int, same signature as current |
~200 |
Total estimate: ~8,100 lines (the current pyruntime.go is ~700 lines using gpython; the difference is the interpreter itself)
Files to modify
| File | Change |
|---|---|
builtins/internal/pyruntime/pyruntime.go |
Replace entirely (same Run() API, new pure-Go implementation) |
builtins/python/python.go |
No change — already delegates to pyruntime.Run() |
go.mod |
Remove github.com/go-python/gpython entry |
go.sum |
Remove gpython hashes (go mod tidy) |
SHELL_FEATURES.md |
Update Python description (remove "gpython", "Python 3.4") |
builtins/tests/python/python_fuzz_test.go |
Update gpython-specific comment in FuzzPythonSource |
analysis/symbols_builtins_test.go |
Update any gpython-specific allowlist entries |
Python features to implement
Core language:
- All literals: int (decimal, hex, octal, binary), float, complex, string (all quote forms + raw), bytes, bool, None, ellipsis
- Operators: arithmetic, bitwise, comparison, boolean,
in/not in,is/is not - Assignments: simple, augmented (
+=, etc.), tuple unpacking, starred assignment (a, *b, c = ...) - Statements: if/elif/else, for/in, while, break, continue, pass, del, return, yield, yield from, raise, try/except/else/finally, with, assert, global, nonlocal, import, from-import, class, def
- Comprehensions: list, dict, set, generator expression
- Lambda expressions
- Slicing (
a[1:3:2]) - Attribute access, subscript
- Starred calls (
*args,**kwargs) - Class definitions: single inheritance,
__init__, methods,__str__,__repr__,__enter__/__exit__,__iter__/__next__ - Generators: yield, yield from, send(), StopIteration
Exception handling:
- All standard exception types: BaseException, Exception, ValueError, TypeError, KeyError, IndexError, AttributeError, NameError, ZeroDivisionError, IOError, OSError, FileNotFoundError, PermissionError, StopIteration, RuntimeError, ImportError, MemoryError, SystemExit, SyntaxError, AssertionError, NotImplementedError
- Exception chaining
- Custom exception classes (subclassing Exception)
Built-in functions (28+): print, len, range, enumerate, zip, sorted, reversed, map, filter, sum, min, max, abs, chr, ord, bin, hex, oct, type, isinstance, issubclass, repr, str, int, float, bool, list, dict, set, tuple, open (sandboxed), hash, id, iter, next, callable, getattr, setattr, hasattr, delattr, dir, vars, all, any, round, divmod, pow
Modules:
sys: argv, exit, stdin, stdout, stderr, version, platform, path (empty)math: floor, ceil, sqrt, log, log2, log10, exp, sin, cos, tan, asin, acos, atan, atan2, pi, e, inf, nan, fabs, factorial, gcd, isnan, isinf, isfinite, degrees, radians, hypotos: listdir (AllowedPaths), getcwd, path.join, path.dirname, path.basename, path.exists, path.isfile, path.isdir, path.splitext, getenv, environ (read-only), sep, linesep, devnullstring: whitespace, ascii_letters, ascii_lowercase, ascii_uppercase, digits, hexdigits, octdigits, printable, punctuationbinascii: hexlify, unhexlify, b2a_hex, a2b_hextime: time, sleep (limited), monotonic
Security sandbox (same as current):
- Blocked os functions: system, popen, remove, unlink, mkdir, makedirs, rmdir, removedirs, rename, renames, replace, link, symlink, chmod, chown, chroot, execl/le/lp/lpe, execv/ve/vp/vpe, _exit, fork, forkpty, kill, killpg, popen2-4, spawnl/le/lp/lpe/v/ve/vp/vpe, startfile, truncate, write, putenv, unsetenv, walk (removed)
open()write modes rejected (PermissionError)tempfileandglobimports blocked (ImportError)- File reads capped at 1 MiB
- Source code capped at 1 MiB
Test strategy
- Existing 129 scenario tests: should all pass unchanged
- Existing Go unit tests: update only the gpython-specific comment in
FuzzPythonSource - The interpreter runs in a goroutine;
selectonctx.Done()for context cancellation - Memory limits:
maxSourceBytes = 1 MiB,maxReadBytes = 1 MiBperfile.read()call - Traceback format:
Traceback (most recent call last):\n File "name", line N\nExceptionType: msg
Verification
make fmt
go build ./...
go test ./builtins/... ./tests/... -timeout 120s
RSHELL_BASH_TEST=1 go test ./tests/ -run TestShellScenariosAgainstBash -timeout 120s # skip; scenarios have skip_assert_against_bash: true
go run ./cmd/rshell --allow-all-commands -c 'python -c "print(\"hello\")"'
go run ./cmd/rshell --allow-all-commands -c 'help' | grep python…eter Replace github.com/go-python/gpython with a from-scratch Python 3 tree-walking interpreter (~12,000 lines) implemented across modular files under builtins/internal/pyruntime/: - ast.go: full AST node type definitions for Python 3 statements/expressions - lexer.go: tokenizer with indent/dedent, string literals, number literals - parser.go: recursive-descent parser covering the complete Python 3 grammar subset needed by the test suite - types.go: Python object system (int, float, str, bytes, list, tuple, dict, set, class/instance, generator, exception hierarchy, module, file, scope) - eval.go: tree-walking evaluator with generators via goroutine+channel, exception handling via Go panic/recover, context cancellation support, class definition with C3 MRO, closures, comprehensions, yield/yield from - builtins_funcs.go: ~45 built-in functions (print, len, range, zip, map, filter, sorted, isinstance, type constructors, open, super, etc.) - modules.go: module registry with sys, math, os (read-only), binascii, string; blocked modules (tempfile, glob, subprocess, socket, ctypes) Remove github.com/go-python/gpython from go.mod (go mod tidy). Update analysis symbol allowlists for new implementation. All 40+ test packages pass including 129 Python scenario tests. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Format analysis/symbols_internal.go and remove unused symbols from
internalAllowedSymbols (bufio.Scanner, bytes.SplitAfter,
hash/crc32.ChecksumIEEE, math.MaxFloat64, math.Round, math/big.NewFloat,
unicode.Is{Space,Title,Upper}, unicode.To{Lower,Title,Upper},
unicode/utf8.RuneError) that are not used by any builtins/internal file.
- Add missing copyright headers to pyruntime/parse_test.go and
pyruntime/smoke_test.go.
- Fix data race in pyruntime.Run(): after ctx.Done() fires, wait for the
goroutine running runInternal to finish before returning. Without this,
the goroutine's defer (printTraceback → fmt.Fprintf to opts.Stderr) races
with the caller reading opts.Stderr in the test. The evaluator checks
ctx.Done() at each loop iteration so the goroutine terminates promptly.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Block host env access: - Remove os.Environ() and os.LookupEnv() from the Python os module. os.environ is now an empty dict and os.getenv() always returns its default argument. Python scripts must not be able to read process environment variables (API keys, tokens, etc.). - Drop os.Environ and os.LookupEnv from the pyruntime symbol allowlists. - Update scenario tests to verify PATH and other real env vars are invisible, and that os.environ is empty (len == 0). Fix callObject data race: - Replace the package-level callObject function variable with a goroutine-keyed sync.Map (goroutineCallFns). Each Python execution registers its evaluator's callObject at goroutine start and deregisters on return, so concurrent executions never share a function pointer. Previously, two parallel Python scenarios would race on the write at newEvaluator():50, causing test failures under -race. goroutineID() reads the goroutine number from runtime.Stack. - Add runtime.Stack and sync.Map to the pyruntime symbol allowlists. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…andbox Python's os.listdir, os.path.exists, os.path.isfile, and os.path.isdir were calling os.ReadDir/os.Stat directly, bypassing the AllowedPaths sandbox. Route them through new Stat/ReadDir callbacks on RunOpts, wired to callCtx.StatFile/callCtx.ReadDir in the python builtin. Also remove os.Environ/os.LookupEnv from the symbol allowlist (removed in prior commit) and add io/fs.FileInfo + io/fs.DirEntry in their place. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…CodeQL - lexer.go: guard rune() cast with unicode.MaxRune check for \U escapes - parser.go: guard int64() cast with math.MaxInt64 check for uint64 literals; values exceeding int64 range now fall through to the big.Int path Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ath leakage Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…axInt64/unicode.MaxRune Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…o builtins/python Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
AlexandreYang
left a comment
There was a problem hiding this comment.
Review Summary
This PR introduces a full pure-Go Python 3 interpreter as a new python builtin (~11k lines: lexer, parser, AST, evaluator, type system, stdlib modules). The sandbox design is sound — open() is routed through AllowedPaths, dangerous os functions are absent, and the blocked module list is comprehensive.
Overall assessment: needs fixes — three P1 issues and three P2 issues require attention before merging.
Positive Observations
- Sandbox design is correct:
open()always callscallCtx.OpenFile, onlyO_RDONLYis passed, write/append/exclusive modes are rejected before the open call. - Blocked module list is comprehensive (
tempfile,glob,subprocess,socket,ctypes,threading,multiprocessing,asyncio) and uses a registry. 1 MiBsource and per-file read limits are in place and useio.LimitReader.maxCallDepth = 500prevents stack-overflow via deep recursion.- Context is checked in every loop/for iteration via
e.checkCtx(), so the evaluator responds promptly to timeout. math.factorial()is capped atn > 10000preventing runawaybig.Intcomputation.goroutineCallFnsis async.Map— concurrent Python executions are safely isolated.- Fuzz tests cover both
-cand file-based execution paths. - Comprehensive scenario tests for sandbox restrictions with
skip_assert_against_bash: true.
|
@codex review this PR |
|
Iteration 1 self-review result:
Total: 8 findings across all severities. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 9198cb8a08
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
…w findings 1. OOM via mulOp (P1): add checkRepeatBytesLimit/checkRepeatItemsLimit guards before every make() in mulOp — raises MemoryError when str/bytes/list/tuple repetition would exceed 1 MiB (maxRepeatBytes). 2. Generator goroutines unregistered (P1): register each generator goroutine's callObject in goroutineCallFns so map/filter/sorted with user-defined key functions work inside generator bodies and generator expressions. 3. Generator absorbs real Go panics (P3): add re-panic for non-Python signals in both makeGenerator and evalGeneratorExp goroutines so nil-pointer dereferences and other Go bugs are not silently swallowed. 4. context.Background() ignores timeout (P1): add Ctx field to RunOpts, populated by runInternal. Replace all context.Background() calls in modules.go (listdir, exists, isfile, isdir) and builtins_funcs.go (open) with opts.Ctx so sandbox I/O respects the shell's cancellation deadline. 5. os.path.abspath/realpath leaks CWD (P2): remove both functions from the os.path module — they both call filepath.Abs which invokes os.Getwd, leaking the host CWD. This matches the policy that blocked os.getcwd() (commit f5235f8). Also remove osPathAbspath helper function. 6. sys.platform hardcoded to "linux" (P2): use runtime.GOOS via new goosToSysPlatform() helper — returns "darwin", "win32", or "linux". 7. analysis/symbols_builtins.go: remove stale context.Background and filepath.Abs entries (no longer used by python package); add runtime.GOOS. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
[Claude Sonnet 4.6] Addressed all action items from the review summary (commit fe0adcd):
Items not addressed (left for reviewer decision):
|
AlexandreYang
left a comment
There was a problem hiding this comment.
Code Review — PR #179: [experiment] python re-implement
Overview
This PR replaces the gpython dependency with a custom pure-Go Python 3 interpreter. The implementation is substantial (~10k LOC) and covers lexer, parser, AST, evaluator, type system, and module support. The security sandbox integrates correctly with the existing AllowedPaths infrastructure.
Overall assessment: needs fixes — no sandbox escapes found, but there are three correctness/safety issues that should be addressed before merge.
Summary of Findings
| # | Priority | File | Finding |
|---|---|---|---|
| 1 | builtins/python/types.go:2324 |
Unbounded make([]byte, n) allocation — OOM via large n |
|
| 2 | builtins/python/eval.go:975,1018 |
floorDivOp/modOp silently truncate big-int operands to 0 via ignored int64() ok flag |
|
| 3 | builtins/python/builtins_funcs.go:1272 |
input() uses unbounded bufio.ReadString — no line-length limit |
|
| 4 | builtins/python/modules.go:319,353 |
os.linesep hardcoded "\n" and os.name hardcoded "posix" — wrong on Windows |
|
| 5 | builtins/python/types.go:2187 |
file.readline() on stdin reader has no length limit — same OOM risk as finding #3 |
|
| 6 | builtins/python/modules.go:353 |
os.name is always "posix" even on Windows |
Security Positive Observations
callCtx.OpenFileis used throughout (neveros.Open) — theAllowedPathssandbox invariant is respected.- The
open()builtin correctly rejects write/append/exclusive modes before calling into the sandbox. - The allowedpaths sandbox itself (
sandbox.go:298) enforcesflag == os.O_RDONLY, so writing viaf.rc.Write()on sandbox-opened files will fail at the OS level. - Dangerous
osfunctions (system, popen, exec*, remove, mkdir, rename, symlink) are absent from the module dict —AttributeErrorraised on access. - Blocked modules (tempfile, glob, subprocess, socket, ctypes, threading, multiprocessing, asyncio) are registered in
moduleRegistryand panic withImportError. - Recursion depth is capped at 500 via
maxCallDepth. string * n,bytes * n,list * nare all bounded bymaxRepeatBytes(1 MiB), preventing repeat-based OOM.- Source code and file reads are bounded at 1 MiB via
io.LimitReader. - Generator goroutines respect
ctx.Done()in their send/yield select blocks. math.factorial()rejects n > 10000, preventing arbitrarily long big-int computation.
|
@codex review this PR |
|
Iteration 2 self-review result:
Total: 5 findings. |
…int ops, platform constants Five unresolved self-review threads addressed: 1. types.go (sys.stdin.read OOM): Cap `n` at `maxFileReadBytes` before allocating the read buffer to prevent OOM via `sys.stdin.read(2**40)`. 2. types.go (file.readline OOM): Wrap `f.r` with `bufio.NewReader(io.LimitReader(...))` so that `readline()` on stdin-backed files reads at most `maxFileReadBytes` bytes, preventing unbounded buffering from infinite sources. 3. builtins_funcs.go (input() OOM): Wrap `opts.Stdin` with `io.LimitReader` before constructing the `bufio.Reader` in `input()`, matching the same 1 MiB bound. 4. eval.go (floorDivOp/modOp big-int): Replace the `int64()`-based arithmetic (which silently truncated big-int operands to 0) with `big.Int` arithmetic via `toBigInt()`. `floorDivOp` uses `DivMod` + floor correction; `modOp` uses `big.Int.Mod` (already sign-matches the divisor). Also handles big-int × float by converting via `big.Float`. 5. modules.go (os.linesep / os.name): Derive both values from `runtime.GOOS`: `os.linesep` is `"\r\n"` on Windows and `"\n"` elsewhere; `os.name` is `"nt"` on Windows and `"posix"` elsewhere — matching CPython behaviour on each platform. Added scenario tests: - tests/scenarios/cmd/python/operators/bigint_floordiv_mod.yaml - tests/scenarios/cmd/python/stdin/read_large_n_is_capped.yaml - tests/scenarios/cmd/python/os_module/os_name_linesep.yaml Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
Iteration 5 self-review result:
Total: 7 findings. |
…ut() stdin limit, re ImportError, goroutineID sanity check - builtins_funcs.go: bytes(n) and bytearray(n) now raise MemoryError when n exceeds maxRepeatBytes (1 MiB) instead of calling make([]byte, n) with an unbounded integer — prevents OOM DoS via `bytes(2**40)`. - pyruntime.go: wrap opts.Stdin in a single io.LimitReader (maxFileReadBytes) at runInternal time so all input() calls and sys.stdin.read*() calls share one cumulative 1 MiB budget instead of each getting a fresh window. - builtins_funcs.go: remove the per-call LimitReader from input() since the global one in runInternal now covers it. - types.go: fix generator goroutine leak — after setting g.done = true on context cancellation in drainGenerator, do a non-blocking receive on g.yieldCh so a goroutine blocked on yieldCh <- val can exit. - types.go: add sanity check in goroutineID() — panic immediately if the parsed id is 0 (format changed) rather than silently mis-dispatching all goroutines through a shared slot in goroutineCallFns. - modules.go: move `re` from a stub module (import succeeds but all calls raise TypeError) to a blocked module (import raises ImportError), so that `try: import re except ImportError: ...` works correctly in Python scripts. - python.go, SHELL_FEATURES.md: update doc to reflect re is now blocked. - Add scenario tests: bytes_bytearray_oom_guard.yaml, re_blocked.yaml, os_listdir_outside_allowed_paths.yaml. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: c3266c361b
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Review-Fix Loop Summary
Iteration log
Final state
Remaining issuesNone — all findings from 5 review iterations have been addressed and resolved. |
AlexandreYang
left a comment
There was a problem hiding this comment.
Review Summary
This PR introduces a complete pure-Go Python 3 interpreter as a new python builtin for the restricted shell. The implementation is large (~10 KLOC new) spanning a lexer, parser, AST, evaluator, type system, stdlib modules, and a security sandbox. The overall architecture is well-thought-out: file access goes through callCtx.OpenFile / AllowedPaths, dangerous os functions are absent from the module, blocked modules raise ImportError, write-mode open() is rejected, and stdin has a global 1 MiB cumulative LimitReader.
Overall assessment: needs fixes — one P1 DoS (unbounded big-int exponentiation) and a cluster of P2 issues.
Findings Summary
Detailed Findings
P1 — Unbounded big.Int Exponentiation (OOM / CPU exhaustion)
Location: builtins/python/eval.go:1078, builtins/python/builtins_funcs.go:551
Description: powOp and makeBuiltinPow delegate integer exponentiation to new(big.Int).Exp(base, exp, nil) without any bound on the exponent. A script can request exponentiation with astronomically large exponents that fit in int64, causing the runtime to allocate multi-gigabyte BigInts and consume all available memory and CPU.
Proof of concept:
# python -c "print(2**1000000000)"
# Exponent = 10^9 — produces ~150 MB number; with 2**10^10 it would OOM the host.Note: The << shift operator at eval.go:1152-1158 correctly caps the shift at maxShift = 1 << 23 = 8 MB bits. The same guard is absent for **.
CPython 3.11+ limits int ** int via a digit-count pre-check. This interpreter has no such limit.
Remediation: Add an exponent magnitude guard before calling big.Int.Exp. If the result would exceed maxRepeatBytes bits (1 MiB = 8 Mbit), raise OverflowError. Concretely: if base.BitLen() * exp > 8*1024*1024, raise OverflowError("integer exponentiation result too large").
P2 — math.comb / math.perm loop over int64-sized k (CPU exhaustion)
Location: builtins/python/modules.go:237-270
Description: math.comb(n, k) and math.perm(n, k) use toIntVal (returns int64). With n=1000000 and k=500000, the loop runs 500k iterations with growing BigInt multiplications, consuming minutes of CPU and gigabytes of memory.
Proof of concept:
# python -c "import math; print(math.comb(1000000, 500000))"Remediation: Add if k > 10000 { raiseValueError("math.comb argument is too large") } (analogous to factorial's limit).
P2 — print(file=f) bypasses write-block for read-only files
Location: builtins/python/builtins_funcs.go:179-185
Description: The print() builtin writes to f.rc directly when f.rc != nil, bypassing the PermissionError guard that file.write() correctly raises (line 2273). Since callCtx.OpenFile returns an io.ReadWriteCloser that is read-only at the OS level, writes will fail at the OS layer — but the application-level sandbox check is absent, making the behavior OS-dependent rather than consistently enforced by the sandbox.
Proof of concept:
f = open('somefile.txt') # rc-based, opened read-only
print("hello", file=f) # should raise PermissionError; instead tries OS writeRemediation: In makeBuiltinPrint, when out = f.rc, raise PermissionError("cannot write to a file opened in read mode").
P2 — goroutineID() relies on undocumented runtime.Stack format
Location: builtins/python/types.go:1872
Description: Parsing runtime.Stack output to extract goroutine IDs is fragile. The format is undocumented. If the format changes, the function panics with a message that brings down the entire shell process. The comment acknowledges this.
Remediation (design): Refactor so that PyBuiltin.Fn closures that need to call user functions receive the evaluator explicitly via closure capture, eliminating the goroutineCallFns global map and goroutineID() entirely.
P2 — Missing pentest scenario tests for DoS vectors
Location: tests/scenarios/cmd/python/sandbox/
Description: The sandbox folder lacks coverage for integer DoS vectors. Suggested additions:
# pow_dos.yaml
description: Integer exponentiation that would overflow raises OverflowError.
skip_assert_against_bash: true
input:
script: |+
python -c "print(2**1000000000)"
expect:
stdout: |+
stderr_contains: ["OverflowError"]
exit_code: 1# comb_dos.yaml
description: math.comb with large k raises ValueError.
skip_assert_against_bash: true
input:
script: |+
python -c "import math; print(math.comb(1000000, 500000))"
expect:
stdout: |+
stderr_contains: ["ValueError"]
exit_code: 1P3 — json.loads raises TypeError instead of NotImplementedError
Location: builtins/python/modules.go:655
Description: Code that does try: json.loads(s) except NotImplementedError: cannot catch the error because TypeError is raised instead. NotImplementedError is the conventional choice for "not yet implemented" builtins.
Remediation: Change to panic(exceptionSignal{exc: newExceptionf(ExcNotImplementedError, "json.loads() is not implemented in this shell")}).
P3 — execClassDef child evaluator has no goroutineCallFns entry
Location: builtins/python/eval.go:273-283
Description: The child evaluator created in execClassDef is not registered with goroutineCallFns. If during class body execution a call triggers callObject (the global function in types.go:1890), it will panic with "callObject invoked outside Python evaluation context". This would happen if e.g. __repr__ is called on an instance created inside the class body during pyStr() evaluation.
Remediation: Register the child evaluator with goroutineCallFns during execClassDef execution (same pattern as newEvaluator).
Positive Observations
- Sandbox architecture is sound: all file access goes through
callCtx.OpenFile/AllowedPaths. No directos.Opencalls in Python-facing code. - Global stdin budget:
runInternalwrapsopts.Stdinwith a single cumulative 1 MiBio.LimitReadershared across allinput()andsys.stdinreads — correctly prevents per-call budget bypass. - Memory guards on sequences:
checkRepeatBytesLimit/checkRepeatItemsLimitprotectstr * n,bytes * n,list * n,tuple * n. - Shift cap:
<<is capped atmaxShift = 8 MB bits— good pattern to replicate for**. - Recursion depth limit:
maxCallDepth = 500prevents stack overflow via deep recursion. - Generator item cap:
maxGeneratorItems = 128kprevents OOM from infinite generators. - Comprehensive module blocklist:
subprocess,socket,ctypes,tempfile,glob,threading,multiprocessing,asyncio,re. - Source size limit: 1 MiB cap on script files and stdin before parsing.
- Fuzz tests: both
FuzzPythonSourceandFuzzPythonFileContentincluded — excellent for catching parser/evaluator panics. - Context propagation: generator goroutines correctly propagate
ctx.Done()in all blocking channel operations.
|
@codex review this PR |
|
Iteration 1 self-review result (run 2):
Total: 7 findings. |
…on limits, math guards, permission enforcement - eval.go powOp: add bit-length guard before big.Int.Exp to prevent OOM/CPU exhaustion via 2**1000000000; raises OverflowError when result would exceed 8 MiB (analogous to existing maxShift cap on <<) - builtins_funcs.go pow(): same exponent-magnitude guard for the pow() builtin - modules.go math.comb: add k > 10000 guard (analogous to mathFactorial n > 10000) to prevent CPU exhaustion via math.comb(1000000, 500000) - modules.go math.perm: same k > 10000 guard - builtins_funcs.go print(): block writes to rc-backed (read-only) files at the application layer; raises PermissionError instead of relying on OS rejection - modules.go json.loads: raise NotImplementedError instead of TypeError so callers can use `except NotImplementedError` as documented Python convention Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 65656cf007
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
AlexandreYang
left a comment
There was a problem hiding this comment.
Review Summary
This PR introduces a complete pure-Go Python 3 interpreter as a builtin command (python). It is a large, ambitious change: a lexer, parser, AST, tree-walking evaluator, type system, module system, and ~150 scenario tests. The sandbox design is fundamentally sound — all file I/O is routed through callCtx.OpenFile/StatFile/ReadDir, dangerous OS functions are absent from the os module, blocked modules raise ImportError, and several DoS limits are in place.
Overall assessment: needs fixes — no sandbox escapes were found, but there are correctness issues (integer overflow in divmod/bin/hex/oct for big ints, an unbounded drainIter call for __iter__-backed custom objects, one generator goroutine that swallows Python exceptions silently, and missing test coverage for a few security-sensitive code paths).
| # | Priority | File | Finding |
|---|---|---|---|
| 1 | builtins/python/eval.go:2475 |
drainIter on __iter__-backed custom objects has no item-count cap — infinite generator objects that implement __iter__ + __next__ directly bypass maxGeneratorItems |
|
| 2 | builtins/python/eval.go:2341 |
Generator goroutine silently swallows non-StopIteration Python exceptions — unhandled exceptions disappear without printing traceback | |
| 3 | builtins/python/builtins_funcs.go:666 |
makeBuiltinBin/Hex/Oct call toIntVal which truncates big ints silently — wrong output for integers outside int64 range |
|
| 4 | builtins/python/builtins_funcs.go:484 |
makeBuiltinDivmod uses plain int64 arithmetic without big-int paths — overflows/wrong results for big ints |
|
| 5 | builtins/python/builtins_funcs.go:1272 |
makeBuiltinInput creates a new bufio.NewReader on each call, which can read ahead beyond the shared LimitReader budget, potentially consuming more bytes than intended |
|
| 6 | builtins/python/types.go:2172 |
file.readline wraps f.r in a new io.LimitReader per call instead of sharing the global stdin limit — each readline call gets a fresh 1 MiB window |
|
| 7 | builtins/python/eval.go:2004 |
Generator expression goroutine: in the evalGeneratorExp goroutine, non-StopIteration Python exceptions are silently swallowed (same pattern as finding #2) |
|
| 8 | builtins/python/types.go:1861 |
goroutineID() relies on undocumented runtime.Stack format — if the format changes the function panics, crashing the entire shell process |
|
| 9 | builtins/python/modules.go:558 |
binascii.Error is aliased to ExcOSError rather than ExcValueError — CPython uses binascii.Error subclassing ValueError |
|
| 10 | tests/scenarios/cmd/python/sandbox/open_outside_allowed_paths.yaml:14 |
Test uses stderr_contains instead of expect.stderr — guideline says prefer expect.stderr |
Findings
P1 — drainIter has no item-count cap for __iter__/__next__ objects
drainGenerator (called from collectIterable) correctly caps items at maxGeneratorItems (128k). However, drainIter (called from iterateObj when the object implements __iter__) has no limit at all:
func (e *Evaluator) drainIter(iterObj Object) []Object {
var result []Object
for {
val, ok := e.nextFromIter(iterObj)
if !ok {
break
}
result = append(result, val) // unbounded
}
return result
}A script like:
class Inf:
def __iter__(self): return self
def __next__(self):
return 1
for x in Inf(): pass # calls iterateObj → drainIter → OOMwill exhaust memory. The fix is to add the same maxGeneratorItems cap inside drainIter.
P1 — Generator goroutine swallows non-StopIteration Python exceptions
In makeGenerator (eval.go ~line 2363), the deferred recover in the generator goroutine catches all exceptions and returns silently:
if sig, ok := r.(exceptionSignal); ok {
if exceptionMatchesClass(sig.exc, ExcStopIteration) { return }
if exceptionMatchesClass(sig.exc, ExcGeneratorExit) { return }
// Other Python exception: generator exits cleanly. ← swallows it!
return
}If user code in a generator raises e.g. ValueError, the generator silently exhausts without printing a traceback. CPython propagates the exception to the caller of next(g). This is a correctness bug.
P1 — bin()/hex()/oct() truncate big ints silently
makeBuiltinBin/Hex/Oct call toIntVal(args[0]) which calls int64() and drops the big field:
func toIntVal(obj Object) int64 {
switch v := obj.(type) {
case *PyInt:
n, _ := v.int64() // returns 0 for big ints that don't fit in int64
return n
...
}
}For bin(2**63) CPython outputs '0b1000...0000' (65 chars) but the interpreter would output '0b0' (since int64() returns 0 for values that overflow). The fix is to use the big.Int path with big.Int.Text(2/16/8).
P1 — divmod() uses plain int64 arithmetic
makeBuiltinDivmod does an, _ := av.int64() without checking the ok return. For big ints where int64() returns (0, false), the function silently computes divmod(0, bn) instead of raising an error or using big-int arithmetic.
P2 — input() creates new bufio.Reader per call, buffering past the LimitReader
makeBuiltinInput creates reader := bufio.NewReader(opts.Stdin) on each call. bufio.Reader reads ahead in 4KB chunks internally, which can consume bytes from the underlying LimitReader beyond what's needed for a single line. This means multiple input() calls can collectively read significantly more than maxFileReadBytes from the underlying source before the limit fires. The global LimitReader wrapping in runInternal was intended to prevent this, but the per-call bufio.NewReader breaks the budget by buffering past the limit boundary.
P2 — file.readline wraps stdin in a new LimitReader per call
fileGetAttr for readline does:
limited := bufio.NewReader(io.LimitReader(f.r, int64(maxFileReadBytes)))This gives each readline() call on a file a fresh 1 MiB window. For stdin files that are already wrapped in the global LimitReader this is double-limited and harmless, but for sys.stdin (which wraps the original opts.Stdin via bufio.NewReader(opts.Stdin) in the sys module creation), each sys.stdin.readline() call gets a fresh 1 MiB limit, allowing unbounded reads from /dev/stdin-like sources.
P2 — goroutineID() is a panic footgun for the entire shell
If the Go runtime ever changes its runtime.Stack output format (it is explicitly documented as non-stable), goroutineID() will return 0 and panic:
if id == 0 {
panic("goroutineID: could not parse goroutine ID from runtime.Stack output — Go runtime format may have changed")
}This Go-level panic propagates up through newEvaluator and will crash the shell process entirely rather than returning an error. Consider using a recover here and returning an error, or using sync.Map with a context-value key instead of goroutine IDs.
P3 — binascii.Error should alias ExcValueError, not ExcOSError
In CPython, binascii.Error is a subclass of ValueError. The current code aliases it to ExcOSError, meaning except ValueError would not catch binascii.Error. This is a minor compatibility issue.
P3 — Scenario tests should prefer expect.stderr over stderr_contains
Per project guidelines, expect.stderr (exact match) is preferred over stderr_contains. Several sandbox scenario tests use stderr_contains.
Coverage Summary
| Code path | Scenario test | Go test | Status |
|---|---|---|---|
open() sandbox enforcement |
sandbox/open_outside_allowed_paths.yaml |
— | Covered |
os.system blocked |
sandbox/os_system_blocked.yaml |
— | Covered |
re blocked |
sandbox/re_blocked.yaml |
— | Covered |
drainIter on __iter__/__next__ objects |
— | — | Missing |
| Generator exception propagation to caller | — | — | Missing |
bin/hex/oct on big ints |
— | — | Missing |
divmod on big ints |
— | — | Missing |
input() reading beyond limit via buffering |
— | — | Missing |
goroutineID() format parsing |
— | smoke_test.go (indirectly) |
Partial |
Positive Observations
- All file I/O is correctly routed through
callCtx.OpenFile— the primary sandbox invariant is upheld throughout the implementation. - The
maxRepeatByteslimit onstr * n,bytes * n,list * n,tuple * nis well-placed and well-tested. - The
maxCallDepth = 500recursion guard correctly raisesRecursionError. - The
maxSourceBytes = 1 MiBlimit on script reading is correct and bounded. - Blocked modules (
subprocess,socket,ctypes,tempfile,glob,threading,multiprocessing,asyncio,re) all correctly raiseImportError. os.environis correctly stubbed to an empty dict andos.getenvalways returns the default.- The generator
drainGeneratorfunction correctly caps atmaxGeneratorItemsfor directPyGeneratordrain paths. - Context cancellation is correctly propagated through
checkCtx()in loop bodies and in generatorselectblocks. - The
powexponent limit guards correctly prevent OOM from2**10000000. - The
exec/eval/compile/__import__builtins are correctly absent from the globals dict.
|
@codex review this PR |
|
Iteration 2 self-review result (run 2):
Total: 9 findings. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 96c650e5fe
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
…tor exception propagation, bin/hex/oct big int, divmod big int, stdin reader reuse, binascii.Error alias, goroutineID safety - drainIter: add maxGeneratorItems (128k) item-count cap; raises MemoryError when custom __iter__/__next__ objects produce too many items (prevents OOM DoS) - Generator exception propagation: wire up excCh so non-StopIteration exceptions raised inside generator bodies are propagated to the caller of next() instead of being silently swallowed; fix applies to both makeGenerator and evalGeneratorExp, and to nextFromGenerator/nextFromIterable - bin()/hex()/oct(): use toBigInt().Text(base) via new toIntValBig helper so big integers (> int64 max) produce correct output instead of '0b0'/'0x0'/'0o0' - divmod(): use big.Int.DivMod for integer operands so values outside int64 range produce correct results instead of silently computing divmod(0, bn) - stdin reader reuse: store a single persistent bufio.Reader (opts.stdinReader) in RunOpts, initialised by runInternal after wrapping Stdin in its LimitReader; reuse it in input() and sys.stdin to avoid losing read-ahead bytes between calls - sys.stdin.readline(): use f.r.ReadString directly instead of wrapping f.r in a fresh LimitReader per call (which gave each readline() its own independent 1 MiB budget and discarded buffered bytes) - binascii.Error: change alias from ExcOSError to ExcValueError to match CPython semantics (binascii.Error is a subclass of ValueError, not OSError) - goroutineID(): change return to (int64, bool) instead of panicking on parse failure; callers degrade gracefully instead of crashing the shell process - Tests: add scenario tests for big-int bin/hex/oct and divmod, generator exception propagation, and drainIter memory limit Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
AlexandreYang
left a comment
There was a problem hiding this comment.
Review Summary
This PR implements a custom pure-Go Python 3 interpreter as a new python builtin command (~9,000 lines of new Go code). The overall security architecture is sound: all filesystem access routes through callCtx.OpenFile/Stat/ReadDir, the sandbox correctly denies write modes, resource limits are applied for memory exhaustion vectors, and dangerous modules are blocked. However, there are several findings that should be addressed before merge.
Overall assessment: needs fixes (no P0, multiple P1/P2 findings)
Finding Summary
Findings
P1: Goroutine ID parsing is fragile
Location: builtins/python/types.go:1876–1890
The goroutine ID is extracted by parsing runtime.Stack output. The Go runtime explicitly states this format is undocumented and may change. If goroutineID() returns (0, false), the evaluator falls back to registering nothing in goroutineCallFns. Any subsequent user-defined function call via callObject() (e.g. sorted(key=fn), map(fn, …), min/max(key=fn)) hits:
panic("callObject invoked outside Python evaluation context")This is a real Go panic (not exceptionSignal), which is not caught by runInternal's defer recover() (that only catches typed signals) and will crash the shell process.
Remediation: Convert the unrecoverable string panic to an exceptionSignal so it gets caught:
// In callObject:
panic(exceptionSignal{exc: newExceptionf(ExcRuntimeError, "callObject invoked outside Python evaluation context")})P1: sys.exit(N) with N > 255 silently truncates
Location: builtins/python/python.go:172
uint8(300) == 44. sys.exit(300) exits with code 44, not 300. CPython lets the OS truncate exit codes, but the interpreter itself does not truncate. Script authors relying on distinct non-zero codes (e.g., sys.exit(2) vs sys.exit(10)) will see wrong values when N > 255.
import sys; sys.exit(300) # shell exits 44, not 300Remediation: Clamp to 1 for out-of-range values, or at minimum document this wrapping behaviour:
if exitCode < 0 || exitCode > 255 {
exitCode = 1
}
return builtins.Result{Code: uint8(exitCode)}P1: collectIterable has no item count limit — OOM via list(range(10**9))
Location: builtins/python/types.go:2966 (case *PyRange: in collectIterable)
collectIterable applies drainGenerator (which has a maxGeneratorItems = 128k cap) for PyGenerator sources. But for PyRange, PyMapIter, PyFilterIter, PyZipIter, PyEnumerateIter, and PyReversedIter, there is no cap. For PyRange:
n := v.length() // can be up to int64 max
result := make([]Object, n) // immediately allocates n * 8 byteslist(range(10**9)) allocates ~8 GiB immediately. list(map(str, range(10**9))) does the same via the map iter pre-collection in makeBuiltinMap.
Remediation: Add the same cap used in drainIter:
case *PyRange:
n := v.length()
if n > maxGeneratorItems {
panic(exceptionSignal{exc: newExceptionf(ExcMemoryError, "range too large to collect (limit %d)", maxGeneratorItems)})
}
result := make([]Object, n)
// ...Apply the same guard in the PyMapIter, PyFilterIter, PyZipIter, PyEnumerateIter, and PyReversedIter loops (cap the result slice before the loop, or check length mid-loop like drainIter does).
P2: Class-body child evaluator — works by coincidence, needs documentation
Location: builtins/python/eval.go:278–287 (execClassDef)
The child evaluator created for the class body does NOT call goroutineCallFns.Store. This works because the class body executes on the same goroutine as the parent evaluator, which already has its goroutineCallFns entry. But there is a comment // Propagate callObject binding that is misleading — it does not actually propagate anything. Please replace with:
// The class body executes on the same goroutine as the parent, so the
// parent's goroutineCallFns entry is already in scope. No re-registration needed.P2: json.loads exception type mismatch with docs
Location: builtins/python/modules.go:664
The package doc comment (python.go:56) and SHELL_FEATURES.md both say json.loads raises TypeError. The implementation raises NotImplementedError. The test json_loads_not_implemented.yaml uses stderr_contains: ["NotImplementedError"], which matches the code but not the docs.
Remediation: Either change the implementation to use ExcTypeError, or update the documentation everywhere that says TypeError.
P3: id() returns address of stack local, not object identity
Location: builtins/python/builtins_funcs.go:2148–2158
fmt.Sprintf("%p", &args[0]) formats the address of the stack variable holding the interface value, not the object itself. Two calls with the same object return different addresses:
a = [1, 2, 3]
print(id(a) == id(a)) # likely prints FalseThis is a correctness bug that could confuse users who write id(x) == id(y) for identity checks. Since unsafe is not in the allowlist, document that id() returns an approximation.
P3: Real Go panics inside generator goroutines crash the process
Location: builtins/python/eval.go:2060, 2411–2413
Generator goroutines recover panics and re-panic if the panic is a real Go panic (nil pointer dereference, index OOB, etc.). A re-panicked goroutine crashes the entire process since there is no recover() on that goroutine's stack after the re-panic. This is the correct Go behavior for unexpected crashes, but users may experience an unrecoverable shell crash rather than a Python exception. Consider adding a catch-all that converts unrecognized panics to a Python RuntimeError propagated over excCh.
Test Coverage Summary
| Code path | Scenario test | Go test | Status |
|---|---|---|---|
sys.exit(N) N > 255 truncation |
— | — | Missing |
collectIterable with large range() OOM |
— | — | Missing |
list(map(str, range(10**9))) OOM |
— | — | Missing |
| Generator memory limit (via generator) | basic/drain_iter_memory_limit.yaml |
— | Covered |
json.loads raises error |
basic/json_loads_not_implemented.yaml |
— | Covered (wrong exc type) |
| Blocked modules | sandbox/re_blocked.yaml etc. |
— | Covered |
Write-mode open() rejected |
sandbox/open_write_blocked.yaml |
— | Covered |
os.listdir() outside allowed paths |
sandbox/os_listdir_outside_allowed_paths.yaml |
— | Covered |
| Recursion depth limit | — | python_test.go |
Covered |
Coverage: Gaps found — large-range OOM guard and exit-code truncation are untested.
Positive Observations
- All filesystem access routes through the sandbox callbacks (
callCtx.OpenFile,callCtx.StatFile,callCtx.ReadDir) — no directos.Opencalls anywhere in the package. open()write-mode check is enforced at two levels: mode string parse gate and runtimef.rc != nilcheck.- Stdin is wrapped in a single global
io.LimitReadersoinput()in a loop cannot exceed the 1 MiB cumulative budget. maxCallDepth = 500protects against infinite recursion.maxRepeatBytes = 1 MiBis applied uniformly to string/bytes/list/tuple repetition operators.math.comb,math.perm,math.factorialhave explicit argument limits (> 10000) to prevent factorial blowup.- Generator goroutines register their own
goroutineCallFnsentry — user-defined key functions work correctly inside generator bodies. - Dangerous Python builtins (
compile,exec,eval,__import__) are absent from the builtin dict, raisingNameErrorrather than needing explicit blocking logic. - The
osmodule intentionally omitsos.getcwd(),os.system(), and all write/exec/spawn functions.
|
@codex review this PR |
|
Iteration 3 self-review result (run 2):
Total: 6 findings. |
- toIntVal: raise IndexError for big integers that don't fit in int64
instead of silently returning 0 (CPython: "cannot fit 'int' into an
index-sized integer")
- powOp: use big.Float for full-precision conversion of big-int bases
when computing negative-exponent results, preventing silent truncation
to 0 (which produced +Inf instead of the correct small float)
- id(): use the interface's stored pointer (args[0]) instead of the
slice-slot address (&args[0]) so that id() returns a stable per-object
identity across multiple calls
- collectIterable(PyRange): add maxGeneratorItems size guard before
make([]Object, n) to prevent Go runtime panic on huge ranges
- execWith: three fixes to match CPython with-statement semantics:
(1) pass exc_type (the class) as the first arg to __exit__ instead
of the exception instance;
(2) unwind already-entered managers in reverse order when a later
__enter__ raises, before re-propagating the exception;
(3) clear bodyPanic when an inner __exit__ suppresses the exception
so outer managers in the same with-chain receive (None, None, None)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 992144042d
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
AlexandreYang
left a comment
There was a problem hiding this comment.
Review Summary
This PR introduces a pure-Go Python 3 interpreter as a new python builtin for the restricted shell. The interpreter is remarkably complete — lexer, parser, AST, evaluator, full type system, modules, generators with goroutines, and a comprehensive security sandbox layer. This is a significant engineering effort.
Overall assessment: needs fixes — the sandbox design is fundamentally sound, but there are several correctness and resource-safety issues to address before merging.
Summary of Findings
Positive Observations
- Sandbox file access is correct:
open()routes throughcallCtx.OpenFile(the AllowedPaths sandbox) and write/append modes are rejected before the OS call. No directos.Opencalls anywhere in the python package. - Write-mode double defense: write is blocked at both
open()mode-char validation and again inPyFile.write()for rc-based files. - Memory bounds are broadly good: 1 MiB cap on source, stdin, and per-file reads;
maxRepeatBytesguards string/list repetition;maxGeneratorItemsguardsdrainGenerator;maxCallDepth=500prevents stack overflow. - Context cancellation is pervasive:
checkCtx()in while/for loops, select onctx.Done()in all channel operations in generators. - No dangerous builtins exposed:
eval,exec,compile,__import__are all absent. Blocked modules (subprocess, socket, ctypes, etc.) raiseImportErrorcleanly. - Big integer arithmetic: arbitrary-precision integers with exponent and shift limits prevent OOM from things like
2**10000000. - Host env isolation:
os.environreturns an empty dict;os.getenv()always returns the default;os.getcwd/os.path.expanduserare absent. - Goroutine-ID-based callObject dispatch is a clever solution for threading user-defined functions through built-in higher-order functions (
map,filter,sorted) without a shared global evaluator.
|
@codex review this PR |
|
Iteration 4 self-review result (run 2):
Total: 9 findings. |
…d json.loads exception type - Generator send()/__next__(): add ctx.Done() arms to all channel operations so they cannot deadlock if the generator goroutine exits unexpectedly or the execution context is cancelled (P1 #3071362308) - Generator close(): close sendCh to unblock a generator goroutine waiting on sendCh, and drain yieldCh so the goroutine can observe the close and exit cleanly. Guard with v.done to prevent double-close (P1 #3071362326) - collectIterable: add maxGeneratorItems bounds to PyMapIter, PyFilterIter, PyZipIter, PyEnumerateIter, PyReversedIter, rangeIter, PyStr, and PyBytes branches so scripts cannot OOM via list(map(fn, huge_list)) or list(big_str) (P2 #3071362339, #3071362345) - json.loads(): raise ValueError instead of NotImplementedError so callers using `except ValueError` can detect the unimplemented function correctly, matching CPython's json.JSONDecodeError (subclass of ValueError) (P2 #3071362351) - python.go: add comment explaining sys.exit(N > 255) truncation to uint8 matches POSIX exit-code mod-256 behaviour (P1 #3071362333) - os.environ: document that writes are in-memory only and do not affect the host process (P3 #3071362355) - Add sys_exit_256.yaml test scenario verifying sys.exit(256) → exit code 0 - Update json_loads_not_implemented.yaml to expect ValueError instead of NotImplementedError Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 33786a9ccd
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| // on opts.Stderr: runInternal may still write traceback output after the | ||
| // context fires. Waiting here is safe because the evaluator checks | ||
| // ctx.Done() at each loop iteration and returns promptly. | ||
| <-ch |
There was a problem hiding this comment.
Return immediately when execution context is cancelled
The cancellation branch waits on <-ch> before returning, which can deadlock the command if runInternal is blocked in I/O (for example python -c "input()" with stdin connected to a pipe/FIFO whose writer stays open). In that case the runner timeout/cancellation is never honored and the shell can hang indefinitely instead of exiting with a timeout error.
Useful? React with 👍 / 👎.
| "memoryview": makeBuiltinMemoryview(), | ||
| "open": makeBuiltinOpen(opts), | ||
| "super": makeBuiltinSuper(), | ||
| "object": makeBuiltinObject(), |
There was a problem hiding this comment.
Expose
object as a class, not a builtin function
Registering object via makeBuiltinObject() makes it a builtin function that returns an instance, so explicit base declarations like class A(object): pass fail with TypeError: bases must be classes, not builtin_function_or_method. This breaks valid Python class definitions and differs from expected Python 3 behavior where object is the root class.
Useful? React with 👍 / 👎.
AlexandreYang
left a comment
There was a problem hiding this comment.
Review Summary
This PR introduces a complete, self-contained Python 3 interpreter (builtins/python) as a builtin command for the restricted shell. The implementation is a pure-Go tree-walking evaluator with a rich Python-compatible type system, a lexer, parser, and runtime with a security sandbox.
Overall assessment: needs fixes (P2/P3 issues; no P0/P1)
The sandbox is solid. No direct os.Open/os.Stat/os.ReadFile calls exist in the new code — all filesystem access is correctly routed through callCtx.OpenFile, callCtx.StatFile, and callCtx.ReadDir. No dangerous Go packages (os/exec, net/http, unsafe, syscall) are imported. The eval/exec/compile/__import__ Python builtins are absent, which is the right call. Context cancellation and memory limits are pervasive and well-implemented. Generator goroutines clean up correctly.
Findings summary
Positive Observations
- No sandbox escape paths found. All file I/O goes through
callCtx.OpenFile/StatFile/ReadDir, which route throughAllowedPaths. No directos.*filesystem calls anywhere in the package. eval/exec/compile/__import__/breakpointPython builtins are absent — there is no way for user code to dynamically compile/execute arbitrary Python strings.- Memory safety is thorough:
maxSourceBytes(1 MiB) caps source reads;maxFileReadBytescaps file/stdin reads per call;maxRepeatBytescaps sequence repetition;maxGeneratorItemscaps iterator materialisation;maxCallDepthcaps recursion;math.factorial/comb/permhave explicit argument limits. - Context cancellation is propagated everywhere:
checkCtx()is called in bothwhileandforloops, and generator goroutines select onctx.Done()at every blocking yield/send. - Blocked modules are correctly registered in
moduleRegistrysotry: import re except ImportErrorworks as expected. - Fuzz tests and a rich scenario suite (100+ YAML scenarios) cover the most important paths, including security (sandbox violations, mode blocking) and memory limits.
- The
goroutineCallFnssync.Map correctly scopes function dispatch per-goroutine so that concurrent Python executions do not share state.
|
@codex review this PR |
|
Iteration 5 self-review result (run 2):
Total: 5 findings (no P1s). |
- Fix SHELL_FEATURES.md: json.loads raises ValueError (not TypeError) - Fix python.go doc comment: same TypeError→ValueError correction - Document known limitation: `import os.path` raises ImportError; use `from os import path` instead (noted in both SHELL_FEATURES.md and python.go package doc) - Add scenario tests for all six blocked dangerous modules: subprocess, socket, ctypes, threading, multiprocessing, asyncio (all raise ImportError on import, matching re_blocked.yaml pattern) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ef38545409
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| mod, found := loadModule(name.Name, e.opts) | ||
| if !found { | ||
| // Check cache | ||
| if cached, ok := e.modules[name.Name]; ok { | ||
| mod = cached |
There was a problem hiding this comment.
Reuse cached modules before calling the loader
execImport loads the module before consulting e.modules, so registry-backed modules are rebuilt on every import in the same run. That breaks expected Python module state persistence (for example, mutating os.environ and importing os again returns a fresh module and drops prior state). Check the per-run cache first and only invoke loadModule on a cache miss.
Useful? React with 👍 / 👎.
| mod, found := loadModule(n.Module, e.opts) | ||
| if !found { | ||
| if cached, ok := e.modules[n.Module]; ok { | ||
| mod = cached | ||
| } else { | ||
| panic(exceptionSignal{exc: newExceptionf(ExcImportError, "No module named '%s'", n.Module)}) |
There was a problem hiding this comment.
Resolve dotted module names in from-imports
execImportFrom passes the full module string directly to loadModule, so dotted imports like from os.path import join fail with ImportError when only the top-level module is registered. This causes valid Python imports to fail even though os.path is exposed through the os module object. The import path should resolve dotted names via parent/top-level module loading and cache lookup before attribute extraction.
Useful? React with 👍 / 👎.
Review-Fix Loop Summary (run 2)
Iteration log
Final state
Remaining issuesNone — all findings from 5 review iterations have been addressed and resolved. |
What does this PR do?
Motivation
Testing
Checklist