-
Notifications
You must be signed in to change notification settings - Fork 0
refactor: deduplicate code #38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Warning Rate limit exceeded
⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📝 WalkthroughWalkthroughReplaces family-wide environment lookups with registry family API, centralizes API response construction, adds env_loader utilities for loading/warming/evaluating environments, refactors executor workers to use APIClient and env_loader, removes EpisodeResult dataclass, updates configs, and consolidates tests with fixtures and parametrization. Changes
Sequence Diagram(s)sequenceDiagram
participant Worker
participant EnvLoader as env_loader
participant Affinetes as af_env
participant Container
participant APIClient
Worker->>EnvLoader: build_load_kwargs(image, mode, ...)
EnvLoader-->>Worker: load_kwargs
Worker->>EnvLoader: load_and_warmup_env(family, image, load_kwargs)
EnvLoader->>Affinetes: load_env(load_kwargs)
Affinetes->>Container: create/start container
Container-->>Affinetes: ready
Affinetes-->>EnvLoader: env instance
EnvLoader->>Affinetes: env.list_environments() (warmup)
EnvLoader-->>Worker: env instance
loop per task
Worker->>EnvLoader: run_evaluation(env, task, ...)
EnvLoader->>Affinetes: env.evaluate(task, ...)
Affinetes-->>EnvLoader: TaskResult
EnvLoader-->>Worker: TaskResult
Worker->>APIClient: submit_results(TaskResult)
APIClient-->>Worker: ack
end
Worker->>EnvLoader: force_remove_container(container_name)
EnvLoader->>Container: docker rm -f
Container-->>EnvLoader: removed
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
tests/unit/test_cycle_isolation.py (1)
91-105:⚠️ Potential issue | 🟡 MinorTest doesn't verify its stated intent — no completed/failed task is present.
The name and docstring claim this test checks that completed/failed tasks are left unchanged, but it only creates a
PENDINGtask and asserts it gets cancelled. There's noCOMPLETEDorFAILEDtask to verify is left untouched, making this effectively a duplicate oftest_cancels_running_cycle_and_tasks.Add a completed task mock and assert its status remains unchanged:
Proposed fix
async def test_leaves_completed_tasks_unchanged(self, mock_session): """Completed/failed tasks from incomplete cycles are not modified.""" mock_cycle = _make_mock_cycle() - mock_task = _make_mock_task(TaskStatus.PENDING.value) + mock_pending_task = _make_mock_task(TaskStatus.PENDING.value) + mock_completed_task = _make_mock_task(TaskStatus.COMPLETED.value) - mock_session.execute = _mock_execute_results([mock_cycle], [mock_task]) + mock_session.execute = _mock_execute_results( + [mock_cycle], + [mock_pending_task, mock_completed_task], + ) storage = Storage("postgresql+asyncpg://test:test@localhost/test") cycles_cancelled, tasks_cancelled = await storage.cancel_incomplete_cycles(mock_session) - # Only the pending task is cancelled + # Only the pending task is cancelled; completed task is untouched assert tasks_cancelled == 1 - assert mock_task.status == TaskStatus.FAILED.value + assert mock_pending_task.status == TaskStatus.FAILED.value + assert mock_completed_task.status == TaskStatus.COMPLETED.value
🤖 Fix all issues with AI agents
In `@kinitro/executor/env_loader.py`:
- Around line 130-143: The code is discarding evaluation error details by
hardcoding error=None when building the TaskResult; change this to propagate any
error returned by the environment by extracting error = result.get("error") (or
None by default) and pass that into the TaskResult constructor (TaskResult(...,
error=error)); keep existing fields (task.task_uuid, success, score,
total_reward, timesteps) and only replace the hardcoded None with the extracted
error so evaluation failures include the environment's error message.
In `@kinitro/executor/family_worker.py`:
- Around line 111-121: The FamilyWorker instantiation chain never receives the
GPU flag so gpu_enabled is always false; pass ExecutorConfig.eval_gpu through
WorkerProcess.start into run_family_worker, add a gpu_enabled parameter to
run_family_worker and propagate it into FamilyWorker.__init__ (store as
self.gpu_enabled), and include gpu_enabled=self.gpu_enabled when calling
build_load_kwargs in FamilyWorker before load_and_warmup_env; update any
signatures that call run_family_worker (and tests) accordingly so family workers
honor eval_gpu instead of defaulting to False.
🧹 Nitpick comments (7)
kinitro/api/routes/scores.py (1)
17-17: Add type annotations to the helper's parameters.
cycleandscores_ormare untyped, which reduces IDE support and type-checker coverage. Since the types are already imported or available (EvaluationCycleORMfrom storage,list[MinerScoreORM]), annotating them is straightforward. The same applies to_build_weights_responseinweights.py.Proposed fix
You'll need to import the ORM types:
from kinitro.backend.models import EvaluationCycleORM, MinerScoreORMThen annotate:
-def _build_scores_response(cycle, scores_orm) -> ScoresResponse: +def _build_scores_response(cycle: EvaluationCycleORM, scores_orm: list[MinerScoreORM]) -> ScoresResponse:As per coding guidelines, "Add return types to public functions and methods. Prefer
BaseSettingsandBaseModeltype annotations for config/DTOs."kinitro/api/routes/weights.py (1)
13-13: Add type annotations to the helper's parameters (same asscores.py).Note that
cycleis nullable here sinceget_cyclereturnsEvaluationCycleORM | None, and the body already handlesNonewith conditionals on lines 25–27.Proposed fix
-def _build_weights_response(weights_orm, cycle) -> WeightsResponse: +def _build_weights_response(weights_orm: ComputedWeightsORM, cycle: EvaluationCycleORM | None) -> WeightsResponse:You'll need to add the ORM import:
from kinitro.backend.models import ComputedWeightsORM, EvaluationCycleORM, WeightsResponse, WeightsU16As per coding guidelines, "Add return types to public functions and methods. Prefer
BaseSettingsandBaseModeltype annotations for config/DTOs."tests/unit/test_crypto.py (2)
58-60: Good use of fixture to deduplicate keypair generation.Since the keypair is never mutated, you could optionally scope this to
"class"to avoid regenerating keys for every test method, but function scope is fine for isolation.
215-217: Remove the unnecessary aliasbackend_keypair = keypair.Either rename the fixture parameter to
backend_keypairdirectly, or usekeypairthroughout the method body. The intermediate assignment adds no value.Option A: rename the fixture parameter
- def test_full_commitment_flow(self, keypair): + def test_full_commitment_flow(self, backend_keypair): """Test full flow: generate keys, encrypt, parse, decrypt.""" - backend_keypair = keypairOption B: use `keypair` throughout
def test_full_commitment_flow(self, keypair): """Test full flow: generate keys, encrypt, parse, decrypt.""" - backend_keypair = keypair - # Miner encrypts their deployment ID deployment_id = "95edf2b6-e18b-400a-8398-5573df10e5e4" - encrypted_blob = encrypt_deployment_id(deployment_id, backend_keypair.public_key_hex()) + encrypted_blob = encrypt_deployment_id(deployment_id, keypair.public_key_hex()) ... - parsed["encrypted_deployment"], backend_keypair.private_key + parsed["encrypted_deployment"], keypair.private_keytests/unit/test_genesis.py (2)
78-80:object.__new__bypass is fragile — consider a comment listing the assumed invariant.This works because
_compute_rewardand_check_successcurrently only use their explicit arguments and never touchself.*attributes. If a future refactor introduces instance-state access in those methods, every test here will fail with an opaqueAttributeErrorinstead of a clear message.A minimal safeguard: add a brief inline comment noting the invariant, or assert early that the methods don't rely on instance state (e.g., a smoke attribute check). Not blocking, since the pattern is documented in the docstring.
620-762: Consider a@pytest.fixturefor the repeated_make_g1_env()calls.Every test method in
TestG1Reward(andTestG1Success) calls_make_g1_env()identically. A class-scoped or function-scoped fixture would reduce the repetition and make it trivial to swap the construction strategy later.Example fixture
`@pytest.fixture`() def g1_env() -> G1Environment: return _make_g1_env()Then each test takes
g1_envas a parameter instead of calling the helper.environments/metaworld/env.py (1)
46-343: Significant code duplication betweenmetaworld/env.pyandgenesis/env.pyActor classes.Given this PR's goal of deduplicating code, it's worth noting that the
Actorclasses inenvironments/metaworld/env.pyandenvironments/genesis/env.pyshare nearly identical structure:_call_miner,_run_evaluation,_build_error_result,_get_env, andcleanupare essentially the same, differing only in the family prefix string and a few minor details (e.g., genesis usesasyncio.Lockfor serialization). A shared baseActorclass or mixin could eliminate ~200 lines of duplication. This could be a follow-up.
3300722 to
2cdedef
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
tests/unit/test_cycle_isolation.py (1)
91-105:⚠️ Potential issue | 🟡 MinorTest name doesn't match what's actually verified.
test_leaves_completed_tasks_unchangedonly supplies a pending task via_make_mock_task(TaskStatus.PENDING.value). It never creates a completed task to assert it remains unmodified. Consider either renaming the test (e.g.,test_cancels_only_pending_tasks) or adding a completed mock task and asserting its status is unchanged.💡 Option: add a completed task to actually verify the claim
async def test_leaves_completed_tasks_unchanged(self, mock_session): """Completed/failed tasks from incomplete cycles are not modified.""" mock_cycle = _make_mock_cycle() mock_task = _make_mock_task(TaskStatus.PENDING.value) + mock_completed_task = _make_mock_task(TaskStatus.COMPLETED.value) - mock_session.execute = _mock_execute_results([mock_cycle], [mock_task]) + mock_session.execute = _mock_execute_results([mock_cycle], [mock_task, mock_completed_task]) storage = Storage("postgresql+asyncpg://test:test@localhost/test") cycles_cancelled, tasks_cancelled = await storage.cancel_incomplete_cycles(mock_session) # Only the pending task is cancelled assert tasks_cancelled == 1 assert mock_task.status == TaskStatus.FAILED.value + assert mock_completed_task.status == TaskStatus.COMPLETED.value
🧹 Nitpick comments (7)
tests/unit/test_genesis.py (1)
625-627: Consider extractingg1_envto a module-level fixture to avoid duplication.The same
g1_envfixture is defined identically in bothTestG1RewardandTestG1Success. You could define it once at module scope (or inconftest.py) to further reduce duplication — which aligns with this PR's objective.♻️ Example: module-level fixture
# At module level or in conftest.py `@pytest.fixture`() def g1_env() -> G1Environment: return _make_g1_env()Then remove the per-class fixture definitions and add
g1_envas a parameter to each test method.Also applies to: 772-774
tests/unit/test_crypto.py (1)
58-60: Samekeypairfixture pattern duplicated across three test classes.Identical
BackendKeypair.generate()fixtures appear inTestBackendKeypair,TestEncryptDecrypt, andTestIntegration(asbackend_keypair). Consider a single module-level orconftest.pyfixture to consolidate, consistent with this PR's deduplication goal.Also applies to: 141-143, 211-213
kinitro/api/routes/weights.py (1)
39-54: Consider adding return type annotations to endpoint functions.The coding guidelines require return types on public functions. While FastAPI infers the response from
response_model, adding-> WeightsResponsewould satisfy the guideline and improve readability.`@router.get`("/latest", response_model=WeightsResponse) async def get_latest_weights( session: AsyncSession = Depends(get_session), storage: Storage = Depends(get_storage), -): +) -> WeightsResponse:Same for
get_weights_for_blockat Line 58.kinitro/executor/env_loader.py (1)
147-161: Silentexcept Exception: passhides container-removal failures.During cleanup debugging, a completely silent swallow gives zero visibility. A
logger.debugwould cost nothing and help diagnose stuck-container issues.Proposed fix
try: subprocess.run( ["docker", "rm", "-f", container_name], capture_output=True, text=True, timeout=5, check=False, ) - except Exception: - pass + except Exception as e: + logger.debug("force_remove_container_failed", container=container_name, error=str(e))As per coding guidelines: "Avoid broad
except Exceptionunless you re-raise or return a clear error" and "Provide actionable error messages and log details withstructlog".kinitro/executor/family_worker.py (1)
244-246: Nit:_submit_resultwraps a single result in a list.
submit_resultsexpects a list, but this always submits one result. This is fine if the API client handles single-element lists efficiently, but worth noting in case batching is desired later.kinitro/executor/worker_process.py (1)
39-62: Positional arg list is getting unwieldy (17 args) — consider passing a config/dataclass.This isn't a regression from this PR, but adding
eval_gpuas the 17th positional arg further increases the risk of ordering mistakes. A config dataclass or kwargs dict would be safer for cross-process serialization. Fine to defer.kinitro/executor/worker.py (1)
22-34: Remove the unusedEvalEnvConfigdataclass.This dataclass is not referenced anywhere in the codebase and appears to be leftover from earlier code. Removing it (lines 22-33) will eliminate dead code.
2cdedef to
271097d
Compare
Shr1ftyy
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm.
Summary by CodeRabbit
Refactor
Config
Tests
Chores