Skip to content

Conversation

@ZeroSumQuant
Copy link
Owner

Summary

This PR implements Phase 3 of the master cleanup script, adding three new transformation phases that enhance code quality and consistency:

  1. Import deduplication and organization (fix_imports)
  2. Docstring normalization (fix_docstrings)
  3. AST-based empty body detection (ast_empty_body_sweep)

Changes

🔧 Import Deduplication (fix_imports)

  • Splits comma-separated imports into individual lines for better readability
  • Removes duplicate import statements while preserving the first occurrence
  • Groups imports by type (stdlib → third-party → local) while maintaining original order within groups
  • Preserves inline comments (e.g., # noqa: F401)
  • Handles aliased imports correctly (keeps both import numpy and import numpy as np)

Example transformation:

# Before
import os, sys, json
import os
from os import path

# After  
import os
import sys
import json
from os import path

📝 Docstring Normalization (fix_docstrings)

  • Converts single quotes to double quotes for consistency
  • Splits long one-line docstrings (>72 chars) into multi-line format
  • Ensures closing quotes are on their own line for multi-line docstrings
  • Skips raw strings to avoid breaking regex patterns

🔍 AST Empty Body Detection (ast_empty_body_sweep)

  • Uses Python's AST to accurately detect empty function/class bodies
  • Distinguishes between truly empty bodies and those containing only definitions
  • Adds pass statements where needed for syntactic correctness
  • Includes fallback detection for unparseable code (e.g., incomplete functions)
  • Preserves existing pass and ellipsis (...) statements

Example transformation:

# Before
async def outer():
    async def inner():
        pass

# After
async def outer():
    async def inner():
        pass
    pass

Testing

  • ✅ All 17 Phase 3 tests passing
  • ✅ Security checks pass (bandit, safety)
  • ✅ Code formatted with black
  • ✅ Comprehensive test coverage including edge cases

Test Coverage

The test suite covers:

  • Duplicate import removal
  • Multi-import splitting
  • Comment preservation
  • Mixed import styles
  • Aliased imports
  • Docstring normalization
  • Empty body detection (async functions, classes with docstrings, etc.)

Implementation Notes

  • The import deduplication preserves order within groups rather than alphabetizing
  • Empty body detection uses AST's end_lineno for precise pass insertion
  • Fallback mechanisms handle edge cases like unparseable syntax

Related Issues

Part of the master cleanup script development (#2)

@ZeroSumQuant
Copy link
Owner Author

CI Analysis

The CI failures are due to pre-existing issues in the cake/ directory that are unrelated to this PR:

Security Issues (Bandit)

  • 4 HIGH severity issues (pickle usage, MD5 hashes, eval)
  • 8 MEDIUM severity issues
  • 40 LOW severity issues

All in files under cake/:

  • cake/adapters/claude_orchestration.py: pickle.load() calls
  • cake/components/adaptive_confidence_engine.py: pickle.load()
  • cake/components/semantic_error_classifier.py: pickle.load()
  • cake/utils/cross_task_knowledge_ledger.py: MD5 usage, hardcoded /tmp/
  • cake/utils/info_fetcher.py: MD5 usage
  • cake/utils/rule_creator.py: MD5 usage, eval() call

Linting Issues

The lint-and-test job is failing on the CAKE linting suite check for the cake/ directory.

This PR's Changes

This PR only modifies:

  • scripts/master_cleanup.py - Added Phase-3 implementations
  • tests/test_phase3.py - Added new test file

Neither of these files are in the cake/ directory where the CI issues are occurring.

Recommendation

Since these are pre-existing issues unrelated to the Phase-3 implementation, we should:

  1. Merge this PR as the Phase-3 implementation is complete and all tests pass locally
  2. Create a separate issue/PR to address the pre-existing security concerns in the cake/ directory

… green)

- Implement fix_imports() with comma-split and deduplication
- Preserve original order within stdlib/third-party groups
- Plain imports come before aliased imports
- Comments (# noqa) preserved on correct lines
- Fix ast_empty_body_sweep() to handle nested functions
- Use AST end_lineno for precise pass insertion
- Add fallback for unparseable empty functions
- All 17 Phase 3 tests passing
@ZeroSumQuant ZeroSumQuant force-pushed the feature/phase3-import-docstring branch from 482c190 to 85e8b20 Compare June 5, 2025 00:51
@ZeroSumQuant
Copy link
Owner Author

✅ All CI Checks Now Passing!

After rebasing onto the updated main (with hardening changes from PR #45), all CI checks are green:

  • ✅ lint-and-test (3.10) - passing
  • ✅ lint-and-test (3.11) - passing
  • ✅ security-check - passing
  • ✅ code-quality - passing
  • ✅ validate-docs - passing

Phase-3 Implementation Complete

All 17 tests pass locally and in CI. The PR is ready for final review and merge.

Next Steps:

  1. Squash & merge this PR
  2. Tag as v2025-06-05-phase-3
  3. Optional: Run black/isort on cake/ directory in a future hardening pass

@ZeroSumQuant ZeroSumQuant merged commit 04d19fb into main Jun 5, 2025
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants