Skip to content

Commit 6af5854

Browse files
emerybergerclaude
andauthored
Refactor scalene_profiler.py into modular components (#967)
* Refactor scalene_profiler.py into modular components Extract separation of concerns from the main profiler: - ScaleneCPUProfiler: CPU sample processing logic - ScaleneTracing: File/function filtering with lru_cache - ScaleneLifecycle: Profiler lifecycle management Updates: - Reduce scalene_profiler.py from ~1885 to ~1584 lines - Use Filename type consistently across tracing APIs - Update scalene_tracer.py type signatures - Document new modules in scalene/README.md 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]> * Fix mypy error for Python 3.9 compatibility Add unused-ignore to type comment for IPython.get_ipython() call to handle both Python versions that do and don't flag this. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]> --------- Co-authored-by: Claude Opus 4.5 <[email protected]>
1 parent 742d53f commit 6af5854

File tree

7 files changed

+795
-366
lines changed

7 files changed

+795
-366
lines changed

refactoring_todo.md

Lines changed: 66 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,66 @@
1+
# Scalene Profiler Refactoring Plan
2+
3+
## Goal
4+
Refactor `scalene/scalene_profiler.py` into multiple files with clear separation of concerns.
5+
6+
## Status: ✅ COMPLETE
7+
8+
All verification checks pass:
9+
- `pytest tests/` - 147 tests passed
10+
- `mypy scalene` - No issues found
11+
- `ruff check scalene` - All checks passed
12+
13+
## File Sizes
14+
15+
| File | Lines |
16+
|------|-------|
17+
| `scalene_profiler.py` | 1,584 (was 1,885) |
18+
| `scalene_cpu_profiler.py` | 228 (new) |
19+
| `scalene_tracing.py` | 225 (new) |
20+
| `scalene_lifecycle.py` | 198 (new) |
21+
22+
**Net reduction**: ~300 lines from main profiler, with reusable logic extracted
23+
24+
## New Modules Created
25+
26+
### 1. `scalene_cpu_profiler.py`
27+
- **Class**: `ScaleneCPUProfiler`
28+
- **Purpose**: CPU profiling sample processing
29+
- **Key methods**:
30+
- `process_cpu_sample` - Main CPU sample handler
31+
- `_update_main_thread_stats` - Main thread statistics
32+
- `_update_thread_stats` - Other thread statistics
33+
34+
### 2. `scalene_tracing.py`
35+
- **Class**: `ScaleneTracing`
36+
- **Purpose**: Tracing decisions and file filtering with `lru_cache`
37+
- **Key methods**:
38+
- `should_trace` - Main entry point (cached)
39+
- `_passes_exclusion_rules` - Library exclusions
40+
- `_should_trace_by_location` - Path-based filtering
41+
- `_is_system_library` - System library detection
42+
43+
### 3. `scalene_lifecycle.py`
44+
- **Class**: `ScaleneLifecycle`
45+
- **Purpose**: Profiler lifecycle management (prepared for future use)
46+
47+
## Architecture
48+
49+
```
50+
scalene_profiler.py (Scalene class)
51+
├── ScaleneCPUProfiler (CPU sample processing)
52+
├── ScaleneTracing (file/function filtering)
53+
├── ScaleneMemoryProfiler (already existed)
54+
└── ScaleneSignalManager (already existed)
55+
```
56+
57+
## Completed Tasks
58+
59+
- [x] Extracted CPU profiling logic (~150 lines)
60+
- [x] Extracted tracing/filtering logic (~150 lines)
61+
- [x] Created lifecycle module for future use
62+
- [x] Updated type signatures to use `Filename` consistently
63+
- [x] Applied proper `lru_cache` usage
64+
- [x] All tests passing (147/147)
65+
- [x] Type checking passing (mypy)
66+
- [x] Linting passing (ruff)

scalene/README.md

Lines changed: 14 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -24,24 +24,36 @@
2424
* `scalene_arguments.py`:
2525
`ScaleneArguments` holds command-line arguments and their default values.
2626

27+
* `scalene_cpu_profiler.py`:
28+
`ScaleneCPUProfiler` handles CPU profiling sample processing, including main thread and background thread statistics collection. Extracted from `scalene_profiler.py` for separation of concerns.
29+
2730
* `scalene_gpu.py`:
2831
`ScaleneGPU` wraps the NVIDIA library to conveniently provide access to GPU statistics required by Scalene.
2932

33+
* `scalene_lifecycle.py`:
34+
`ScaleneLifecycle` manages profiler lifecycle operations including start, stop, and signal management.
35+
3036
* `scalene_magics.py`:
3137
Sets up the "magics" for using Scalene within Jupyter notebooks (`%scrun` and `%%scalene`).
3238

39+
* `scalene_memory_profiler.py`:
40+
`ScaleneMemoryProfiler` handles memory profiling sample processing for malloc, free, and memcpy operations.
41+
3342
* `scalene_output.py`:
3443
`ScaleneOutput` encapsulates functions used for generating Scalene's profiles either as text or HTML.
3544

3645
* `scalene_profiler.py`:
37-
The core of the Scalene profiler.
38-
46+
The core of the Scalene profiler. Coordinates CPU, memory, and GPU profiling by delegating to specialized modules (`ScaleneCPUProfiler`, `ScaleneTracing`, `ScaleneMemoryProfiler`).
47+
3948
* `scalene_signals.py`:
4049
Defines the Unix signals that Scalene uses (some of which must be kept in sync with `include/sampleheap.hpp`).
4150

4251
* `scalene_statistics.py`:
4352
Operations for managing the statistics generated by Scalene.
4453

54+
* `scalene_tracing.py`:
55+
`ScaleneTracing` handles tracing decisions and file filtering. Determines which files and functions should be profiled based on exclusion rules, profile-only patterns, and system library detection. Uses `lru_cache` for performance.
56+
4557
* `scalene_version.py`:
4658
The version number of Scalene which ultimately is reflected on `pypi` (for `pip` installs, used by `setup.py` in the top level directory).
4759

scalene/scalene_cpu_profiler.py

Lines changed: 228 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,228 @@
1+
"""CPU profiling logic for Scalene."""
2+
3+
from __future__ import annotations
4+
5+
import math
6+
import threading
7+
from typing import TYPE_CHECKING, Callable, cast
8+
9+
from scalene.runningstats import RunningStats
10+
from scalene.scalene_funcutils import ScaleneFuncUtils
11+
from scalene.scalene_statistics import (
12+
ByteCodeIndex,
13+
Filename,
14+
LineNumber,
15+
ScaleneStatistics,
16+
)
17+
from scalene.scalene_utility import add_stack, enter_function_meta
18+
from scalene.time_info import TimeInfo
19+
20+
if TYPE_CHECKING:
21+
from types import FrameType
22+
23+
24+
class ScaleneCPUProfiler:
25+
"""Handles CPU profiling sample processing."""
26+
27+
def __init__(self, stats: ScaleneStatistics, available_cpus: int) -> None:
28+
"""Initialize the CPU profiler.
29+
30+
Args:
31+
stats: The statistics object to update with CPU samples.
32+
available_cpus: Number of available CPUs for utilization calculations.
33+
"""
34+
self._stats = stats
35+
self._available_cpus = available_cpus
36+
37+
def process_cpu_sample(
38+
self,
39+
new_frames: list[tuple[FrameType, int, FrameType]],
40+
now: TimeInfo,
41+
gpu_load: float,
42+
gpu_mem_used: float,
43+
prev: TimeInfo,
44+
is_thread_sleeping: dict[int, bool],
45+
should_trace: Callable[[Filename, str], bool],
46+
last_cpu_interval: float,
47+
stacks_enabled: bool,
48+
) -> None:
49+
"""Handle interrupts for CPU profiling.
50+
51+
Args:
52+
new_frames: List of (frame, thread_id, original_frame) tuples.
53+
now: Current time information.
54+
gpu_load: Current GPU load (0.0-1.0).
55+
gpu_mem_used: Current GPU memory usage.
56+
prev: Previous time information.
57+
is_thread_sleeping: Dict mapping thread IDs to sleep status.
58+
should_trace: Function to check if a file/function should be traced.
59+
last_cpu_interval: The last CPU sampling interval.
60+
stacks_enabled: Whether stack collection is enabled.
61+
"""
62+
if not new_frames:
63+
return
64+
65+
elapsed = now - prev
66+
67+
# Skip samples with negative values (can occur in multi-process settings)
68+
if any([elapsed.virtual < 0, elapsed.wallclock < 0, elapsed.user < 0]):
69+
return
70+
71+
# Calculate CPU utilization
72+
cpu_utilization = 0.0
73+
if elapsed.wallclock != 0:
74+
cpu_utilization = elapsed.user / elapsed.wallclock
75+
76+
core_utilization = cpu_utilization / self._available_cpus
77+
if cpu_utilization > 1.0:
78+
cpu_utilization = 1.0
79+
elapsed.wallclock = elapsed.user
80+
81+
# Handle NaN GPU load
82+
if math.isnan(gpu_load):
83+
gpu_load = 0.0
84+
assert 0.0 <= gpu_load <= 1.0
85+
86+
gpu_time = gpu_load * elapsed.wallclock
87+
self._stats.gpu_stats.total_gpu_samples += gpu_time
88+
89+
python_time = last_cpu_interval
90+
c_time = max(elapsed.virtual - python_time, 0)
91+
total_time = python_time + c_time
92+
93+
# Count non-sleeping frames
94+
total_frames = sum(
95+
not is_thread_sleeping[tident] for frame, tident, orig_frame in new_frames
96+
)
97+
if total_frames == 0:
98+
total_frames = 1
99+
100+
normalized_time = total_time / total_frames
101+
average_python_time = python_time / total_frames
102+
average_c_time = c_time / total_frames
103+
average_cpu_time = (python_time + c_time) / total_frames
104+
105+
# Process main thread
106+
main_thread_frame = new_frames[0][0]
107+
108+
if stacks_enabled:
109+
add_stack(
110+
main_thread_frame,
111+
should_trace,
112+
self._stats.stacks,
113+
average_python_time,
114+
average_c_time,
115+
average_cpu_time,
116+
)
117+
118+
enter_function_meta(main_thread_frame, should_trace, self._stats)
119+
fname = Filename(main_thread_frame.f_code.co_filename)
120+
lineno = LineNumber(main_thread_frame.f_lineno)
121+
122+
main_tid = cast(int, threading.main_thread().ident)
123+
if not is_thread_sleeping[main_tid]:
124+
self._update_main_thread_stats(
125+
fname,
126+
lineno,
127+
now,
128+
average_python_time,
129+
average_c_time,
130+
average_cpu_time,
131+
cpu_utilization,
132+
core_utilization,
133+
gpu_load,
134+
gpu_mem_used,
135+
elapsed,
136+
)
137+
138+
# Process other threads
139+
for frame, tident, orig_frame in new_frames:
140+
if frame == main_thread_frame:
141+
continue
142+
143+
add_stack(
144+
frame,
145+
should_trace,
146+
self._stats.stacks,
147+
average_python_time,
148+
average_c_time,
149+
average_cpu_time,
150+
)
151+
152+
fname = Filename(frame.f_code.co_filename)
153+
lineno = LineNumber(frame.f_lineno)
154+
enter_function_meta(frame, should_trace, self._stats)
155+
156+
if is_thread_sleeping[tident]:
157+
continue
158+
159+
self._update_thread_stats(
160+
fname,
161+
lineno,
162+
orig_frame,
163+
normalized_time,
164+
cpu_utilization,
165+
core_utilization,
166+
)
167+
168+
# Cleanup
169+
del new_frames[:]
170+
del new_frames
171+
del is_thread_sleeping
172+
self._stats.cpu_stats.total_cpu_samples += total_time
173+
174+
def _update_main_thread_stats(
175+
self,
176+
fname: Filename,
177+
lineno: LineNumber,
178+
now: TimeInfo,
179+
average_python_time: float,
180+
average_c_time: float,
181+
average_cpu_time: float,
182+
cpu_utilization: float,
183+
core_utilization: float,
184+
gpu_load: float,
185+
gpu_mem_used: float,
186+
elapsed: TimeInfo,
187+
) -> None:
188+
"""Update statistics for the main thread."""
189+
cpu_stats = self._stats.cpu_stats
190+
gpu_stats = self._stats.gpu_stats
191+
192+
cpu_stats.cpu_samples_list[fname][lineno].append(now.wallclock)
193+
cpu_stats.cpu_samples_python[fname][lineno] += average_python_time
194+
cpu_stats.cpu_samples_c[fname][lineno] += average_c_time
195+
cpu_stats.cpu_samples[fname] += average_cpu_time
196+
cpu_stats.cpu_utilization[fname][lineno].push(cpu_utilization)
197+
cpu_stats.core_utilization[fname][lineno].push(core_utilization)
198+
199+
gpu_stats.gpu_samples[fname][lineno] += gpu_load * elapsed.wallclock
200+
gpu_stats.n_gpu_samples[fname][lineno] += elapsed.wallclock
201+
gpu_stats.gpu_mem_samples[fname][lineno].push(gpu_mem_used)
202+
203+
def _update_thread_stats(
204+
self,
205+
fname: Filename,
206+
lineno: LineNumber,
207+
orig_frame: FrameType,
208+
normalized_time: float,
209+
cpu_utilization: float,
210+
core_utilization: float,
211+
) -> None:
212+
"""Update statistics for non-main threads."""
213+
cpu_stats = self._stats.cpu_stats
214+
215+
# Check if the original caller is stuck inside a call
216+
if ScaleneFuncUtils.is_call_function(
217+
orig_frame.f_code,
218+
ByteCodeIndex(orig_frame.f_lasti),
219+
):
220+
# Attribute time to native
221+
cpu_stats.cpu_samples_c[fname][lineno] += normalized_time
222+
else:
223+
# Attribute time to Python
224+
cpu_stats.cpu_samples_python[fname][lineno] += normalized_time
225+
226+
cpu_stats.cpu_samples[fname] += normalized_time
227+
cpu_stats.cpu_utilization[fname][lineno].push(cpu_utilization)
228+
cpu_stats.core_utilization[fname][lineno].push(core_utilization)

0 commit comments

Comments
 (0)