DownScaleXR — Downsampling-Induced Bias Under CPU Constraints in Chest X-ray CNNs

Overview

DownScaleXR is a controlled architectural study that isolates the effect of early spatial downsampling operators on:

generalization under noisy supervision
decision bias (false positives vs false negatives)
CPU inference latency and stability

The study uses intentionally simple CNNs to prevent representation capacity from masking architectural behavior.

This is not a model performance exercise. This is a mechanistic investigation of inductive bias under real deployment constraints.

Motivation

This project was driven by three practical realities:

CPU-only deployment
Many clinical and edge environments cannot rely on GPUs.
Noisy, limited data
Medical datasets amplify architectural bias.
Architecture literacy gap
Pooling and strided convolutions are often treated as interchangeable — they are not.

Core Question

How does spatial compression itself shape decision boundaries under limited supervision and CPU constraints?

Architectural Scope & Design Choices

To keep the study interpretable and controlled:

A LeNet-style CNN was used to minimize confounding factors.
Modern architectures (ResNet, MobileNet, EfficientNet) were intentionally avoided.
Skip connections, depthwise convolutions, and compound scaling dilute the observable effect of early downsampling.

This project isolates downsampling behavior — not representational capacity.

Downsampling Strategies Studied

All variants share identical depth, width, and parameter count (~11M).

AvgPool

Smooths spatial activations
Acts as an implicit regularizer
Produces conservative decision boundaries

MaxPool

Amplifies high-activation regions
Improves recall but increases false positives
Prone to pathology over-prediction

Strided Convolutions

Learnable downsampling
Under limited data, collapses to MaxPool-like behavior

Objectives

Quantify performance vs bias trade-offs
Measure real CPU latency and throughput
Examine generalization gaps under noise
Track everything via MLflow + DagsHub

Project Structure

DownScaleXR/
├─ configs/ # YAML-driven experiment configuration
├─ data/ # Raw and preprocessed CXR data
├─ model/ # Best checkpoints per variant
├─ artifacts/ # Metrics, plots, and inference visualizations
├─ notebooks/ # MLflow analysis & comparison
├─ scripts/ # Entry points and preprocessing
├─ src/ # Core training, models, experiments
├─ requirements.txt
└─ README.md

Experiments

Three LeNet variants were trained on the same chest X-ray dataset with identical hyperparameters to evaluate how different downsampling strategies affect performance and efficiency.

Model Name	Downsampling	Val AUC	Val F1	Val Precision	Val Recall	Val Accuracy	Train Accuracy	Inference Time (ms)	Throughput (FPS)	Parameters	Model Size (MB)
lenet_strided	strided	0.895	0.820	0.697	0.997	0.727	0.989	50.77	608.65	11.4 M	43.47
lenet_avgpool	avgpool	0.890	0.814	0.688	0.997	0.715	0.980	78.20	395.13	11.4 M	43.47
lenet_maxpool	maxpool	0.854	0.837	0.723	0.992	0.757	0.996	168.70	183.17	11.4 M	43.47

Observation:
All models have roughly the same parameter count and model size (~11M params, 43 MB). Differences arise primarily from downsampling strategy, impacting inference speed, throughput, and class-specific decision biases.

Confusion Matrices

The test set predictions highlight differences in model behavior based on downsampling strategy:

Model	Confusion Matrix
lenet_avgpool
lenet_maxpool
lenet_strided

Insights:

AvgPool: Balanced errors; moderate false positives and false negatives. Conservative decision boundaries.
MaxPool: High recall for pneumonia but over-predicts pathology. Bias toward positive class.
Strided Conv: Behavior similar to MaxPool; collapses to same decision bias on limited data.

Performance vs CPU Efficiency

Observations

FLOPs do not correlate with wall-clock latency on CPU.
Memory access patterns and operator behavior dominate runtime.
All models have the same parameter count and model size.
Observed differences arise solely from downsampling behavior.

Trade-offs

AvgPool → Best stability–accuracy balance
MaxPool → Highest F1, worst latency
Strided Conv → Fast throughput, unstable bias

CPU realism exposes architectural costs often hidden by theoretical efficiency metrics.

Visual Analysis

Key comparative plots:

Accuracy vs Latency:
Accuracy vs Model Size:
Accuracy vs Throughput:
CPU Efficiency Overview:
Generalization Gap (Train vs Val):
Validation Performance Summary:

Architectural Conclusions

Downsampling is a bias control mechanism, not a neutral operation
Small datasets amplify pooling-induced decision bias
Strided convolution is not inherently superior under limited data
CPU deployment reshuffles architectural trade-offs

Inference Visualizations

Side-by-Side Comparisons (Confusion Matrix, ROC, P-R Curve):
Model Comparison (Accuracy & F1 Score):

Architectural Conclusions

Downsampling is a bias control mechanism, not a neutral operation
Small datasets amplify pooling-induced decision bias
Strided convolution is not inherently superior under limited data
CPU deployment reshuffles architectural trade-offs

MLflow Tracking

All experiments were logged to DagsHub MLflow to ensure reproducibility, allow easy comparison, and facilitate structured analysis.

Tracking URI:
https://dagshub.com/Y-R-A-V-R-5/DownScaleXR.mlflow
Experiment Name:
DownScaleXR
Purpose:
- Store all metrics (train/validation), parameters, and tags
- Log artifacts such as plots and model checkpoints
- Enable side-by-side comparisons of different downsampling strategies
Minimal Usage:

import mlflow

# Set tracking URI
mlflow.set_tracking_uri("https://dagshub.com/Y-R-A-V-R-5/DownScaleXR.mlflow")

# Select experiment
mlflow.set_experiment("DownScaleXR")

Logged Artifacts

Model checkpoints: model/<variant>/best_model.pt
Plots: artifacts/comparision/*.png and artifacts/inference/*.png
Configuration files: configs/*.yaml

Logged Metrics

Performance: accuracy, precision, recall, f1_score, auc
Efficiency: inference_time_ms, throughput_fps, model_parameters, model_size_mb
Tracking: Metrics logged per epoch for both training and validation

Using MLflow + DagsHub ensures reproducibility, enables easy experiment comparisons, and provides structured logging of both performance and efficiency metrics.

What This Project Signals

This work demonstrates:

Constraint-first thinking
Architectural literacy beyond plug-and-play models
Ability to isolate variables and reason about bias
CPU-realistic performance evaluation
Reproducible, inspectable R&D workflow

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

DownScaleXR — Downsampling-Induced Bias Under CPU Constraints in Chest X-ray CNNs

Overview

Motivation

Core Question

Architectural Scope & Design Choices

Downsampling Strategies Studied

AvgPool

MaxPool

Strided Convolutions

Objectives

Project Structure

Experiments

Confusion Matrices

Performance vs CPU Efficiency

Observations

Trade-offs

Visual Analysis

Architectural Conclusions

Inference Visualizations

Architectural Conclusions

MLflow Tracking

Logged Artifacts

Logged Metrics

What This Project Signals

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
artifacts		artifacts
configs		configs
model		model
notebook		notebook
scripts		scripts
src		src
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Y-R-A-V-R-5/DownScaleXR

Folders and files

Latest commit

History

Repository files navigation

DownScaleXR — Downsampling-Induced Bias Under CPU Constraints in Chest X-ray CNNs

Overview

Motivation

Core Question

Architectural Scope & Design Choices

Downsampling Strategies Studied

AvgPool

MaxPool

Strided Convolutions

Objectives

Project Structure

Experiments

Confusion Matrices

Performance vs CPU Efficiency

Observations

Trade-offs

Visual Analysis

Architectural Conclusions

Inference Visualizations

Architectural Conclusions

MLflow Tracking

Logged Artifacts

Logged Metrics

What This Project Signals

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages