Skip to content

This project demonstrates how to build and train a neural network to classify clothing items from the Fashion MNIST dataset using TensorFlow.

Notifications You must be signed in to change notification settings

mslawsky/fashion-classification-with-tensorflow

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

21 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Fashion MNIST Classification with TensorFlow πŸ‘•

TensorFlow Python Keras NumPy Matplotlib

A neural network implementation using TensorFlow to classify fashion items from the Fashion MNIST dataset. This project demonstrates image classification fundamentals including data preprocessing, model building, training, and evaluation.

Fashion MNIST Examples


Table of Contents πŸ“‹


Project Overview πŸ”Ž

This project builds a neural network model to recognize and classify clothing items from grayscale images. Unlike traditional "Hello World" examples that learn simple linear relationships, this project tackles a more challenging computer vision problem that showcases the power of neural networks in image recognition tasks.

Key Objectives:

  • Load and preprocess the Fashion MNIST dataset
  • Build and train a neural network classification model
  • Visualize and understand the training process
  • Evaluate model performance on unseen data
  • Experiment with different model architectures and parameters

Dataset Details πŸ“Š

The Fashion MNIST dataset includes 70,000 grayscale images of clothing items (28x28 pixels):

  • 60,000 training images
  • 10,000 test images

Each image is labeled with one of 10 clothing categories:

Label Description Example
0 T-shirt/top T-shirt
1 Trouser Trouser
2 Pullover Pullover
3 Dress Dress
4 Coat Coat
5 Sandal Sandal
6 Shirt Shirt
7 Sneaker Sneaker
8 Bag Bag
9 Ankle boot Ankle boot

Data Preprocessing:

  • Images are normalized from 0-255 pixel values to 0-1 range
  • Labels are represented as integers from 0-9

Model Architecture 🧠

The project explores two neural network architectures: a simple dense network and a more advanced convolutional neural network (CNN).

Basic Dense Network

Our baseline model uses a straightforward architecture:

model = tf.keras.models.Sequential([
    tf.keras.Input(shape=(28,28)),
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(128, activation=tf.nn.relu),
    tf.keras.layers.Dense(10, activation=tf.nn.softmax)
])

model.compile(optimizer=tf.optimizers.Adam(),
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

Architecture Breakdown:

  • Input Layer: Accepts 28x28 grayscale images
  • Flatten Layer: Converts 2D image arrays (28x28) to 1D arrays (784)
  • Hidden Layer: 128 neurons with ReLU activation
  • Output Layer: 10 neurons (one per clothing category) with Softmax activation
  • Optimizer: Adam (adaptive learning rate)
  • Loss Function: Sparse Categorical Crossentropy

Convolutional Neural Network

For improved accuracy, we implemented a CNN architecture:

model_cnn = tf.keras.models.Sequential([
    tf.keras.layers.Conv2D(64, (3,3), activation='relu', input_shape=(28, 28, 1)),
    tf.keras.layers.MaxPooling2D(2, 2),
    tf.keras.layers.Conv2D(64, (3,3), activation='relu'),
    tf.keras.layers.MaxPooling2D(2,2),
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(128, activation='relu'),
    tf.keras.layers.Dense(10, activation='softmax')
])

CNN Architecture Breakdown:

  • First Conv2D Layer: 64 filters with 3x3 kernels, ReLU activation
  • First MaxPooling Layer: 2x2 pooling, reducing spatial dimensions by half
  • Second Conv2D Layer: 64 filters with 3x3 kernels, ReLU activation
  • Second MaxPooling Layer: Further dimension reduction
  • Flatten Layer: Converts feature maps to 1D array
  • Dense Hidden Layer: 128 neurons with ReLU activation
  • Output Layer: 10 neurons with Softmax activation

The CNN architecture excels at image classification by learning hierarchical features directly from the pixel data.


Training Process πŸ”„

The model is trained for 5 epochs using the prepared dataset:

# Train the model
history = model.fit(training_images, training_labels, epochs=5)

# Evaluate on test data
test_loss, test_accuracy = model.evaluate(test_images, test_labels)
print(f"Test accuracy: {test_accuracy:.4f}")

Training Visualization:

Training Accuracy Curve

The graph shows steady improvement in accuracy across the training epochs, with the model quickly learning to distinguish between different clothing items.


Callbacks Implementation πŸ”„

Callbacks provide a powerful way to customize the training process by executing code at specific points during training. They can monitor metrics, stop training early, adjust learning rates, and more.

Custom Accuracy Threshold Callback

class AccuracyThresholdCallback(tf.keras.callbacks.Callback):
    def on_epoch_end(self, epoch, logs={}):
        if logs.get('accuracy') >= 0.98:
            self.model.stop_training = True
            print("\nReached 98% accuracy - stopping training!")

Common Callback Use Cases:

  1. Early Stopping: Stop training when a specified accuracy threshold is reached
  2. Model Checkpointing: Save the model at regular intervals or when improvements occur
    checkpoint_cb = tf.keras.callbacks.ModelCheckpoint(
        'fashion_mnist_model.h5', 
        save_best_only=True
    )
  3. Learning Rate Scheduling: Adjust learning rate during training for better convergence
    lr_scheduler = tf.keras.callbacks.ReduceLROnPlateau(
        factor=0.5,
        patience=3
    )
  4. TensorBoard Integration: Visualize training metrics in real-time
    tensorboard_cb = tf.keras.callbacks.TensorBoard(
        log_dir='./logs'
    )
  5. Custom Metrics Logging: Track and record specific metrics during training

Implementation Example:

# Create callback instances
accuracy_cb = AccuracyThresholdCallback()
checkpoint_cb = tf.keras.callbacks.ModelCheckpoint('fashion_mnist_model.h5')

# Use in model training
history = model.fit(
    training_images, 
    training_labels,
    epochs=10,
    callbacks=[accuracy_cb, checkpoint_cb]
)

This approach improves efficiency by preventing unnecessary training iterations once desired performance is reached, saving computational resources and time. Callbacks also enable automated model saving, which helps preserve the best-performing model versions throughout the training process.


Results πŸ“ˆ

After training for just 5 epochs, the model achieves impressive results:

Metric Training Set Test Set
Accuracy ~83% ~82%
Loss ~0.48 ~0.50

Classification Visualization:

For an ankle boot image (label 9), the model outputs probability scores:

[1.0767830e-06 1.8923657e-07 9.3867056e-06 1.4331826e-05 3.1927171e-05
 1.6217418e-01 1.6793387e-05 2.9690662e-01 4.1863704e-03 5.3665912e-01]

The highest probability (0.536) correctly corresponds to class 9 (ankle boot).


Improving MNIST with Convolutions πŸš€

Building on our work with Fashion MNIST, we've applied convolutional neural networks to the classic MNIST handwritten digits dataset to achieve significantly higher accuracy with minimal architecture changes.

Challenge Objectives

  • Reach 99.5% accuracy on MNIST using a minimal CNN architecture
  • Achieve this performance in less than 10 epochs
  • Implement an early stopping mechanism to halt training once target accuracy is reached

Preprocessing Approach

Similar to our Fashion MNIST implementation, we prepare the MNIST data through two key steps:

  1. Reshaping: Add an extra dimension to the image data (28Γ—28β†’28Γ—28Γ—1) to accommodate the channel dimension used by convolutional layers
  2. Normalization: Scale pixel values from 0-255 to 0-1 range for more effective training
def reshape_and_normalize(images):
    # Reshape to add the channel dimension
    images = images.reshape(images.shape[0], images.shape[1], images.shape[2], 1)
    
    # Normalize pixel values
    images = images / 255.0
    
    return images

Custom Callback Implementation

To efficiently monitor training progress and stop when we reach our accuracy target, we implemented a custom callback:

class EarlyStoppingCallback(tf.keras.callbacks.Callback):
    def on_epoch_end(self, epoch, logs={}):
        if logs.get('accuracy') >= 0.995:
            self.model.stop_training = True
            print("\nReached 99.5% accuracy so cancelling training!")

This callback checks the model's accuracy after each epoch and automatically halts training when we reach our target, saving computational resources.

Optimal CNN Architecture

Our experiments showed that a surprisingly minimal CNN architecture could achieve the 99.5% accuracy target:

model = tf.keras.models.Sequential([
    # Convolutional layer with 32 filters
    tf.keras.layers.Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)),
    # Pooling layer to reduce spatial dimensions
    tf.keras.layers.MaxPooling2D(2, 2),
    # Flatten layer to connect to dense layers
    tf.keras.layers.Flatten(),
    # Dense hidden layer
    tf.keras.layers.Dense(128, activation='relu'),
    # Output layer (10 digits)
    tf.keras.layers.Dense(10, activation='softmax')
])

Performance Comparison

Model Architecture Test Accuracy Epochs to 99.5% Parameters
Dense-only (baseline) ~98.2% N/A (max: 98.2%) 101,770
Single Conv + MaxPool >99.5% 5-7 93,322

Key Findings

  • Adding just one convolutional layer dramatically improved accuracy compared to dense-only networks
  • The architecture achieved >99.5% accuracy in approximately 5-7 epochs, well within our target
  • MaxPooling proved essential for efficient feature extraction while keeping parameter count manageable
  • The model is relatively lightweight while achieving state-of-the-art performance on this dataset

Feature Visualization

MNIST Convolution Visualization

Example visualization of convolutional layer activations for MNIST digits (visualization code not included in the assignment)

This experiment demonstrates the power of even simple convolutional architectures for image classification tasks, achieving near-perfect accuracy with minimal computational resources.


Installation & Usage πŸš€

Prerequisites

  • Python 3.6+
  • TensorFlow 2.x
  • NumPy
  • Matplotlib

Setup

# Clone this repository
git clone https://github.com/yourusername/fashion-mnist-classification.git

# Navigate to the project directory
cd fashion-mnist-classification

# Install dependencies
pip install tensorflow numpy matplotlib

Running the Notebook

jupyter notebook C1_W2_Lab_1_beyond_hello_world.ipynb

Example Code

# Load the Fashion MNIST dataset
fmnist = tf.keras.datasets.fashion_mnist
(training_images, training_labels), (test_images, test_labels) = fmnist.load_data()

# Normalize the images
training_images = training_images / 255.0
test_images = test_images / 255.0

# Build the model
model = tf.keras.models.Sequential([
    tf.keras.Input(shape=(28,28)),
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(128, activation=tf.nn.relu),
    tf.keras.layers.Dense(10, activation=tf.nn.softmax)
])

# Compile the model
model.compile(optimizer=tf.optimizers.Adam(),
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

# Create callback
accuracy_callback = AccuracyThresholdCallback()

# Train the model with callback
model.fit(training_images, training_labels, epochs=5, callbacks=[accuracy_callback])

# Make predictions
predictions = model.predict(test_images)

Exploration Exercises 🌎

The notebook includes several exercises to deepen your understanding:

  1. Neuron Count Experiments: Test different numbers of neurons in the hidden layer

    • Results show that increasing from 128 to 512 neurons improves accuracy but increases training time
  2. Layer Structure: Explore the impact of adding or removing layers

    • Adding a second hidden layer can capture more complex patterns but may require more training time
  3. Training Duration: Analyze the effect of training for more or fewer epochs

    • Training beyond 5-10 epochs shows diminishing returns and potential overfitting
  4. Early Stopping: Implement callbacks to stop training when desired accuracy is reached

    class myCallback(tf.keras.callbacks.Callback):
        def on_epoch_end(self, epoch, logs={}):
            if(logs.get('accuracy') >= 0.85):
                print("\nReached 85% accuracy - stopping training!")
                self.model.stop_training = True

Key Learnings πŸ”Ž

This project demonstrates several essential concepts in neural network development:

  1. Image Preprocessing: Normalizing pixel values for optimal training
  2. Activation Functions: Using ReLU for hidden layers and Softmax for multi-class output
  3. Model Evaluation: Distinguishing between training and test performance
  4. Overfitting: Recognizing when a model performs better on training than test data
  5. TensorFlow/Keras API: Working with Sequential models and configuring training
  6. Callback System: Customizing training behavior with callback functions
  7. Convolutional Neural Networks: Understanding how convolutions and pooling extract spatial features from images
  8. Feature Visualization: Interpreting model behavior by visualizing activations of internal layers
  9. Architecture Experimentation: Observing how changes in model structure affect performance and efficiency

Convolutions and Pooling βš™οΈ

Convolutional Neural Networks (CNNs) greatly improve image classification performance by learning spatial hierarchies of features through convolutional and pooling operations.

How Convolutions Work

Convolutions scan an input image with small filters (typically 3x3) to extract features:

Input Image β†’ Conv2D β†’ Feature Maps β†’ MaxPooling β†’ Reduced Feature Maps β†’ ...

Convolution Process Image adapted from Sumit Saha's "A Comprehensive Guide to Convolutional Neural Networks - the ELI5 way".

Each convolutional layer learns to detect different features:

  • First layers: Edges, corners, simple textures
  • Later layers: More complex patterns like fabric textures, clothing shapes

Visualization of Activations

We can visualize how the network "sees" different clothing items by examining the activations of convolutional layers:

CNN Activations

The above visualization shows how three different shoe images activate various filters in our convolutional layers. Notice how similar patterns emerge despite differences in the original images.

Performance Comparison

Experimenting with different CNN architectures showed significant improvements over the baseline model:

Model Architecture Test Accuracy Test Loss Parameters Training Time
Baseline (Dense) 87.3% 0.348 101,770 10s/epoch
CNN (64 filters) 90.1% 0.264 243,786 21s/epoch
CNN (32 filters) 89.2% 0.296 62,826 15s/epoch
Single Conv Layer 88.5% 0.323 110,218 13s/epoch
Triple Conv Layers 91.3% 0.244 294,922 24s/epoch

Key findings:

  • Adding convolutions improved accuracy by ~3-4%
  • Increasing filter count provided diminishing returns
  • Deeper networks (3+ conv layers) showed minor improvements but increased training time
  • The sweet spot was 2 convolutional layers with 64 filters each

These experiments demonstrate how convolutional architectures can effectively extract spatial features from image data, leading to better classification performance.


Contact πŸ“«

For inquiries about this analysis:


Β© 2025 Melissa Slawsky. All Rights Reserved.

About

This project demonstrates how to build and train a neural network to classify clothing items from the Fashion MNIST dataset using TensorFlow.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published