Home » Deep Learning Foundations

Chapter 13 · Mini-Project

DL Mini-Project: Digits Classifier in NumPy

Build a 2-layer MLP to classify handwritten digits using only NumPy. Full pipeline: data, init, training, evaluation.

March 20, 2026 · 3 min · 549 words · codefrydev | Suggest Changes

Learning objectives

Build a complete neural network pipeline from data loading to evaluation using only NumPy
Implement forward pass, cross-entropy loss, backpropagation, and SGD in sequence
Track and interpret a training loss curve
Connect this pipeline to the DQN training pattern

Concept and real-world motivation

This mini-project combines everything from the DL Foundations section. You will build a 2-layer MLP to classify handwritten digits — the same pipeline used in DQN: input → hidden layers → output. The input is a flattened image (pixel values), the hidden layers extract features, and the output layer predicts a class (or in DQN, a Q-value per action).

We use sklearn’s digits dataset — 1797 samples of 8×8 = 64-pixel images of digits 0–9. We take the first 100 samples to keep computation fast in the browser.

Step 1 — Prepare data

Try it — edit and run (Shift+Enter)

Step 2 — Initialize the MLP

Architecture: 64 → 32 → 10 (input features → hidden → output classes)

Try it — edit and run (Shift+Enter)

Step 3 — Training loop

Try it — edit and run (Shift+Enter)

Step 4 — Plot loss curve

Try it — edit and run (Shift+Enter)

Step 5 — Evaluate on test set

Try it — edit and run (Shift+Enter)

Debug exercise: Fix the softmax that doesn’t sum to 1 (missing normalization):

Try it — edit and run (Shift+Enter)

Professor’s hints

On only 80 training samples, the network can memorize the data. Watch the loss curve — if it goes to near-zero, the model is overfitting on this tiny dataset.
With lr=0.1 on 200 epochs you should see clear learning. If loss barely moves, try lr=0.5.
The test accuracy with 100 samples and simple MLP will be modest (~50–70%) — this is expected. With all 1797 samples, it reaches ~95%.

Common pitfalls

Running the evaluation cell without first running the training cell (weights won’t be trained).
Using the wrong axis in softmax: use axis=1 for batches (rows are samples), not axis=0.

Worked solution comparison with PyTorch

For a PyTorch comparison, use the local notebook:

DL Mini-Project in PyTorch (run locally)

Extra practice

Warm-up: Run only Step 1. Print the pixel values of the first training sample. Reshape it to 8×8 and print.
Coding: Add L2 regularization (lambda=0.01) to the training loop in Step 3. Does the test accuracy improve?
Challenge: Scale to all 1797 samples. Add a third hidden layer (64→128→64→10). What test accuracy do you achieve?
Variant: Replace SGD with a hand-coded Adam optimizer in the training loop. Compare convergence speed.
Debug: Modify Step 3 to introduce a bug: divide by n_classes instead of len(Xb) in the gradient. Observe how training is affected.
Conceptual: How does this digits classifier pipeline compare to DQN? Map: input → state, hidden layers → feature extraction, output → Q-values/actions.
Recall: In 3 steps, describe the full training pipeline you implemented from raw pixels to accuracy score.