Challenges: Neural Networks

Hands-on challenges to deepen your understanding of neural networks

🚀 Challenge 1: Gradient Vanishing Detective

Difficulty: ⭐⭐ Beginner-Intermediate
Time: 30-45 minutes
Concepts: Activation functions, gradient flow, deep networks

The Problem

Build a very deep neural network (10+ layers) and observe the gradient vanishing problem in action.

Your Task

Create a 15-layer network with sigmoid activations
Train on MNIST dataset
Track gradient magnitudes at each layer during training
Visualize the gradients - notice how they get smaller in early layers?
Fix it by switching to ReLU - observe the difference!

Starter Code


class DeepNetwork:
    def __init__(self, n_layers=15, activation='sigmoid'):
        self.n_layers = n_layers
        self.activation = activation
        # TODO: Initialize layers
    
    def track_gradients(self):
        """Return gradient magnitudes for each layer."""
        # TODO: Track gradient norms
        pass

Success Criteria

Demonstrate gradient vanishing with sigmoid
Show improvement with ReLU
Create visualization comparing both
Explain why this happens mathematically

💡 Hint

Sigmoid derivative: max value is 0.25. If you chain 15 layers: 0.25^15 ≈ 0 (vanishing!) ReLU derivative: either 0 or 1, so gradients flow better.

🚀 Challenge 2: Architecture Search

Difficulty: ⭐⭐⭐ Intermediate
Time: 1-2 hours
Concepts: Hyperparameter tuning, model selection, experimental design

The Problem

Find the best neural network architecture for CIFAR-10 image classification.

Your Task

Experiment with:

Number of layers (2, 4, 6, 8)
Layer sizes (32, 64, 128, 256)
Activation functions (ReLU, LeakyReLU, ELU)
Dropout rates (0, 0.2, 0.5)
Batch sizes (32, 64, 128)

Requirements

Test at least 20 different architectures
Track accuracy, training time, and parameter count
Create visualization showing tradeoffs
Identify Pareto-optimal architectures
Present your findings

Deliverable


results/
├── experiment_log.csv      # All experiments
├── best_models.json        # Top 5 configurations
├── tradeoff_plot.png      # Accuracy vs params vs time
└── recommendations.md      # Your analysis

💡 Hint

Use a grid search or random search. Track experiments in a DataFrame. Consider accuracy/params ratio as a metric.

🚀 Challenge 3: Attention Visualization

Difficulty: ⭐⭐⭐ Intermediate-Advanced
Time: 2-3 hours
Concepts: Attention mechanism, visualization, interpretability

The Problem

Implement and visualize a simple attention mechanism for sequence classification.

Your Task

Build attention layer from scratch
Train on text sentiment classification
Visualize attention weights for sample inputs
Show which words the model focuses on
Compare with/without attention

Example Output


Input: "This movie was absolutely terrible and boring"
Attention weights:
- "terrible": 0.45 ██████████
- "boring": 0.38   ████████
- "absolutely": 0.10 ██
- Other words: &lt;0.05
Prediction: Negative (0.92 confidence)

Success Criteria

Working attention implementation
Heatmap visualization of attention
Interpretable results
Performance comparison

💡 Hint

Attention formula: α = softmax(score(h_i, query)) Output = Σ α_i * h_i Start with simple dot-product attention.

🚀 Challenge 4: Transfer Learning Master

Difficulty: ⭐⭐⭐⭐ Advanced
Time: 3-4 hours
Concepts: Transfer learning, fine-tuning, domain adaptation

The Problem

Use a pre-trained ImageNet model for a custom classification task with limited data.

Your Task

Choose a small custom dataset (100-500 images, 5-10 classes)
Load pre-trained ResNet/VGG/MobileNet
Try 3 approaches:
- Feature extraction (freeze all layers)
- Fine-tuning (unfreeze last few layers)
- Full training (train entire network)
Compare results and training time
Analyze which layers learn what

Dataset Suggestions

Food classification
Dog breed recognition
Flower species
Medical images
Your own photos

Analysis Required

Learning curves for each approach
Confusion matrices
Layer-wise feature visualization
Recommendations for when to use each approach

💡 Hint

Feature extraction works well with <1000 images. Fine-tuning helps when task is somewhat different from ImageNet. Full training needs lots of data.

🚀 Challenge 5: Neural Network Debugger

Difficulty: ⭐⭐⭐ Intermediate
Time: 1-2 hours
Concepts: Debugging, troubleshooting, systematic diagnosis

The Problem

You’re given 5 broken neural network implementations. Find and fix the bugs!

Scenarios

Bug 1: The Never-Learning Network


# Network trains but loss doesn't decrease
# What's wrong?
for epoch in range(100):
    loss = compute_loss(model(X), y)
    gradients = backprop(loss)
    # weights stay the same! Why?

Bug 2: The Exploding Loss


# Loss goes to infinity after a few batches
# What's the issue?
def train_step(X, y):
    pred = model(X)
    loss = mse_loss(pred, y)
    # loss = 1e+10 after iteration 3

Bug 3: The Plateauing Network


# Accuracy stuck at 10% on MNIST (10 classes)
# Red flag! What's happening?

Bug 4: The Slow Learner


# Training takes 10x longer than expected
# Same architecture, same data
# Where's the bottleneck?

Bug 5: The Overfitting Champion


# Training accuracy: 99%
# Validation accuracy: 45%
# How do you fix this?

Your Task

Identify each bug
Explain why it happens
Provide the fix
Share prevention strategies

💡 Hint 1

Bug 1: Are you actually updating the weights? Bug 2: Check learning rate and weight initialization Bug 3: Is your model predicting all one class? Bug 4: Profile your code - is it the data loading? Bug 5: Classic overfitting - what’s missing?

🌟 Meta Challenge: Build Your Own Framework

Difficulty: ⭐⭐⭐⭐⭐ Expert
Time: 8-12 hours
Concepts: Software engineering, API design, comprehensive understanding

The Ultimate Challenge

Build a mini deep learning framework (like PyTorch, but simpler).

Requirements

Your framework should support:

Automatic differentiation (autograd)
Common layers (Linear, Conv2D, ReLU, etc.)
Loss functions (MSE, CrossEntropy)
Optimizers (SGD, Adam)
Model save/load
GPU support (optional but impressive!)

Example Usage


from your_framework import Module, Linear, ReLU, CrossEntropyLoss, Adam
 
class MyModel(Module):
    def __init__(self):
        self.fc1 = Linear(784, 128)
        self.relu = ReLU()
        self.fc2 = Linear(128, 10)
    
    def forward(self, x):
        x = self.fc1(x)
        x = self.relu(x)
        x = self.fc2(x)
        return x
 
model = MyModel()
optimizer = Adam(model.parameters(), lr=0.001)
criterion = CrossEntropyLoss()
 
# Training loop works just like PyTorch!

Optional Stretch

Clean, documented API
Comprehensive tests
Tutorial notebooks
Performance benchmarks
PyTorch compatibility layer

📊 Challenge Completion Tracker

Mark off challenges as you complete them:

Challenge 1: Gradient Vanishing Detective
Challenge 2: Architecture Search
Challenge 3: Attention Visualization
Challenge 4: Transfer Learning Master
Challenge 5: Neural Network Debugger
Meta Challenge: Build Your Own Framework

Completed a challenge? Share your solution:

GitHub: Create a repo with your solution
Discussions: Post in Challenges Section
Tag: Use #ZeroToAI and #NNChallenge
Help Others: Comment on others’ solutions

💡 Learning Tips

Start small: Don’t try all challenges at once
Debug systematically: Print intermediate values, visualize data
Compare with libraries: Check your implementation against PyTorch
Read papers: Many challenges have research papers behind them
Ask for help: Post in Discussions if stuck

Happy challenging! 🚀

Remember: The goal is to learn, not to complete everything perfectly. Each challenge deepens your understanding!

Challenges: Neural Networks

🚀 Challenge 1: Gradient Vanishing Detective

The Problem

Your Task

Starter Code

Success Criteria

💡 Hint

🚀 Challenge 2: Architecture Search

The Problem

Your Task

Requirements

Deliverable

💡 Hint

🚀 Challenge 3: Attention Visualization

The Problem

Your Task

Example Output

Success Criteria

💡 Hint

🚀 Challenge 4: Transfer Learning Master

The Problem

Your Task

Dataset Suggestions

Analysis Required

💡 Hint

🚀 Challenge 5: Neural Network Debugger

The Problem

Scenarios

Your Task

💡 Hint 1

🌟 Meta Challenge: Build Your Own Framework

The Ultimate Challenge

Requirements

Example Usage

Optional Stretch

📊 Challenge Completion Tracker

🏆 Share Your Solutions!

💡 Learning Tips