Neural Networks

This module is where the repo shifts from classical ML intuition into modern deep learning. The goal is not just to run PyTorch code, but to understand why gradient-based learning, attention, and transformers work well enough that later LLM modules feel connected instead of magical.

Recommended Order

Companion reading:

What You Should Be Able To Explain

Why nonlinear activations are needed
How backpropagation moves signal through a network
Why PyTorch autograd matters in practice
What attention is computing and why scaling matters
How transformer blocks combine attention, MLPs, residual paths, and normalization

How To Study This Module

Spend more time on 04_backpropagation_explained.ipynb than on framework syntax.
Treat 06_attention_mechanism.ipynb as the bridge into LLM architecture.
Revisit 03-maths/foundational/07_neural_network_math.ipynb if gradients feel mechanical instead of intuitive.

Suggested Practice

Implement a tiny MLP from scratch with NumPy
Rebuild the same idea in PyTorch
Write down tensor shapes at each step of attention
Explain a transformer block without using the phrase “it just learns it”

Why This Module Matters

If this phase is weak, later phases on fine-tuning, local LLMs, evaluation, and agents become tool memorization. If this phase is strong, the rest of the repo becomes a connected system.

What Comes Next

Continue to ../12-llm-finetuning/README.md if you want to adapt models.
Continue to ../14-local-llms/README.md if you want to run and serve open models yourself.
Continue to ../15-ai-agents/README.md after you are comfortable with model behavior, tool use, and prompting.