Machine Learning: A Probabilistic Perspective
This folder contains practical examples and implementations from Kevin Murphy’s “Machine Learning: A Probabilistic Perspective” (MLPP). The book provides a comprehensive introduction to machine learning from a unified probabilistic perspective.
Use this folder when you want a Bayesian and probabilistic view of ML rather than a purely optimization-first or API-first perspective. It is especially useful if uncertainty, latent variables, graphical models, or sampling methods keep showing up in your interests.
📚 Overview
The examples are organized by major topics from the book, focusing on hands-on implementation of key concepts and algorithms.
📁 Notebooks (13 Total)
Part I: Foundations (Chapters 2-5)
- 01_probability_fundamentals.ipynb - Probability theory basics, Bayes’ rule, probability distributions, information theory
- 02_generative_models.ipynb - Generative classifiers, Naive Bayes, discriminant analysis (Chapters 3-4)
- 03_gaussian_models.ipynb - Multivariate Gaussians, MVN inference, missing data, Gaussian mixtures
- 04_bayesian_statistics.ipynb - Bayesian inference, conjugate priors, posterior computation, empirical Bayes
Part II: Regression and Classification (Chapters 7-8)
- 05_linear_logistic_regression.ipynb - Linear regression (MLE, Ridge, Bayesian), Logistic regression, Softmax classification
Part III: Sparse Models and Regularization (Chapter 13)
- 06_sparse_linear_models.ipynb - Lasso, Elastic Net, coordinate descent, feature selection
Part IV: Kernel Methods and Gaussian Processes (Chapter 14)
- 07_kernels_gaussian_processes.ipynb - Kernel trick, kernel ridge regression, SVMs, GP regression
Part V: Graphical Models and Sequential Data (Chapters 10, 16-17)
- 08_graphical_models.ipynb - Bayesian networks, d-separation, Naive Bayes, Markov chains
- 09_hidden_markov_models.ipynb - HMMs, forward-backward algorithm, Viterbi decoding, Baum-Welch learning
Part VI: Sampling and Inference (Chapters 21-24)
- 10_mcmc_sampling.ipynb - Monte Carlo methods, Metropolis-Hastings, Gibbs sampling, convergence diagnostics
Part VII: Unsupervised Learning (Chapters 9, 11, 12, 25)
- 11_mixture_models_em.ipynb - Gaussian mixture models, EM algorithm, K-means, mixture of Bernoullis, BIC/AIC
- 12_dimensionality_reduction.ipynb - PCA, Probabilistic PCA, Factor Analysis, ICA, source separation
- 13_clustering.ipynb - K-means, hierarchical, spectral, DBSCAN, affinity propagation, evaluation metrics
🎯 Learning Objectives
After working through these notebooks, you will:
- Understand probability theory and Bayesian inference fundamentals
- Implement generative and discriminative classification models
- Work with graphical models, Bayesian networks, and HMMs
- Apply sparse models, regularization, and feature selection
- Use kernel methods, SVMs, and Gaussian processes
- Perform Bayesian inference with MCMC sampling methods
- Master unsupervised learning: clustering, dimensionality reduction, mixture models
- Understand EM algorithm and its applications
- Evaluate and compare different clustering algorithms
🚀 Quick Start
Step 1: Ensure you have the required dependencies:
pip install numpy scipy matplotlib scikit-learn pandas seabornStep 2: Start with the foundational notebooks (01-04) before moving to advanced topics
Step 3: Each notebook contains:
- Theory overview
- Code implementations
- Visualizations
- Practical examples
- Exercises
How To Use This Folder Well
- Start with the foundations and regression/classification sections before jumping to MCMC or graphical models.
- Focus on probabilistic reasoning, not just on reproducing code.
- Use this folder when you want deeper intuition for uncertainty, inference, and latent-variable modeling.
📖 Book Reference
Murphy, Kevin P. “Machine Learning: A Probabilistic Perspective.” MIT Press, 2012.
The PDF is available in: ML-Machine-Learning-A-Probabilistic-Perspective.pdf
🔗 Related Sections
- Foundational Math - Core mathematical concepts
- MML Book - Mathematics for Machine Learning
- ISLP Book - Statistical Learning with Python
- CS229 Course - Stanford ML Course
💡 Tips
- The book uses MATLAB examples; these notebooks use Python equivalents
- Focus on understanding probabilistic reasoning and Bayesian inference
- Practice implementing algorithms from scratch before using libraries
- Compare implementations with scikit-learn and other ML libraries
- Follow the suggested order for foundational concepts, but feel free to explore topics independently
- Each notebook is self-contained with theory, code, and visualizations
What Comes Next
- Continue to ../advanced/README.md if you want research-level probabilistic theory next.
- Continue to ../islp-book/README.md if you want a more classical statistical complement.
- Return to ../../16-model-evaluation/README.md or ../../28-practical-data-science/README.md when you want to connect these ideas to applied work.
📊 Coverage Summary
This collection covers approximately 50% of Murphy’s MLPP book, focusing on:
- ✅ Foundations (probability, Bayesian statistics)
- ✅ Supervised learning (regression, classification)
- ✅ Unsupervised learning (clustering, dimensionality reduction, mixture models)
- ✅ Graphical models (Bayesian networks, HMMs)
- ✅ Kernel methods and Gaussian processes
- ✅ MCMC and sampling methods
- ✅ EM algorithm and latent variable models
Topics not yet covered:
- Deep learning (Chapters 27-28)
- Advanced variational inference
- Advanced reinforcement learning
- Some specialized models and techniques
🤝 Contributing
Feel free to add more examples or improve existing ones following the established format.