AdvancedAnalyticsLabs
Analytics labs notebooks, supporting analytics teaching for BSc and MSc courses. I’ve taught these at a business school and a statistics department, so I think they fit both reasonably well. Currently, there are 19 labs uploaded divided into five topics:
Intro to Python
-
Introduction to Python: First few steps. Simple intro for people who might be already familiar with other languages, not meant for people with no programming experience!
-
Functions and Revenue Management: Implementation of simple algorithms (Littlewood, EMSR-a and EMSR-b). Covers function creation and an introduction to PyPlot. Taught until 2019 in Southampton University as part of Advanced Analytics course.
Banking Regulation
-
Basel Capital Requirements: Covers Lambda functions and an introduction to Pandas in the context of the Basel capital requirements formulas.
-
Bond Pricing: Teaches bond pricing, yields and clean/dirty prices. Taught from 2019 at Western University, as part of the Banking Analytics course I created. Replaces Revenue Management lab above, and also covers function creation and an introduction to PyPlot.
Credit Risk Modelling
-
Data Preprocessing: Simple data preprocessing using polars and scikit-learn.
-
Weight of evidence transformation: How to calculate Weight of Evidence transformations in Python. Uses my own fork of the scorecardpy package by @ShichenXie, with some bugs fixed and other personalizations.
-
Logistic Regression and Scorecards: Intro to scikit-learn, how to run a Lasso and Ridge regression, and how to calculate a scorecard. It includes larger-than-memory training using SGD.
-
Random Forest and XGBoosting: How to run a Random Forest, an XGBoost model, tune parameters over a grid, use Shapley values to explain predictions, and compare ROC curves. It also includes larger-than-memory training.
-
LGD Modelling: How to model LGD using either a GLM or an XGB model.
-
PD / LGD Calibration: How to define ratings by segmenting the AUC curve and calibrate a long-run PD / downturn LGD adjusted by macroeconomic factors using the Vasicek model.
Deep Learning
See our new book’s for the most up-to-date versions of these labs. They are also available in the book’s GitHub.
Other labs
-
SQL Refresher: Refresher on SQL, how to access it from Python, and a very light introduction to SQLAlchemy.
-
Primer on Visualization: A few plots using pyplot, seaborn and plotly. Very introductory primer.
-
Explainability and Confounding: How to use the Shap package to explain XGB models and a couple of confounding factors examples. Taught as part of the DS3000 - Intro to Machine Learning course at Western.
These labs are available under the GPL v3, feel free to use them as you wish. I’ll be grateful if you can point to the Github, as I’ll keep these updated in subsequent iterations of the modules where I teach this. As always, these notebooks are provided with no guarantees.
Comments are welcome!