Decision Trees and Ensembles: From Splits to Random Forests and XGBoost

Decision Trees and Ensembles: From Splits to Random Forests and XGBoost

A complete guide to decision trees, covering entropy, information gain, one-hot encoding, regression trees, and ensemble methods like Random Forest and XGBoost.

Jongmin Lee
9 min read

4. Decision trees

Decision trees

Screenshot 2024-04-03 at 5.44.16 PM.png

Screenshot 2024-04-03 at 5.44.50 PM.png

Screenshot 2024-04-03 at 5.45.20 PM.png

Learning Process

Screenshot 2024-04-03 at 5.49.39 PM.png

Screenshot 2024-04-03 at 5.51.26 PM.png

Screenshot 2024-04-03 at 5.57.17 PM.png

Decision tree learning

Measuring purity

Screenshot 2024-04-03 at 7.47.45 PM.png

Screenshot 2024-04-03 at 7.48.46 PM.png

Choosing a split: Information Gain

Screenshot 2024-04-03 at 7.50.16 PM.png

Screenshot 2024-04-03 at 7.50.27 PM.png

Putting it together

Screenshot 2024-04-03 at 7.52.21 PM.png

Screenshot 2024-04-03 at 7.52.38 PM.png

Using one-hot encoding of categorical features

Screenshot 2024-04-03 at 7.55.55 PM.png

Screenshot 2024-04-03 at 7.56.04 PM.png

Screenshot 2024-04-03 at 7.56.29 PM.png

Screenshot 2024-04-03 at 7.56.37 PM.png

Continuous valued features

Screenshot 2024-04-03 at 7.58.06 PM.png

Screenshot 2024-04-03 at 7.58.19 PM.png

Screenshot 2024-04-03 at 7.58.28 PM.png

Regression Trees

Screenshot 2024-04-03 at 7.59.28 PM.png

Screenshot 2024-04-03 at 7.59.43 PM.png

Tree ensembles

Using multiple decision trees

Screenshot 2024-04-06 at 11.07.46 AM.png

Screenshot 2024-04-06 at 11.08.06 AM.png

Sampling with replacement

Screenshot 2024-04-06 at 11.09.42 AM.png

Screenshot 2024-04-06 at 11.09.54 AM.png

Random forest algorithm

Screenshot 2024-04-06 at 11.11.49 AM.png

Screenshot 2024-04-06 at 11.12.00 AM.png

XGBoost

Screenshot 2024-04-06 at 11.14.27 AM.png

Screenshot 2024-04-06 at 11.14.37 AM.png

Screenshot 2024-04-06 at 11.15.06 AM.png

When to use decision trees

Screenshot 2024-04-06 at 11.19.05 AM.png