This notebook is an exercise in the Introduction to Machine Learning course. You can reference the tutorial at this link.


Recap

You've built your first model, and now it's time to optimize the size of the tree to make better predictions. Run this cell to set up your coding environment where the previous step left off.

Exercises

You could write the function get_mae yourself. For now, we'll supply it. This is the same function you read about in the previous lesson. Just run the cell below.

Step 1: Compare Different Tree Sizes

Write a loop that tries the following values for max_leaf_nodes from a set of possible values.

Call the get_mae function on each value of max_leaf_nodes. Store the output in some way that allows you to select the value of max_leaf_nodes that gives the most accurate model on your data.

Step 2: Fit Model Using All Data

You know the best tree size. If you were going to deploy this model in practice, you would make it even more accurate by using all of the data and keeping that tree size. That is, you don't need to hold out the validation data now that you've made all your modeling decisions.

You've tuned this model and improved your results. But we are still using Decision Tree models, which are not very sophisticated by modern machine learning standards. In the next step you will learn to use Random Forests to improve your models even more.

Keep Going

You are ready for Random Forests.


Have questions or comments? Visit the Learn Discussion forum to chat with other Learners.