This notebook is an exercise in the Introduction to Machine Learning course. You can reference the tutorial at this link.


Recap

You've built a model. In this exercise you will test how good your model is.

Run the cell below to set up your coding environment where the previous exercise left off.

Exercises

Step 1: Split Your Data

Use the train_test_split function to split up your data.

Give it the argument random_state=1 so the check functions know what to expect when verifying your code.

Recall, your features are loaded in the DataFrame X and your target is loaded in y.

Step 2: Specify and Fit the Model

Create a DecisionTreeRegressor model and fit it to the relevant data. Set random_state to 1 again when creating the model.

Step 3: Make Predictions with Validation data

Inspect your predictions and actual values from validation data.

What do you notice that is different from what you saw with in-sample predictions (which are printed after the top code cell in this page).

Do you remember why validation predictions differ from in-sample (or training) predictions? This is an important idea from the last lesson.

Step 4: Calculate the Mean Absolute Error in Validation Data

Is that MAE good? There isn't a general rule for what values are good that applies across applications. But you'll see how to use (and improve) this number in the next step.

Keep Going

You are ready for Underfitting and Overfitting.


Have questions or comments? Visit the Learn Discussion forum to chat with other Learners.