This notebook is an exercise in the Data Visualization course. You can reference the tutorial at this link.
In this exercise, you will use your new knowledge to propose a solution to a real-world scenario. To succeed, you will need to import data into Python, answer questions using the data, and generate line charts to understand patterns in the data.
You have recently been hired to manage the museums in the City of Los Angeles. Your first project focuses on the four museums pictured in the images below.
You will leverage data from the Los Angeles Data Portal that tracks monthly visitors to each museum.
Run the next cell to import and configure the Python libraries that you need to complete the exercise.
import pandas as pd
pd.plotting.register_matplotlib_converters()
import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns
print("Setup Complete")
Setup Complete
The questions below will give you feedback on your work. Run the following cell to set up the feedback system.
# Set up code checking
import os
if not os.path.exists("../input/museum_visitors.csv"):
os.symlink("../input/data-for-datavis/museum_visitors.csv", "../input/museum_visitors.csv")
from learntools.core import binder
binder.bind(globals())
from learntools.data_viz_to_coder.ex2 import *
print("Setup Complete")
Setup Complete
Your first assignment is to read the LA Museum Visitors data file into museum_data
. Note that:
museum_filepath
. Please do not change the provided value of the filepath."Date"
. (This can be seen in cell A1 when the file is opened in Excel.)To help with this, you may find it useful to revisit some relevant code from the tutorial, which we have pasted below:
# Path of the file to read
spotify_filepath = "../input/spotify.csv"
# Read the file into a variable spotify_data
spotify_data = pd.read_csv(spotify_filepath, index_col="Date", parse_dates=True)
The code you need to write now looks very similar!
# Path of the file to read
museum_filepath = "../input/museum_visitors.csv"
# Fill in the line below to read the file into a variable museum_data
museum_data = pd.read_csv(museum_filepath, index_col="Date", parse_dates=True)
# Run the line below with no changes to check that you've loaded the data correctly
step_1.check()
Correct
# Uncomment the line below to receive a hint
#step_1.hint()
# Uncomment the line below to see the solution
#step_1.solution()
Use a Python command to print the last 5 rows of the data.
# Print the last five rows of the data
museum_data.tail()
Avila Adobe | Firehouse Museum | Chinese American Museum | America Tropical Interpretive Center | |
---|---|---|---|---|
Date | ||||
2018-07-01 | 23136 | 4191 | 2620 | 4718 |
2018-08-01 | 20815 | 4866 | 2409 | 3891 |
2018-09-01 | 21020 | 4956 | 2146 | 3180 |
2018-10-01 | 19280 | 4622 | 2364 | 3775 |
2018-11-01 | 17163 | 4082 | 2385 | 4562 |
The last row (for 2018-11-01
) tracks the number of visitors to each museum in November 2018, the next-to-last row (for 2018-10-01
) tracks the number of visitors to each museum in October 2018, and so on.
Use the last 5 rows of the data to answer the questions below.
# Fill in the line below: How many visitors did the Chinese American Museum
# receive in July 2018?
ca_museum_jul18 = 2620
# Fill in the line below: In October 2018, how many more visitors did Avila
# Adobe receive than the Firehouse Museum?
avila_oct18 = 19280 - 4622
# Check your answers
step_2.check()
Correct
# Lines below will give you a hint or solution code
#step_2.hint()
step_2.solution()
Solution:
# Print the last five rows of the data
museum_data.tail()
# How many visitors did the Chinese American Museum
# receive in July 2018?
ca_museum_jul18 = 2620
# In October 2018, how many more visitors did Avila
# Adobe receive than the Firehouse Museum?
avila_oct18 = 14658
The Firehouse Museum claims they ran an event in 2014 that brought an incredible number of visitors, and that they should get extra budget to run a similar event again. The other museums think these types of events aren't that important, and budgets should be split purely based on recent visitors on an average day.
To show the museum board how the event compared to regular traffic at each museum, create a line chart that shows how the number of visitors to each museum evolved over time. Your figure should have four lines (one for each museum).
(Optional) Note: If you have some prior experience with plotting figures in Python, you might be familiar with the
plt.show()
command. If you decide to use this command, please place it after the line of code that checks your answer (in this case, place it afterstep_3.check()
below) -- otherwise, the checking code will return an error!
# Line chart showing the number of visitors to each museum over time
plt.figure(figsize=(14,6))
plt.title("Monthly visitors of Museums 2014-2018")
sns.lineplot(data=museum_data)
# Check your answer
step_3.check()
Correct
# Lines below will give you a hint or solution code
#step_3.hint()
#step_3.solution_plot()
When meeting with the employees at Avila Adobe, you hear that one major pain point is that the number of museum visitors varies greatly with the seasons, with low seasons (when the employees are perfectly staffed and happy) and also high seasons (when the employees are understaffed and stressed). You realize that if you can predict these high and low seasons, you can plan ahead to hire some additional seasonal employees to help out with the extra work.
Create a line chart that shows how the number of visitors to Avila Adobe has evolved over time. (If your code returns an error, the first thing that you should check is that you've spelled the name of the column correctly! You must write the name of the column exactly as it appears in the dataset.)
# Line plot showing the number of visitors to Avila Adobe over time
plt.figure(figsize=(14,6))
plt.title("Monthly visitors to Avila Adobe 2014-2018")
sns.lineplot(data=museum_data['Avila Adobe'])
# Check your answer
step_4.a.check()
Thank you for creating a line chart! To see how your code compares to the official solution, please use the code cell below.
Correct
# Lines below will give you a hint or solution code
#step_4.a.hint()
step_4.a.solution_plot()
Solution:
# Set the width and height of the figure
plt.figure(figsize=(12,6))
# Add title
plt.title("Monthly Visitors to Avila Adobe")
# Line chart showing the number of visitors to Avila Adobe over time
sns.lineplot(data=museum_data['Avila Adobe'])
# Add label for horizontal axis
plt.xlabel("Date")
Does Avila Adobe get more visitors:
Using this information, when should the museum staff additional seasonal employees?
#step_4.b.hint()
# Check your answer (Run this code cell to receive credit!)
step_4.b.solution()
Move on to learn about bar charts and heatmaps with a new dataset!
Have questions or comments? Visit the Learn Discussion forum to chat with other Learners.