{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**This notebook is an exercise in the [Pandas](https://www.kaggle.com/learn/pandas) course.  You can reference the tutorial at [this link](https://www.kaggle.com/residentmario/summary-functions-and-maps).**\n",
    "\n",
    "---\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Introduction\n",
    "\n",
    "Now you are ready to get a deeper understanding of your data.\n",
    "\n",
    "Run the following cell to load your data and some utility functions (including code to check your answers)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Setup complete.\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>country</th>\n",
       "      <th>description</th>\n",
       "      <th>designation</th>\n",
       "      <th>points</th>\n",
       "      <th>price</th>\n",
       "      <th>province</th>\n",
       "      <th>region_1</th>\n",
       "      <th>region_2</th>\n",
       "      <th>taster_name</th>\n",
       "      <th>taster_twitter_handle</th>\n",
       "      <th>title</th>\n",
       "      <th>variety</th>\n",
       "      <th>winery</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Italy</td>\n",
       "      <td>Aromas include tropical fruit, broom, brimston...</td>\n",
       "      <td>Vulkà Bianco</td>\n",
       "      <td>87</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Sicily &amp; Sardinia</td>\n",
       "      <td>Etna</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Kerin O’Keefe</td>\n",
       "      <td>@kerinokeefe</td>\n",
       "      <td>Nicosia 2013 Vulkà Bianco  (Etna)</td>\n",
       "      <td>White Blend</td>\n",
       "      <td>Nicosia</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Portugal</td>\n",
       "      <td>This is ripe and fruity, a wine that is smooth...</td>\n",
       "      <td>Avidagos</td>\n",
       "      <td>87</td>\n",
       "      <td>15.0</td>\n",
       "      <td>Douro</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Roger Voss</td>\n",
       "      <td>@vossroger</td>\n",
       "      <td>Quinta dos Avidagos 2011 Avidagos Red (Douro)</td>\n",
       "      <td>Portuguese Red</td>\n",
       "      <td>Quinta dos Avidagos</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>US</td>\n",
       "      <td>Tart and snappy, the flavors of lime flesh and...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>87</td>\n",
       "      <td>14.0</td>\n",
       "      <td>Oregon</td>\n",
       "      <td>Willamette Valley</td>\n",
       "      <td>Willamette Valley</td>\n",
       "      <td>Paul Gregutt</td>\n",
       "      <td>@paulgwine</td>\n",
       "      <td>Rainstorm 2013 Pinot Gris (Willamette Valley)</td>\n",
       "      <td>Pinot Gris</td>\n",
       "      <td>Rainstorm</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>US</td>\n",
       "      <td>Pineapple rind, lemon pith and orange blossom ...</td>\n",
       "      <td>Reserve Late Harvest</td>\n",
       "      <td>87</td>\n",
       "      <td>13.0</td>\n",
       "      <td>Michigan</td>\n",
       "      <td>Lake Michigan Shore</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Alexander Peartree</td>\n",
       "      <td>NaN</td>\n",
       "      <td>St. Julian 2013 Reserve Late Harvest Riesling ...</td>\n",
       "      <td>Riesling</td>\n",
       "      <td>St. Julian</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>US</td>\n",
       "      <td>Much like the regular bottling from 2012, this...</td>\n",
       "      <td>Vintner's Reserve Wild Child Block</td>\n",
       "      <td>87</td>\n",
       "      <td>65.0</td>\n",
       "      <td>Oregon</td>\n",
       "      <td>Willamette Valley</td>\n",
       "      <td>Willamette Valley</td>\n",
       "      <td>Paul Gregutt</td>\n",
       "      <td>@paulgwine</td>\n",
       "      <td>Sweet Cheeks 2012 Vintner's Reserve Wild Child...</td>\n",
       "      <td>Pinot Noir</td>\n",
       "      <td>Sweet Cheeks</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    country                                        description  \\\n",
       "0     Italy  Aromas include tropical fruit, broom, brimston...   \n",
       "1  Portugal  This is ripe and fruity, a wine that is smooth...   \n",
       "2        US  Tart and snappy, the flavors of lime flesh and...   \n",
       "3        US  Pineapple rind, lemon pith and orange blossom ...   \n",
       "4        US  Much like the regular bottling from 2012, this...   \n",
       "\n",
       "                          designation  points  price           province  \\\n",
       "0                        Vulkà Bianco      87    NaN  Sicily & Sardinia   \n",
       "1                            Avidagos      87   15.0              Douro   \n",
       "2                                 NaN      87   14.0             Oregon   \n",
       "3                Reserve Late Harvest      87   13.0           Michigan   \n",
       "4  Vintner's Reserve Wild Child Block      87   65.0             Oregon   \n",
       "\n",
       "              region_1           region_2         taster_name  \\\n",
       "0                 Etna                NaN       Kerin O’Keefe   \n",
       "1                  NaN                NaN          Roger Voss   \n",
       "2    Willamette Valley  Willamette Valley        Paul Gregutt   \n",
       "3  Lake Michigan Shore                NaN  Alexander Peartree   \n",
       "4    Willamette Valley  Willamette Valley        Paul Gregutt   \n",
       "\n",
       "  taster_twitter_handle                                              title  \\\n",
       "0          @kerinokeefe                  Nicosia 2013 Vulkà Bianco  (Etna)   \n",
       "1            @vossroger      Quinta dos Avidagos 2011 Avidagos Red (Douro)   \n",
       "2           @paulgwine       Rainstorm 2013 Pinot Gris (Willamette Valley)   \n",
       "3                   NaN  St. Julian 2013 Reserve Late Harvest Riesling ...   \n",
       "4           @paulgwine   Sweet Cheeks 2012 Vintner's Reserve Wild Child...   \n",
       "\n",
       "          variety               winery  \n",
       "0     White Blend              Nicosia  \n",
       "1  Portuguese Red  Quinta dos Avidagos  \n",
       "2      Pinot Gris            Rainstorm  \n",
       "3        Riesling           St. Julian  \n",
       "4      Pinot Noir         Sweet Cheeks  "
      ]
     },
     "execution_count": 1,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import pandas as pd\n",
    "pd.set_option(\"display.max_rows\", 5)\n",
    "reviews = pd.read_csv(\"../input/wine-reviews/winemag-data-130k-v2.csv\", index_col=0)\n",
    "\n",
    "from learntools.core import binder; binder.bind(globals())\n",
    "from learntools.pandas.summary_functions_and_maps import *\n",
    "print(\"Setup complete.\")\n",
    "\n",
    "reviews.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Exercises"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 1.\n",
    "\n",
    "What is the median of the `points` column in the `reviews` DataFrame?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/javascript": [
       "parent.postMessage({\"jupyterEvent\": \"custom.exercise_interaction\", \"data\": {\"outcomeType\": 1, \"valueTowardsCompletion\": 0.14285714285714285, \"interactionType\": 1, \"questionType\": 1, \"questionId\": \"1_MedianPoints\", \"learnToolsVersion\": \"0.3.4\", \"failureMessage\": \"\", \"exceptionClass\": \"\", \"trace\": \"\"}}, \"*\")"
      ],
      "text/plain": [
       "<IPython.core.display.Javascript object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/markdown": [
       "<span style=\"color:#33cc33\">Correct</span>"
      ],
      "text/plain": [
       "Correct"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "median_points = reviews.points.median()\n",
    "\n",
    "# Check your answer\n",
    "q1.check()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/javascript": [
       "parent.postMessage({\"jupyterEvent\": \"custom.exercise_interaction\", \"data\": {\"interactionType\": 3, \"questionType\": 1, \"questionId\": \"1_MedianPoints\", \"learnToolsVersion\": \"0.3.4\", \"valueTowardsCompletion\": 0.0, \"failureMessage\": \"\", \"exceptionClass\": \"\", \"trace\": \"\", \"outcomeType\": 4}}, \"*\")"
      ],
      "text/plain": [
       "<IPython.core.display.Javascript object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/markdown": [
       "<span style=\"color:#33cc99\">Solution:</span> \n",
       "```python\n",
       "median_points = reviews.points.median()\n",
       "```"
      ],
      "text/plain": [
       "Solution: \n",
       "```python\n",
       "median_points = reviews.points.median()\n",
       "```"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "#q1.hint()\n",
    "#q1.solution()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2. \n",
    "What countries are represented in the dataset? (Your answer should not include any duplicates.)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/javascript": [
       "parent.postMessage({\"jupyterEvent\": \"custom.exercise_interaction\", \"data\": {\"outcomeType\": 1, \"valueTowardsCompletion\": 0.14285714285714285, \"interactionType\": 1, \"questionType\": 2, \"questionId\": \"2_UniqueCountries\", \"learnToolsVersion\": \"0.3.4\", \"failureMessage\": \"\", \"exceptionClass\": \"\", \"trace\": \"\"}}, \"*\")"
      ],
      "text/plain": [
       "<IPython.core.display.Javascript object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/markdown": [
       "<span style=\"color:#33cc33\">Correct</span>"
      ],
      "text/plain": [
       "Correct"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "countries = reviews.country.unique()\n",
    "\n",
    "# Check your answer\n",
    "q2.check()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#q2.hint()\n",
    "#q2.solution()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 3.\n",
    "How often does each country appear in the dataset? Create a Series `reviews_per_country` mapping countries to the count of reviews of wines from that country."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/javascript": [
       "parent.postMessage({\"jupyterEvent\": \"custom.exercise_interaction\", \"data\": {\"outcomeType\": 1, \"valueTowardsCompletion\": 0.14285714285714285, \"interactionType\": 1, \"questionType\": 1, \"questionId\": \"3_ReviewsPerCountry\", \"learnToolsVersion\": \"0.3.4\", \"failureMessage\": \"\", \"exceptionClass\": \"\", \"trace\": \"\"}}, \"*\")"
      ],
      "text/plain": [
       "<IPython.core.display.Javascript object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/markdown": [
       "<span style=\"color:#33cc33\">Correct</span>"
      ],
      "text/plain": [
       "Correct"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "reviews_per_country = reviews.country.value_counts()\n",
    "\n",
    "# Check your answer\n",
    "q3.check()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#q3.hint()\n",
    "#q3.solution()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 4.\n",
    "Create variable `centered_price` containing a version of the `price` column with the mean price subtracted.\n",
    "\n",
    "(Note: this 'centering' transformation is a common preprocessing step before applying various machine learning algorithms.) "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/javascript": [
       "parent.postMessage({\"jupyterEvent\": \"custom.exercise_interaction\", \"data\": {\"outcomeType\": 1, \"valueTowardsCompletion\": 0.14285714285714285, \"interactionType\": 1, \"questionType\": 1, \"questionId\": \"4_CenteredPrice\", \"learnToolsVersion\": \"0.3.4\", \"failureMessage\": \"\", \"exceptionClass\": \"\", \"trace\": \"\"}}, \"*\")"
      ],
      "text/plain": [
       "<IPython.core.display.Javascript object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/markdown": [
       "<span style=\"color:#33cc33\">Correct</span>"
      ],
      "text/plain": [
       "Correct"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "mean_price = reviews.price.mean()\n",
    "centered_price = reviews.price.map(lambda r: r - mean_price)\n",
    "\n",
    "# Check your answer\n",
    "q4.check()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/javascript": [
       "parent.postMessage({\"jupyterEvent\": \"custom.exercise_interaction\", \"data\": {\"interactionType\": 2, \"questionType\": 1, \"questionId\": \"4_CenteredPrice\", \"learnToolsVersion\": \"0.3.4\", \"valueTowardsCompletion\": 0.0, \"failureMessage\": \"\", \"exceptionClass\": \"\", \"trace\": \"\", \"outcomeType\": 4}}, \"*\")"
      ],
      "text/plain": [
       "<IPython.core.display.Javascript object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/markdown": [
       "<span style=\"color:#3366cc\">Hint:</span> To get the mean of a column in a Pandas DataFrame, use the `mean` function."
      ],
      "text/plain": [
       "Hint: To get the mean of a column in a Pandas DataFrame, use the `mean` function."
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "q4.hint()\n",
    "#q4.solution()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 5.\n",
    "I'm an economical wine buyer. Which wine is the \"best bargain\"? Create a variable `bargain_wine` with the title of the wine with the highest points-to-price ratio in the dataset."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/javascript": [
       "parent.postMessage({\"jupyterEvent\": \"custom.exercise_interaction\", \"data\": {\"outcomeType\": 1, \"valueTowardsCompletion\": 0.14285714285714285, \"interactionType\": 1, \"questionType\": 2, \"questionId\": \"5_BargainWine\", \"learnToolsVersion\": \"0.3.4\", \"failureMessage\": \"\", \"exceptionClass\": \"\", \"trace\": \"\"}}, \"*\")"
      ],
      "text/plain": [
       "<IPython.core.display.Javascript object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/markdown": [
       "<span style=\"color:#33cc33\">Correct</span>"
      ],
      "text/plain": [
       "Correct"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "bargain_idmax = (reviews.points / reviews.price).idxmax()\n",
    "bargain_wine = reviews.loc[bargain_idmax, 'title']\n",
    "\n",
    "# Check your answer\n",
    "q5.check()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/javascript": [
       "parent.postMessage({\"jupyterEvent\": \"custom.exercise_interaction\", \"data\": {\"interactionType\": 2, \"questionType\": 2, \"questionId\": \"5_BargainWine\", \"learnToolsVersion\": \"0.3.4\", \"valueTowardsCompletion\": 0.0, \"failureMessage\": \"\", \"exceptionClass\": \"\", \"trace\": \"\", \"outcomeType\": 4}}, \"*\")"
      ],
      "text/plain": [
       "<IPython.core.display.Javascript object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/markdown": [
       "<span style=\"color:#3366cc\">Hint:</span> The `idxmax` method may be useful here."
      ],
      "text/plain": [
       "Hint: The `idxmax` method may be useful here."
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/javascript": [
       "parent.postMessage({\"jupyterEvent\": \"custom.exercise_interaction\", \"data\": {\"interactionType\": 3, \"questionType\": 2, \"questionId\": \"5_BargainWine\", \"learnToolsVersion\": \"0.3.4\", \"valueTowardsCompletion\": 0.0, \"failureMessage\": \"\", \"exceptionClass\": \"\", \"trace\": \"\", \"outcomeType\": 4}}, \"*\")"
      ],
      "text/plain": [
       "<IPython.core.display.Javascript object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/markdown": [
       "<span style=\"color:#33cc99\">Solution:</span> \n",
       "```python\n",
       "bargain_idx = (reviews.points / reviews.price).idxmax()\n",
       "bargain_wine = reviews.loc[bargain_idx, 'title']\n",
       "```"
      ],
      "text/plain": [
       "Solution: \n",
       "```python\n",
       "bargain_idx = (reviews.points / reviews.price).idxmax()\n",
       "bargain_wine = reviews.loc[bargain_idx, 'title']\n",
       "```"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "q5.hint()\n",
    "q5.solution()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 6.\n",
    "There are only so many words you can use when describing a bottle of wine. Is a wine more likely to be \"tropical\" or \"fruity\"? Create a Series `descriptor_counts` counting how many times each of these two words appears in the `description` column in the dataset."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0    Aromas include tropical fruit, broom, brimston...\n",
       "1    This is ripe and fruity, a wine that is smooth...\n",
       "2    Tart and snappy, the flavors of lime flesh and...\n",
       "3    Pineapple rind, lemon pith and orange blossom ...\n",
       "4    Much like the regular bottling from 2012, this...\n",
       "Name: description, dtype: object"
      ]
     },
     "execution_count": 23,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "reviews.description.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 37,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/javascript": [
       "parent.postMessage({\"jupyterEvent\": \"custom.exercise_interaction\", \"data\": {\"outcomeType\": 1, \"valueTowardsCompletion\": 0.14285714285714285, \"interactionType\": 1, \"questionType\": 1, \"questionId\": \"6_DescriptorCounts\", \"learnToolsVersion\": \"0.3.4\", \"failureMessage\": \"\", \"exceptionClass\": \"\", \"trace\": \"\"}}, \"*\")"
      ],
      "text/plain": [
       "<IPython.core.display.Javascript object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/markdown": [
       "<span style=\"color:#33cc33\">Correct</span>"
      ],
      "text/plain": [
       "Correct"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "# descriptor_counts = reviews.description.map(lambda r: 'tropical' in r or 'fruity' in r).value_counts()\n",
    "n_tropical = reviews.description.map(lambda d: 'tropical' in d).sum()\n",
    "n_fruit = reviews.description.map(lambda d: 'fruity' in d).sum()\n",
    "descriptor_counts = pd.Series([n_tropical, n_fruit], index=['tropical', 'fruity'])\n",
    "\n",
    "# Check your answer\n",
    "q6.check()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/javascript": [
       "parent.postMessage({\"jupyterEvent\": \"custom.exercise_interaction\", \"data\": {\"interactionType\": 2, \"questionType\": 1, \"questionId\": \"6_DescriptorCounts\", \"learnToolsVersion\": \"0.3.4\", \"valueTowardsCompletion\": 0.0, \"failureMessage\": \"\", \"exceptionClass\": \"\", \"trace\": \"\", \"outcomeType\": 4}}, \"*\")"
      ],
      "text/plain": [
       "<IPython.core.display.Javascript object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/markdown": [
       "<span style=\"color:#3366cc\">Hint:</span> Use a map to check each description for the string `tropical`, then count up the number of times this is `True`. Repeat this for `fruity`. Finally, create a `Series` combining the two values."
      ],
      "text/plain": [
       "Hint: Use a map to check each description for the string `tropical`, then count up the number of times this is `True`. Repeat this for `fruity`. Finally, create a `Series` combining the two values."
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/javascript": [
       "parent.postMessage({\"jupyterEvent\": \"custom.exercise_interaction\", \"data\": {\"interactionType\": 3, \"questionType\": 1, \"questionId\": \"6_DescriptorCounts\", \"learnToolsVersion\": \"0.3.4\", \"valueTowardsCompletion\": 0.0, \"failureMessage\": \"\", \"exceptionClass\": \"\", \"trace\": \"\", \"outcomeType\": 4}}, \"*\")"
      ],
      "text/plain": [
       "<IPython.core.display.Javascript object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/markdown": [
       "<span style=\"color:#33cc99\">Solution:</span> \n",
       "```python\n",
       "n_trop = reviews.description.map(lambda desc: \"tropical\" in desc).sum()\n",
       "n_fruity = reviews.description.map(lambda desc: \"fruity\" in desc).sum()\n",
       "descriptor_counts = pd.Series([n_trop, n_fruity], index=['tropical', 'fruity'])\n",
       "```"
      ],
      "text/plain": [
       "Solution: \n",
       "```python\n",
       "n_trop = reviews.description.map(lambda desc: \"tropical\" in desc).sum()\n",
       "n_fruity = reviews.description.map(lambda desc: \"fruity\" in desc).sum()\n",
       "descriptor_counts = pd.Series([n_trop, n_fruity], index=['tropical', 'fruity'])\n",
       "```"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "q6.hint()\n",
    "q6.solution()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 7.\n",
    "We'd like to host these wine reviews on our website, but a rating system ranging from 80 to 100 points is too hard to understand - we'd like to translate them into simple star ratings. A score of 95 or higher counts as 3 stars, a score of at least 85 but less than 95 is 2 stars. Any other score is 1 star.\n",
    "\n",
    "Also, the Canadian Vintners Association bought a lot of ads on the site, so any wines from Canada should automatically get 3 stars, regardless of points.\n",
    "\n",
    "Create a series `star_ratings` with the number of stars corresponding to each review in the dataset."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 39,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>country</th>\n",
       "      <th>description</th>\n",
       "      <th>designation</th>\n",
       "      <th>points</th>\n",
       "      <th>price</th>\n",
       "      <th>province</th>\n",
       "      <th>region_1</th>\n",
       "      <th>region_2</th>\n",
       "      <th>taster_name</th>\n",
       "      <th>taster_twitter_handle</th>\n",
       "      <th>title</th>\n",
       "      <th>variety</th>\n",
       "      <th>winery</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>454</th>\n",
       "      <td>Canada</td>\n",
       "      <td>An aromatic knockout with notes of peach, papa...</td>\n",
       "      <td>Reserve Icewine</td>\n",
       "      <td>92</td>\n",
       "      <td>30.0</td>\n",
       "      <td>Ontario</td>\n",
       "      <td>Niagara-On-The-Lake</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Sean P. Sullivan</td>\n",
       "      <td>@wawinereport</td>\n",
       "      <td>Pillitteri 2012 Reserve Icewine Vidal (Niagara...</td>\n",
       "      <td>Vidal</td>\n",
       "      <td>Pillitteri</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2616</th>\n",
       "      <td>Canada</td>\n",
       "      <td>A slightly earthy, spicy nose leads, followed ...</td>\n",
       "      <td>Fusion</td>\n",
       "      <td>83</td>\n",
       "      <td>12.0</td>\n",
       "      <td>Ontario</td>\n",
       "      <td>Niagara Peninsula</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Susan Kostrzewa</td>\n",
       "      <td>@suskostrzewa</td>\n",
       "      <td>Pillitteri 2004 Fusion Gewürztraminer-Riesling...</td>\n",
       "      <td>Gewürztraminer-Riesling</td>\n",
       "      <td>Pillitteri</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>129528</th>\n",
       "      <td>Canada</td>\n",
       "      <td>A delicious though somewhat reserved wine with...</td>\n",
       "      <td>Icewine</td>\n",
       "      <td>89</td>\n",
       "      <td>45.0</td>\n",
       "      <td>Ontario</td>\n",
       "      <td>Niagara Peninsula</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Sean P. Sullivan</td>\n",
       "      <td>@wawinereport</td>\n",
       "      <td>Henry of Pelham 2011 Icewine Vidal (Niagara Pe...</td>\n",
       "      <td>Vidal</td>\n",
       "      <td>Henry of Pelham</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>129581</th>\n",
       "      <td>Canada</td>\n",
       "      <td>Smooth and engaging, this offers classic varie...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>90</td>\n",
       "      <td>38.0</td>\n",
       "      <td>British Columbia</td>\n",
       "      <td>Okanagan Valley</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Paul Gregutt</td>\n",
       "      <td>@paulgwine</td>\n",
       "      <td>Burrowing Owl 2013 Syrah (Okanagan Valley)</td>\n",
       "      <td>Syrah</td>\n",
       "      <td>Burrowing Owl</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>257 rows × 13 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "       country                                        description  \\\n",
       "454     Canada  An aromatic knockout with notes of peach, papa...   \n",
       "2616    Canada  A slightly earthy, spicy nose leads, followed ...   \n",
       "...        ...                                                ...   \n",
       "129528  Canada  A delicious though somewhat reserved wine with...   \n",
       "129581  Canada  Smooth and engaging, this offers classic varie...   \n",
       "\n",
       "            designation  points  price          province             region_1  \\\n",
       "454     Reserve Icewine      92   30.0           Ontario  Niagara-On-The-Lake   \n",
       "2616             Fusion      83   12.0           Ontario    Niagara Peninsula   \n",
       "...                 ...     ...    ...               ...                  ...   \n",
       "129528          Icewine      89   45.0           Ontario    Niagara Peninsula   \n",
       "129581              NaN      90   38.0  British Columbia      Okanagan Valley   \n",
       "\n",
       "       region_2       taster_name taster_twitter_handle  \\\n",
       "454         NaN  Sean P. Sullivan         @wawinereport   \n",
       "2616        NaN   Susan Kostrzewa         @suskostrzewa   \n",
       "...         ...               ...                   ...   \n",
       "129528      NaN  Sean P. Sullivan         @wawinereport   \n",
       "129581      NaN      Paul Gregutt           @paulgwine    \n",
       "\n",
       "                                                    title  \\\n",
       "454     Pillitteri 2012 Reserve Icewine Vidal (Niagara...   \n",
       "2616    Pillitteri 2004 Fusion Gewürztraminer-Riesling...   \n",
       "...                                                   ...   \n",
       "129528  Henry of Pelham 2011 Icewine Vidal (Niagara Pe...   \n",
       "129581         Burrowing Owl 2013 Syrah (Okanagan Valley)   \n",
       "\n",
       "                        variety           winery  \n",
       "454                       Vidal       Pillitteri  \n",
       "2616    Gewürztraminer-Riesling       Pillitteri  \n",
       "...                         ...              ...  \n",
       "129528                    Vidal  Henry of Pelham  \n",
       "129581                    Syrah    Burrowing Owl  \n",
       "\n",
       "[257 rows x 13 columns]"
      ]
     },
     "execution_count": 39,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "reviews[reviews.country == 'Canada']"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 42,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/javascript": [
       "parent.postMessage({\"jupyterEvent\": \"custom.exercise_interaction\", \"data\": {\"outcomeType\": 1, \"valueTowardsCompletion\": 0.14285714285714285, \"interactionType\": 1, \"questionType\": 1, \"questionId\": \"7_StarRatings\", \"learnToolsVersion\": \"0.3.4\", \"failureMessage\": \"\", \"exceptionClass\": \"\", \"trace\": \"\"}}, \"*\")"
      ],
      "text/plain": [
       "<IPython.core.display.Javascript object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/markdown": [
       "<span style=\"color:#33cc33\">Correct</span>"
      ],
      "text/plain": [
       "Correct"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "# este codigo me pasaba como correcto y esta mal 🤷‍\n",
    "# def calculate_stars_from_points(points):\n",
    "#     if points > 94:\n",
    "#         return 3\n",
    "#     if points > 84:\n",
    "#         return 2\n",
    "#     return 1\n",
    "\n",
    "# star_ratings = reviews.points.map(calculate_stars_from_points)\n",
    "\n",
    "def calculate_stars(row):\n",
    "    if row.country == 'Canada':\n",
    "        return 3\n",
    "    if row.points > 94:\n",
    "        return 3\n",
    "    if row.points > 84:\n",
    "        return 2\n",
    "    return 1\n",
    "\n",
    "star_ratings = reviews.apply(calculate_stars, axis='columns')\n",
    "\n",
    "\n",
    "\n",
    "# Check your answer\n",
    "q7.check()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 40,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/javascript": [
       "parent.postMessage({\"jupyterEvent\": \"custom.exercise_interaction\", \"data\": {\"interactionType\": 3, \"questionType\": 1, \"questionId\": \"7_StarRatings\", \"learnToolsVersion\": \"0.3.4\", \"valueTowardsCompletion\": 0.0, \"failureMessage\": \"\", \"exceptionClass\": \"\", \"trace\": \"\", \"outcomeType\": 4}}, \"*\")"
      ],
      "text/plain": [
       "<IPython.core.display.Javascript object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/markdown": [
       "<span style=\"color:#33cc99\">Solution:</span> \n",
       "```python\n",
       "def stars(row):\n",
       "    if row.country == 'Canada':\n",
       "        return 3\n",
       "    elif row.points >= 95:\n",
       "        return 3\n",
       "    elif row.points >= 85:\n",
       "        return 2\n",
       "    else:\n",
       "        return 1\n",
       "    \n",
       "star_ratings = reviews.apply(stars, axis='columns')\n",
       "```"
      ],
      "text/plain": [
       "Solution: \n",
       "```python\n",
       "def stars(row):\n",
       "    if row.country == 'Canada':\n",
       "        return 3\n",
       "    elif row.points >= 95:\n",
       "        return 3\n",
       "    elif row.points >= 85:\n",
       "        return 2\n",
       "    else:\n",
       "        return 1\n",
       "    \n",
       "star_ratings = reviews.apply(stars, axis='columns')\n",
       "```"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "#q7.hint()\n",
    "q7.solution()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Keep going\n",
    "Continue to **[grouping and sorting](https://www.kaggle.com/residentmario/grouping-and-sorting)**."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---\n",
    "\n",
    "\n",
    "\n",
    "\n",
    "*Have questions or comments? Visit the [Learn Discussion forum](https://www.kaggle.com/learn-forum/161299) to chat with other Learners.*"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.6"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}