{ "cells": [ { "cell_type": "markdown", "metadata": { "papermill": { "duration": 0.011532, "end_time": "2020-10-01T01:25:33.370672", "exception": false, "start_time": "2020-10-01T01:25:33.359140", "status": "completed" }, "tags": [] }, "source": [ "# Introduction\n", "\n", "Oftentimes data will come to us with column names, index names, or other naming conventions that we are not satisfied with. In that case, you'll learn how to use pandas functions to change the names of the offending entries to something better.\n", "\n", "You'll also explore how to combine data from multiple DataFrames and/or Series.\n", "\n", "**To start the exercise for this topic, please click [here](https://www.kaggle.com/kernels/fork/638064).**\n", "\n", "# Renaming\n", "\n", "The first function we'll introduce here is `rename()`, which lets you change index names and/or column names. For example, to change the `points` column in our dataset to `score`, we would do:" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "_kg_hide-input": true, "execution": { "iopub.execute_input": "2020-10-01T01:25:33.406785Z", "iopub.status.busy": "2020-10-01T01:25:33.405704Z", "iopub.status.idle": "2020-10-01T01:25:34.484737Z", "shell.execute_reply": "2020-10-01T01:25:34.485436Z" }, "papermill": { "duration": 1.10373, "end_time": "2020-10-01T01:25:34.485656", "exception": false, "start_time": "2020-10-01T01:25:33.381926", "status": "completed" }, "tags": [] }, "outputs": [], "source": [ "\n", "import pandas as pd\n", "pd.set_option('max_rows', 5)\n", "reviews = pd.read_csv(\"../input/wine-reviews/winemag-data-130k-v2.csv\", index_col=0)" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": true, "execution": { "iopub.execute_input": "2020-10-01T01:25:34.512675Z", "iopub.status.busy": "2020-10-01T01:25:34.511845Z", "iopub.status.idle": "2020-10-01T01:25:34.568444Z", "shell.execute_reply": "2020-10-01T01:25:34.569096Z" }, "jupyter": { "outputs_hidden": true }, "papermill": { "duration": 0.073222, "end_time": "2020-10-01T01:25:34.569312", "exception": false, "start_time": "2020-10-01T01:25:34.496090", "status": "completed" }, "tags": [] }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
countrydescriptiondesignationscorepriceprovinceregion_1region_2taster_nametaster_twitter_handletitlevarietywinery
0ItalyAromas include tropical fruit, broom, brimston...Vulkà Bianco87NaNSicily & SardiniaEtnaNaNKerin O’Keefe@kerinokeefeNicosia 2013 Vulkà Bianco (Etna)White BlendNicosia
1PortugalThis is ripe and fruity, a wine that is smooth...Avidagos8715.0DouroNaNNaNRoger Voss@vossrogerQuinta dos Avidagos 2011 Avidagos Red (Douro)Portuguese RedQuinta dos Avidagos
..........................................
129969FranceA dry style of Pinot Gris, this is crisp with ...NaN9032.0AlsaceAlsaceNaNRoger Voss@vossrogerDomaine Marcel Deiss 2012 Pinot Gris (Alsace)Pinot GrisDomaine Marcel Deiss
129970FranceBig, rich and off-dry, this is powered by inte...Lieu-dit Harth Cuvée Caroline9021.0AlsaceAlsaceNaNRoger Voss@vossrogerDomaine Schoffit 2012 Lieu-dit Harth Cuvée Car...GewürztraminerDomaine Schoffit
\n", "

129971 rows × 13 columns

\n", "
" ], "text/plain": [ " country description \\\n", "0 Italy Aromas include tropical fruit, broom, brimston... \n", "1 Portugal This is ripe and fruity, a wine that is smooth... \n", "... ... ... \n", "129969 France A dry style of Pinot Gris, this is crisp with ... \n", "129970 France Big, rich and off-dry, this is powered by inte... \n", "\n", " designation score price province \\\n", "0 Vulkà Bianco 87 NaN Sicily & Sardinia \n", "1 Avidagos 87 15.0 Douro \n", "... ... ... ... ... \n", "129969 NaN 90 32.0 Alsace \n", "129970 Lieu-dit Harth Cuvée Caroline 90 21.0 Alsace \n", "\n", " region_1 region_2 taster_name taster_twitter_handle \\\n", "0 Etna NaN Kerin O’Keefe @kerinokeefe \n", "1 NaN NaN Roger Voss @vossroger \n", "... ... ... ... ... \n", "129969 Alsace NaN Roger Voss @vossroger \n", "129970 Alsace NaN Roger Voss @vossroger \n", "\n", " title variety \\\n", "0 Nicosia 2013 Vulkà Bianco (Etna) White Blend \n", "1 Quinta dos Avidagos 2011 Avidagos Red (Douro) Portuguese Red \n", "... ... ... \n", "129969 Domaine Marcel Deiss 2012 Pinot Gris (Alsace) Pinot Gris \n", "129970 Domaine Schoffit 2012 Lieu-dit Harth Cuvée Car... Gewürztraminer \n", "\n", " winery \n", "0 Nicosia \n", "1 Quinta dos Avidagos \n", "... ... \n", "129969 Domaine Marcel Deiss \n", "129970 Domaine Schoffit \n", "\n", "[129971 rows x 13 columns]" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "reviews.rename(columns={'points': 'score'})" ] }, { "cell_type": "markdown", "metadata": { "papermill": { "duration": 0.010751, "end_time": "2020-10-01T01:25:34.592086", "exception": false, "start_time": "2020-10-01T01:25:34.581335", "status": "completed" }, "tags": [] }, "source": [ "`rename()` lets you rename index _or_ column values by specifying a `index` or `column` keyword parameter, respectively. It supports a variety of input formats, but usually a Python dictionary is the most convenient. Here is an example using it to rename some elements of the index." ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "collapsed": true, "execution": { "iopub.execute_input": "2020-10-01T01:25:34.621265Z", "iopub.status.busy": "2020-10-01T01:25:34.620445Z", "iopub.status.idle": "2020-10-01T01:25:34.742972Z", "shell.execute_reply": "2020-10-01T01:25:34.742252Z" }, "jupyter": { "outputs_hidden": true }, "papermill": { "duration": 0.139717, "end_time": "2020-10-01T01:25:34.743158", "exception": false, "start_time": "2020-10-01T01:25:34.603441", "status": "completed" }, "tags": [] }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
countrydescriptiondesignationpointspriceprovinceregion_1region_2taster_nametaster_twitter_handletitlevarietywinery
firstEntryItalyAromas include tropical fruit, broom, brimston...Vulkà Bianco87NaNSicily & SardiniaEtnaNaNKerin O’Keefe@kerinokeefeNicosia 2013 Vulkà Bianco (Etna)White BlendNicosia
secondEntryPortugalThis is ripe and fruity, a wine that is smooth...Avidagos8715.0DouroNaNNaNRoger Voss@vossrogerQuinta dos Avidagos 2011 Avidagos Red (Douro)Portuguese RedQuinta dos Avidagos
..........................................
129969FranceA dry style of Pinot Gris, this is crisp with ...NaN9032.0AlsaceAlsaceNaNRoger Voss@vossrogerDomaine Marcel Deiss 2012 Pinot Gris (Alsace)Pinot GrisDomaine Marcel Deiss
129970FranceBig, rich and off-dry, this is powered by inte...Lieu-dit Harth Cuvée Caroline9021.0AlsaceAlsaceNaNRoger Voss@vossrogerDomaine Schoffit 2012 Lieu-dit Harth Cuvée Car...GewürztraminerDomaine Schoffit
\n", "

129971 rows × 13 columns

\n", "
" ], "text/plain": [ " country description \\\n", "firstEntry Italy Aromas include tropical fruit, broom, brimston... \n", "secondEntry Portugal This is ripe and fruity, a wine that is smooth... \n", "... ... ... \n", "129969 France A dry style of Pinot Gris, this is crisp with ... \n", "129970 France Big, rich and off-dry, this is powered by inte... \n", "\n", " designation points price province \\\n", "firstEntry Vulkà Bianco 87 NaN Sicily & Sardinia \n", "secondEntry Avidagos 87 15.0 Douro \n", "... ... ... ... ... \n", "129969 NaN 90 32.0 Alsace \n", "129970 Lieu-dit Harth Cuvée Caroline 90 21.0 Alsace \n", "\n", " region_1 region_2 taster_name taster_twitter_handle \\\n", "firstEntry Etna NaN Kerin O’Keefe @kerinokeefe \n", "secondEntry NaN NaN Roger Voss @vossroger \n", "... ... ... ... ... \n", "129969 Alsace NaN Roger Voss @vossroger \n", "129970 Alsace NaN Roger Voss @vossroger \n", "\n", " title \\\n", "firstEntry Nicosia 2013 Vulkà Bianco (Etna) \n", "secondEntry Quinta dos Avidagos 2011 Avidagos Red (Douro) \n", "... ... \n", "129969 Domaine Marcel Deiss 2012 Pinot Gris (Alsace) \n", "129970 Domaine Schoffit 2012 Lieu-dit Harth Cuvée Car... \n", "\n", " variety winery \n", "firstEntry White Blend Nicosia \n", "secondEntry Portuguese Red Quinta dos Avidagos \n", "... ... ... \n", "129969 Pinot Gris Domaine Marcel Deiss \n", "129970 Gewürztraminer Domaine Schoffit \n", "\n", "[129971 rows x 13 columns]" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "reviews.rename(index={0: 'firstEntry', 1: 'secondEntry'})" ] }, { "cell_type": "markdown", "metadata": { "papermill": { "duration": 0.011758, "end_time": "2020-10-01T01:25:34.767371", "exception": false, "start_time": "2020-10-01T01:25:34.755613", "status": "completed" }, "tags": [] }, "source": [ "You'll probably rename columns very often, but rename index values very rarely. For that, `set_index()` is usually more convenient.\n", "\n", "Both the row index and the column index can have their own `name` attribute. The complimentary `rename_axis()` method may be used to change these names. For example:" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "collapsed": true, "execution": { "iopub.execute_input": "2020-10-01T01:25:34.799472Z", "iopub.status.busy": "2020-10-01T01:25:34.798660Z", "iopub.status.idle": "2020-10-01T01:25:34.883683Z", "shell.execute_reply": "2020-10-01T01:25:34.883009Z" }, "jupyter": { "outputs_hidden": true }, "papermill": { "duration": 0.104383, "end_time": "2020-10-01T01:25:34.883848", "exception": false, "start_time": "2020-10-01T01:25:34.779465", "status": "completed" }, "tags": [] }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
fieldscountrydescriptiondesignationpointspriceprovinceregion_1region_2taster_nametaster_twitter_handletitlevarietywinery
wines
0ItalyAromas include tropical fruit, broom, brimston...Vulkà Bianco87NaNSicily & SardiniaEtnaNaNKerin O’Keefe@kerinokeefeNicosia 2013 Vulkà Bianco (Etna)White BlendNicosia
1PortugalThis is ripe and fruity, a wine that is smooth...Avidagos8715.0DouroNaNNaNRoger Voss@vossrogerQuinta dos Avidagos 2011 Avidagos Red (Douro)Portuguese RedQuinta dos Avidagos
..........................................
129969FranceA dry style of Pinot Gris, this is crisp with ...NaN9032.0AlsaceAlsaceNaNRoger Voss@vossrogerDomaine Marcel Deiss 2012 Pinot Gris (Alsace)Pinot GrisDomaine Marcel Deiss
129970FranceBig, rich and off-dry, this is powered by inte...Lieu-dit Harth Cuvée Caroline9021.0AlsaceAlsaceNaNRoger Voss@vossrogerDomaine Schoffit 2012 Lieu-dit Harth Cuvée Car...GewürztraminerDomaine Schoffit
\n", "

129971 rows × 13 columns

\n", "
" ], "text/plain": [ "fields country description \\\n", "wines \n", "0 Italy Aromas include tropical fruit, broom, brimston... \n", "1 Portugal This is ripe and fruity, a wine that is smooth... \n", "... ... ... \n", "129969 France A dry style of Pinot Gris, this is crisp with ... \n", "129970 France Big, rich and off-dry, this is powered by inte... \n", "\n", "fields designation points price province \\\n", "wines \n", "0 Vulkà Bianco 87 NaN Sicily & Sardinia \n", "1 Avidagos 87 15.0 Douro \n", "... ... ... ... ... \n", "129969 NaN 90 32.0 Alsace \n", "129970 Lieu-dit Harth Cuvée Caroline 90 21.0 Alsace \n", "\n", "fields region_1 region_2 taster_name taster_twitter_handle \\\n", "wines \n", "0 Etna NaN Kerin O’Keefe @kerinokeefe \n", "1 NaN NaN Roger Voss @vossroger \n", "... ... ... ... ... \n", "129969 Alsace NaN Roger Voss @vossroger \n", "129970 Alsace NaN Roger Voss @vossroger \n", "\n", "fields title variety \\\n", "wines \n", "0 Nicosia 2013 Vulkà Bianco (Etna) White Blend \n", "1 Quinta dos Avidagos 2011 Avidagos Red (Douro) Portuguese Red \n", "... ... ... \n", "129969 Domaine Marcel Deiss 2012 Pinot Gris (Alsace) Pinot Gris \n", "129970 Domaine Schoffit 2012 Lieu-dit Harth Cuvée Car... Gewürztraminer \n", "\n", "fields winery \n", "wines \n", "0 Nicosia \n", "1 Quinta dos Avidagos \n", "... ... \n", "129969 Domaine Marcel Deiss \n", "129970 Domaine Schoffit \n", "\n", "[129971 rows x 13 columns]" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "reviews.rename_axis(\"wines\", axis='rows').rename_axis(\"fields\", axis='columns')" ] }, { "cell_type": "markdown", "metadata": { "papermill": { "duration": 0.012562, "end_time": "2020-10-01T01:25:34.909299", "exception": false, "start_time": "2020-10-01T01:25:34.896737", "status": "completed" }, "tags": [] }, "source": [ "# Combining\n", "\n", "When performing operations on a dataset, we will sometimes need to combine different DataFrames and/or Series in non-trivial ways. Pandas has three core methods for doing this. In order of increasing complexity, these are `concat()`, `join()`, and `merge()`. Most of what `merge()` can do can also be done more simply with `join()`, so we will omit it and focus on the first two functions here.\n", "\n", "The simplest combining method is `concat()`. Given a list of elements, this function will smush those elements together along an axis.\n", "\n", "This is useful when we have data in different DataFrame or Series objects but having the same fields (columns). One example: the [YouTube Videos dataset](https://www.kaggle.com/datasnaek/youtube-new), which splits the data up based on country of origin (e.g. Canada and the UK, in this example). If we want to study multiple countries simultaneously, we can use `concat()` to smush them together:" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "collapsed": true, "execution": { "iopub.execute_input": "2020-10-01T01:25:34.946163Z", "iopub.status.busy": "2020-10-01T01:25:34.945318Z", "iopub.status.idle": "2020-10-01T01:25:36.519947Z", "shell.execute_reply": "2020-10-01T01:25:36.520995Z" }, "jupyter": { "outputs_hidden": true }, "papermill": { "duration": 1.598018, "end_time": "2020-10-01T01:25:36.521310", "exception": false, "start_time": "2020-10-01T01:25:34.923292", "status": "completed" }, "tags": [] }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
video_idtrending_datetitlechannel_titlecategory_idpublish_timetagsviewslikesdislikescomment_countthumbnail_linkcomments_disabledratings_disabledvideo_error_or_removeddescription
0n1WpP7iowLc17.14.11Eminem - Walk On Water (Audio) ft. BeyoncéEminemVEVO102017-11-10T17:00:03.000ZEminem|\"Walk\"|\"On\"|\"Water\"|\"Aftermath/Shady/In...1715857978742543420125882https://i.ytimg.com/vi/n1WpP7iowLc/default.jpgFalseFalseFalseEminem's new track Walk on Water ft. Beyoncé i...
10dBIkQ4Mz1M17.14.11PLUSH - Bad Unboxing Fan MailiDubbbzTV232017-11-13T17:00:00.000Zplush|\"bad unboxing\"|\"unboxing\"|\"fan mail\"|\"id...1014651127794168813030https://i.ytimg.com/vi/0dBIkQ4Mz1M/default.jpgFalseFalseFalseSTill got a lot of packages. Probably will las...
...................................................
38914-DRsfNObKIQ18.14.06Eleni Foureira - Fuego - Cyprus - LIVE - First...Eurovision Song Contest242018-05-08T20:32:32.000ZEurovision Song Contest|\"2018\"|\"Lisbon\"|\"Cypru...143175151518704587526766https://i.ytimg.com/vi/-DRsfNObKIQ/default.jpgFalseFalseFalseEleni Foureira represented Cyprus at the first...
389154YFo4bdMO8Q18.14.06KYLE - Ikuyo feat. 2 Chainz & Sophia Black [A...SuperDuperKyle102018-05-11T04:06:35.000ZKyle|\"SuperDuperKyle\"|\"Ikuyo\"|\"2 Chainz\"|\"Soph...607552182712741423https://i.ytimg.com/vi/4YFo4bdMO8Q/default.jpgFalseFalseFalseDebut album 'Light of Mine' out now: http://ky...
\n", "

79797 rows × 16 columns

\n", "
" ], "text/plain": [ " video_id trending_date \\\n", "0 n1WpP7iowLc 17.14.11 \n", "1 0dBIkQ4Mz1M 17.14.11 \n", "... ... ... \n", "38914 -DRsfNObKIQ 18.14.06 \n", "38915 4YFo4bdMO8Q 18.14.06 \n", "\n", " title \\\n", "0 Eminem - Walk On Water (Audio) ft. Beyoncé \n", "1 PLUSH - Bad Unboxing Fan Mail \n", "... ... \n", "38914 Eleni Foureira - Fuego - Cyprus - LIVE - First... \n", "38915 KYLE - Ikuyo feat. 2 Chainz & Sophia Black [A... \n", "\n", " channel_title category_id publish_time \\\n", "0 EminemVEVO 10 2017-11-10T17:00:03.000Z \n", "1 iDubbbzTV 23 2017-11-13T17:00:00.000Z \n", "... ... ... ... \n", "38914 Eurovision Song Contest 24 2018-05-08T20:32:32.000Z \n", "38915 SuperDuperKyle 10 2018-05-11T04:06:35.000Z \n", "\n", " tags views likes \\\n", "0 Eminem|\"Walk\"|\"On\"|\"Water\"|\"Aftermath/Shady/In... 17158579 787425 \n", "1 plush|\"bad unboxing\"|\"unboxing\"|\"fan mail\"|\"id... 1014651 127794 \n", "... ... ... ... \n", "38914 Eurovision Song Contest|\"2018\"|\"Lisbon\"|\"Cypru... 14317515 151870 \n", "38915 Kyle|\"SuperDuperKyle\"|\"Ikuyo\"|\"2 Chainz\"|\"Soph... 607552 18271 \n", "\n", " dislikes comment_count \\\n", "0 43420 125882 \n", "1 1688 13030 \n", "... ... ... \n", "38914 45875 26766 \n", "38915 274 1423 \n", "\n", " thumbnail_link comments_disabled \\\n", "0 https://i.ytimg.com/vi/n1WpP7iowLc/default.jpg False \n", "1 https://i.ytimg.com/vi/0dBIkQ4Mz1M/default.jpg False \n", "... ... ... \n", "38914 https://i.ytimg.com/vi/-DRsfNObKIQ/default.jpg False \n", "38915 https://i.ytimg.com/vi/4YFo4bdMO8Q/default.jpg False \n", "\n", " ratings_disabled video_error_or_removed \\\n", "0 False False \n", "1 False False \n", "... ... ... \n", "38914 False False \n", "38915 False False \n", "\n", " description \n", "0 Eminem's new track Walk on Water ft. Beyoncé i... \n", "1 STill got a lot of packages. Probably will las... \n", "... ... \n", "38914 Eleni Foureira represented Cyprus at the first... \n", "38915 Debut album 'Light of Mine' out now: http://ky... \n", "\n", "[79797 rows x 16 columns]" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "canadian_youtube = pd.read_csv(\"../input/youtube-new/CAvideos.csv\")\n", "british_youtube = pd.read_csv(\"../input/youtube-new/GBvideos.csv\")\n", "\n", "pd.concat([canadian_youtube, british_youtube])" ] }, { "cell_type": "markdown", "metadata": { "papermill": { "duration": 0.018724, "end_time": "2020-10-01T01:25:36.560674", "exception": false, "start_time": "2020-10-01T01:25:36.541950", "status": "completed" }, "tags": [] }, "source": [ "The middlemost combiner in terms of complexity is `join()`. `join()` lets you combine different DataFrame objects which have an index in common. For example, to pull down videos that happened to be trending on the same day in _both_ Canada and the UK, we could do the following:" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "collapsed": true, "execution": { "iopub.execute_input": "2020-10-01T01:25:36.605414Z", "iopub.status.busy": "2020-10-01T01:25:36.604185Z", "iopub.status.idle": "2020-10-01T01:25:37.725816Z", "shell.execute_reply": "2020-10-01T01:25:37.726503Z" }, "jupyter": { "outputs_hidden": true }, "papermill": { "duration": 1.147704, "end_time": "2020-10-01T01:25:37.726673", "exception": false, "start_time": "2020-10-01T01:25:36.578969", "status": "completed" }, "tags": [] }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
video_id_CANchannel_title_CANcategory_id_CANpublish_time_CANtags_CANviews_CANlikes_CANdislikes_CANcomment_count_CANthumbnail_link_CAN...tags_UKviews_UKlikes_UKdislikes_UKcomment_count_UKthumbnail_link_UKcomments_disabled_UKratings_disabled_UKvideo_error_or_removed_UKdescription_UK
titletrending_date
!! THIS VIDEO IS NOTHING BUT PAIN !! | Getting Over It - Part 718.04.01PNn8sECd7ioMarkiplier202018-01-03T19:33:53.000Zgetting over it|\"markiplier\"|\"funny moments\"|\"...8359304705810238250https://i.ytimg.com/vi/PNn8sECd7io/default.jpg...NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
#1 Fortnite World Rank - 2,323 Solo Wins!18.09.03DvPW66IFhMIAlexRamiGaming202018-03-09T07:15:52.000ZPS4 Battle Royale|\"PS4 Pro Battle Royale\"|\"Bat...212838519954211https://i.ytimg.com/vi/DvPW66IFhMI/default.jpg...NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
.....................................................................
🚨 BREAKING NEWS 🔴 Raja Live all Slot Channels Welcome 🎰18.07.05Wt9Gkpmbt44TheBigJackpot242018-05-07T06:58:59.000ZSlot Machine|\"win\"|\"Gambling\"|\"Big Win\"|\"raja\"...28973216717510https://i.ytimg.com/vi/Wt9Gkpmbt44/default.jpg...NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
🚨Active Shooter at YouTube Headquarters - LIVE BREAKING NEWS COVERAGE18.04.04Az72jrKbANARight Side Broadcasting Network252018-04-03T23:12:37.000ZYouTube shooter|\"YouTube active shooter\"|\"acti...103513172218176https://i.ytimg.com/vi/Az72jrKbANA/default.jpg...NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
\n", "

40900 rows × 28 columns

\n", "
" ], "text/plain": [ " video_id_CAN \\\n", "title trending_date \n", "!! THIS VIDEO IS NOTHING BUT PAIN !! | Getting ... 18.04.01 PNn8sECd7io \n", "#1 Fortnite World Rank - 2,323 Solo Wins! 18.09.03 DvPW66IFhMI \n", "... ... \n", "🚨 BREAKING NEWS 🔴 Raja Live all Slot Channels W... 18.07.05 Wt9Gkpmbt44 \n", "🚨Active Shooter at YouTube Headquarters - LIVE ... 18.04.04 Az72jrKbANA \n", "\n", " channel_title_CAN \\\n", "title trending_date \n", "!! THIS VIDEO IS NOTHING BUT PAIN !! | Getting ... 18.04.01 Markiplier \n", "#1 Fortnite World Rank - 2,323 Solo Wins! 18.09.03 AlexRamiGaming \n", "... ... \n", "🚨 BREAKING NEWS 🔴 Raja Live all Slot Channels W... 18.07.05 TheBigJackpot \n", "🚨Active Shooter at YouTube Headquarters - LIVE ... 18.04.04 Right Side Broadcasting Network \n", "\n", " category_id_CAN \\\n", "title trending_date \n", "!! THIS VIDEO IS NOTHING BUT PAIN !! | Getting ... 18.04.01 20 \n", "#1 Fortnite World Rank - 2,323 Solo Wins! 18.09.03 20 \n", "... ... \n", "🚨 BREAKING NEWS 🔴 Raja Live all Slot Channels W... 18.07.05 24 \n", "🚨Active Shooter at YouTube Headquarters - LIVE ... 18.04.04 25 \n", "\n", " publish_time_CAN \\\n", "title trending_date \n", "!! THIS VIDEO IS NOTHING BUT PAIN !! | Getting ... 18.04.01 2018-01-03T19:33:53.000Z \n", "#1 Fortnite World Rank - 2,323 Solo Wins! 18.09.03 2018-03-09T07:15:52.000Z \n", "... ... \n", "🚨 BREAKING NEWS 🔴 Raja Live all Slot Channels W... 18.07.05 2018-05-07T06:58:59.000Z \n", "🚨Active Shooter at YouTube Headquarters - LIVE ... 18.04.04 2018-04-03T23:12:37.000Z \n", "\n", " tags_CAN \\\n", "title trending_date \n", "!! THIS VIDEO IS NOTHING BUT PAIN !! | Getting ... 18.04.01 getting over it|\"markiplier\"|\"funny moments\"|\"... \n", "#1 Fortnite World Rank - 2,323 Solo Wins! 18.09.03 PS4 Battle Royale|\"PS4 Pro Battle Royale\"|\"Bat... \n", "... ... \n", "🚨 BREAKING NEWS 🔴 Raja Live all Slot Channels W... 18.07.05 Slot Machine|\"win\"|\"Gambling\"|\"Big Win\"|\"raja\"... \n", "🚨Active Shooter at YouTube Headquarters - LIVE ... 18.04.04 YouTube shooter|\"YouTube active shooter\"|\"acti... \n", "\n", " views_CAN \\\n", "title trending_date \n", "!! THIS VIDEO IS NOTHING BUT PAIN !! | Getting ... 18.04.01 835930 \n", "#1 Fortnite World Rank - 2,323 Solo Wins! 18.09.03 212838 \n", "... ... \n", "🚨 BREAKING NEWS 🔴 Raja Live all Slot Channels W... 18.07.05 28973 \n", "🚨Active Shooter at YouTube Headquarters - LIVE ... 18.04.04 103513 \n", "\n", " likes_CAN \\\n", "title trending_date \n", "!! THIS VIDEO IS NOTHING BUT PAIN !! | Getting ... 18.04.01 47058 \n", "#1 Fortnite World Rank - 2,323 Solo Wins! 18.09.03 5199 \n", "... ... \n", "🚨 BREAKING NEWS 🔴 Raja Live all Slot Channels W... 18.07.05 2167 \n", "🚨Active Shooter at YouTube Headquarters - LIVE ... 18.04.04 1722 \n", "\n", " dislikes_CAN \\\n", "title trending_date \n", "!! THIS VIDEO IS NOTHING BUT PAIN !! | Getting ... 18.04.01 1023 \n", "#1 Fortnite World Rank - 2,323 Solo Wins! 18.09.03 542 \n", "... ... \n", "🚨 BREAKING NEWS 🔴 Raja Live all Slot Channels W... 18.07.05 175 \n", "🚨Active Shooter at YouTube Headquarters - LIVE ... 18.04.04 181 \n", "\n", " comment_count_CAN \\\n", "title trending_date \n", "!! THIS VIDEO IS NOTHING BUT PAIN !! | Getting ... 18.04.01 8250 \n", "#1 Fortnite World Rank - 2,323 Solo Wins! 18.09.03 11 \n", "... ... \n", "🚨 BREAKING NEWS 🔴 Raja Live all Slot Channels W... 18.07.05 10 \n", "🚨Active Shooter at YouTube Headquarters - LIVE ... 18.04.04 76 \n", "\n", " thumbnail_link_CAN \\\n", "title trending_date \n", "!! THIS VIDEO IS NOTHING BUT PAIN !! | Getting ... 18.04.01 https://i.ytimg.com/vi/PNn8sECd7io/default.jpg \n", "#1 Fortnite World Rank - 2,323 Solo Wins! 18.09.03 https://i.ytimg.com/vi/DvPW66IFhMI/default.jpg \n", "... ... \n", "🚨 BREAKING NEWS 🔴 Raja Live all Slot Channels W... 18.07.05 https://i.ytimg.com/vi/Wt9Gkpmbt44/default.jpg \n", "🚨Active Shooter at YouTube Headquarters - LIVE ... 18.04.04 https://i.ytimg.com/vi/Az72jrKbANA/default.jpg \n", "\n", " ... \\\n", "title trending_date ... \n", "!! THIS VIDEO IS NOTHING BUT PAIN !! | Getting ... 18.04.01 ... \n", "#1 Fortnite World Rank - 2,323 Solo Wins! 18.09.03 ... \n", "... ... \n", "🚨 BREAKING NEWS 🔴 Raja Live all Slot Channels W... 18.07.05 ... \n", "🚨Active Shooter at YouTube Headquarters - LIVE ... 18.04.04 ... \n", "\n", " tags_UK \\\n", "title trending_date \n", "!! THIS VIDEO IS NOTHING BUT PAIN !! | Getting ... 18.04.01 NaN \n", "#1 Fortnite World Rank - 2,323 Solo Wins! 18.09.03 NaN \n", "... ... \n", "🚨 BREAKING NEWS 🔴 Raja Live all Slot Channels W... 18.07.05 NaN \n", "🚨Active Shooter at YouTube Headquarters - LIVE ... 18.04.04 NaN \n", "\n", " views_UK \\\n", "title trending_date \n", "!! THIS VIDEO IS NOTHING BUT PAIN !! | Getting ... 18.04.01 NaN \n", "#1 Fortnite World Rank - 2,323 Solo Wins! 18.09.03 NaN \n", "... ... \n", "🚨 BREAKING NEWS 🔴 Raja Live all Slot Channels W... 18.07.05 NaN \n", "🚨Active Shooter at YouTube Headquarters - LIVE ... 18.04.04 NaN \n", "\n", " likes_UK \\\n", "title trending_date \n", "!! THIS VIDEO IS NOTHING BUT PAIN !! | Getting ... 18.04.01 NaN \n", "#1 Fortnite World Rank - 2,323 Solo Wins! 18.09.03 NaN \n", "... ... \n", "🚨 BREAKING NEWS 🔴 Raja Live all Slot Channels W... 18.07.05 NaN \n", "🚨Active Shooter at YouTube Headquarters - LIVE ... 18.04.04 NaN \n", "\n", " dislikes_UK \\\n", "title trending_date \n", "!! THIS VIDEO IS NOTHING BUT PAIN !! | Getting ... 18.04.01 NaN \n", "#1 Fortnite World Rank - 2,323 Solo Wins! 18.09.03 NaN \n", "... ... \n", "🚨 BREAKING NEWS 🔴 Raja Live all Slot Channels W... 18.07.05 NaN \n", "🚨Active Shooter at YouTube Headquarters - LIVE ... 18.04.04 NaN \n", "\n", " comment_count_UK \\\n", "title trending_date \n", "!! THIS VIDEO IS NOTHING BUT PAIN !! | Getting ... 18.04.01 NaN \n", "#1 Fortnite World Rank - 2,323 Solo Wins! 18.09.03 NaN \n", "... ... \n", "🚨 BREAKING NEWS 🔴 Raja Live all Slot Channels W... 18.07.05 NaN \n", "🚨Active Shooter at YouTube Headquarters - LIVE ... 18.04.04 NaN \n", "\n", " thumbnail_link_UK \\\n", "title trending_date \n", "!! THIS VIDEO IS NOTHING BUT PAIN !! | Getting ... 18.04.01 NaN \n", "#1 Fortnite World Rank - 2,323 Solo Wins! 18.09.03 NaN \n", "... ... \n", "🚨 BREAKING NEWS 🔴 Raja Live all Slot Channels W... 18.07.05 NaN \n", "🚨Active Shooter at YouTube Headquarters - LIVE ... 18.04.04 NaN \n", "\n", " comments_disabled_UK \\\n", "title trending_date \n", "!! THIS VIDEO IS NOTHING BUT PAIN !! | Getting ... 18.04.01 NaN \n", "#1 Fortnite World Rank - 2,323 Solo Wins! 18.09.03 NaN \n", "... ... \n", "🚨 BREAKING NEWS 🔴 Raja Live all Slot Channels W... 18.07.05 NaN \n", "🚨Active Shooter at YouTube Headquarters - LIVE ... 18.04.04 NaN \n", "\n", " ratings_disabled_UK \\\n", "title trending_date \n", "!! THIS VIDEO IS NOTHING BUT PAIN !! | Getting ... 18.04.01 NaN \n", "#1 Fortnite World Rank - 2,323 Solo Wins! 18.09.03 NaN \n", "... ... \n", "🚨 BREAKING NEWS 🔴 Raja Live all Slot Channels W... 18.07.05 NaN \n", "🚨Active Shooter at YouTube Headquarters - LIVE ... 18.04.04 NaN \n", "\n", " video_error_or_removed_UK \\\n", "title trending_date \n", "!! THIS VIDEO IS NOTHING BUT PAIN !! | Getting ... 18.04.01 NaN \n", "#1 Fortnite World Rank - 2,323 Solo Wins! 18.09.03 NaN \n", "... ... \n", "🚨 BREAKING NEWS 🔴 Raja Live all Slot Channels W... 18.07.05 NaN \n", "🚨Active Shooter at YouTube Headquarters - LIVE ... 18.04.04 NaN \n", "\n", " description_UK \n", "title trending_date \n", "!! THIS VIDEO IS NOTHING BUT PAIN !! | Getting ... 18.04.01 NaN \n", "#1 Fortnite World Rank - 2,323 Solo Wins! 18.09.03 NaN \n", "... ... \n", "🚨 BREAKING NEWS 🔴 Raja Live all Slot Channels W... 18.07.05 NaN \n", "🚨Active Shooter at YouTube Headquarters - LIVE ... 18.04.04 NaN \n", "\n", "[40900 rows x 28 columns]" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "left = canadian_youtube.set_index(['title', 'trending_date'])\n", "right = british_youtube.set_index(['title', 'trending_date'])\n", "\n", "left.join(right, lsuffix='_CAN', rsuffix='_UK')" ] }, { "cell_type": "markdown", "metadata": { "papermill": { "duration": 0.015163, "end_time": "2020-10-01T01:25:37.756918", "exception": false, "start_time": "2020-10-01T01:25:37.741755", "status": "completed" }, "tags": [] }, "source": [ "The `lsuffix` and `rsuffix` parameters are necessary here because the data has the same column names in both British and Canadian datasets. If this wasn't true (because, say, we'd renamed them beforehand) we wouldn't need them.\n", "\n", "# Your turn\n", "\n", "If you haven't started the exercise, you can **[get started here](https://www.kaggle.com/kernels/fork/638064)**." ] }, { "cell_type": "markdown", "metadata": { "papermill": { "duration": 0.015042, "end_time": "2020-10-01T01:25:37.786782", "exception": false, "start_time": "2020-10-01T01:25:37.771740", "status": "completed" }, "tags": [] }, "source": [ "---\n", "\n", "\n", "\n", "\n", "*Have questions or comments? Visit the [Learn Discussion forum](https://www.kaggle.com/learn-forum/161299) to chat with other Learners.*" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.6" }, "papermill": { "duration": 9.92106, "end_time": "2020-10-01T01:25:37.911519", "environment_variables": {}, "exception": null, "input_path": "__notebook__.ipynb", "output_path": "__notebook__.ipynb", "parameters": {}, "start_time": "2020-10-01T01:25:27.990459", "version": "2.1.0" } }, "nbformat": 4, "nbformat_minor": 4 }