Seaborn load_dataset

PythonBoxplotSeaborn

Python Problem Overview


I am trying to get a grouped boxplot working using Seaborn as per the example

I can get the above example working, however the line:

tips = sns.load_dataset("tips")

is not explained at all. I have located the tips.csv file, but I can't seem to find adequate documentation on what load_dataset specifically does. I tried to create my own csv and load this, but to no avail. I also renamed the tips file and it still worked...

My question is thus:

Where is load_dataset actually looking for files? Can I actually use this for my own boxplots?

EDIT: I managed to get my own boxplots working using my own DataFrame, but I am still wondering whether load_dataset is used for anything more than mysterious tutorial examples.

Python Solutions


Solution 1 - Python

load_dataset looks for online csv files on https://github.com/mwaskom/seaborn-data. Here's the docstring:

> Load a dataset from the online repository (requires internet). > > Parameters > ---------- > name : str > Name of the dataset (name.csv on > https://github.com/mwaskom/seaborn-data). You can obtain list of > available datasets using :func:get_dataset_names

> kws : dict, optional > Passed to pandas.read_csv

If you want to modify that online dataset or bring in your own data, you likely have to use pandas. load_dataset actually returns a pandas DataFrame object, which you can confirm with type(tips).

If you already created your own data in a csv file called, say, tips2.csv, and saved it in the same location as your script, use this (after installing pandas) to load it in:

import pandas as pd

tips2 = pd.read_csv('tips2.csv')

Solution 2 - Python

Just to add to 'selwyth's' answer.

import pandas as pd
Data=pd.read_csv('Path\to\csv\')
Data.head(10)

Once you have completed these steps successfully. Now the plotting actually works like this.

Let's say you want to plot a bar plot.

sns.barplot(x=Data.Year,y=Data.Salary) //year and salary attributes were present in my dataset.

This actually works with every plotting in seaborn.

Moreover, we will not be eligible to add our own dataset on Seaborn Git.

Solution 3 - Python

Download all csv files(zipped) to be used for your example from here.

Extract the zip file to a local directory and launch your jupyter notebook from the same directory. Run the following commands in jupyter notebook:

import pandas as pd
tips = pd.read_csv('seaborn-data-master/tips.csv')

you're good to work with your example now!

Solution 4 - Python

You will need to have an internet connection since the csv files are not on your local computer so your computer needs to be online in order to download the dataset

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionArsibaltView Question on Stackoverflow
Solution 1 - PythonselwythView Answer on Stackoverflow
Solution 2 - PythonvegetarianCoderView Answer on Stackoverflow
Solution 3 - PythonRahul DeshmukhView Answer on Stackoverflow
Solution 4 - PythonTabot Charles BessongView Answer on Stackoverflow