# Plot two histograms on single chart with matplotlib

PythonMatplotlibPlotHistogram## Python Problem Overview

I created a histogram plot using data from a file and no problem. Now I wanted to superpose data from another file in the same histogram, so I do something like this

```
n,bins,patchs = ax.hist(mydata1,100)
n,bins,patchs = ax.hist(mydata2,100)
```

but the problem is that for each interval, only the bar with the highest value appears, and the other is hidden. I wonder how could I plot both histograms at the same time with different colors.

## Python Solutions

## Solution 1 - Python

Here you have a working example:

```
import random
import numpy
from matplotlib import pyplot
x = [random.gauss(3,1) for _ in range(400)]
y = [random.gauss(4,2) for _ in range(400)]
bins = numpy.linspace(-10, 10, 100)
pyplot.hist(x, bins, alpha=0.5, label='x')
pyplot.hist(y, bins, alpha=0.5, label='y')
pyplot.legend(loc='upper right')
pyplot.show()
```

## Solution 2 - Python

The accepted answers gives the code for a histogram with overlapping bars, but in case you want each bar to be side-by-side (as I did), try the variation below:

```
import numpy as np
import matplotlib.pyplot as plt
plt.style.use('seaborn-deep')
x = np.random.normal(1, 2, 5000)
y = np.random.normal(-1, 3, 2000)
bins = np.linspace(-10, 10, 30)
plt.hist([x, y], bins, label=['x', 'y'])
plt.legend(loc='upper right')
plt.show()
```

Reference: http://matplotlib.org/examples/statistics/histogram_demo_multihist.html

EDIT [2018/03/16]: Updated to allow plotting of arrays of different sizes, as suggested by @stochastic_zeitgeist

## Solution 3 - Python

In the case you have different sample sizes, it may be difficult to compare the distributions with a single y-axis. For example:

```
import numpy as np
import matplotlib.pyplot as plt
#makes the data
y1 = np.random.normal(-2, 2, 1000)
y2 = np.random.normal(2, 2, 5000)
colors = ['b','g']
#plots the histogram
fig, ax1 = plt.subplots()
ax1.hist([y1,y2],color=colors)
ax1.set_xlim(-10,10)
ax1.set_ylabel("Count")
plt.tight_layout()
plt.show()
```

In this case, you can plot your two data sets on different axes. To do so, you can get your histogram data using matplotlib, clear the axis, and then re-plot it on two separate axes (shifting the bin edges so that they don't overlap):

```
#sets up the axis and gets histogram data
fig, ax1 = plt.subplots()
ax2 = ax1.twinx()
ax1.hist([y1, y2], color=colors)
n, bins, patches = ax1.hist([y1,y2])
ax1.cla() #clear the axis
#plots the histogram data
width = (bins[1] - bins[0]) * 0.4
bins_shifted = bins + width
ax1.bar(bins[:-1], n[0], width, align='edge', color=colors[0])
ax2.bar(bins_shifted[:-1], n[1], width, align='edge', color=colors[1])
#finishes the plot
ax1.set_ylabel("Count", color=colors[0])
ax2.set_ylabel("Count", color=colors[1])
ax1.tick_params('y', colors=colors[0])
ax2.tick_params('y', colors=colors[1])
plt.tight_layout()
plt.show()
```

## Solution 4 - Python

*As a completion to Gustavo Bezerra's answer:*

If you want **each histogram to be normalized** (* normed for mpl<=2.1 and density for mpl>=3.1*) you cannot just use

`normed/density=True`

, you need to set the weights for each value instead:```
import numpy as np
import matplotlib.pyplot as plt
x = np.random.normal(1, 2, 5000)
y = np.random.normal(-1, 3, 2000)
x_w = np.empty(x.shape)
x_w.fill(1/x.shape[0])
y_w = np.empty(y.shape)
y_w.fill(1/y.shape[0])
bins = np.linspace(-10, 10, 30)
plt.hist([x, y], bins, weights=[x_w, y_w], label=['x', 'y'])
plt.legend(loc='upper right')
plt.show()
```

As a comparison, the exact same `x`

and `y`

vectors with default weights and `density=True`

:

## Solution 5 - Python

You should use `bins`

from the values returned by `hist`

:

```
import numpy as np
import matplotlib.pyplot as plt
foo = np.random.normal(loc=1, size=100) # a normal distribution
bar = np.random.normal(loc=-1, size=10000) # a normal distribution
_, bins, _ = plt.hist(foo, bins=50, range=[-6, 6], normed=True)
_ = plt.hist(bar, bins=bins, alpha=0.5, normed=True)
```

## Solution 6 - Python

Here is a simple method to plot two histograms, with their bars side-by-side, on the same plot when the data has different sizes:

```
def plotHistogram(p, o):
"""
p and o are iterables with the values you want to
plot the histogram of
"""
plt.hist([p, o], color=['g','r'], alpha=0.8, bins=50)
plt.show()
```

## Solution 7 - Python

It sounds like you might want just a bar graph:

- http://matplotlib.sourceforge.net/examples/pylab_examples/bar_stacked.html
- http://matplotlib.sourceforge.net/examples/pylab_examples/barchart_demo.html

Alternatively, you can use subplots.

## Solution 8 - Python

Also an option which is quite similar to joaquin answer:

```
import random
from matplotlib import pyplot
#random data
x = [random.gauss(3,1) for _ in range(400)]
y = [random.gauss(4,2) for _ in range(400)]
#plot both histograms(range from -10 to 10), bins set to 100
pyplot.hist([x,y], bins= 100, range=[-10,10], alpha=0.5, label=['x', 'y'])
#plot legend
pyplot.legend(loc='upper right')
#show it
pyplot.show()
```

Gives the following output:

## Solution 9 - Python

Plotting two overlapping histograms (or more) can lead to a rather cluttered plot. I find that using step histograms (aka hollow histograms) improves the readability quite a bit. The only downside is that in matplotlib the default legend for a step histogram is not properly formatted, so it can be edited like in the following example:

```
import numpy as np # v 1.19.2
import matplotlib.pyplot as plt # v 3.3.2
from matplotlib.lines import Line2D
rng = np.random.default_rng(seed=123)
# Create two normally distributed random variables of different sizes
# and with different shapes
data1 = rng.normal(loc=30, scale=10, size=500)
data2 = rng.normal(loc=50, scale=10, size=1000)
# Create figure with 'step' type of histogram to improve plot readability
fig, ax = plt.subplots(figsize=(9,5))
ax.hist([data1, data2], bins=15, histtype='step', linewidth=2,
alpha=0.7, label=['data1','data2'])
# Edit legend to get lines as legend keys instead of the default polygons
# and sort the legend entries in alphanumeric order
handles, labels = ax.get_legend_handles_labels()
leg_entries = {}
for h, label in zip(handles, labels):
leg_entries[label] = Line2D([0], [0], color=h.get_facecolor()[:-1],
alpha=h.get_alpha(), lw=h.get_linewidth())
labels_sorted, lines = zip(*sorted(leg_entries.items()))
ax.legend(lines, labels_sorted, frameon=False)
# Remove spines
ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)
# Add annotations
plt.ylabel('Frequency', labelpad=15)
plt.title('Matplotlib step histogram', fontsize=14, pad=20)
plt.show()
```

As you can see, the result looks quite clean. This is especially useful when overlapping even more than two histograms. Depending on how the variables are distributed, this can work for up to around 5 overlapping distributions. More than that would require the use of another type of plot, such as one of those presented here.

## Solution 10 - Python

Just in case you have pandas (`import pandas as pd`

) or are ok with using it:

```
test = pd.DataFrame([[random.gauss(3,1) for _ in range(400)],
[random.gauss(4,2) for _ in range(400)]])
plt.hist(test.values.T)
plt.show()
```

## Solution 11 - Python

This question has been answered before, but wanted to add another quick/easy workaround that might help other visitors to this question.

```
import seasborn as sns
sns.kdeplot(mydata1)
sns.kdeplot(mydata2)
```

Some helpful examples are here for kde vs histogram comparison.

## Solution 12 - Python

Inspired by Solomon's answer, but to stick with the question, which is related to histogram, a clean solution is:

```
sns.distplot(bar)
sns.distplot(foo)
plt.show()
```

Make sure to plot the taller one first, otherwise you would need to set plt.ylim(0,0.45) so that the taller histogram is not chopped off.

## Solution 13 - Python

There is one caveat when you want to plot the histogram from a 2-d numpy array. You need to swap the 2 axes.

```
import numpy as np
import matplotlib.pyplot as plt
data = np.random.normal(size=(2, 300))
# swapped_data.shape == (300, 2)
swapped_data = np.swapaxes(x, axis1=0, axis2=1)
plt.hist(swapped_data, bins=30, label=['x', 'y'])
plt.legend()
plt.show()
```