Consistent width for geom_bar in the event of missing data

RGgplot2

R Problem Overview


Is there a way to set a constant width for geom_bar() in the event of missing data in the time series example below? I've tried setting width in aes() with no luck. Compare May '11 to June '11 width of bars in the plot below the code example.

colours <- c("#FF0000", "#33CC33", "#CCCCCC", "#FFA500", "#000000" )
iris$Month <- rep(seq(from=as.Date("2011-01-01"), to=as.Date("2011-10-01"), by="month"), 15)

colours <- c("#FF0000", "#33CC33", "#CCCCCC", "#FFA500", "#000000" )
iris$Month <- rep(seq(from=as.Date("2011-01-01"), to=as.Date("2011-10-01"), by="month"), 15)
d<-aggregate(iris$Sepal.Length, by=list(iris$Month, iris$Species), sum)
d$quota<-seq(from=2000, to=60000, by=2000)
colnames(d) <- c("Month", "Species", "Sepal.Width", "Quota")
d$Sepal.Width<-d$Sepal.Width * 1000
g1 <- ggplot(data=d, aes(x=Month, y=Quota, color="Quota")) + geom_line(size=1)
g1 + geom_bar(data=d[c(-1:-5),], aes(x=Month, y=Sepal.Width, width=10, group=Species, fill=Species), stat="identity", position="dodge") + scale_fill_manual(values=colours)

plot

R Solutions


Solution 1 - R

Some new options for position_dodge() and the new position_dodge2(), introduced in ggplot2 3.0.0 can help.

You can use preserve = "single" in position_dodge() to base the widths off a single element, so the widths of all bars will be the same.

ggplot(data = d, aes(x = Month, y = Quota, color = "Quota")) + 
     geom_line(size = 1) + 
     geom_col(data = d[c(-1:-5),], aes(y = Sepal.Width, fill = Species), 
              position = position_dodge(preserve = "single") ) + 
     scale_fill_manual(values = colours)

Using position_dodge2() changes the way things are centered, centering each set of bars at each x axis location. It has some padding built in, so use padding = 0 to remove.

ggplot(data = d, aes(x = Month, y = Quota, color = "Quota")) + 
     geom_line(size = 1) + 
     geom_col(data = d[c(-1:-5),], aes(y = Sepal.Width, fill = Species), 
              position = position_dodge2(preserve = "single", padding = 0) ) + 
     scale_fill_manual(values = colours)

Solution 2 - R

The easiest way is to supplement your data set so that every combination is present, even if it has NA as its value. Taking a simpler example (as yours has a lot of unneeded features):

dat <- data.frame(a=rep(LETTERS[1:3],3),
                  b=rep(letters[1:3],each=3),
                  v=1:9)[-2,]
                  
ggplot(dat, aes(x=a, y=v, colour=b)) +
  geom_bar(aes(fill=b), stat="identity", position="dodge")

enter image description here

This shows the behavior you are trying to avoid: in group "B", there is no group "a", so the bars are wider. Supplement dat with a dataframe with all the combinations of a and b:

dat.all <- rbind(dat, cbind(expand.grid(a=levels(dat$a), b=levels(dat$b)), v=NA))

ggplot(dat.all, aes(x=a, y=v, colour=b)) +
  geom_bar(aes(fill=b), stat="identity", position="dodge")  

enter image description here

Solution 3 - R

I had the same problem but was looking for a solution that works with the pipe (%>%). Using tidyr::spread and tidyr::gather from the tidyverse does the trick. I use the same data as @Brian Diggs, but with uppercase variable names to not end up with double variable names when transforming to wide:

library(tidyverse)

dat <- data.frame(A = rep(LETTERS[1:3], 3),
                  B = rep(letters[1:3], each = 3),
                  V = 1:9)[-2, ]
dat %>% 
  spread(key = B, value = V, fill = NA) %>% # turn data to wide, using fill = NA to generate missing values
  gather(key = B, value = V, -A) %>% # go back to long, with the missings
  ggplot(aes(x = A, y = V, fill = B)) +
  geom_col(position = position_dodge())

Edit:

There actually is a even simpler solution to that problem in combination with the pipe. Use tidyr::complete gives the same result in one line:

dat %>% 
  complete(A, B) %>% 
  ggplot(aes(x = A, y = V, fill = B)) +
  geom_col(position = position_dodge())

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
Questiontcash21View Question on Stackoverflow
Solution 1 - RaosmithView Answer on Stackoverflow
Solution 2 - RBrian DiggsView Answer on Stackoverflow
Solution 3 - RTinoView Answer on Stackoverflow