How to display only integer values on an axis using ggplot2

RGgplot2

R Problem Overview


I have the following plot:

library(reshape)
library(ggplot2)
library(gridExtra)
require(ggplot2)



data2<-structure(list(IR = structure(c(4L, 3L, 2L, 1L, 4L, 3L, 2L, 1L
), .Label = c("0.13-0.16", "0.17-0.23", "0.24-0.27", "0.28-1"
), class = "factor"), variable = structure(c(1L, 1L, 1L, 1L, 
2L, 2L, 2L, 2L), .Label = c("Real queens", "Simulated individuals"
), class = "factor"), value = c(15L, 11L, 29L, 42L, 0L, 5L, 21L, 
22L), Legend = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L), .Label = c("Real queens", 
"Simulated individuals"), class = "factor")), .Names = c("IR", 
"variable", "value", "Legend"), row.names = c(NA, -8L), class = "data.frame")
p <- ggplot(data2, aes(x =factor(IR), y = value, fill = Legend, width=.15))


data3<-structure(list(IR = structure(c(4L, 3L, 2L, 1L, 4L, 3L, 2L, 1L
), .Label = c("0.13-0.16", "0.17-0.23", "0.24-0.27", "0.28-1"
), class = "factor"), variable = structure(c(1L, 1L, 1L, 1L, 
2L, 2L, 2L, 2L), .Label = c("Real queens", "Simulated individuals"
), class = "factor"), value = c(2L, 2L, 6L, 10L, 0L, 1L, 4L, 
4L), Legend = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L), .Label = c("Real queens", 
"Simulated individuals"), class = "factor")), .Names = c("IR", 
"variable", "value", "Legend"), row.names = c(NA, -8L), class = "data.frame")
q<- ggplot(data3, aes(x =factor(IR), y = value, fill = Legend, width=.15))


##the plot##
q + geom_bar(position='dodge', colour='black') + ylab('Frequency') + xlab('IR')+scale_fill_grey() +theme(axis.text.x=element_text(colour="black"), axis.text.y=element_text(colour="Black"))+ opts(title='', panel.grid.major = theme_blank(),panel.grid.minor = theme_blank(),panel.border = theme_blank(),panel.background = theme_blank(), axis.ticks.x = theme_blank())

I want the y-axis to display only integers. Whether this is accomplished through rounding or through a more elegant method isn't really important to me.

R Solutions


Solution 1 - R

If you have the scales package, you can use pretty_breaks() without having to manually specify the breaks.

q + geom_bar(position='dodge', colour='black') + 
scale_y_continuous(breaks= pretty_breaks())

Solution 2 - R

This is what I use:

ggplot(data3, aes(x = factor(IR), y = value, fill = Legend, width = .15)) +
  geom_col(position = 'dodge', colour = 'black') + 
  scale_y_continuous(breaks = function(x) unique(floor(pretty(seq(0, (max(x) + 1) * 1.1)))))

Solution 3 - R

With scale_y_continuous() and argument breaks= you can set the breaking points for y axis to integers you want to display.

ggplot(data2, aes(x =factor(IR), y = value, fill = Legend, width=.15)) +
    geom_bar(position='dodge', colour='black')+
    scale_y_continuous(breaks=c(1,3,7,10))

Solution 4 - R

You can use a custom labeller. For example, this function guarantees to only produce integer breaks:

int_breaks <- function(x, n = 5) {
  l <- pretty(x, n)
  l[abs(l %% 1) < .Machine$double.eps ^ 0.5] 
}

Use as

+ scale_y_continuous(breaks = int_breaks)

It works by taking the default breaks, and only keeping those that are integers. If it is showing too few breaks for your data, increase n, e.g.:

+ scale_y_continuous(breaks = function(x) int_breaks(x, n = 10))

Solution 5 - R

These solutions did not work for me and did not explain the solutions.

The breaks argument to the scale_*_continuous functions can be used with a custom function that takes the limits as input and returns breaks as output. By default, the axis limits will be expanded by 5% on each side for continuous data (relative to the range of data). The axis limits will likely not be integer values due to this expansion.

The solution I was looking for was to simply round the lower limit up to the nearest integer, round the upper limit down to the nearest integer, and then have breaks at integer values between these endpoints. Therefore, I used the breaks function:

brk <- function(x) seq(ceiling(x[1]), floor(x[2]), by = 1)

The required code snippet is:

scale_y_continuous(breaks = function(x) seq(ceiling(x[1]), floor(x[2]), by = 1))

The reproducible example from original question is:

data3 <-
  structure(
    list(
      IR = structure(
        c(4L, 3L, 2L, 1L, 4L, 3L, 2L, 1L),
        .Label = c("0.13-0.16", "0.17-0.23", "0.24-0.27", "0.28-1"),
        class = "factor"
      ),
      variable = structure(
        c(1L, 1L, 1L, 1L,
          2L, 2L, 2L, 2L),
        .Label = c("Real queens", "Simulated individuals"),
        class = "factor"
      ),
      value = c(2L, 2L, 6L, 10L, 0L, 1L, 4L,
                4L),
      Legend = structure(
        c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L),
        .Label = c("Real queens",
                   "Simulated individuals"),
        class = "factor"
      )
    ),
    row.names = c(NA,-8L),
    class = "data.frame"
  )

ggplot(data3, aes(
  x = factor(IR),
  y = value,
  fill = Legend,
  width = .15
)) +
  geom_col(position = 'dodge', colour = 'black') + ylab('Frequency') + xlab('IR') +
  scale_fill_grey() +
  scale_y_continuous(
    breaks = function(x) seq(ceiling(x[1]), floor(x[2]), by = 1),
    expand = expand_scale(mult = c(0, 0.05))
    ) +
  theme(axis.text.x=element_text(colour="black", angle = 45, hjust = 1), 
        axis.text.y=element_text(colour="Black"),
        panel.grid.major = element_blank(),
        panel.grid.minor = element_blank(),
        panel.border = element_blank(),
        panel.background = element_blank(), 
        axis.ticks.x = element_blank())

Solution 6 - R

You can use the accuracy argument of scales::label_number() or scales::label_comma() for this:

fakedata <- data.frame(
  x = 1:5,
  y = c(0.1, 1.2, 2.4, 2.9, 2.2)
)

library(ggplot2)

# without the accuracy argument, you see .0 decimals
ggplot(fakedata, aes(x = x, y = y)) +
  geom_point() +
  scale_y_continuous(label = scales::comma)

# with the accuracy argument, all displayed numbers are integers
ggplot(fakedata, aes(x = x, y = y)) +
  geom_point() +
  scale_y_continuous(label = ~ scales::comma(.x, accuracy = 1))

# equivalent
ggplot(fakedata, aes(x = x, y = y)) +
  geom_point() +
  scale_y_continuous(label = scales::label_comma(accuracy = 1))

# this works with scales::label_number() as well
ggplot(fakedata, aes(x = x, y = y)) +
  geom_point() +
  scale_y_continuous(label = scales::label_number(accuracy = 1))

Created on 2021-08-27 by the reprex package (v2.0.0.9000)

Solution 7 - R

All of the existing answers seem to require custom functions or fail in some cases.

This line makes integer breaks:

bad_scale_plot +
  scale_y_continuous(breaks = scales::breaks_extended(Q = c(1, 5, 2, 4, 3)))

For more info, see the documentation ?labeling::extended (which is a function called by scales::breaks_extended).

Basically, the argument Q is a set of nice numbers that the algorithm tries to use for scale breaks. The original plot produces non-integer breaks (0, 2.5, 5, and 7.5) because the default value for Q includes 2.5: Q = c(1,5,2,2.5,4,3).

EDIT: as pointed out in a comment, non-integer breaks can occur when the y-axis has a small range. By default, breaks_extended() tries to make about n = 5 breaks, which is impossible when the range is too small. Quick testing shows that ranges wider than 0 < y < 2.5 give integer breaks (n can also be decreased manually).

Solution 8 - R

I found this solution from Joshua Cook and worked pretty well.

integer_breaks <- function(n = 5, ...) {
fxn <- function(x) {
breaks <- floor(pretty(x, n, ...))
names(breaks) <- attr(breaks, "labels")
breaks
}
return(fxn)
}

q + geom_bar(position='dodge', colour='black') + 
scale_y_continuous(breaks = integer_breaks())

The source is: https://joshuacook.netlify.app/post/integer-values-ggplot-axis/

Solution 9 - R

This answer builds on @Axeman's answer to address the comment by kory that if the data only goes from 0 to 1, no break is shown at 1. This seems to be because of inaccuracy in pretty with outputs which appear to be 1 not being identical to 1 (see example at the end).

Therefore if you use

int_breaks_rounded <- function(x, n = 5)  pretty(x, n)[round(pretty(x, n),1) %% 1 == 0]

with

+ scale_y_continuous(breaks = int_breaks_rounded)

both 0 and 1 are shown as breaks.

Example to illustrate difference from Axeman's

testdata <- data.frame(x = 1:5, y = c(0,1,0,1,1))

p1 <- ggplot(testdata, aes(x = x, y = y))+
  geom_point()


p1 + scale_y_continuous(breaks = int_breaks)
p1 + scale_y_continuous(breaks =  int_breaks_rounded)

Both will work with the data provided in the initial question.

Illustration of why rounding is required

pretty(c(0,1.05),5)
#> [1] 0.0 0.2 0.4 0.6 0.8 1.0 1.2
identical(pretty(c(0,1.05),5)[6],1)
#> [1] FALSE

Solution 10 - R

Google brought me to this question. I'm trying to use real numbers in a y scale. The y scale numbers are in Millions.

The scales package comma method introduces a comma to my large numbers. This post on R-Bloggers explains a simple approach using the comma method:

library(scales)

big_numbers <- data.frame(x = 1:5, y = c(1000000:1000004))

big_numbers_plot <- ggplot(big_numbers, aes(x = x, y = y))+
geom_point()

big_numbers_plot + scale_y_continuous(labels = comma)

Enjoy R :)

Solution 11 - R

If your values are integers, here is another way of doing this with group = 1 and as.factor(value):

library(tidyverse)

data3<-structure(list(IR = structure(c(4L, 3L, 2L, 1L, 4L, 3L, 2L, 1L
), .Label = c("0.13-0.16", "0.17-0.23", "0.24-0.27", "0.28-1"
), class = "factor"), variable = structure(c(1L, 1L, 1L, 1L, 
                                             2L, 2L, 2L, 2L), .Label = c("Real queens", "Simulated individuals"
                                             ), class = "factor"), value = c(2L, 2L, 6L, 10L, 0L, 1L, 4L, 
                                                                             4L), Legend = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L), .Label = c("Real queens", 
                                                                                                                                                   "Simulated individuals"), class = "factor")), .Names = c("IR", 
                                                                                                                                                                                                            "variable", "value", "Legend"), row.names = c(NA, -8L), class = "data.frame")
data3 %>% 
  mutate(value = as.factor(value)) %>% 
  ggplot(aes(x =factor(IR), y = value, fill = Legend, width=.15)) +
  geom_col(position = 'dodge', colour='black', group = 1) 

Created on 2022-04-05 by the reprex package (v2.0.1)

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionAtticus29View Question on Stackoverflow
Solution 1 - RSealanderView Answer on Stackoverflow
Solution 2 - RDaniel GardinerView Answer on Stackoverflow
Solution 3 - RDidzis ElfertsView Answer on Stackoverflow
Solution 4 - RAxemanView Answer on Stackoverflow
Solution 5 - RNatView Answer on Stackoverflow
Solution 6 - RDropletView Answer on Stackoverflow
Solution 7 - RNickView Answer on Stackoverflow
Solution 8 - RBruno VidigalView Answer on Stackoverflow
Solution 9 - RSarahView Answer on Stackoverflow
Solution 10 - RTony CroninView Answer on Stackoverflow
Solution 11 - RbobloblawlawblogView Answer on Stackoverflow