How to group by all but one columns?

R Problem Overview

How do I tell group_by to group the data by all columns except a given one?

With aggregate, it would be aggregate(x ~ ., ...).

I tried group_by(data, -x), but that groups by the negative-of-x (i.e. the same as grouping by x).

R Solutions

Solution 1 - R

dplyr version 1.0+

In dplyr 1.0.0 coming up, the _at functions are falling into the superseded lifecycle (i.e. while they remain in dplyr for the foreseeable future, there are now better alternatives that are more actively developed). The new way to accomplish this is via the across function:

df %>%
  group_by(across(c(-hp)))

dplyr v 0.7+

A small update on this question because I stumbled across this myself and found an elegant solution with current version of dplyr (0.7.4): Inside group_by_at(), you can supply the names of columns the same way as in the select() function using vars(). This enables us to group by everything but one column (hp in this example) by writing:

library(dplyr)
df <- as_tibble(mtcars, rownames = "car")
df %>% group_by_at(vars(-hp))

Solution 2 - R

Building on the @eipi10's dplyr 0.7.0 edit, group_by_at appears to be the right function for this job. However, if you are simply looking to omit column "x", then you can use:

new2.0 <- dat %>%
  group_by_at(vars(-x)) %>%
  summarize(mean_value = mean(value))

Using @eipi10's example data:

# Fake data
set.seed(492)
dat <- data.frame(value = rnorm(1000),
             g1 = sample(LETTERS, 1000, replace = TRUE),
             g2 = sample(letters, 1000, replace = TRUE),
             g3 = sample(1:10, replace = TRUE),
             other = sample(c("red", "green", "black"), 1000, replace = TRUE))

new <- dat %>% 
  group_by_at(names(dat)[-grep("value", names(dat))]) %>%
  summarise(meanValue = mean(value))


new2.0 <- dat %>% 
  group_by_at(vars(-value)) %>% 
  summarize(meanValue = mean(value))

identical(new, new2.0)
# [1] TRUE

Solution 3 - R

You can do this using standard evaluation (group_by_ instead of group_by):

# Fake data
set.seed(492)
dat = data.frame(value=rnorm(1000), g1=sample(LETTERS,1000,replace=TRUE),
                 g2=sample(letters,1000,replace=TRUE), g3=sample(1:10, replace=TRUE),
                 other=sample(c("red","green","black"),1000,replace=TRUE))

dat %>% group_by_(.dots=names(dat)[-grep("value", names(dat))]) %>%
  summarise(meanValue=mean(value))

> g1 g2 g3 other meanValue > > 1 A a 2 green 0.89281475 > 2 A b 2 red -0.03558775 > 3 A b 5 black -1.79184218 > 4 A c 10 black 0.17518610 > 5 A e 5 black 0.25830392 > ...

See this vignette for more on standard vs. non-standard evaluation in dplyr.

UPDATE for `dplyr` 0.7.0

To address @ÖmerAn's comment: It looks like group_by_at is the way to go in dplyr 0.7.0 (someone please correct me if I'm wrong about this). For example:

dat %>% 
  group_by_at(setdiff(names(dat), "value")) %>%
  summarise(meanValue=mean(value))

> # Groups: g1, g2, g3 [?] > g1 g2 g3 other meanValue > > 1 A a 2 green 0.89281475 > 2 A b 2 red -0.03558775 > 3 A b 5 black -1.79184218 > 4 A c 10 black 0.17518610 > 5 A e 5 black 0.25830392 > 6 A e 5 red -0.81879788 > 7 A e 7 green 0.30836054 > 8 A f 2 green 0.05537047 > 9 A g 1 black 1.00156405 > 10 A g 10 black 1.26884303 > # ... with 949 more rows

Let's confirm both methods give the same output (in dplyr 0.7.0):

new = dat %>% 
  group_by_at(setdiff(names(dat), "value")) %>%
  summarise(meanValue=mean(value))

old = dat %>% 
  group_by_(.dots=names(dat)[-grep("value", names(dat))]) %>%
  summarise(meanValue=mean(value))

identical(old, new)
# [1] TRUE

Content Type	Original Author	Original Content on Stackoverflow
Question	Roman Cheplyaka	View Question on Stackoverflow
Solution 1 - R	Jannik Buhr	View Answer on Stackoverflow
Solution 2 - R	ZS27	View Answer on Stackoverflow
Solution 3 - R	eipi10	View Answer on Stackoverflow

How to group by all but one columns?

R Problem Overview

R Solutions

Solution 1 - R

dplyr version 1.0+

dplyr v 0.7+

Solution 2 - R

Solution 3 - R

UPDATE for `dplyr` 0.7.0

How to to return an image with Web API Get method

How to define Object of Objects type in typescript

Attributions

R Problem Overview

R Solutions

Solution 1 - R

dplyr version 1.0+

dplyr v 0.7+

Solution 2 - R

Solution 3 - R

UPDATE for dplyr 0.7.0

How to to return an image with Web API Get method

How to define Object of Objects type in typescript

Attributions

UPDATE for `dplyr` 0.7.0