Can dplyr join on multiple columns or composite key?

RDplyr

R Problem Overview


I realize that dplyr v3.0 allows you to join on different variables:

left_join(x, y, by = c("a" = "b") will match x.a to y.b

However, is it possible to join on a combination of variables or do I have to add a composite key beforehand?

Something like this:

left_join(x, y, by = c("a c" = "b d") to match the concatenation of [x.a and x.c] to [y.b and y.d]

R Solutions


Solution 1 - R

Updating to use tibble()

You can pass a named vector of length greater than 1 to the by argument of left_join():

library(dplyr)

d1 <- tibble(
  x = letters[1:3],
  y = LETTERS[1:3],
  a = rnorm(3)
  )

d2 <- tibble(
  x2 = letters[3:1],
  y2 = LETTERS[3:1],
  b = rnorm(3)
  )

left_join(d1, d2, by = c("x" = "x2", "y" = "y2"))

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionJasonAizkalnsView Question on Stackoverflow
Solution 1 - RdavechildersView Answer on Stackoverflow