Comparing gather (tidyr) to melt (reshape2)

RReshape2Tidyr

R Problem Overview


I love the reshape2 package because it made life so doggone easy. Typically Hadley has made improvements in his previous packages that enable streamlined, faster running code. I figured I'd give tidyr a whirl and from what I read I thought gather was very similar to melt from reshape2. But after reading the documentation I can't get gather to do the same task that melt does.

Data View

Here's a view of the data (actual data in dput form at end of post):

  teacher yr1.baseline     pd yr1.lesson1 yr1.lesson2 yr2.lesson1 yr2.lesson2 yr2.lesson3
1       3      1/13/09 2/5/09      3/6/09     4/27/09     10/7/09    11/18/09      3/4/10
2       7      1/15/09 2/5/09      3/3/09      5/5/09    10/16/09    11/18/09      3/4/10
3       8      1/27/09 2/5/09      3/3/09     4/27/09     10/7/09    11/18/09      3/5/10

Code

Here's the code in melt fashion, my attempt at gather. How can I make gather do the same thing as melt?

library(reshape2); library(dplyr); library(tidyr)

dat %>% 
   melt(id=c("teacher", "pd"), value.name="date") 

dat %>% 
   gather(key=c(teacher, pd), value=date, -c(teacher, pd)) 

Desired Output

   teacher     pd     variable     date
1        3 2/5/09 yr1.baseline  1/13/09
2        7 2/5/09 yr1.baseline  1/15/09
3        8 2/5/09 yr1.baseline  1/27/09
4        3 2/5/09  yr1.lesson1   3/6/09
5        7 2/5/09  yr1.lesson1   3/3/09
6        8 2/5/09  yr1.lesson1   3/3/09
7        3 2/5/09  yr1.lesson2  4/27/09
8        7 2/5/09  yr1.lesson2   5/5/09
9        8 2/5/09  yr1.lesson2  4/27/09
10       3 2/5/09  yr2.lesson1  10/7/09
11       7 2/5/09  yr2.lesson1 10/16/09
12       8 2/5/09  yr2.lesson1  10/7/09
13       3 2/5/09  yr2.lesson2 11/18/09
14       7 2/5/09  yr2.lesson2 11/18/09
15       8 2/5/09  yr2.lesson2 11/18/09
16       3 2/5/09  yr2.lesson3   3/4/10
17       7 2/5/09  yr2.lesson3   3/4/10
18       8 2/5/09  yr2.lesson3   3/5/10

Data

dat <- structure(list(teacher = structure(1:3, .Label = c("3", "7", 
    "8"), class = "factor"), yr1.baseline = structure(1:3, .Label = c("1/13/09", 
    "1/15/09", "1/27/09"), class = "factor"), pd = structure(c(1L, 
    1L, 1L), .Label = "2/5/09", class = "factor"), yr1.lesson1 = structure(c(2L, 
    1L, 1L), .Label = c("3/3/09", "3/6/09"), class = "factor"), yr1.lesson2 = structure(c(1L, 
    2L, 1L), .Label = c("4/27/09", "5/5/09"), class = "factor"), 
        yr2.lesson1 = structure(c(2L, 1L, 2L), .Label = c("10/16/09", 
        "10/7/09"), class = "factor"), yr2.lesson2 = structure(c(1L, 
        1L, 1L), .Label = "11/18/09", class = "factor"), yr2.lesson3 = structure(c(1L, 
        1L, 2L), .Label = c("3/4/10", "3/5/10"), class = "factor")), .Names = c("teacher", 
    "yr1.baseline", "pd", "yr1.lesson1", "yr1.lesson2", "yr2.lesson1", 
    "yr2.lesson2", "yr2.lesson3"), row.names = c(NA, -3L), class = "data.frame")

R Solutions


Solution 1 - R

Your gather line should look like:

dat %>% gather(variable, date, -teacher, -pd)

This says "Gather all variables except teacher and pd, calling the new key column 'variable' and the new value column 'date'."


As an explanation, note the following from the help(gather) page:

 ...: Specification of columns to gather. Use bare variable names.
      Select all variables between x and z with ‘x:z’, exclude y
      with-y’. For more options, see the select documentation.

Since this is an ellipsis, the specification of columns to gather is given as separate (bare name) arguments. We wish to gather all columns except teacher and pd, so we use -.

Solution 2 - R

In tidyr 1.0.0 this task is accomplished with the more flexible pivot_longer().

The equivalent syntax would be

library(tidyr)
dat %>% pivot_longer(cols = -c(teacher, pd), names_to = "variable", values_to = "date")

which says, correspondingly, "pivot everything longer except teacher and pd, calling the new variable column "variable" and the new value column "date".

Note that the long data comes back in order firstly of the columns of the previous data frame that were pivoted, unlike from gather, which came back in the order of the new variable column. To rearrange the resultant tibble, use dplyr::arrange().

Solution 3 - R

My solution

    dat%>%
    gather(!c(teacher,pd),key=variable,value=date)

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionTyler RinkerView Question on Stackoverflow
Solution 1 - RDavid RobinsonView Answer on Stackoverflow
Solution 2 - RJoeView Answer on Stackoverflow
Solution 3 - RSravanView Answer on Stackoverflow