Create a data.frame where a column is a list

RListDataframe

R Problem Overview


I know how to add a list column:

> df <- data.frame(a=1:3)
> df$b <- list(1:1, 1:2, 1:3)
> df
  a       b
1 1       1
2 2    1, 2
3 3 1, 2, 3

This works, but not:

> df <- data.frame(a=1:3, b=list(1:1, 1:2, 1:3))
Error in data.frame(1L, 1:2, 1:3, check.names = FALSE, stringsAsFactors = TRUE) : 
  arguments imply differing number of rows: 1, 2, 3

Why?

Also, is there a way to create df (above) in a single call to data.frame?

R Solutions


Solution 1 - R

Slightly obscurely, from ?data.frame:

> If a list or data frame or matrix is passed to ‘data.frame’ it is as > if each component or column had been passed as a separate argument > (except for matrices of class ‘"model.matrix"’ and those protected by > ‘I’).

(emphasis added).

So

data.frame(a=1:3,b=I(list(1,1:2,1:3)))

seems to work.

Solution 2 - R

If you are working with data.tables, then you can avoid the call to I()

library(data.table)
# the following works as intended
data.table(a=1:3,b=list(1,1:2,1:3))

   a     b
1: 1     1
2: 2   1,2
3: 3 1,2,3

Solution 3 - R

data_frames (variously called tibbles, tbl_df, tbl) natively support the creation of list columns using the data_frame constructor. To use them, load one of the many libraries with them such as tibble, dplyr or tidyverse.

> data_frame(abc = letters[1:3], lst = list(1:3, 1:3, 1:3))
# A tibble: 3 × 2
    abc       lst
  <chr>    <list>
1     a <int [3]>
2     b <int [3]>
3     c <int [3]>

They are actually data.frames under the hood, but somewhat modified. They can almost always be used as normal data.frames. The only exception I've found is that when people do inappropriate class checks, they cause problems:

> #no problem
> data.frame(x = 1:3, y = 1:3) %>% class
[1] "data.frame"
> data.frame(x = 1:3, y = 1:3) %>% class == "data.frame"
[1] TRUE
> #uh oh
> data_frame(x = 1:3, y = 1:3) %>% class
[1] "tbl_df"     "tbl"        "data.frame"
> data_frame(x = 1:3, y = 1:3) %>% class == "data.frame"
[1] FALSE FALSE  TRUE
> #dont use if with improper testing!
> if(data_frame(x = 1:3, y = 1:3) %>% class == "data.frame") "something"
Warning message:
In if (data_frame(x = 1:3, y = 1:3) %>% class == "data.frame") "something" :
  the condition has length > 1 and only the first element will be used
> #proper
> data_frame(x = 1:3, y = 1:3) %>% inherits("data.frame")
[1] TRUE

I recommending reading about them in R 4 Data Science (free).

Solution 4 - R

You can use list2DF to create a data.frame where a column is a list.

x <- list2DF(list(a=1:3, b=list(1:1, 1:2, 1:3)))
#x <- data.frame(a=1:3, list2DF(list(b=list(1:1, 1:2, 1:3)))) #Alternative

x
#  a       b
#1 1       1
#2 2    1, 2
#3 3 1, 2, 3

str(x)
#'data.frame':   3 obs. of  2 variables:
# $ a: int  1 2 3
# $ b:List of 3
#  ..$ : int 1
#  ..$ : int  1 2
#  ..$ : int  1 2 3

With this you don't have the attr AsIs in the data.frame, what you would have when using I.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionflodelView Question on Stackoverflow
Solution 1 - RBen BolkerView Answer on Stackoverflow
Solution 2 - RmnelView Answer on Stackoverflow
Solution 3 - RCoderGuy123View Answer on Stackoverflow
Solution 4 - RGKiView Answer on Stackoverflow