How to convert a list consisting of vector of different lengths to a usable data frame in R?

RVectorDataframe

R Problem Overview


I have a (fairly long) list of vectors. The vectors consist of Russian words that I got by using the strsplit() function on sentences.

The following is what head() returns:

[[1]]
[1] "модно"     "создавать" "резюме"    "в"         "виде"     

[[2]]
[1] "ты"        "начианешь" "работать"  "с"         "этими"    

[[3]]
[1] "модно"            "называть"         "блогер-рилейшенз" "―"                "начинается"       "задолго"         

[[4]]
[1] "видел" "по"    "сыну," "что"   "он"   

[[5]]
[1] "четырнадцать," "я"             "поселился"     "на"            "улице"        

[[6]]
[1] "широко"     "продолжали" "род."

Note the vectors are of different length.

What I want is to be able to read the first words from each sentence, the second word, the third, etc.

The desired result would be something like this:

    P1              P2           P3                 P4    P5           P6
[1] "модно"         "создавать"  "резюме"           "в"   "виде"       NA
[2] "ты"            "начианешь"  "работать"         "с"   "этими"      NA
[3] "модно"         "называть"   "блогер-рилейшенз" "―"   "начинается" "задолго"         
[4] "видел"         "по"         "сыну,"            "что" "он"         NA
[5] "четырнадцать," "я"          "поселился"        "на"  "улице"      NA
[6] "широко"        "продолжали" "род."             NA    NA           NA

I have tried to just use data.frame() but that didn't work because the rows are of different length. I also tried rbind.fill() from the plyr package, but that function can only process matrices.

I found some other questions here (that's where I got the plyr help from), but those were all about combining for instance two data frames of different size.

Thanks for your help.

R Solutions


Solution 1 - R

One liner with plyr

plyr::ldply(word.list, rbind)

Solution 2 - R

try this:

word.list <- list(letters[1:4], letters[1:5], letters[1:2], letters[1:6])
n.obs <- sapply(word.list, length)
seq.max <- seq_len(max(n.obs))
mat <- t(sapply(word.list, "[", i = seq.max))

the trick is, that,

c(1:2)[1:4]

returns the vector + two NAs

Solution 3 - R

Another option is stri_list2matrix from library(stringi)

library(stringi)
stri_list2matrix(l, byrow=TRUE)
#    [,1] [,2] [,3] [,4]
#[1,] "a"  "b"  "c"  NA  
#[2,] "a2" "b2" NA   NA  
#[3,] "a3" "b3" "c3" "d3"

NOTE: Data from @juba's post.

Or as @Valentin mentioned in the comments

sapply(l, "length<-", max(lengths(l)))

Solution 4 - R

You can do something like this :

## Example data
l <- list(c("a","b","c"), c("a2","b2"), c("a3","b3","c3","d3"))
## Compute maximum length
max.length <- max(sapply(l, length))
## Add NA values to list elements
l <- lapply(l, function(v) { c(v, rep(NA, max.length-length(v)))})
## Rbind
do.call(rbind, l)

Which gives :

     [,1] [,2] [,3] [,4]
[1,] "a"  "b"  "c"  NA  
[2,] "a2" "b2" NA   NA  
[3,] "a3" "b3" "c3" "d3"

Solution 5 - R

You could also use rbindlist() from the data.table package.

Convert vectors to data.tables or data.frames and transpose them (not sure if this reduces speed a lot) with the help of lapply(). Then bind them with rbindlist() - filling missing cells with NA.

require(data.table)

l = list(c("a","b","c"), c("a2","b2"), c("a3","b3","c3","d3"))
dt = rbindlist(
  lapply(l, function(x) data.table(t(x))),
  fill = TRUE
)

Solution 6 - R

Another option could be to define a function like this (it'd mimic rbind.fill) or use it directly from rowr package:

cbind.fill <- function(...){
  nm <- list(...) 
  nm <- lapply(nm, as.matrix)
  n <- max(sapply(nm, nrow)) 
  do.call(cbind, lapply(nm, function (x) 
    rbind(x, matrix(, n-nrow(x), ncol(x))))) 
}

This response is taken from here (and there're some usage examples).

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionIcoView Question on Stackoverflow
Solution 1 - RRamnathView Answer on Stackoverflow
Solution 2 - RadibenderView Answer on Stackoverflow
Solution 3 - RakrunView Answer on Stackoverflow
Solution 4 - RjubaView Answer on Stackoverflow
Solution 5 - RandscharView Answer on Stackoverflow
Solution 6 - RjgarcesView Answer on Stackoverflow