What is difference between dataframe and list in R?

RListDataframe

R Problem Overview


What is difference between dataframe and list in R? Which one should be used when? Which is easier to loop over?

Exact problem: I have to first store 3 string elements like "a", "b", "c". Later for each of these, I need to append 3 more elements; for instance for "a" I have to add "a1", "a2", "a3". Later I have to use nested for loops to access these elements.

So I am confused to use dataframe or list or some other data type, in which I could first store and then append (kind of each column)?

Currently I am getting errors, like "number of items to replace is not a multiple of replacement length"

R Solutions


Solution 1 - R

The question isn't as stupid as some people think it is. I know plenty of people struggling with that difference, and what to use where. To summarize :

Lists are by far the most flexible data structure in R. They can be seen as a collection of elements without any restriction on the class, length or structure of each element. The only thing you need to take care of, is that you don't give two elements the same name. That might cause a lot of confusion, and R doesn't give errors for that:

> X <- list(a=1,b=2,a=3)
> X$a
[1] 1

Data frames are lists as well, but they have a few restrictions:

  • you can't use the same name for two different variables
  • all elements of a data frame are vectors
  • all elements of a data frame have an equal length.

Due to these restrictions and the resulting two-dimensional structure, data frames can mimick some of the behaviour of matrices. You can select rows and do operations on rows. You can't do that with lists, as a row is undefined there.

All this implies that you should use a data frame for any dataset that fits in that twodimensional structure. Essentially, you use data frames for any dataset where a column coincides with a variable and a row coincides with a single observation in the broad sense of the word. For all other structures, lists are the way to go.

Note that if you want a nested structure, you have to use lists. As elements of a list can be lists themselves, you can create very flexible structured objects.

Solution 2 - R

Look at the example: If you use apply instead of sapply to get the class -

apply(iris,2,class) #  function elements are rows or columns
Sepal.Length  Sepal.Width Petal.Length  Petal.Width      Species 
"character"  "character"  "character"  "character"  "character" 

sapply(iris,class) # function elements are variables
Sepal.Length  Sepal.Width Petal.Length  Petal.Width      Species 
"numeric"    "numeric"    "numeric"    "numeric"     "factor" 

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionShazSimpleView Question on Stackoverflow
Solution 1 - RJoris MeysView Answer on Stackoverflow
Solution 2 - Ruser5220347View Answer on Stackoverflow