R define dimensions of empty data frame

R

R Problem Overview


I am trying to collect some data from multiple subsets of a data set and need to create a data frame to collect the results. My problem is don't know how to create an empty data frame with defined number of columns without actually having data to put into it.

collect1 <- c()  ## i'd like to create empty df w/ 3 columns: `id`, `max1` and `min1`

for(i in 1:10){
collect1$id <- i
ss1 <- subset(df1, df1$id == i)
collect1$max1 <- max(ss1$value)
collect1$min1 <- min(ss1$value)
}

I feel very dumb asking this question (I almost feel like I've asked it on SO before but can't find it) but would greatly appreciate any help.

R Solutions


Solution 1 - R

Would a dataframe of NAs work? something like:

data.frame(matrix(NA, nrow = 2, ncol = 3))

if you need to be more specific about the data type then may prefer: NA_integer_, NA_real_, NA_complex_, or NA_character_ instead of just NA which is logical

Something else that may be more specific that the NAs is:

data.frame(matrix(vector(mode = 'numeric',length = 6), nrow = 2, ncol = 3))

where the mode can be of any type. See ?vector

Solution 2 - R

Just create a data frame of empty vectors:

collect1 <- data.frame(id = character(0), max1 = numeric(0), max2 = numeric(0))

But if you know how many rows you're going to have in advance, you should just create the data frame with that many rows to start with.

Solution 3 - R

You can do something like:

N <- 10
collect1 <- data.frame(id   = integer(N),
                       max1 = numeric(N),
                       min1 = numeric(N))

Now be careful that in the rest of your code, you forgot to use the row index for filling the data.frame row by row. It should be:

for(i in seq_len(N)){
   collect1$id[i] <- i
   ss1 <- subset(df1, df1$id == i)
   collect1$max1[i] <- max(ss1$value)
   collect1$min1[i] <- min(ss1$value)
}

Finally, I would say that there are many alternatives for doing what you are trying to accomplish, some would be much more efficient and use a lot less typing. You could for example look at the aggregate function, or ddply from the plyr package.

Solution 4 - R

You may use NULL instead of NA. This creates a truly empty data frame.

Solution 5 - R

df = data.frame(matrix("", ncol = 3, nrow = 10))  

Solution 6 - R

Here a solution if you want an empty data frame with a defined number of rows and NO columns:

df = data.frame(matrix(NA, ncol=1, nrow=10)[-1]

Solution 7 - R

It might help the solution given in another forum, Basically is: i.e.

Cols <- paste("A", 1:5, sep="")
DF <- read.table(textConnection(""), col.names = Cols,colClasses = "character")

> str(DF)
'data.frame':	0 obs. of  5 variables:
$ A1: chr
$ A2: chr
$ A3: chr
$ A4: chr
$ A5: chr

You can change the colClasses to fit your needs.

Original link is <https://stat.ethz.ch/pipermail/r-help/2008-August/169966.html>

Solution 8 - R

A more general method to create an arbitrary size data frame is to create a n-by-1 data-frame from a matrix of the same dimension. Then, you can immediately drop the first row:

> v <- data.frame(matrix(NA, nrow=1, ncol=10))
> v <- v[-1, , drop=FALSE]
> v
 [1] X1  X2  X3  X4  X5  X6  X7  X8  X9  X10
<0 rows> (or 0-length row.names)

Solution 9 - R

If only the column names are available like :

cnms <- c("Nam1","Nam2","Nam3")

To create an empty data frame with the above variable names, first create a data.frame object:

emptydf <- data.frame()

Now call zeroth element of every column, thus creating an empty data frame with the given variable names:

for( i in 1:length(cnms)){
     emptydf[0,eval(cnms[i])]
 }

Solution 10 - R

seq_along may help to find out how many rows in your data file and create a data.frame with the desired number of rows

    listdf <- data.frame(ID=seq_along(df),
                              var1=seq_along(df), var2=seq_along(df))

Solution 11 - R

I have come across the same problem and have a cleaner solution. Instead of creating an empty data.frame you can instead save your data as a named list. Once you have added all results to this list you convert it to a data.frame after.

For the case of adding features one at a time this works best.

mylist = list()
for(column in 1:10) mylist$column = rnorm(10)
mydf = data.frame(mylist)

For the case of adding rows one at a time this becomes tricky due to mixed types. If all types are the same it is easy.

mylist = list()
for(row in 1:10) mylist$row = rnorm(10)
mydf = data.frame(do.call(rbind, mylist))

I haven't found a simple way to add rows of mixed types. In this case, if you must do it this way, the empty data.frame is probably the best solution.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionscreechOwlView Question on Stackoverflow
Solution 1 - RaatrujillobView Answer on Stackoverflow
Solution 2 - RHong OoiView Answer on Stackoverflow
Solution 3 - RflodelView Answer on Stackoverflow
Solution 4 - Ruser3148065View Answer on Stackoverflow
Solution 5 - RAmarjeetView Answer on Stackoverflow
Solution 6 - RSallyView Answer on Stackoverflow
Solution 7 - RJoseView Answer on Stackoverflow
Solution 8 - RBrendonView Answer on Stackoverflow
Solution 9 - RVikram VenkatView Answer on Stackoverflow
Solution 10 - RFRANK LiuView Answer on Stackoverflow
Solution 11 - RAdam WaringView Answer on Stackoverflow