How to add a row to a data frame in R?

RDataframe

R Problem Overview


In R, how do you add a new row to a data frame once the data frame has already been initialized?

So far I have this:

df <- data.frame("hi", "bye")
names(df) <- c("hello", "goodbye")

#I am trying to add "hola" and "ciao" as a new row
de <- data.frame("hola", "ciao")

merge(df, de) # Adds to the same row as new columns

# Unfortunately, I couldn't find an rbind() solution that wouldn't give me an error

Any help would be appreciated

R Solutions


Solution 1 - R

Let's make it simple:

df[nrow(df) + 1,] = c("v1","v2")

Solution 2 - R

Like @Khashaa and @Richard Scriven point out in comments, you have to set consistent column names for all the data frames you want to append.

Hence, you need to explicitly declare the columns names for the second data frame, de, then use rbind(). You only set column names for the first data frame, df:

df<-data.frame("hi","bye")
names(df)<-c("hello","goodbye")

de<-data.frame("hola","ciao")
names(de)<-c("hello","goodbye")

newdf <- rbind(df, de)

Solution 3 - R

There's now add_row() from the tibble or tidyverse packages.

library(tidyverse)
df %>% add_row(hello = "hola", goodbye = "ciao")

Unspecified columns get an NA.

Solution 4 - R

Or, as inspired by @MatheusAraujo:

df[nrow(df) + 1,] = list("v1","v2")

This would allow for mixed data types.

Solution 5 - R

I like list instead of c because it handles mixed data types better. Adding an additional column to the original poster's question:

#Create an empty data frame
df <- data.frame(hello=character(), goodbye=character(), volume=double())
de <- list(hello="hi", goodbye="bye", volume=3.0)
df = rbind(df,de, stringsAsFactors=FALSE)
de <- list(hello="hola", goodbye="ciao", volume=13.1)
df = rbind(df,de, stringsAsFactors=FALSE)

Note that some additional control is required if the string/factor conversion is important.

Or using the original variables with the solution from MatheusAraujo/Ytsen de Boer:

df[nrow(df) + 1,] = list(hello="hallo",goodbye="auf wiedersehen", volume=20.2)

Note that this solution doesn't work well with the strings unless there is existing data in the dataframe.

Solution 6 - R

Not terribly elegant, but:

data.frame(rbind(as.matrix(df), as.matrix(de)))

From documentation of the rbind function: > For rbind column names are taken from the first argument with appropriate names: colnames for a matrix...

Solution 7 - R

If you want to make an empty data frame and add contents in a loop, the following may help:

# Number of students in class
student.count <- 36

# Gather data about the students
student.age <- sample(14:17, size = student.count, replace = TRUE)
student.gender <- sample(c('male', 'female'), size = student.count, replace = TRUE)
student.marks <- sample(46:97, size = student.count, replace = TRUE)

# Create empty data frame
student.data <- data.frame()

# Populate the data frame using a for loop
for (i in 1 : student.count) {
    # Get the row data
    age <- student.age[i]
    gender <- student.gender[i]
    marks <- student.marks[i]
    
    # Populate the row
    new.row <- data.frame(age = age, gender = gender, marks = marks)
    
    # Add the row
    student.data <- rbind(student.data, new.row)
}

# Print the data frame
student.data

Hope it helps :)

Solution 8 - R

To build a data.frame in a loop:

df <- data.frame()
for(i in 1:10){
  df <- rbind(df, data.frame(str="hello", x=i, y=i*10))
}

Solution 9 - R

I need to add stringsAsFactors=FALSE when creating the dataframe.

> df <- data.frame("hello"= character(0), "goodbye"=character(0))
> df
[1] hello   goodbye
<0 rows> (or 0-length row.names)
> df[nrow(df) + 1,] = list("hi","bye")
Warning messages:
1: In `[<-.factor`(`*tmp*`, iseq, value = "hi") :
  invalid factor level, NA generated
2: In `[<-.factor`(`*tmp*`, iseq, value = "bye") :
  invalid factor level, NA generated
> df
  hello goodbye
1  <NA>    <NA>
> 

.

> df <- data.frame("hello"= character(0), "goodbye"=character(0), stringsAsFactors=FALSE)
> df
[1] hello   goodbye
<0 rows> (or 0-length row.names)
> df[nrow(df) + 1,] = list("hi","bye")
> df[nrow(df) + 1,] = list("hola","ciao")
> df[nrow(df) + 1,] = list(hello="hallo",goodbye="auf wiedersehen")
> df
  hello         goodbye
1    hi             bye
2  hola            ciao
3 hallo auf wiedersehen
> 

Solution 10 - R

Make certain to specify stringsAsFactors=FALSE when creating the dataframe:

> rm(list=ls())
> trigonometry <- data.frame(character(0), numeric(0), stringsAsFactors=FALSE)
> colnames(trigonometry) <- c("theta", "sin.theta")
> trigonometry
[1] theta     sin.theta
<0 rows> (or 0-length row.names)
> trigonometry[nrow(trigonometry) + 1, ] <- c("0", sin(0))
> trigonometry[nrow(trigonometry) + 1, ] <- c("pi/2", sin(pi/2))
> trigonometry
  theta sin.theta
1     0         0
2  pi/2         1
> typeof(trigonometry)
[1] "list"
> class(trigonometry)
[1] "data.frame"

Failing to use stringsAsFactors=FALSE when creating the dataframe will result in the following error when attempting to add the new row:

> trigonometry[nrow(trigonometry) + 1, ] <- c("0", sin(0))
Warning message:
In `[<-.factor`(`*tmp*`, iseq, value = "0") :
  invalid factor level, NA generated

Solution 11 - R

There is a simpler way to append a record from one dataframe to another IF you know that the two dataframes share the same columns and types. To append one row from xx to yy just do the following where i is the i'th row in xx.

yy[nrow(yy)+1,] <- xx[i,]

Simple as that. No messy binds. If you need to append all of xx to yy, then either call a loop or take advantage of R's sequence abilities and do this:

zz[(nrow(zz)+1):(nrow(zz)+nrow(yy)),] <- yy[1:nrow(yy),]

Solution 12 - R

To formalize what someone else used setNames for:

add_row <- function(original_data, new_vals_list){ 
  # appends row to dataset while assuming new vals are ordered and classed appropriately. 
  # new_vals must be a list not a single vector. 
  rbind(
    original_data,
    setNames(data.frame(new_vals_list), colnames(original_data))
    )
  }

It preserves class when legal and passes errors elsewhere.

m <- mtcars[ ,1:3]
m$cyl <- as.factor(m$cyl)
str(m)

#'data.frame':	32 obs. of  3 variables:
# $ mpg : num  21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
# $ cyl : Factor w/ 3 levels "4","6","8": 2 2 1 2 3 2 3 1 1 2 ...
# $ disp: num  160 160 108 258 360 ...

Factor preserved when adding 4, even though it was passed as a numeric.

str(add_row(m, list(20,4,160)))
#'data.frame':	33 obs. of  3 variables:
# $ mpg : num  21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
# $ cyl : Factor w/ 3 levels "4","6","8": 2 2 1 2 3 2 3 1 1 2 ... 
# $ disp: num  160 160 108 258 360 ...

Attempting to pass a non- 4,6,8 would return an error that factor level is invalid.

str(add_row(m, list(20,3,160)))
# 'data.frame':	33 obs. of  3 variables:
# $ mpg : num  21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
# $ cyl : Factor w/ 3 levels "4","6","8": 2 2 1 2 3 2 3 1 1 2 ...
# $ disp: num  160 160 108 258 360 ...
Warning message:
In `[<-.factor`(`*tmp*`, ri, value = 3) :
  invalid factor level, NA generated

Solution 13 - R

I will add to the other suggestions. I use the base r code to create a dataframe:

data_set_name <- data.frame(data_set)

Now I always suggest making a duplicate of the original data frame just in case you need to go back or test something out. I listed that below:

data_set_name_copy <- data_set_name

Now if you wanted to add a new column the code would look like the following:

data_set_name_copy$Name_of_New_Column <- Data_for_New_Column

The $ signifies that you are adding a new column and right after as outlined you insert the nomenclature/name for your new entry.

Solution 14 - R

I think,

rbind.data.frame(df, de)

should do the trick

Solution 15 - R

In dplyr >= 1.0.0 you could use row_insert:

df1 <- data.frame(hello = "hi", goodbye = "bye")
df2 <- data.frame(hello = "hola", goodbye = "ciao")

library(dplyr)

df1 %>% 
  rows_insert(df2)
Matching, by = "hello"
  hello goodbye
1    hi     bye
2  hola    ciao

Note: all columns in df2 must exist in df1, but not all columns in df1 have to be in df2.

For additional behavior, there are other row_* options. For example, you could use row_upsert which will overwrite the values if they exist already, otherwise it will insert them:

df2 <- data.frame(hello = c("hi", "hola"), goodbye = c("goodbye", "ciao"))

library(dplyr)

df1 %>% 
  rows_upsert(df2)
Matching, by = "hello"
  hello goodbye
1    hi goodbye # bye updated to goodbye since "hi" was already in data frame
2  hola    ciao # inserted because "hola" was not in the data frame

These functions work by matching key columns. If the by argument is not specified then the default behavior is to match the first column in the second data frame (df2 in this example) to the first data frame (df1 in this example).

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionRilcon42View Question on Stackoverflow
Solution 1 - RMatheus AraujoView Answer on Stackoverflow
Solution 2 - RParfaitView Answer on Stackoverflow
Solution 3 - RJoeView Answer on Stackoverflow
Solution 4 - RYtsen de BoerView Answer on Stackoverflow
Solution 5 - Rgsk9View Answer on Stackoverflow
Solution 6 - RJ. Win.View Answer on Stackoverflow
Solution 7 - REdwin PrattView Answer on Stackoverflow
Solution 8 - RStuart BallView Answer on Stackoverflow
Solution 9 - RnealeiView Answer on Stackoverflow
Solution 10 - ROracleJavaNetView Answer on Stackoverflow
Solution 11 - RPatrick ChampionView Answer on Stackoverflow
Solution 12 - RCarlos R. MercadoView Answer on Stackoverflow
Solution 13 - Rspecops223View Answer on Stackoverflow
Solution 14 - Ruser17924177View Answer on Stackoverflow
Solution 15 - RLMcView Answer on Stackoverflow