assign headers based on existing row in dataframe in R

RDataframeNames

R Problem Overview


After transforming a dataframe, I would like to assign heads/names to the columns based on an existing row. My headers are currently:

row.names	X2	X3	X4	X5	X6	X7	X8	X9	...

I would like to get rid of that and use the following row as column headers (without having to type them out since I have many).

The only solution I have for this is to export and re-load the data (with header=T).

R Solutions


Solution 1 - R

The key here is to unlist the row first.

colnames(DF) <- as.character(unlist(DF[1,]))
DF = DF[-1, ]

Solution 2 - R

Try this:

colnames(DF) = DF[1, ] # the first row will be the header
DF = DF[-1, ]          # removing the first row.

However, get a look if the data has been properly read. If you data.frame has numeric variables but the first row were characters, all the data has been read as character. To avoid this problem, it's better to save the data and read again with header=TRUE as you suggest. You can also get a look to this question: https://stackoverflow.com/questions/17288197/reading-a-csv-file-organized-horizontally/17289991#17289991.

Solution 3 - R

The cleanest way is use a function of janitor package that is built for exactly this purpose.

janitor::row_to_names(DF,1)

If you want to use any other row than the first one, pass it in the second parameter.

Solution 4 - R

Very similar to Vishnu's answer but uses the lapply to map all the data to characters then to assign them as the headers. This is really helpful if your data is imported as factors.

DF[] <- lapply(DF, as.character)
colnames(DF) <- DF[1, ]
DF <- DF[-1 ,]

note that that if you have a lot of numeric data or factors you want you'll need to convert them back. In this case it may make sense to store the character data frame, extract the row you want, and then apply it to the original data frame

tempDF <- DF
tempDF[] <- lapply(DF, as.character)
colnames(DF) <- tempDF[1, ]
DF <- DF[-1 ,]
tempDF <- NULL

Solution 5 - R

A new answer that uses dplyr and tidyr:

Extracts the desired column names and converts to a list

library(tidyverse)

col_names <- raw_dta %>% 
  slice(2) %>%
  pivot_longer(
    cols = "X2":"X10", # until last named column
    names_to = "old_names",
    values_to = "new_names") %>% 
  pull(new_names)

Removes the incorrect rows and adds the correct column names

dta <- raw_dta %>% 
  slice(-1, -2) %>% # Removes the rows containing new and original names
  set_names(., nm = col_names)

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
Questionuser3166363View Question on Stackoverflow
Solution 1 - RVishnu JayanandView Answer on Stackoverflow
Solution 2 - RRicardo Oliveros-RamosView Answer on Stackoverflow
Solution 3 - RLazarus ThurstonView Answer on Stackoverflow
Solution 4 - RblakiseskreamView Answer on Stackoverflow
Solution 5 - Rgreg_sView Answer on Stackoverflow