Convert data.frame column to a vector?

RDataframeVectorType Conversion

R Problem Overview


I have a dataframe such as:

a1 = c(1, 2, 3, 4, 5)
a2 = c(6, 7, 8, 9, 10)
a3 = c(11, 12, 13, 14, 15)
aframe = data.frame(a1, a2, a3)

I tried the following to convert one of the columns to a vector, but it doesn't work:

avector <- as.vector(aframe['a2'])
class(avector) 
[1] "data.frame"

This is the only solution I could come up with, but I'm assuming there has to be a better way to do this:

class(aframe['a2']) 
[1] "data.frame"
avector = c()
for(atmp in aframe['a2']) { avector <- atmp }
class(avector)
[1] "numeric"

Note: My vocabulary above may be off, so please correct me if so. I'm still learning the world of R. Additionally, any explanation of what's going on here is appreciated (i.e. relating to Python or some other language would help!)

R Solutions


Solution 1 - R

I'm going to attempt to explain this without making any mistakes, but I'm betting this will attract a clarification or two in the comments.

A data frame is a list. When you subset a data frame using the name of a column and [, what you're getting is a sublist (or a sub data frame). If you want the actual atomic column, you could use [[, or somewhat confusingly (to me) you could do aframe[,2] which returns a vector, not a sublist.

So try running this sequence and maybe things will be clearer:

avector <- as.vector(aframe['a2'])
class(avector) 

avector <- aframe[['a2']]
class(avector)

avector <- aframe[,2]
class(avector)

Solution 2 - R

There's now an easy way to do this using dplyr.

dplyr::pull(aframe, a2)

Solution 3 - R

You could use $ extraction:

class(aframe$a1)
[1] "numeric"

or the double square bracket:

class(aframe[["a1"]])
[1] "numeric"

Solution 4 - R

You do not need as.vector(), but you do need correct indexing: avector <- aframe[ , "a2"]

The one other thing to be aware of is the drop=FALSE option to [:

R> aframe <- data.frame(a1=c1:5, a2=6:10, a3=11:15)
R> aframe
  a1 a2 a3
1  1  6 11
2  2  7 12
3  3  8 13
4  4  9 14
5  5 10 15
R> avector <- aframe[, "a2"]
R> avector
[1]  6  7  8  9 10
R> avector <- aframe[, "a2", drop=FALSE]
R> avector
  a2
1  6
2  7
3  8
4  9
5 10
R> 

Solution 5 - R

You can try something like this-

as.vector(unlist(aframe$a2))

Solution 6 - R

Another advantage of using the '[[' operator is that it works both with data.frame and data.table. So if the function has to be made running for both data.frame and data.table, and you want to extract a column from it as a vector then

data[["column_name"]] 

is best.

Solution 7 - R

as.vector(unlist(aframe['a2']))

Solution 8 - R

a1 = c(1, 2, 3, 4, 5)
a2 = c(6, 7, 8, 9, 10)
a3 = c(11, 12, 13, 14, 15)
aframe = data.frame(a1, a2, a3)
avector <- as.vector(aframe['a2'])

avector<-unlist(avector)
#this will return a vector of type "integer"

Solution 9 - R

If you just use the extract operator it will work. By default, [] sets option drop=TRUE, which is what you want here. See ?'[' for more details.

>  a1 = c(1, 2, 3, 4, 5)
>  a2 = c(6, 7, 8, 9, 10)
>  a3 = c(11, 12, 13, 14, 15)
>  aframe = data.frame(a1, a2, a3)
> aframe[,'a2']
[1]  6  7  8  9 10
> class(aframe[,'a2'])
[1] "numeric"

Solution 10 - R

I use lists to filter dataframes by whether or not they have a value %in% a list.

I had been manually creating lists by exporting a 1 column dataframe to Excel where I would add " ", around each element, before pasting into R: list <- c("el1", "el2", ...) which was usually followed by FilteredData <- subset(Data, Column %in% list).

After searching stackoverflow and not finding an intuitive way to convert a 1 column dataframe into a list, I am now posting my first ever stackoverflow contribution:

# assuming you have a 1 column dataframe called "df"
list <- c()
for(i in 1:nrow(df)){
  list <- append(list, df[i,1])
}
View(list)
# This list is not a dataframe, it is a list of values
# You can filter a dataframe using "subset([Data], [Column] %in% list")

Solution 11 - R

We can also convert data.frame columns generically to a simple vector. as.vector is not enough as it retains the data.frame class and structure, so we also have to pull out the first (and only) element:

df_column_object <- aframe[,2]
simple_column <- df_column_object[[1]]

All the solutions suggested so far require hardcoding column titles. This makes them non-generic (imagine applying this to function arguments).

Alternatively, you could, of course read the column names from the column first and then insert them in the code in the other solutions.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionDolan AntenucciView Question on Stackoverflow
Solution 1 - RjoranView Answer on Stackoverflow
Solution 2 - RAndrew BrēzaView Answer on Stackoverflow
Solution 3 - RJamesView Answer on Stackoverflow
Solution 4 - RDirk EddelbuettelView Answer on Stackoverflow
Solution 5 - RVaibhav SharmaView Answer on Stackoverflow
Solution 6 - Rjoel.wilsonView Answer on Stackoverflow
Solution 7 - RDr_HopeView Answer on Stackoverflow
Solution 8 - Rshubham ranjanView Answer on Stackoverflow
Solution 9 - RAri B. FriedmanView Answer on Stackoverflow
Solution 10 - RAdrian DSouzaView Answer on Stackoverflow
Solution 11 - R0rangeView Answer on Stackoverflow