How do I sort one vector based on values of another
SortingRSorting Problem Overview
I have a vector x, that I would like to sort based on the order of values in vector y. The two vectors are not of the same length.
x < c(2, 2, 3, 4, 1, 4, 4, 3, 3)
y < c(4, 2, 1, 3)
The expected result would be:
[1] 4 4 4 2 2 1 3 3 3
Sorting Solutions
Solution 1  Sorting
what about this one
x[order(match(x,y))]
Solution 2  Sorting
You could convert x
into an ordered factor:
x.factor < factor(x, levels = y, ordered=TRUE)
sort(x)
sort(x.factor)
Obviously, changing your numbers into factors can radically change the way code downstream reacts to x
. But since you didn't give us any context about what happens next, I thought I would suggest this as an option.
Solution 3  Sorting
How about?:
rep(y,table(x)[as.character(y)])
(Ian's is probably still better)
Solution 4  Sorting
In case you need to get order on "y" no matter if it's numbers or characters:
x[order(ordered(x, levels = y))]
4 4 4 2 2 1 3 3 3
By steps:
a < ordered(x, levels = y) # Create ordered factor from "x" upon order in "y".
[1] 2 2 3 4 1 4 4 3 3
Levels: 4 < 2 < 1 < 3
b < order(a) # Define "x" order that match to order in "y".
[1] 4 6 7 1 2 5 3 8 9
x[b] # Reorder "x" according to order in "y".
[1] 4 4 4 2 2 1 3 3 3
Solution 5  Sorting
[Edit: Clearly Ian has the right approach, but I will leave this in for posterity.]
You can do this without loops by indexing on your y vector. Add an incrementing numeric value to y and merge them:
y < data.frame(index=1:length(y), x=y)
x < data.frame(x=x)
x < merge(x,y)
x < x[order(x$index),"x"]
x
[1] 4 4 4 2 2 1 3 3 3
Solution 6  Sorting
x < c(2, 2, 3, 4, 1, 4, 4, 3, 3)
y < c(4, 2, 1, 3)
for(i in y) { z < c(z, rep(i, sum(x==i))) }
The result in z: 4 4 4 2 2 1 3 3 3
The important steps:

for(i in y)  Loops over the elements of interest.

z < c(z, ...)  Concatenates each subexpression in turn

rep(i, sum(x==i))  Repeats i (the current element of interest) sum(x==i) times (the number of times we found i in x).
Solution 7  Sorting
Also you can use sqldf
and do it by a join
function in sql
likes the following:
library(sqldf)
x < data.frame(x = c(2, 2, 3, 4, 1, 4, 4, 3, 3))
y < data.frame(y = c(4, 2, 1, 3))
result < sqldf("SELECT x.x FROM y JOIN x on y.y = x.x")
ordered_x < result[[1]]