How to tell what is in one vector and not another?

RVector

R Problem Overview


In matlab there is a way to find the values in one vector but not in the other.

for example:

x <- c(1,2,3,4)
y <- c(2,3,4)

is there any function that would tell me that the value in x that's not in y is 1?

R Solutions


Solution 1 - R

you can use the setdiff() (set difference) function:

> setdiff(x, y)
[1] 1

Solution 2 - R

Yes. For vectors you can simply use the %in% operator or is.element() function.

> x[!(x %in% y)]
1

For a matrix, there are many difference approaches. merge() is probably the most straight forward. I suggest https://stackoverflow.com/questions/1299871/how-to-join-data-frames-in-r-inner-outer-left-right">looking at this question for that scenario.

Solution 3 - R

The help file in R for setdiff, union, intersect, setequal, and is.element provides information on the standard set functions in R.

setdiff(x, y) returns the elements of x that are not in y.

As noted above, it is an asymmetric difference. So for example:

> x <- c(1,2,3,4)
> y <- c(2,3,4,5)
> 
> setdiff(x, y)
[1] 1
> setdiff(y, x)
[1] 5
> union(setdiff(x, y), setdiff(y, x))
[1] 1 5

Solution 4 - R

x[is.na(match(x,y))]

Solution 5 - R

setdiff() is a tricky function because the output is dependent on the order of the input. You can instead write a simple function as such that does the exact opposite of intersect. This is far better.

>difference <- function(x, y) {
c(setdiff(x, y), setdiff(y, x))
}

#Now lets test it. 
>x <- c(1,2,3,4)
>y <- c(2,3,4,5)

>difference(x,y)
[1] 1 5

Solution 6 - R

If:

x <- c(1,2,3,4)
y <- c(2,3,4)

Any of these expressions:

setdiff(x, y)
x[!(x %in% y)]
x[is.na(match(x,y))]
x[!(is.element(x,y))]

will give you the right answer [1] 1, if the goal is to find the values/characters in x, that is not present in y.

However, applying the above expressions can be tricky and can give undesirable results depending on the nature of the vector, and the position of x and y in the expression. For instance, if:

x <- c(1,1,2,2,3,4)
y <- c(2,3,4)

and the goal is just to find the unique values/characters in x, that is not present in y or vice-versa. Applying any of these expressions will still give the right answer [1] 1:

union(setdiff(x, y), setdiff(y, x))

Thanks to contribution of Jeromy Anglim

OR:

difference <- function(x, y) {
c(setdiff(x, y), setdiff(y, x))
}
difference(y,x)

Thanks to contribution of Workhouse

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionTony StarkView Question on Stackoverflow
Solution 1 - RXelaView Answer on Stackoverflow
Solution 2 - RShaneView Answer on Stackoverflow
Solution 3 - RJeromy AnglimView Answer on Stackoverflow
Solution 4 - RGeorge DontasView Answer on Stackoverflow
Solution 5 - RWorkhorseView Answer on Stackoverflow
Solution 6 - RWilliamView Answer on Stackoverflow