Opposite of %in%: exclude rows with values specified in a vector

RDataframeSubset

R Problem Overview


A categorical variable V1 in a data frame D1 can have values represented by the letters from A to Z. I want to create a subset D2, which excludes some values, say, B, N and T. Basically, I want a command which is the opposite of %in%

D2 = subset(D1, V1 %in% c("B", "N", "T"))

R Solutions


Solution 1 - R

You can use the ! operator to basically make any TRUE FALSE and every FALSE TRUE. so:

D2 = subset(D1, !(V1 %in% c('B','N','T')))

EDIT: You can also make an operator yourself:

'%!in%' <- function(x,y)!('%in%'(x,y))

c(1,3,11)%!in%1:10
[1] FALSE FALSE  TRUE

Solution 2 - R

How about:

`%ni%` <- Negate(`%in%`)
c(1,3,11) %ni% 1:10
# [1] FALSE FALSE  TRUE

Solution 3 - R

Here is a version using filter in dplyr that applies the same technique as the accepted answer by negating the logical with !:

D2 <- D1 %>% dplyr::filter(!V1 %in% c('B','N','T'))

Solution 4 - R

If you look at the code of %in%

 function (x, table) match(x, table, nomatch = 0L) > 0L

then you should be able to write your version of opposite. I use

`%not in%` <- function (x, table) is.na(match(x, table, nomatch=NA_integer_))

Another way is:

function (x, table) match(x, table, nomatch = 0L) == 0L

Solution 5 - R

Using negate from purrr also does the trick quickly and neatly:

`%not_in%` <- purrr::negate(`%in%`)

Then usage is, for example,

c("cat", "dog") %not_in% c("dog", "mouse")

Solution 6 - R

purrr::compose() is another quick way to define this for later use, as in:

`%!in%` <- compose(`!`, `%in%`)

Solution 7 - R

Another solution could be using setdiff

D1 = c("A",..., "Z") ; D0 = c("B","N","T")

D2 = setdiff(D1, D0)

D2 is your desired subset.

Solution 8 - R

Solution 9 - R

library(roperators)

1 %ni% 2:10

If you frequently need to use custom infix operators, it is easier to just have them in a package rather than declaring the same exact functions over and over in each script or project.

Solution 10 - R

The help for %in%, help("%in%"), includes, in the Examples section, this definition of not in,

"%w/o%" <- function(x, y) x[!x %in% y] #-- x without y

Lets try it:

c(2,3,4) %w/o% c(2,8,9)
[1] 3 4

Alternatively

"%w/o%" <- function(x, y) !x %in% y #--  x without y
c(2,3,4) %w/o% c(2,8,9)
# [1] FALSE  TRUE  TRUE

Solution 11 - R

require(TSDT)

c(1,3,11) %nin% 1:10
# [1] FALSE FALSE  TRUE

For more information, you can refer to: https://cran.r-project.org/web/packages/TSDT/TSDT.pdf

Solution 12 - R

The package collapse has it built in: %!in%.

Solution 13 - R

In Frank Harrell's package of R utility functions, he has a %nin% (not in) which does exactly what the original question asked. No need for wheel reinvention.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
Questionuser702432View Question on Stackoverflow
Solution 1 - RSacha EpskampView Answer on Stackoverflow
Solution 2 - RSpencer CastroView Answer on Stackoverflow
Solution 3 - Ruser29609View Answer on Stackoverflow
Solution 4 - RMarekView Answer on Stackoverflow
Solution 5 - REllaKView Answer on Stackoverflow
Solution 6 - RedavidajaView Answer on Stackoverflow
Solution 7 - Ruser3373954View Answer on Stackoverflow
Solution 8 - RMattView Answer on Stackoverflow
Solution 9 - RBenbobView Answer on Stackoverflow
Solution 10 - RTony LadsonView Answer on Stackoverflow
Solution 11 - RVishal SharmaView Answer on Stackoverflow
Solution 12 - RMarcio RodriguesView Answer on Stackoverflow
Solution 13 - RJim HunterView Answer on Stackoverflow