Opposite of %in%: exclude rows with values specified in a vector
RDataframeSubsetR Problem Overview
A categorical variable V1 in a data frame D1 can have values represented by the letters from A to Z. I want to create a subset D2, which excludes some values, say, B, N and T. Basically, I want a command which is the opposite of %in%
D2 = subset(D1, V1 %in% c("B", "N", "T"))
R Solutions
Solution 1 - R
You can use the !
operator to basically make any TRUE FALSE and every FALSE TRUE. so:
D2 = subset(D1, !(V1 %in% c('B','N','T')))
EDIT: You can also make an operator yourself:
'%!in%' <- function(x,y)!('%in%'(x,y))
c(1,3,11)%!in%1:10
[1] FALSE FALSE TRUE
Solution 2 - R
How about:
`%ni%` <- Negate(`%in%`)
c(1,3,11) %ni% 1:10
# [1] FALSE FALSE TRUE
Solution 3 - R
Here is a version using filter
in dplyr
that applies the same technique as the accepted answer by negating the logical with !:
D2 <- D1 %>% dplyr::filter(!V1 %in% c('B','N','T'))
Solution 4 - R
If you look at the code of %in%
function (x, table) match(x, table, nomatch = 0L) > 0L
then you should be able to write your version of opposite. I use
`%not in%` <- function (x, table) is.na(match(x, table, nomatch=NA_integer_))
Another way is:
function (x, table) match(x, table, nomatch = 0L) == 0L
Solution 5 - R
Using negate
from purrr
also does the trick quickly and neatly:
`%not_in%` <- purrr::negate(`%in%`)
Then usage is, for example,
c("cat", "dog") %not_in% c("dog", "mouse")
Solution 6 - R
purrr::compose()
is another quick way to define this for later use, as in:
`%!in%` <- compose(`!`, `%in%`)
Solution 7 - R
Another solution could be using setdiff
D1 = c("A",..., "Z") ; D0 = c("B","N","T")
D2 = setdiff(D1, D0)
D2
is your desired subset.
Solution 8 - R
Hmisc has %nin%
function, which should do this.
https://www.rdocumentation.org/packages/Hmisc/versions/4.4-0/topics/%25nin%25
Solution 9 - R
library(roperators)
1 %ni% 2:10
If you frequently need to use custom infix operators, it is easier to just have them in a package rather than declaring the same exact functions over and over in each script or project.
Solution 10 - R
The help for %in%, help("%in%")
, includes, in the Examples section, this definition of not in,
"%w/o%" <- function(x, y) x[!x %in% y] #-- x without y
Lets try it:
c(2,3,4) %w/o% c(2,8,9)
[1] 3 4
Alternatively
"%w/o%" <- function(x, y) !x %in% y #-- x without y
c(2,3,4) %w/o% c(2,8,9)
# [1] FALSE TRUE TRUE
Solution 11 - R
require(TSDT)
c(1,3,11) %nin% 1:10
# [1] FALSE FALSE TRUE
For more information, you can refer to: https://cran.r-project.org/web/packages/TSDT/TSDT.pdf
Solution 12 - R
The package collapse has it built in: %!in%
.
Solution 13 - R
In Frank Harrell's package of R utility functions, he has a %nin% (not in) which does exactly what the original question asked. No need for wheel reinvention.