Omit rows containing specific column of NA

R Problem Overview

I want to know how to omit NA values in a data frame, but only in some columns I am interested in.

For example,

DF <- data.frame(x = c(1, 2, 3), y = c(0, 10, NA), z=c(NA, 33, 22))

but I only want to omit the data where y is NA, therefore the result should be

  x  y  z
1 1  0 NA
2 2 10 33

na.omit seems delete all rows contain any NA.

Can somebody help me out of this simple question?

But if now I change the question like:

DF <- data.frame(x = c(1, 2, 3,NA), y = c(1,0, 10, NA), z=c(43,NA, 33, NA))

If I want to omit only x=na or z=na, where can I put the | in function?

R Solutions

Solution 1 - R

Use is.na

DF <- data.frame(x = c(1, 2, 3), y = c(0, 10, NA), z=c(NA, 33, 22))
DF[!is.na(DF$y),]

Solution 2 - R

Hadley's tidyr just got this amazing function drop_na

library(tidyr)
DF %>% drop_na(y)
  x  y  z
1 1  0 NA
2 2 10 33

Solution 3 - R

You could use the complete.cases function and put it into a function thusly:

DF <- data.frame(x = c(1, 2, 3), y = c(0, 10, NA), z=c(NA, 33, 22))

completeFun <- function(data, desiredCols) {
  completeVec <- complete.cases(data[, desiredCols])
  return(data[completeVec, ])
}

completeFun(DF, "y")
#   x  y  z
# 1 1  0 NA
# 2 2 10 33

completeFun(DF, c("y", "z"))
#   x  y  z
# 2 2 10 33

EDIT: Only return rows with no NAs

If you want to eliminate all rows with at least one NA in any column, just use the complete.cases function straight up:

DF[complete.cases(DF), ]
#   x  y  z
# 2 2 10 33

Or if completeFun is already ingrained in your workflow ;)

completeFun(DF, names(DF))

Solution 4 - R

Use 'subset'

DF <- data.frame(x = c(1, 2, 3), y = c(0, 10, NA), z=c(NA, 33, 22))
subset(DF, !is.na(y))

Solution 5 - R

It is possible to use na.omit for data.table:

na.omit(data, cols = c("x", "z"))

Solution 6 - R

Omit row if either of two specific columns contain <NA>.

DF[!is.na(DF$x)&!is.na(DF$z),]

Solution 7 - R

Try this:

cc=is.na(DF$y)
m=which(cc==c("TRUE"))
DF=DF[-m,]

Solution 8 - R

To update, a tidyverse approach with dplyr:

library(dplyr)

your_data_frame %>% 
  filter(!is.na(region_column))

Solution 9 - R

Just try this:

DF %>% t %>% na.omit %>% t

It transposes the data frame and omits null rows which were 'columns' before transposition and then you transpose it back.

Content Type	Original Author	Original Content on Stackoverflow
Question	user1489975	View Question on Stackoverflow
Solution 1 - R	mnel	View Answer on Stackoverflow
Solution 2 - R	amrrs	View Answer on Stackoverflow
Solution 3 - R	BenBarnes	View Answer on Stackoverflow
Solution 4 - R	Rnoob	View Answer on Stackoverflow
Solution 5 - R	Droney	View Answer on Stackoverflow
Solution 6 - R	M.Viking	View Answer on Stackoverflow
Solution 7 - R	rockswap	View Answer on Stackoverflow
Solution 8 - R	Vinícius Félix	View Answer on Stackoverflow
Solution 9 - R	Luchao Qi	View Answer on Stackoverflow