Determine the number of NA values in a column

RDataframe

R Problem Overview


I want to count the number of NA values in a data frame column. Say my data frame is called df, and the name of the column I am considering is col. The way I have come up with is following:

sapply(df$col, function(x) sum(length(which(is.na(x)))))  

Is this a good/most efficient way to do this?

R Solutions


Solution 1 - R

You're over-thinking the problem:

sum(is.na(df$col))

Solution 2 - R

If you are looking for NA counts for each column in a dataframe then:

na_count <-sapply(x, function(y) sum(length(which(is.na(y)))))

should give you a list with the counts for each column.

na_count <- data.frame(na_count)

Should output the data nicely in a dataframe like:

----------------------
| row.names | na_count
------------------------
| column_1  | count

Solution 3 - R

Try the colSums function

df <- data.frame(x = c(1,2,NA), y = rep(NA, 3))

colSums(is.na(df))

#x y 
#1 3 

Solution 4 - R

A quick and easy Tidyverse solution to get a NA count for all columns is to use summarise_all() which I think makes a much easier to read solution than using purrr or sapply

library(tidyverse)
# Example data
df <- tibble(col1 = c(1, 2, 3, NA), 
             col2 = c(NA, NA, "a", "b"))

df %>% summarise_all(~ sum(is.na(.)))
#> # A tibble: 1 x 2
#>    col1  col2
#>   <int> <int>
#> 1     1     2

Or using the more modern across() function:

df %>% summarise(across(everything(), ~ sum(is.na(.))))

Solution 5 - R

If you are looking to count the number of NAs in the entire dataframe you could also use

sum(is.na(df))

Solution 6 - R

In the summary() output, the function also counts the NAs so one can use this function if one wants the sum of NAs in several variables.

Solution 7 - R

A tidyverse way to count the number of nulls in every column of a dataframe:

library(tidyverse)
library(purrr)

df %>%
    map_df(function(x) sum(is.na(x))) %>%
    gather(feature, num_nulls) %>%
    print(n = 100)

Solution 8 - R

This form, slightly changed from Kevin Ogoros's one:

na_count <-function (x) sapply(x, function(y) sum(is.na(y)))

returns NA counts as named int array

Solution 9 - R

sapply(name of the data, function(x) sum(is.na(x)))

Solution 10 - R

User rrs answer is right but that only tells you the number of NA values in the particular column of the data frame that you are passing to get the number of NA values for the whole data frame try this:

apply(<name of dataFrame>, 2<for getting column stats>, function(x) {sum(is.na(x))})

This does the trick

Solution 11 - R

Try this:

length(df$col[is.na(df$col)])

Solution 12 - R

I read a csv file from local directory. Following code works for me.

# to get number of which contains na
sum(is.na(df[, c(columnName)]) # to get number of na row

# to get number of which not contains na
sum(!is.na(df[, c(columnName)]) 

#here columnName is your desire column name

Solution 13 - R

Similar to hute37's answer but using the purrr package. I think this tidyverse approach is simpler than the answer proposed by AbiK.

library(purrr)
map_dbl(df, ~sum(is.na(.)))

Note: the tilde (~) creates an anonymous function. And the '.' refers to the input for the anonymous function, in this case the data.frame df.

Solution 14 - R

If you're looking for null values in each column to be printed one after the other then you can use this. Simple solution.

lapply(df, function(x) { length(which(is.na(x)))})

Solution 15 - R

You can use this to count number of NA or blanks in every column

colSums(is.na(data_set_name)|data_set_name == '')

Solution 16 - R

In the interests of completeness you can also use the useNA argument in table. For example table(df$col, useNA="always") will count all of non NA cases and the NA ones.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
Questionuser3274289View Question on Stackoverflow
Solution 1 - RrrsView Answer on Stackoverflow
Solution 2 - RKevin OgoroView Answer on Stackoverflow
Solution 3 - RTony LadsonView Answer on Stackoverflow
Solution 4 - RMoohanView Answer on Stackoverflow
Solution 5 - RbkielstrView Answer on Stackoverflow
Solution 6 - RShahinView Answer on Stackoverflow
Solution 7 - RAbi KView Answer on Stackoverflow
Solution 8 - Rhute37View Answer on Stackoverflow
Solution 9 - RUTKARSHView Answer on Stackoverflow
Solution 10 - Riec2011007View Answer on Stackoverflow
Solution 11 - RRabish kumar SinghView Answer on Stackoverflow
Solution 12 - Rreza.cse08View Answer on Stackoverflow
Solution 13 - RChris KiniryView Answer on Stackoverflow
Solution 14 - RPrakhar RathiView Answer on Stackoverflow
Solution 15 - RPrakhar SrivastavaView Answer on Stackoverflow
Solution 16 - RdpelView Answer on Stackoverflow