Number of non repeating lines - unique count

BashShellLineUnique

Bash Problem Overview


Here is my problem: Any number of lines of text is given from standard input. Output: number of non repeating lines

INPUT:

She is wearing black shoes.
My name is Johny.
I hate mondays.
My name is Johny.
I don't understand you.
She is wearing black shoes.

OUTPUT:

2

Bash Solutions


Solution 1 - Bash

You could try using uniq man uniq and do the following

sort file | uniq -u | wc -l

Solution 2 - Bash

Here's how I'd solve the problem:

... | awk '{n[$0]++} END {for (line in n) if (n[line]==1) num++; print num}'

But that's pretty opaque. Here's a (slightly) more legible way to look at it (requires bash version 4)

... | {
    declare -A count    # count is an associative array

    # iterate over each line of the input
    # accumulate the number of times we've seen this line
    #
    # the construct "IFS= read -r line" ensures we capture the line exactly

    while IFS= read -r line; do
        (( count["$line"]++ ))
    done

    # now add up the number of lines who's count is only 1        
    num=0
    for c in "${count[@]}"; do
        if (( $c == 1 )); then
            (( num++ ))
        fi
    done
    
    echo $num
}

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
Questionjohn blackwoodView Question on Stackoverflow
Solution 1 - BashDingView Answer on Stackoverflow
Solution 2 - Bashglenn jackmanView Answer on Stackoverflow