How to print the number of characters in each line of a text file

ShellUnixSedAwk

Shell Problem Overview


I would like to print the number of characters in each line of a text file using a unix command. I know it is simple with powershell

gc abc.txt | % {$_.length}

but I need unix command.

Shell Solutions


Solution 1 - Shell

Use Awk.

awk '{ print length }' abc.txt

Solution 2 - Shell

while IFS= read -r line; do echo ${#line}; done < abc.txt

It is POSIX, so it should work everywhere.

Edit: Added -r as suggested by William.

Edit: Beware of Unicode handling. Bash and zsh, with correctly set locale, will show number of codepoints, but dash will show bytes—so you have to check what your shell does. And then there many other possible definitions of length in Unicode anyway, so it depends on what you actually want.

Edit: Prefix with IFS= to avoid losing leading and trailing spaces.

Solution 3 - Shell

Here is example using xargs:

$ xargs -d '\n' -I% sh -c 'echo % | wc -c' < file

Solution 4 - Shell

I've tried the other answers listed above, but they are very far from decent solutions when dealing with large files -- especially once a single line's size occupies more than ~1/4 of available RAM.

Both bash and awk slurp the entire line, even though for this problem it's not needed. Bash will error out once a line is too long, even if you have enough memory.

I've implemented an extremely simple, fairly unoptimized python script that when tested with large files (~4 GB per line) doesn't slurp, and is by far a better solution than those given.

If this is time critical code for production, you can rewrite the ideas in C or perform better optimizations on the read call (instead of only reading a single byte at a time), after testing that this is indeed a bottleneck.

Code assumes newline is a linefeed character, which is a good assumption for Unix, but YMMV on Mac OS/Windows. Be sure the file ends with a linefeed to ensure the last line character count isn't overlooked.

from sys import stdin, exit

counter = 0
while True:
    byte = stdin.buffer.read(1)
    counter += 1
    if not byte:
        exit()
    if byte == b'\x0a':
        print(counter-1)
        counter = 0

Solution 5 - Shell

Try this:

while read line    
do    
    echo -e |wc -m      
done <abc.txt    

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
Questionvikas368View Question on Stackoverflow
Solution 1 - ShellFred FooView Answer on Stackoverflow
Solution 2 - ShellJan HudecView Answer on Stackoverflow
Solution 3 - ShellkenorbView Answer on Stackoverflow
Solution 4 - Shelluser2875414View Answer on Stackoverflow
Solution 5 - ShellRahulView Answer on Stackoverflow