Total size of the contents of all the files in a directory

LinuxShell

Linux Problem Overview


When I use ls or du, I get the amount of disk space each file is occupying.

I need the sum total of all the data in files and subdirectories I would get if I opened each file and counted the bytes. Bonus points if I can get this without opening each file and counting.

Linux Solutions


Solution 1 - Linux

If you want the 'apparent size' (that is the number of bytes in each file), not size taken up by files on the disk, use the -b or --bytes option (if you got a Linux system with GNU coreutils):

% du -sbh <directory>

Solution 2 - Linux

Use du -sb:

du -sb DIR

Optionally, add the h option for more user-friendly output:

du -sbh DIR

Solution 3 - Linux

cd to directory, then:

du -sh

ftw!

Originally wrote about it here: https://ao.ms/get-the-total-size-of-all-the-files-in-a-directory/

Solution 4 - Linux

Just an alternative:

ls -lAR | grep -v '^d' | awk '{total += $5} END {print "Total:", total}'

grep -v '^d' will exclude the directories.

Solution 5 - Linux

stat's "%s" format gives you the actual number of bytes in a file.

 find . -type f |
 xargs stat --format=%s |
 awk '{s+=$1} END {print s}'

Feel free to substitute your https://stackoverflow.com/questions/450799/linux-command-to-sum-integers-one-per-line">favourite method for summing numbers.

Solution 6 - Linux

If you use busybox's "du" in emebedded system, you can not get a exact bytes with du, only Kbytes you can get.

BusyBox v1.4.1 (2007-11-30 20:37:49 EST) multi-call binary

Usage: du [-aHLdclsxhmk] [FILE]...

Summarize disk space used for each FILE and/or directory.
Disk space is printed in units of 1024 bytes.

Options:
        -a      Show sizes of files in addition to directories
        -H      Follow symbolic links that are FILE command line args
        -L      Follow all symbolic links encountered
        -d N    Limit output to directories (and files with -a) of depth < N
        -c      Output a grand total
        -l      Count sizes many times if hard linked
        -s      Display only a total for each argument
        -x      Skip directories on different filesystems
        -h      Print sizes in human readable format (e.g., 1K 243M 2G )
        -m      Print sizes in megabytes
        -k      Print sizes in kilobytes(default)

Solution 7 - Linux

When a folder is created, many Linux filesystems allocate 4096 bytes to store some metadata about the directory itself. This space is increased by a multiple of 4096 bytes as the directory grows.

du command (with or without -b option) take in count this space, as you can see typing:

mkdir test && du -b test

you will have a result of 4096 bytes for an empty dir. So, if you put 2 files of 10000 bytes inside the dir, the total amount given by du -sb would be 24096 bytes.

If you read carefully the question, this is not what asked. The questioner asked:

> the sum total of all the data in files and subdirectories I would get if I opened each file and counted the bytes

that in the example above should be 20000 bytes, not 24096.

So, the correct answer IMHO could be a blend of Nelson answer and hlovdal suggestion to handle filenames containing spaces:

find . -type f -print0 | xargs -0 stat --format=%s | awk '{s+=$1} END {print s}'

Solution 8 - Linux

For Win32 DOS, you can:

c:> dir /s c:\directory\you\want

and the penultimate line will tell you how many bytes the files take up.

I know this reads all files and directories, but works faster in some situations.

Solution 9 - Linux

There are at least three ways to get the "sum total of all the data in files and subdirectories" in bytes that work in both Linux/Unix and Git Bash for Windows, listed below in order from fastest to slowest on average. For your reference, they were executed at the root of a fairly deep file system (docroot in a Magento 2 Enterprise installation comprising 71,158 files in 30,027 directories).

1.

$ time find -type f -printf '%s\n' | awk '{ total += $1 }; END { print total" bytes" }'
748660546 bytes

real    0m0.221s
user    0m0.068s
sys     0m0.160s

2.

$ time echo `find -type f -print0 | xargs -0 stat --format=%s | awk '{total+=$1} END {print total}'` bytes
748660546 bytes

real    0m0.256s
user    0m0.164s
sys     0m0.196s

3.

$ time echo `find -type f -exec du -bc {} + | grep -P "\ttotal$" | cut -f1 | awk '{ total += $1 }; END { print total }'` bytes
748660546 bytes

real    0m0.553s
user    0m0.308s
sys     0m0.416s


These two also work, but they rely on commands that don't exist on Git Bash for Windows:

1.

$ time echo `find -type f -printf "%s + " | dc -e0 -f- -ep` bytes
748660546 bytes

real    0m0.233s
user    0m0.116s
sys     0m0.176s

2.

$ time echo `find -type f -printf '%s\n' | paste -sd+ | bc` bytes
748660546 bytes

real    0m0.242s
user    0m0.104s
sys     0m0.152s


If you only want the total for the current directory, then add -maxdepth 1 to find.


Note that some of the suggested solutions don't return accurate results, so I would stick with the solutions above instead.

$ du -sbh
832M    .

$ ls -lR | grep -v '^d' | awk '{total += $5} END {print "Total:", total}'
Total: 583772525

$ find . -type f | xargs stat --format=%s | awk '{s+=$1} END {print s}'
xargs: unmatched single quote; by default quotes are special to xargs unless you use the -0 option
4390471

$ ls -l| grep -v '^d'| awk '{total = total + $5} END {print "Total" , total}'
Total 968133

Solution 10 - Linux

du is handy, but find is useful in case if you want to calculate the size of some files only (for example, using filter by extension). Also note that find themselves can print the size of each file in bytes. To calculate a total size we can connect dc command in the following manner:

find . -type f -printf "%s + " | dc -e0 -f- -ep

Here find generates sequence of commands for dc like 123 + 456 + 11 +. Although, the completed program should be like 0 123 + 456 + 11 + p (remember postfix notation).

So, to get the completed program we need to put 0 on the stack before executing the sequence from stdin, and print the top number after executing (the p command at the end). We achieve it via dc options:

  1. -e0 is just shortcut for -e '0' that puts 0 on the stack,
  2. -f- is for read and execute commands from stdin (that generated by find here),
  3. -ep is for print the result (-e 'p').

To print the size in MiB like 284.06 MiB we can use -e '2 k 1024 / 1024 / n [ MiB] p' in point 3 instead (most spaces are optional).

Solution 11 - Linux

Use:

$ du -ckx <DIR> | grep total | awk '{print $1}'

Where <DIR> is the directory you want to inspect.

The '-c' gives you grand total data which is extracted using the 'grep total' portion of the command, and the count in Kbytes is extracted with the awk command.

The only caveat here is if you have a subdirectory containing the text "total" it will get spit out as well.

Solution 12 - Linux

This may help:

ls -l| grep -v '^d'| awk '{total = total + $5} END {print "Total" , total}'

The above command will sum total all the files leaving the directories size.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionArthur UlfeldtView Question on Stackoverflow
Solution 1 - LinuxArkadyView Answer on Stackoverflow
Solution 2 - LinuxrobView Answer on Stackoverflow
Solution 3 - LinuxAO_View Answer on Stackoverflow
Solution 4 - LinuxBarunView Answer on Stackoverflow
Solution 5 - LinuxNelsonView Answer on Stackoverflow
Solution 6 - LinuxSam LiaoView Answer on Stackoverflow
Solution 7 - LinuxbytepanView Answer on Stackoverflow
Solution 8 - LinuxSunView Answer on Stackoverflow
Solution 9 - LinuxthdoanView Answer on Stackoverflow
Solution 10 - LinuxruvimView Answer on Stackoverflow
Solution 11 - LinuxRob JonesView Answer on Stackoverflow
Solution 12 - LinuxAtaul HaqueView Answer on Stackoverflow