Add up a column of numbers at the Unix shell
LinuxUnixShellLinux Problem Overview
Given a list of files in files.txt
, I can get a list of their sizes like this:
cat files.txt | xargs ls -l | cut -c 23-30
which produces something like this:
151552
319488
1536000
225280
How can I get the total of all those numbers?
Linux Solutions
Solution 1 - Linux
... | paste -sd+ - | bc
is the shortest one I've found (from the UNIX Command Line blog).
Edit: added the -
argument for portability, thanks @Dogbert and @Owen.
Solution 2 - Linux
Here goes
cat files.txt | xargs ls -l | cut -c 23-30 |
awk '{total = total + $1}END{print total}'
Solution 3 - Linux
cat will not work if there are spaces in filenames. here is a perl one-liner instead.
perl -nle 'chomp; $x+=(stat($_))[7]; END{print $x}' files.txt
Solution 4 - Linux
Instead of using cut to get the file size from output of ls -l, you can use directly:
$ cat files.txt | xargs ls -l | awk '{total += $5} END {print "Total:", total, "bytes"}'
Awk interprets "$5" as the fifth column. This is the column from ls -l that gives you the file size.
Solution 5 - Linux
python3 -c"import os; print(sum(os.path.getsize(f) for f in open('files.txt').read().split()))"
Or if you just want to sum the numbers, pipe into:
python3 -c"import sys; print(sum(int(x) for x in sys.stdin))"
Solution 6 - Linux
if you don't have bc installed, try
echo $(( $(... | paste -sd+ -) ))
instead of
... | paste -sd+ - | bc
$( )
<-- return the value of executing the command
$(( 1+2 ))
<-- return the evaluated results
echo
<-- echo it to the screen
Solution 7 - Linux
You can use the following script if you just want to use shell scripting without awk or other interpreters:
#!/bin/bash
total=0
for number in `cat files.txt | xargs ls -l | cut -c 23-30`; do
let total=$total+$number
done
echo $total
Solution 8 - Linux
TMTWWTDI: Perl has a file size operator (-s)
perl -lne '$t+=-s;END{print $t}' files.txt
Solution 9 - Linux
The whole ls -l and then cut is rather convoluted when you have stat. It is also vulnerable to the exact format of ls -l (it didn't work until I changed the column numbers for cut)
Also, fixed the useless use of cat.
<files.txt xargs stat -c %s | paste -sd+ - | bc
Solution 10 - Linux
cat files.txt | awk '{ total += $1} END {print total}'
You can use the awk to do the same it even skips the non integers
$ cat files.txt
1
2.3
3.4
ew
1
$ cat files.txt | awk '{ total += $1} END {print total}'
7.7
or you can use ls command and calculate human readable output
$ ls -l | awk '{ sum += $5} END {hum[1024^3]="Gb"; hum[1024^2]="Mb"; hum[1024]="Kb"; for (x=1024^3; x>=1024; x/=1024) { if (sum>=x) { printf "%.2f %s\n",sum/x,hum[x]; break; } } if (sum<1024) print "1kb"; }'
15.69 Mb
$ ls -l *.txt | awk '{ sum += $5} END {hum[1024^3]="Gb"; hum[1024^2]="Mb"; hum[1024]="Kb"; for (x=1024^3; x>=1024; x/=1024) { if (sum>=x) { printf "%.2f %s\n",sum/x,hum[x]; break; } } if (sum<1024) print "1kb"; }'
2.10 Mb
Solution 11 - Linux
I would use "du" instead.
$ cat files.txt | xargs du -c | tail -1
4480 total
If you just want the number:
cat files.txt | xargs du -c | tail -1 | awk '{print $1}'
Solution 12 - Linux
In ksh:
echo " 0 $(ls -l $(<files.txt) | awk '{print $5}' | tr '\n' '+') 0" | bc
Solution 13 - Linux
Here's mine
cat files.txt | xargs ls -l | cut -c 23-30 | sed -e :a -e '$!N;s/\n/+/;ta' | bc
Solution 14 - Linux
... |xargs|tr \ +|bc
... |paste -sd+ -|bc
The first command is just one symbol longer (note, it must have two spaces after the backslash!), but it handles the cases with empty lines in a column, whereas the second command results in an invalid expression with extra pluses.
E.g.:
echo "2
3
5
" | paste -sd+ -
results in
2+3+5++
which bc cannot handle, whereas
echo "2
3
5
" | xargs | tr \ +
gives a valid expression
2+3+5
which can be piped into bc to get the final result
Solution 15 - Linux
Pipe to gawk:
cat files.txt | xargs ls -l | cut -c 23-30 | gawk 'BEGIN { sum = 0 } // { sum = sum + $0 } END { print sum }'
Solution 16 - Linux
#
# @(#) addup.sh 1.0 90/07/19
#
# Copyright (C) <heh> SjB, 1990
# Adds up a column (default=last) of numbers in a file.
# 95/05/16 updated to allow (999) negative style numbers.
case $1 in
-[0-9])
COLUMN=`echo $1 | tr -d -`
shift
;;
*)
COLUMN="NF"
;;
esac
echo "Adding up column .. $COLUMN .. of file(s) .. $*"
nawk ' OFMT="%.2f" # 1 "%12.2f"
{ x = '$COLUMN' # 2
neg = index($x, "$") # 3
if (neg > 0) X = gsub("\\$", "", $x)
neg = index($x, ",") # 4
if (neg > 1) X = gsub(",", "", $x)
neg = index($x, "(") # 8 neg (123 & change
if (neg > 0) X = gsub("\\(", "", $x)
if (neg > 0) $x = (-1 * $x) # it to "-123.00"
neg = index($x, "-") # 5
if (neg > 1) $x = (-1 * $x) # 6
t += $x # 7
print "x is <<<", $x+0, ">>> running balance:", t
} ' $*
# 1. set numeric format to eliminate rounding errors
# 1.1 had to reset numeric format from 12.2f to .2f 95/05/16
# when a computed number is assigned to a variable ( $x = (-1 * $x) )
# it causes $x to use the OFMT so -1.23 = "________-1.23" vs "-1.23"
# and that causes my #5 (negative check) to not work correctly because
# the index returns a number >1 and to the neg neg than becomes a positive
# this only occurs if the number happened to b a "(" neg number
# 2. find the field we want to add up (comes from the shell or defaults
# to the last field "NF") in the file
# 3. check for a dollar sign ($) in the number - if there get rid of it
# so we may add it correctly - $12 $1$2 $1$2$ $$1$$2$$ all = 12
# 4. check for a comma (,) in the number - if there get rid of it so we
# may add it correctly - 1,2 12, 1,,2 1,,2,, all = 12 (,12=0)
# 5. check for negative numbers
# 6. if x is a negative number in the form 999- "make" it a recognized
# number like -999 - if x is a negative number like -999 already
# the test fails (y is not >1) and this "true" negative is not made
# positive
# 7. accumulate the total
# 8. if x is a negative number in the form (999) "make it a recognized
# number like -999
# * Note that a (-9) (neg neg number) returns a postive
# * Mite not work rite with all forms of all numbers using $-,+. etc. *
Solution 17 - Linux
I like to use....
echo "
1
2
3 " | sed -e 's,$, + p,g' | dc
they will show the sum of each line...
applying over this situation:
ls -ld $(< file.txt) | awk '{print $5}' | sed -e 's,$, + p,g' | dc
Total is the last value...
Solution 18 - Linux
Pure bash
total=0; for i in $(cat files.txt | xargs ls -l | cut -c 23-30); do
total=$(( $total + $i )); done; echo $total
Solution 19 - Linux
In my opinion, the simplest solution to this is "expr" unix command:
s=0;
for i in `cat files.txt | xargs ls -l | cut -c 23-30`
do
s=`expr $s + $i`
done
echo $s
Solution 20 - Linux
sizes=( $(cat files.txt | xargs ls -l | cut -c 23-30) )
total=$(( $(IFS="+"; echo "${sizes[*]}") ))
Or you could just sum them as you read the sizes
declare -i total=0
while read x; total+=x; done < <( cat files.txt | xargs ls -l | cut -c 23-30 )
If you don't care about bite sizes and blocks is OK, then just
declare -i total=0
while read s junk; total+=s; done < <( cat files.txt | xargs ls -s )
Solution 21 - Linux
If you have R, you can use:
> ... | Rscript -e 'print(sum(scan("stdin")));'
Read 4 items
[1] 2232320
Since I'm comfortable with R, I actually have several aliases for things like this so I can use them in bash
without having to remember this syntax. For instance:
alias Rsum=$'Rscript -e \'print(sum(scan("stdin")));\''
which let's me do
> ... | Rsum
Read 4 items
[1] 2232320
Inspiration: Is there a way to get the min, max, median, and average of a list of numbers in a single command?
Solution 22 - Linux
The most popular answer doesn't work right when the start of the pipe can produce 0 lines, because it ends up outputting nothing rather than 0. You can get correct behavior by always adding 0:
... | (cat && echo 0) | paste -sd+ - | bc
Solution 23 - Linux
The - is not required for paste. The following will do as long as files.txt contains one or more valid file names:
<files.txt xargs stat -c %s | paste -sd+ | bc
cat is not required to insert 0 in case there is no file. Without a pipe, perhaps more convenient in a script, you could use:
(xargs -a files.txt stat -c %s || echo 0) | paste -sd+ | bc