Bash: Capture output of command run in background

Bash

Bash Problem Overview


I'm trying to write a bash script that will get the output of a command that runs in the background. Unfortunately I can't get it to work, the variable I assign the output to is empty - if I replace the assignment with an echo command everything works as expected though.

#!/bin/bash

function test {
    echo "$1"
}

echo $(test "echo") &
wait

a=$(test "assignment") &
wait

echo $a

echo done

This code produces the output:

echo

done

Changing the assignment to

a=`echo $(test "assignment") &`

works, but it seems like there should be a better way of doing this.

Bash Solutions


Solution 1 - Bash

Bash has indeed a feature called Process Substitution to accomplish this.

$ echo <(yes)
/dev/fd/63

Here, the expression <(yes) is replaced with a pathname of a (pseudo device) file that is connected to the standard output of an asynchronous job yes (which prints the string y in an endless loop).

Now let's try to read from it:

$ cat /dev/fd/63
cat: /dev/fd/63: No such file or directory

The problem here is that the yes process terminated in the meantime because it received a SIGPIPE (it had no readers on stdout).

The solution is the following construct

$ exec 3< <(yes)  # Save stdout of the 'yes' job as (input) fd 3.

This opens the file as input fd 3 before the background job is started.

You can now read from the background job whenever you prefer. For a stupid example

$ for i in 1 2 3; do read <&3 line; echo "$line"; done
y
y
y

Note that this has slightly different semantics than having the background job write to a drive backed file: the background job will be blocked when the buffer is full (you empty the buffer by reading from the fd). By contrast, writing to a drive-backed file is only blocking when the hard drive doesn't respond.

Process substitution is not a POSIX sh feature.

Here's a quick hack to give an asynchronous job drive backing (almost) without assigning a filename to it:

$ yes > backingfile &  # Start job in background writing to a new file. Do also look at `mktemp(3)` and the `sh` option `set -o noclobber`
$ exec 3< backingfile  # open the file for reading in the current shell, as fd 3
$ rm backingfile       # remove the file. It will disappear from the filesystem, but there is still a reader and a writer attached to it which both can use it.

$ for i in 1 2 3; do read <&3 line; echo "$line"; done
y
y
y

Linux also recently got added the O_TEMPFILE option, which makes such hacks possible without the file ever being visible at all. I don't know if bash already supports it.

UPDATE:

@rthur, if you want to capture the whole output from fd 3, then use

output=$(cat <&3)

But note that you can't capture binary data in general: It's only a defined operation if the output is text in the POSIX sense. The implementations I know simply filter out all NUL bytes. Furthermore POSIX specifies that all trailing newlines must be removed.

(Please note also that capturing the output will result in OOM if the writer never stops (yes never stops). But naturally that problem holds even for read if the line separator is never written additionally)

Solution 2 - Bash

One very robust way to deal with coprocesses in Bash is to use... the coproc builtin.

Suppose you have a script or function called banana you wish to run in background, capture all its output while doing some stuff and wait until it's done. I'll do the simulation with this:

banana() {
    for i in {1..4}; do
        echo "gorilla eats banana $i"
        sleep 1
    done
    echo "gorilla says thank you for the delicious bananas"
}

stuff() {
    echo "I'm doing this stuff"
    sleep 1
    echo "I'm doing that stuff"
    sleep 1
    echo "I'm done doing my stuff."
}

You will then run banana with the coproc as so:

coproc bananafd { banana; }

this is like running banana & but with the following extras: it creates two file descriptors that are in the array bananafd (at index 0 for output and index 1 for input). You'll capture the output of banana with the read builtin:

IFS= read -r -d '' -u "${bananafd[0]}" banana_output

Try it:

#!/bin/bash

banana() {
    for i in {1..4}; do
        echo "gorilla eats banana $i"
        sleep 1
    done
    echo "gorilla says thank you for the delicious bananas"
}

stuff() {
    echo "I'm doing this stuff"
    sleep 1
    echo "I'm doing that stuff"
    sleep 1
    echo "I'm done doing my stuff."
}

coproc bananafd { banana; }

stuff

IFS= read -r -d '' -u "${bananafd[0]}" banana_output

echo "$banana_output"

Caveat: you must be done with stuff before banana ends! if the gorilla is quicker than you:

#!/bin/bash

banana() {
    for i in {1..4}; do
        echo "gorilla eats banana $i"
    done
    echo "gorilla says thank you for the delicious bananas"
}

stuff() {
    echo "I'm doing this stuff"
    sleep 1
    echo "I'm doing that stuff"
    sleep 1
    echo "I'm done doing my stuff."
}

coproc bananafd { banana; }

stuff

IFS= read -r -d '' -u "${bananafd[0]}" banana_output

echo "$banana_output"

In this case, you'll obtain an error like this one:

./banana: line 22: read: : invalid file descriptor specification

You can check whether it's too late (i.e., whether you've taken too long doing your stuff) because after the coproc is done, bash removes the values in the array bananafd, and that's why we obtained the previous error.

#!/bin/bash

banana() {
    for i in {1..4}; do
        echo "gorilla eats banana $i"
    done
    echo "gorilla says thank you for the delicious bananas"
}

stuff() {
    echo "I'm doing this stuff"
    sleep 1
    echo "I'm doing that stuff"
    sleep 1
    echo "I'm done doing my stuff."
}

coproc bananafd { banana; }

stuff

if [[ -n ${bananafd[@]} ]]; then
    IFS= read -r -d '' -u "${bananafd[0]}" banana_output
    echo "$banana_output"
else
    echo "oh no, I took too long doing my stuff..."
fi

Finally, if you really don't want to miss any of gorilla's moves, even if you take too long for your stuff, you could copy banana's file descriptor to another fd, 3 for example, do your stuff and then read from 3:

#!/bin/bash

banana() {
    for i in {1..4}; do
        echo "gorilla eats banana $i"
        sleep 1
    done
    echo "gorilla says thank you for the delicious bananas"
}

stuff() {
    echo "I'm doing this stuff"
    sleep 1
    echo "I'm doing that stuff"
    sleep 1
    echo "I'm done doing my stuff."
}

coproc bananafd { banana; }

# Copy file descriptor banana[0] to 3
exec 3>&${bananafd[0]}

stuff

IFS= read -d '' -u 3 output
echo "$output"

This will work very well! the last read will also play the role of wait, so that output will contain the complete output of banana.

That was great: no temp files to deal with (bash handles everything silently) and 100% pure bash!

Hope this helps!

Solution 3 - Bash

One way to capture background command's output is to redirect it's output in a file and capture output from file after background process has ended:

test "assignment" > /tmp/_out &
wait
a=$(</tmp/_out)

Solution 4 - Bash

I also use file redirections. Like:

exec 3< <({ sleep 2; echo 12; })  # Launch as a job stdout -> fd3
cat <&3  # Lock read fd3

More real case If I want the output of 4 parallel workers: toto, titi, tata and tutu. I redirect each one to an different file descriptor (in fd variable). Then reading these file descriptor will block until EOF <= pipe broken <= command completed

#!/usr/bin/env bash

# Declare data to be forked
a_value=(toto titi tata tutu)
msg=""

# Spawn child sub-processes
for i in {0..3}; do
  ((fd=50+i))
  echo -e "1/ Launching command: $cmd with file descriptor: $fd!"
  eval "exec $fd< <({ sleep $((i)); echo ${a_value[$i]}; })"
  a_pid+=($!)  # Store pid
done

# Join child: wait them all and collect std-output
for i in {0..3}; do
  ((fd=50+i));
  echo -e "2/ Getting result of: $cmd with file descriptor: $fd!"
  msg+="$(cat <&$fd)\n"
  ((i_fd--))
done

# Print result
echo -e "===========================\nResult:"
echo -e "$msg"

Should output:

1/ Launching command:  with file descriptor: 50!
1/ Launching command:  with file descriptor: 51!
1/ Launching command:  with file descriptor: 52!
1/ Launching command:  with file descriptor: 53!
2/ Getting result of:  with file descriptor: 50!
2/ Getting result of:  with file descriptor: 51!
2/ Getting result of:  with file descriptor: 52!
2/ Getting result of:  with file descriptor: 53!
===========================
Result:
toto
titi
tata
tutu

Note1: coproc is supporting only one coprocess and not multiple

Note2: wait command is buggy for old bash version (4.2) and cannot retrieve the status of the jobs I launched. It works well in bash 5 but file redirection works for all versions.

Solution 5 - Bash

Just group the commands, when you run them in background and wait for both.

{ echo a & echo b & wait; } | nl

Output will be:

     1  a
     2  b

But notice that the output can be out of order, if the second task runs faster than the first.

{ { sleep 1; echo a; } & echo b & wait; } | nl

Reverse output:

     1  b
     2  a

If it is necessary to separate the output of both background jobs, it is necessary to buffer the output somewhere, typically in a file. Example:

#! /bin/bash

t0=$(date +%s)                               # Get start time

trap 'rm -f "$ta" "$tb"' EXIT                # Remove temp files on exit.

ta=$(mktemp)                                 # Create temp file for job a.
tb=$(mktemp)                                 # Create temp file for job b.

{ exec >$ta; echo a1; sleep 2; echo a2; } &  # Run job a.
{ exec >$tb; echo b1; sleep 3; echo b2; } &  # Run job b.

wait                                         # Wait for the jobs to finish.

cat "$ta"                                    # Print output of job a.
cat "$tb"                                    # Print output of job b.

t1=$(date +%s)                               # Get end time

echo "t1 - t0: $((t1-t0))"                   # Display execution time.

The overall runtime of the script is three seconds, although the combined sleeping time of both background jobs is five seconds. And the output of the background jobs is in order.

a1
a2
b1
b2
t1 - t0: 3

You can also use a memory buffer to store the output of your jobs. But this works only, if your buffer is big enough to store the whole output of your jobs.

#! /bin/bash

t0=$(date +%s)

trap 'rm -f /tmp/{a,b}' EXIT
mkfifo /tmp/{a,b}

buffer() { dd of="$1" status=none iflag=fullblock bs=1K; }

pids=()
{ echo a1; sleep 2; echo a2; } > >(buffer /tmp/a) &
pids+=($!)
{ echo b1; sleep 3; echo b2; } > >(buffer /tmp/b) &
pids+=($!)

# Wait only for the jobs but not for the buffering `dd`.
wait "${pids[@]}" 

# This will wait for `dd`.
cat /tmp/{a,b}

t1=$(date +%s)

echo "t1 - t0: $((t1-t0))"

The above will also work with cat instead of dd. But then you can not control the buffer size.

Solution 6 - Bash

If you have GNU Parallel you can probably use parset:

myfunc() {
  sleep 3
  echo "The input was"
  echo "$@"
}
export -f myfunc
parset a,b,c myfunc ::: myarg-a "myarg  b" myarg-c
echo "$a"
echo "$b"
echo "$c"

See: https://www.gnu.org/software/parallel/parset.html

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionrthurView Question on Stackoverflow
Solution 1 - BashJo SoView Answer on Stackoverflow
Solution 2 - Bashgniourf_gniourfView Answer on Stackoverflow
Solution 3 - BashanubhavaView Answer on Stackoverflow
Solution 4 - BashTinmarinoView Answer on Stackoverflow
Solution 5 - BashcevingView Answer on Stackoverflow
Solution 6 - BashOle TangeView Answer on Stackoverflow