How can I loop over the output of a shell command?

LinuxBashShell

Linux Problem Overview


I want to write a script that loops through the output (array possibly?) of a shell command, ps.

Here is the command and the output:

$ ps -ewo pid,cmd,etime | grep python | grep -v grep | grep -v sh
 3089 python /var/www/atm_securit       37:02
17116 python /var/www/atm_securit       00:01
17119 python /var/www/atm_securit       00:01
17122 python /var/www/atm_securit       00:01
17125 python /var/www/atm_securit       00:00

Convert it into bash script (snippet):

for tbl in $(ps -ewo pid,cmd,etime | grep python | grep -v grep | grep -v sh)
do
   echo $tbl
done

But the output becomes:

3089
python
/var/www/atm_securit
38:06
17438
python
/var/www/atm_securit
00:02
17448
python
/var/www/atm_securit
00:01

How do I loop through every row like in the shell output, but in a bash script?

Linux Solutions


Solution 1 - Linux

Never for loop over the results of a shell command if you want to process it line by line unless you are changing the value of the internal field separator $IFS to \n. This is because the lines will get subject of word splitting which leads to the actual results you are seeing. Meaning if you for example have a file like this:

foo bar
hello world

The following for loop

for i in $(cat file); do
    echo "$i"
done

gives you:

foo
bar
hello
world

Even if you use IFS='\n' the lines might still get subject of Filename expansion


I recommend to use while + read instead because read reads line by line.

Furthermore I would use pgrep if you are searching for pids belonging to a certain binary. However, since python might appear as different binaries, like python2.7 or python3.4 I suggest to pass -f to pgrep which makes it search the whole command line rather than just searching for binaries called python. But this will also find processes which have been started like cat foo.py. You have been warned! At the end you can refine the regex passed to pgrep like you wish.

Example:

pgrep -f python | while read -r pid ; do
    echo "$pid"
done

or if you also want the process name:

pgrep -af python | while read -r line ; do
    echo "$line"
done

If you want the process name and the pid in separate variables:

pgrep -af python | while read -r pid cmd ; do
    echo "pid: $pid, cmd: $cmd"
done

You see, read offers a flexible and stable way to process the output of a command line-by-line.


Btw, if you prefer your ps .. | grep command line over pgrep use the following loop:

ps -ewo pid,etime,cmd | grep python | grep -v grep | grep -v sh \
  | while read -r pid etime cmd ; do
    echo "$pid $cmd $etime"
done

Note how I changed the order of etime and cmd. Thus to be able to read cmd, which can contain whitespace, into a single variable. This works because read will break down the line into variables, as many times as you specified variables. The remaining part of the line - possibly including whitespace - will get assigned to the last variable which has been specified in the command line.

Solution 2 - Linux

I found you can do this just use double quotes:

while read -r proc; do
     #do work
done <<< "$(ps -ewo pid,cmd,etime | grep python | grep -v grep | grep -v sh)"

This will save each line to the array rather than each item.

Solution 3 - Linux

When using for loops in bash it splits the given list by default by whitespaces, this can be adapted by using the so called Internal Field Seperator, or IFS in short .

> IFS The Internal Field Separator that is used for word splitting after > expansion and to split lines into words with the read builtin command. > The default value is "".

For your example we would need to tell IFS to use new-lines as break point.

IFS=$'\n'

for tbl in $(ps -ewo pid,cmd,etime | grep python | grep -v grep | grep -v sh)
do
   echo $tbl
done

This example returns the following output on my machine.

  668 /usr/bin/python /usr/bin/ud    03:05:54
27892 python                            00:01

Solution 4 - Linux

Here is another bash-based solution, inspired by comment of @Gordon Davisson.

For this we need (atleast bash v1.13.5 (1992) or later verison), because Process-Substitution2,3,4 while read var; do { ... }; done < <(...);, etc are used.

#!/bin/bash
while IFS= read -a oL ; do {  # reads single/one line
	echo "${oL}";  # prints that single/one line
};
done < <(ps -ewo pid,cmd,etime | grep python | grep -v grep | grep -v sh);
unset oL;

Note: You can use any simple or complex command/command-set inside the <(...) which may have multiple output lines.
And what code does what function is shown here.

And here is a single/one-liner way:
while IFS= read -a oL ; do { echo "${oL}"; }; done < <(ps -ewo pid,cmd,etime | grep python | grep -v grep | grep -v sh); unset oL;

( As Process-Substitution is not part of POSIX yet So its not supported in many POSIX compliant shell or in POSIX shell mode of bash-shell. Process-Substitution existed in bash since 1992 (so that is 28yrs ago from now/2020), & existed in ksh86 (before 1985)1. So POSIX should've included it. )
If you or any user wants to use something similar as Process-Substitution in POSIX compliant shell (i.e: sh, ash, dash, pdksh/mksh, etc), then look into NamedPipes.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
Questionadic26View Question on Stackoverflow
Solution 1 - Linuxhek2mglView Answer on Stackoverflow
Solution 2 - LinuxjkdbaView Answer on Stackoverflow
Solution 3 - LinuxflazzariniView Answer on Stackoverflow
Solution 4 - LinuxatErikView Answer on Stackoverflow