Python3 subprocess output

PythonPython 3.xSubprocess

Python Problem Overview


I want to run the Linux word count utility wc to determine the number of lines currently in the /var/log/syslog, so that I can detect that it's growing. I've tried various test, and while I get the results back from wc, it includes both the line count as well as the command (e.g., var/log/syslog).

So it's returning: 1338 /var/log/syslog But I only want the line count, so I want to strip off the /var/log/syslog portion, and just keep 1338.

I have tried converting it to string from bytestring, and then stripping the result, but no joy. Same story for converting to string and stripping, decoding, etc - all fail to produce the output I'm looking for.

These are some examples of what I get, with 1338 lines in syslog:

  • b'1338 /var/log/syslog\n'
  • 1338 /var/log/syslog

Here's some test code I've written to try and crack this nut, but no solution:

import subprocess

#check_output returns byte string
stdoutdata = subprocess.check_output("wc --lines /var/log/syslog", shell=True)
print("2A stdoutdata: " + str(stdoutdata))
stdoutdata = stdoutdata.decode("utf-8")
print("2B stdoutdata: " + str(stdoutdata))    
stdoutdata=stdoutdata.strip()
print("2C stdoutdata: " + str(stdoutdata))    

The output from this is:

  • 2A stdoutdata: b'1338 /var/log/syslog\n'

  • 2B stdoutdata: 1338 /var/log/syslog

  • 2C stdoutdata: 1338 /var/log/syslog

  • 2D stdoutdata: 1338 /var/log/syslog

Python Solutions


Solution 1 - Python

I suggest that you use subprocess.getoutput() as it does exactly what you want—run a command in a shell and get its string output (as opposed to byte string output). Then you can split on whitespace and grab the first element from the returned list of strings.

Try this:

import subprocess
stdoutdata = subprocess.getoutput("wc --lines /var/log/syslog")
print("stdoutdata: " + stdoutdata.split()[0])

Solution 2 - Python

Since Python 3.6 you can make check_output() return a str instead of bytes by giving it an encoding parameter:

check_output('wc --lines /var/log/syslog', encoding='UTF-8')

But since you just want the count, and both split() and int() are usable with bytes, you don't need to bother with the encoding:

linecount = int(check_output('wc -l /var/log/syslog').split()[0])

While some things might be easier with an external program (e.g., counting log line entries printed by journalctl), in this particular case you don't need to use an external program. The simplest Python-only solution is:

with open('/var/log/syslog', 'rt') as f:
    linecount = len(f.readlines())

This does have the disadvantage that it reads the entire file into memory; if it's a huge file instead initialize linecount = 0 before you open the file and use a for line in f: linecount += 1 loop instead of readlines() to have only a small part of the file in memory as you count.

Solution 3 - Python

To avoid invoking a shell and decoding filenames that might be an arbitrary byte sequence (except '\0') on *nix, you could pass the file as stdin:

import subprocess

with open(b'/var/log/syslog', 'rb') as file:
    nlines = int(subprocess.check_output(['wc', '-l'], stdin=file))
print(nlines)

Or you could ignore any decoding errors:

import subprocess

stdoutdata = subprocess.check_output(['wc', '-l', '/var/log/syslog'])
nlines = int(stdoutdata.decode('ascii', 'ignore').partition(' ')[0])
print(nlines)

Solution 4 - Python

Equivalent to Curt J. Sampson's answer is also this one (it's returning a string):

subprocess.check_output('wc -l /path/to/your/file | cut -d " " -f1', universal_newlines=True, shell=True)

from docs:

> If encoding or errors are specified, or text is true, file objects for > stdin, stdout and stderr are opened in text mode using the specified > encoding and errors or the io.TextIOWrapper default. The > universal_newlines argument is equivalent to text and is provided for > backwards compatibility. By default, file objects are opened in binary > mode.

Something similar, but a bit more complex using subprocess.run():

subprocess.run(command, shell=True, check=True, universal_newlines=True, stdout=subprocess.PIPE).stdout

as subprocess.check_output() could be equivalent to subprocess.run().

Solution 5 - Python

getoutput (and the closer replacement getstatusoutput) are not a direct replacement of check_output - there are security changes in 3.x that prevent some previous commands from working that way (my script was attempting to work with iptables and failing with the new commands). Better to adapt to the new python3 output and add the argument universal_newlines=True:

check_output(command, universal_newlines=True)

This command will behave as you expect check_output, but return string output instead of bytes. It's a direct replacement.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
Questionuser2565677View Question on Stackoverflow
Solution 1 - PythonJoseph DunnView Answer on Stackoverflow
Solution 2 - PythoncjsView Answer on Stackoverflow
Solution 3 - PythonjfsView Answer on Stackoverflow
Solution 4 - PythonCatalin B.View Answer on Stackoverflow
Solution 5 - Pythontk421stormView Answer on Stackoverflow