Grep and Python
PythonRegexGrepPython Problem Overview
I need a way of searching a file using grep via a regular expression from the Unix command line. For example when I type in the command line:
python pythonfile.py 'RE' 'file-to-be-searched'
I need the regular expression 'RE'
to be searched in the file and print out the matching lines.
Here's the code I have:
import re
import sys
search_term = sys.argv[1]
f = sys.argv[2]
for line in open(f, 'r'):
if re.search(search_term, line):
print line,
if line == None:
print 'no matches found'
But when I enter a word which isn't present, no matches found
doesn't print
Python Solutions
Solution 1 - Python
The natural question is why not just use grep?! But assuming you can't...
import re
import sys
file = open(sys.argv[2], "r")
for line in file:
if re.search(sys.argv[1], line):
print line,
Things to note:
search
instead ofmatch
to find anywhere in string- comma (
,
) afterprint
removes carriage return (line will have one) argv
includes python file name, so variables need to start at 1
This doesn't handle multiple arguments (like grep does) or expand wildcards (like the Unix shell would). If you wanted this functionality you could get it using the following:
import re
import sys
import glob
for arg in sys.argv[2:]:
for file in glob.iglob(arg):
for line in open(file, 'r'):
if re.search(sys.argv[1], line):
print line,
Solution 2 - Python
Concise and memory efficient:
#!/usr/bin/env python
# file: grep.py
import re, sys, collections
collections.deque(map(sys.stdout.write,(l for l in sys.stdin if re.search(sys.argv[1],l))),maxlen=0)
It works like egrep (without too much error handling), e.g.:
cat input-file | grep.py "RE"
And here is the one-liner:
cat input-file | python -c "import re,sys,collections;collections.deque(map(sys.stdout.write,(l for l in sys.stdin if re.search(sys.argv[1],l))),maxlen=0)" "RE"
Note that the collections.deque
function is required in Python3 because map has become a lazy function.
Solution 3 - Python
Adapted from a grep in python.
Accepts a list of filenames via [2:]
, does no exception handling:
#!/usr/bin/env python
import re, sys, os
for f in filter(os.path.isfile, sys.argv[2:]):
for line in open(f).readlines():
if re.match(sys.argv[1], line):
print line
sys.argv[1]
resp sys.argv[2:]
works, if you run it as an standalone executable, meaning
chmod +x
first
Solution 4 - Python
- use
sys.argv
to get the command-line parameters - use
open()
,read()
to manipulate file - use the Python re module to match lines
Solution 5 - Python
You might be interested in pyp. Citing my other answer:
> "The Pyed Piper", or pyp, is a linux command line text manipulation > tool similar to awk or sed, but which uses standard python string and > list methods as well as custom functions evolved to generate fast > results in an intense production environment.
Solution 6 - Python
The real problem is that the variable line always has a value. The test for "no matches found" is whether there is a match so the code "if line == None:" should be replaced with "else:"
Solution 7 - Python
You can use python-textops3 :
from textops import *
print('\n'.join(cat(f) | grep(search_term)))
with python-textops3 you can use unix-like commands with pipes
Solution 8 - Python
Not sure if your question was clear to me but to fix your code just change your if expression like the following:
import re
import sys
search_term = sys.argv[1]
f = sys.argv[2]
r = None
n = 0
for line in open(f, 'r'):
n=n+1
r = re.search(search_term, line)
if r:
print(f"{line} found at line {n}")
if not r:
print('no matches found')
PS: I tested it on Python 3.8.10
if you want to use grep you could
grep -E '(.*)word(.*)' file.txt || echo "pattern not found"