can you write a str.replace() using dictionary values in Python?

PythonDictionaryStr Replace

Python Problem Overview


I have to replace the north, south, etc with N S in address fields.

If I have

list = {'NORTH':'N','SOUTH':'S','EAST':'E','WEST':'W'}
address = "123 north anywhere street"

Can I for iterate over my dictionary values to replace my address field?

for dir in list[]:
   address.upper().replace(key,value)

I know i'm not even close!! But any input would be appreciated if you can use dictionary values like this.

Python Solutions


Solution 1 - Python

address = "123 north anywhere street"

for word, initial in {"NORTH":"N", "SOUTH":"S" }.items():
    address = address.replace(word.lower(), initial)
print address

nice and concise and readable too.

Solution 2 - Python

you are close, actually:

dictionary = {"NORTH":"N", "SOUTH":"S" } 
for key in dictionary.iterkeys():
    address = address.upper().replace(key, dictionary[key])

Note: for Python 3 users, you should use .keys() instead of .iterkeys():

dictionary = {"NORTH":"N", "SOUTH":"S" } 
for key in dictionary.keys():
    address = address.upper().replace(key, dictionary[key])

Solution 3 - Python

One option I don't think anyone has yet suggested is to build a regular expression containing all of the keys and then simply do one replace on the string:

>>> import re
>>> l = {'NORTH':'N','SOUTH':'S','EAST':'E','WEST':'W'}
>>> pattern = '|'.join(sorted(re.escape(k) for k in l))
>>> address = "123 north anywhere street"
>>> re.sub(pattern, lambda m: l.get(m.group(0).upper()), address, flags=re.IGNORECASE)
'123 N anywhere street'
>>> 

This has the advantage that the regular expression can ignore the case of the input string without modifying it.

If you want to operate only on complete words then you can do that too with a simple modification of the pattern:

>>> pattern = r'\b({})\b'.format('|'.join(sorted(re.escape(k) for k in l)))
>>> address2 = "123 north anywhere southstreet"
>>> re.sub(pattern, lambda m: l.get(m.group(0).upper()), address2, flags=re.IGNORECASE)
'123 N anywhere southstreet'

Solution 4 - Python

You are probably looking for iteritems():

d = {'NORTH':'N','SOUTH':'S','EAST':'E','WEST':'W'}
address = "123 north anywhere street"

for k,v in d.iteritems():
    address = address.upper().replace(k, v)

address is now '123 N ANYWHERE STREET'


Well, if you want to preserve case, whitespace and nested words (e.g. Southstreet should not converted to Sstreet), consider using this simple list comprehension:

import re

l = {'NORTH':'N','SOUTH':'S','EAST':'E','WEST':'W'}

address = "North 123 East Anywhere Southstreet    West"

new_address = ''.join(l[p.upper()] if p.upper() in l else p for p in re.split(r'(\W+)', address))

new_address is now

N 123 E Anywhere Southstreet    W

Solution 5 - Python

"Translating" a string with a dictionary is a very common requirement. I propose a function that you might want to keep in your toolkit:

def translate(text, conversion_dict, before=None):
    """
    Translate words from a text using a conversion dictionary

    Arguments:
        text: the text to be translated
        conversion_dict: the conversion dictionary
        before: a function to transform the input
        (by default it will to a lowercase)
    """
    # if empty:
    if not text: return text
    # preliminary transformation:
    before = before or str.lower
    t = before(text)
    for key, value in conversion_dict.items():
        t = t.replace(key, value)
    return t

Then you can write:

>>> a = {'hello':'bonjour', 'world':'tout-le-monde'}
>>> translate('hello world', a)
'bonjour tout-le-monde'

Solution 6 - Python

I would suggest to use a regular expression instead of a simple replace. With a replace you have the risk that subparts of words are replaced which is maybe not what you want.

import json
import re

with open('filePath.txt') as f:
   data = f.read()

with open('filePath.json') as f:
   glossar = json.load(f)

for word, initial in glossar.items():
   data = re.sub(r'\b' + word + r'\b', initial, data)

print(data)

Solution 7 - Python

def replace_values_in_string(text, args_dict):
    for key in args_dict.keys():
        text = text.replace(key, str(args_dict[key]))
    return text

Solution 8 - Python

Try,

import re
l = {'NORTH':'N','SOUTH':'S','EAST':'E','WEST':'W'}

address = "123 north anywhere street"

for k, v in l.iteritems():
    t = re.compile(re.escape(k), re.IGNORECASE)
    address = t.sub(v, address)
print(address)

Solution 9 - Python

All of these answers are good, but you are missing python string substitution - it's simple and quick, but requires your string to be formatted correctly.

address = "123 %(direction)s anywhere street"
print(address % {"direction": "N"})

Solution 10 - Python

Both using replace() and format() are not so precise:

data =  '{content} {address}'
for k,v in {"{content}":"some {address}", "{address}":"New York" }.items():
    data = data.replace(k,v)
# results: some New York New York

'{ {content} {address}'.format(**{'content':'str1', 'address':'str2'})
# results: ValueError: unexpected '{' in field name

It is better to translate with re.sub() if you need precise place:

import re
def translate(text, kw, ignore_case=False):
    search_keys = map(lambda x:re.escape(x), kw.keys())
    if ignore_case:
        kw = {k.lower():kw[k] for k in kw}
        regex = re.compile('|'.join(search_keys), re.IGNORECASE)
        res = regex.sub( lambda m:kw[m.group().lower()], text)
    else:
        regex = re.compile('|'.join(search_keys))
        res = regex.sub( lambda m:kw[m.group()], text)

    return res

#'score: 99.5% name:%(name)s' %{'name':'foo'}
res = translate( 'score: 99.5% name:{name}', {'{name}':'foo'})
print(res)

res = translate( 'score: 99.5% name:{NAME}', {'{name}':'foo'}, ignore_case=True)
print(res)

Solution 11 - Python

If you're looking for a concise way, you can go for reduce from functools:

from functools import reduce

str_to_replace = "The string for replacement."
replacement_dict = {"The ": "A new ", "for ": "after "}

str_replaced = reduce(lambda x, y: x.replace(*y), [str_to_replace, *list(replacement_dict.items())])
print(str_replaced)

Solution 12 - Python

The advantage of Duncan's approach is that it is careful not to overwrite previous answers. For example if you have {"Shirt": "Tank Top", "Top": "Sweater"}, the other approaches replace "Shirt" with "Tank Sweater".

The following code extends that approach, but sorts the keys such that the longest one is always found first and it uses named groups to search case insensitively.

import re
root_synonyms = {'NORTH':'N','SOUTH':'S','EAST':'E','WEST':'W'}
# put the longest search term first. This menas the system does not replace "top" before "tank top"
synonym_keys = sorted(root_synonyms.keys(),key=len,reverse=True)
# the groups will be named w1, w2, ... . Determine what each of them should become
number_mapping = {f'w{i}':root_synonyms[key] for i,key in enumerate(synonym_keys) }
# make a regex for each word where "tank top" or "tank  top" are the same
search_terms = [re.sub(r'\s+',r'\s+',re.escape(k)) for k in synonym_keys]
# give each search term a name w1 etc where
search_terms = [f'(?P<w{i}>\\b{key}\\b)' for i,key in enumerate(search_terms)]
# make one huge regex
search_terms = '|'.join(search_terms)
# compile it for speed
search_re = re.compile(search_terms,re.IGNORECASE)

query = "123 north anywhere street"
result = re.sub(search_re,lambda x: number_mapping[x.lastgroup],query)
print(result)

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
Questionuser1947457View Question on Stackoverflow
Solution 1 - PythonDjango DoctorView Answer on Stackoverflow
Solution 2 - PythonSamuele MattiuzzoView Answer on Stackoverflow
Solution 3 - PythonDuncanView Answer on Stackoverflow
Solution 4 - PythonslothView Answer on Stackoverflow
Solution 5 - PythonfralauView Answer on Stackoverflow
Solution 6 - PythonTrafalgarView Answer on Stackoverflow
Solution 7 - PythonArtem MalikovView Answer on Stackoverflow
Solution 8 - PythonAdem ÖztaşView Answer on Stackoverflow
Solution 9 - Pythoncacti5View Answer on Stackoverflow
Solution 10 - PythonahuigoView Answer on Stackoverflow
Solution 11 - Pythonm7sView Answer on Stackoverflow
Solution 12 - PythonJelmerView Answer on Stackoverflow