Finding and replacing elements in a list

PythonListReplace

Python Problem Overview


I have to search through a list and replace all occurrences of one element with another. So far my attempts in code are getting me nowhere, what is the best way to do this?

For example, suppose my list has the following integers

>>> a = [1,2,3,4,5,1,2,3,4,5,1]

and I need to replace all occurrences of the number 1 with the value 10 so the output I need is

>>> a = [10, 2, 3, 4, 5, 10, 2, 3, 4, 5, 10]

Thus my goal is to replace all instances of the number 1 with the number 10.

Python Solutions


Solution 1 - Python

Try using a list comprehension and a conditional expression.

>>> a=[1,2,3,1,3,2,1,1]
>>> [4 if x==1 else x for x in a]
[4, 2, 3, 4, 3, 2, 4, 4]

Solution 2 - Python

You can use the built-in enumerate to get both index and value while iterating the list. Then, use the value to test for a condition and the index to replace that value in the original list:

>>> a = [1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1]
>>> for i, n in enumerate(a):
...   if n == 1:
...      a[i] = 10
...
>>> a
[10, 2, 3, 4, 5, 10, 2, 3, 4, 5, 10]

Solution 3 - Python

If you have several values to replace, you can also use a dictionary:

a = [1, 2, 3, 4, 1, 5, 3, 2, 6, 1, 1]
replacements = {1:10, 2:20, 3:'foo'}
replacer = replacements.get  # For faster gets.

print([replacer(n, n) for n in a])

> [10, 20, 'foo', 4, 10, 5, 'foo', 20, 6, 10, 10]

Note that this approach works only if the elements to be replaced are hashable. This is because dict keys are required to be hashable.

Solution 4 - Python

List comprehension works well, and looping through with enumerate can save you some memory (b/c the operation's essentially being done in place).

There's also functional programming. See usage of map:

>>> a = [1,2,3,2,3,4,3,5,6,6,5,4,5,4,3,4,3,2,1]
>>> map(lambda x: x if x != 4 else 'sss', a)
[1, 2, 3, 2, 3, 'sss', 3, 5, 6, 6, 5, 'sss', 5, 'sss', 3, 'sss', 3, 2, 1]

Solution 5 - Python

On long lists and rare occurrences its about 3x faster using list.index() - compared to single step iteration methods presented in the other answers.

def list_replace(lst, old=1, new=10):
    """replace list elements (inplace)"""
    i = -1
    try:
        while True:
            i = lst.index(old, i + 1)
            lst[i] = new
    except ValueError:
        pass

Solution 6 - Python

>>> a=[1,2,3,4,5,1,2,3,4,5,1]
>>> item_to_replace = 1
>>> replacement_value = 6
>>> indices_to_replace = [i for i,x in enumerate(a) if x==item_to_replace]
>>> indices_to_replace
[0, 5, 10]
>>> for i in indices_to_replace:
...     a[i] = replacement_value
... 
>>> a
[6, 2, 3, 4, 5, 6, 2, 3, 4, 5, 6]
>>> 

Solution 7 - Python

I know this is a very old question and there's a myriad of ways to do it. The simpler one I found is using numpy package.

import numpy

arr = numpy.asarray([1, 6, 1, 9, 8])
arr[ arr == 8 ] = 0 # change all occurrences of 8 by 0
print(arr)

Solution 8 - Python

My usecase was replacing None with some default value.

I've timed approaches to this problem that were presented here, including the one by @kxr - using str.count.

Test code in ipython with Python 3.8.1:

def rep1(lst, replacer = 0):
    ''' List comprehension, new list '''

    return [item if item is not None else replacer for item in lst]


def rep2(lst, replacer = 0):
    ''' List comprehension, in-place '''    
    lst[:] =  [item if item is not None else replacer for item in lst]

    return lst


def rep3(lst, replacer = 0):
    ''' enumerate() with comparison - in-place '''
    for idx, item in enumerate(lst):
        if item is None:
            lst[idx] = replacer
    
    return lst


def rep4(lst, replacer = 0):
    ''' Using str.index + Exception, in-place '''

    idx = -1
    # none_amount = lst.count(None)
    while True:
        try:
            idx = lst.index(None, idx+1)
        except ValueError:
            break
        else:
            lst[idx] = replacer

    return lst


def rep5(lst, replacer = 0):
    ''' Using str.index + str.count, in-place '''

    idx = -1
    for _ in range(lst.count(None)):
        idx = lst.index(None, idx+1)
        lst[idx] = replacer

    return lst


def rep6(lst, replacer = 0):
    ''' Using map, return map iterator '''

    return map(lambda item: item if item is not None else replacer, lst)


def rep7(lst, replacer = 0):
    ''' Using map, return new list '''

    return list(map(lambda item: item if item is not None else replacer, lst))


lst = [5]*10**6
# lst = [None]*10**6

%timeit rep1(lst)    
%timeit rep2(lst)    
%timeit rep3(lst)    
%timeit rep4(lst)    
%timeit rep5(lst)    
%timeit rep6(lst)    
%timeit rep7(lst)    

I get:

26.3 ms ± 163 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
29.3 ms ± 206 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
33.8 ms ± 191 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
11.9 ms ± 37.8 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
11.9 ms ± 60.2 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
260 ns ± 1.84 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
56.5 ms ± 204 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

Using the internal str.index is in fact faster than any manual comparison.

I didn't know if the exception in test 4 would be more laborious than using str.count, the difference seems negligible.

Note that map() (test 6) returns an iterator and not an actual list, thus test 7.

Solution 9 - Python

The answers for this old but relevant question are wildly variable in speed.

The fastest of the solution posted by kxr.

However, this is even faster and otherwise not here:

def f1(arr, find, replace):
	# fast and readable
	base=0
	for cnt in range(arr.count(find)):
		offset=arr.index(find, base)
		arr[offset]=replace
		base=offset+1

Here is timing for the various solutions. The faster ones are 3X faster than accepted answer and 5X faster than the slowest answer here.

To be fair, all methods needed to do inlace replacement of the array sent to the function.

Please see timing code below:

def f1(arr, find, replace):
	# fast and readable
	base=0
	for cnt in range(arr.count(find)):
		offset=arr.index(find, base)
		arr[offset]=replace
		base=offset+1
		
def f2(arr,find,replace):
	# accepted answer
	for i,e in enumerate(arr):
		if e==find: 
			arr[i]=replace
		
def f3(arr,find,replace):
	# in place list comprehension
	arr[:]=[replace if e==find else e for e in arr]
	
def f4(arr,find,replace):
	# in place map and lambda -- SLOW
	arr[:]=list(map(lambda x: x if x != find else replace, arr))
	
def f5(arr,find,replace):
	# find index with comprehension
	for i in [i for i, e in enumerate(arr) if e==find]:
		arr[i]=replace
		
def f6(arr,find,replace):
	# FASTEST but a little les clear
	try:
		while True:
			arr[arr.index(find)]=replace
	except ValueError:
		pass	

def f7(lst, old, new):
	"""replace list elements (inplace)"""
	i = -1
	try:
		while 1:
			i = lst.index(old, i + 1)
			lst[i] = new
	except ValueError:
		pass
	
	
import time 	

def cmpthese(funcs, args=(), cnt=1000, rate=True, micro=True):
	"""Generate a Perl style function benchmark"""                   
	def pprint_table(table):
		"""Perl style table output"""
		def format_field(field, fmt='{:,.0f}'):
			if type(field) is str: return field
			if type(field) is tuple: return field[1].format(field[0])
			return fmt.format(field)     

		def get_max_col_w(table, index):
			return max([len(format_field(row[index])) for row in table])         

		col_paddings=[get_max_col_w(table, i) for i in range(len(table[0]))]
		for i,row in enumerate(table):
			# left col
			row_tab=[row[0].ljust(col_paddings[0])]
			# rest of the cols
			row_tab+=[format_field(row[j]).rjust(col_paddings[j]) for j in range(1,len(row))]
			print(' '.join(row_tab))                

	results={}
	for i in range(cnt):
		for f in funcs:
			start=time.perf_counter_ns()
			f(*args)
			stop=time.perf_counter_ns()
			results.setdefault(f.__name__, []).append(stop-start)
	results={k:float(sum(v))/len(v) for k,v in results.items()}		
	fastest=sorted(results,key=results.get, reverse=True)
	table=[['']]
	if rate: table[0].append('rate/sec')
	if micro: table[0].append('\u03bcsec/pass')
	table[0].extend(fastest)
	for e in fastest:
		tmp=[e]
		if rate:
			tmp.append('{:,}'.format(int(round(float(cnt)*1000000.0/results[e]))))

		if micro:
			tmp.append('{:,.1f}'.format(results[e]/float(cnt)))

		for x in fastest:
			if x==e: tmp.append('--')
			else: tmp.append('{:.1%}'.format((results[x]-results[e])/results[e]))
		table.append(tmp) 

	pprint_table(table)                    



if __name__=='__main__':
	import sys
	import time 
	print(sys.version)
	cases=(
		('small, found', 9, 100),
		('small, not found', 99, 100),
		('large, found', 9, 1000),
		('large, not found', 99, 1000)
	)
	for txt, tgt, mul in cases:
		print(f'\n{txt}:')
		arr=[1,2,3,4,5,6,7,8,9,0]*mul 
		args=(arr,tgt,'X')
		cmpthese([f1,f2,f3, f4, f5, f6, f7],args)	

And the results:

3.9.1 (default, Feb  3 2021, 07:38:02) 
[Clang 12.0.0 (clang-1200.0.32.29)]

small, found:
   rate/sec μsec/pass     f4     f3     f5     f2     f6     f7     f1
f4  133,982       7.5     -- -38.8% -49.0% -52.5% -78.5% -78.6% -82.9%
f3  219,090       4.6  63.5%     -- -16.6% -22.4% -64.8% -65.0% -72.0%
f5  262,801       3.8  96.1%  20.0%     --  -6.9% -57.8% -58.0% -66.4%
f2  282,259       3.5 110.7%  28.8%   7.4%     -- -54.6% -54.9% -63.9%
f6  622,122       1.6 364.3% 184.0% 136.7% 120.4%     --  -0.7% -20.5%
f7  626,367       1.6 367.5% 185.9% 138.3% 121.9%   0.7%     -- -19.9%
f1  782,307       1.3 483.9% 257.1% 197.7% 177.2%  25.7%  24.9%     --

small, not found:
   rate/sec μsec/pass     f4     f5     f2     f3     f6     f7     f1
f4   13,846      72.2     -- -40.3% -41.4% -47.8% -85.2% -85.4% -86.2%
f5   23,186      43.1  67.5%     --  -1.9% -12.5% -75.2% -75.5% -76.9%
f2   23,646      42.3  70.8%   2.0%     -- -10.8% -74.8% -75.0% -76.4%
f3   26,512      37.7  91.5%  14.3%  12.1%     -- -71.7% -72.0% -73.5%
f6   93,656      10.7 576.4% 303.9% 296.1% 253.3%     --  -1.0%  -6.5%
f7   94,594      10.6 583.2% 308.0% 300.0% 256.8%   1.0%     --  -5.6%
f1  100,206      10.0 623.7% 332.2% 323.8% 278.0%   7.0%   5.9%     --

large, found:
   rate/sec μsec/pass     f4     f2     f5     f3     f6     f7     f1
f4      145   6,889.4     -- -33.3% -34.8% -48.6% -85.3% -85.4% -85.8%
f2      218   4,593.5  50.0%     --  -2.2% -22.8% -78.0% -78.1% -78.6%
f5      223   4,492.4  53.4%   2.3%     -- -21.1% -77.5% -77.6% -78.2%
f3      282   3,544.0  94.4%  29.6%  26.8%     -- -71.5% -71.6% -72.3%
f6      991   1,009.5 582.4% 355.0% 345.0% 251.1%     --  -0.4%  -2.8%
f7      995   1,005.4 585.2% 356.9% 346.8% 252.5%   0.4%     --  -2.4%
f1    1,019     981.3 602.1% 368.1% 357.8% 261.2%   2.9%   2.5%     --

large, not found:
   rate/sec μsec/pass     f4     f5     f2     f3     f6     f7     f1
f4      147   6,812.0     -- -35.0% -36.4% -48.9% -85.7% -85.8% -86.1%
f5      226   4,424.8  54.0%     --  -2.0% -21.3% -78.0% -78.1% -78.6%
f2      231   4,334.9  57.1%   2.1%     -- -19.6% -77.6% -77.7% -78.2%
f3      287   3,484.0  95.5%  27.0%  24.4%     -- -72.1% -72.2% -72.8%
f6    1,028     972.3 600.6% 355.1% 345.8% 258.3%     --  -0.4%  -2.7%
f7    1,033     968.2 603.6% 357.0% 347.7% 259.8%   0.4%     --  -2.3%
f1    1,057     946.2 619.9% 367.6% 358.1% 268.2%   2.8%   2.3%     --

Solution 10 - Python

I might be a dumb-dumb, but I would write a separate, simple function for this:

def convertElements( oldlist, convert_dict ):
  newlist = []
  for e in oldlist:
    if e in convert_dict:
      newlist.append(convert_dict[e])
    else:
      newlist.append(e)
  return newlist

And then call this as needed like so:

a = [1,2,3,4,5,1,2,3,4,5,1]
a_new = convertElements(a, {1: 10})
## OUTPUT: a_new=[10, 2, 3, 4, 5, 10, 2, 3, 4, 5, 10]

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionJamesView Question on Stackoverflow
Solution 1 - PythonoutisView Answer on Stackoverflow
Solution 2 - Pythonghostdog74View Answer on Stackoverflow
Solution 3 - PythonroipoussiereView Answer on Stackoverflow
Solution 4 - PythondamzamView Answer on Stackoverflow
Solution 5 - PythonkxrView Answer on Stackoverflow
Solution 6 - PythonJohn La RooyView Answer on Stackoverflow
Solution 7 - PythonTiago VieiraView Answer on Stackoverflow
Solution 8 - PythonJayView Answer on Stackoverflow
Solution 9 - PythondawgView Answer on Stackoverflow
Solution 10 - PythonJerry ChenView Answer on Stackoverflow