Python 2.x gotchas and landmines

Python Problem Overview

The purpose of my question is to strengthen my knowledge base with Python and get a better picture of it, which includes knowing its faults and surprises. To keep things specific, I'm only interested in the CPython interpreter.

I'm looking for something similar to what learned from my [PHP landmines][1] [1]: https://stackoverflow.com/questions/512120/php-landmines-in-general question where some of the answers were well known to me but a couple were borderline horrifying.

Update: Apparently one maybe two people are upset that I asked a question that's already partially answered outside of Stack Overflow. As some sort of compromise here's the URL http://www.ferg.org/projects/python_gotchas.html

Note that one or two answers here already are original from what was written on the site referenced above.

Python Solutions

Solution 1 - Python

Expressions in default arguments are calculated when the function is defined, not when it’s called.

Example: consider defaulting an argument to the current time:

>>>import time
>>> def report(when=time.time()):
...     print when
...
>>> report()
1210294387.19
>>> time.sleep(5)
>>> report()
1210294387.19

The when argument doesn't change. It is evaluated when you define the function. It won't change until the application is re-started.

Strategy: you won't trip over this if you default arguments to None and then do something useful when you see it:

>>> def report(when=None):
...     if when is None:
...         when = time.time()
...     print when
...
>>> report()
1210294762.29
>>> time.sleep(5)
>>> report()
1210294772.23

Exercise: to make sure you've understood: why is this happening?

>>> def spam(eggs=[]):
...     eggs.append("spam")
...     return eggs
...
>>> spam()
['spam']
>>> spam()
['spam', 'spam']
>>> spam()
['spam', 'spam', 'spam']
>>> spam()
['spam', 'spam', 'spam', 'spam']

Solution 2 - Python

You should be aware of how class variables are handled in Python. Consider the following class hierarchy:

class AAA(object):
    x = 1

class BBB(AAA):
    pass

class CCC(AAA):
    pass

Now, check the output of the following code:

>>> print AAA.x, BBB.x, CCC.x
1 1 1
>>> BBB.x = 2
>>> print AAA.x, BBB.x, CCC.x
1 2 1
>>> AAA.x = 3
>>> print AAA.x, BBB.x, CCC.x
3 2 3

Surprised? You won't be if you remember that class variables are internally handled as dictionaries of a class object. For read operations, if a variable name is not found in the dictionary of current class, the parent classes are searched for it. So, the following code again, but with explanations:

# AAA: {'x': 1}, BBB: {}, CCC: {}
>>> print AAA.x, BBB.x, CCC.x
1 1 1
>>> BBB.x = 2
# AAA: {'x': 1}, BBB: {'x': 2}, CCC: {}
>>> print AAA.x, BBB.x, CCC.x
1 2 1
>>> AAA.x = 3
# AAA: {'x': 3}, BBB: {'x': 2}, CCC: {}
>>> print AAA.x, BBB.x, CCC.x
3 2 3

Same goes for handling class variables in class instances (treat this example as a continuation of the one above):

>>> a = AAA()
# a: {}, AAA: {'x': 3}
>>> print a.x, AAA.x
3 3
>>> a.x = 4
# a: {'x': 4}, AAA: {'x': 3}
>>> print a.x, AAA.x
4 3

Solution 3 - Python

Loops and lambdas (or any closure, really): variables are bound by name

funcs = []
for x in range(5):
  funcs.append(lambda: x)

[f() for f in funcs]
# output:
# 4 4 4 4 4

A work around is either creating a separate function or passing the args by name:

funcs = []
for x in range(5):
  funcs.append(lambda x=x: x)
[f() for f in funcs]
# output:
# 0 1 2 3 4

Solution 4 - Python

Dynamic binding makes typos in your variable names surprisingly hard to find. It's easy to spend half an hour fixing a trivial bug.

EDIT: an example...

for item in some_list:
    ... # lots of code
... # more code
for tiem in some_other_list:
    process(item) # oops!

Solution 5 - Python

One of the biggest surprises I ever had with Python is this one:

a = ([42],)
a[0] += [43, 44]

This works as one might expect, except for raising a TypeError after updating the first entry of the tuple! So a will be ([42, 43, 44],) after executing the += statement, but there will be an exception anyway. If you try this on the other hand

a = ([42],)
b = a[0]
b += [43, 44]

you won't get an error.

Solution 6 - Python

try:
    int("z")
except IndexError, ValueError:
    pass

reason this doesn't work is because IndexError is the type of exception you're catching, and ValueError is the name of the variable you're assigning the exception to.

Correct code to catch multiple exceptions is:

try:
    int("z")
except (IndexError, ValueError):
    pass

Solution 7 - Python

There was a lot of discussion on hidden language features a while back: hidden-features-of-python. Where some pitfalls were mentioned (and some of the good stuff too).

Also you might want to check out Python Warts.

But for me, integer division's a gotcha:

>>> 5/2
2

You probably wanted:

>>> 5*1.0/2
2.5

If you really want this (C-like) behaviour, you should write:

>>> 5//2
2

As that will work with floats too (and it will work when you eventually go to Python 3):

>>> 5*1.0//2
2.0

GvR explains how integer division came to work how it does on the history of Python.

Solution 8 - Python

Not including an __init__.py in your packages. That one still gets me sometimes.

Solution 9 - Python

List slicing has caused me a lot of grief. I actually consider the following behavior a bug.

Define a list x

>>> x = [10, 20, 30, 40, 50]

Access index 2:

>>> x[2]
30

As you expect.

Slice the list from index 2 and to the end of the list:

>>> x[2:]
[30, 40, 50]

As you expect.

Access index 7:

>>> x[7]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IndexError: list index out of range

Again, as you expect.

However, try to slice the list from index 7 until the end of the list:

>>> x[7:]
[]

???

The remedy is to put a lot of tests when using list slicing. I wish I'd just get an error instead. Much easier to debug.

Solution 10 - Python

The only gotcha/surprise I've dealt with is with CPython's GIL. If for whatever reason you expect python threads in CPython to run concurrently... well they're not and this is pretty well documented by the Python crowd and even Guido himself.

A long but thorough explanation of CPython threading and some of the things going on under the hood and why true concurrency with CPython isn't possible. http://jessenoller.com/2009/02/01/python-threads-and-the-global-interpreter-lock/

Solution 11 - Python

James Dumay eloquently reminded me of another Python gotcha:

Not all of Python's “included batteries” are wonderful.

James’ specific example was the HTTP libraries: httplib, urllib, urllib2, urlparse, mimetools, and ftplib. Some of the functionality is duplicated, and some of the functionality you'd expect is completely absent, e.g. redirect handling. Frankly, it's horrible.

If I ever have to grab something via HTTP these days, I use the urlgrabber module forked from the Yum project.

Solution 12 - Python

Floats are not printed at full precision by default (without repr):

x = 1.0 / 3
y = 0.333333333333
print x  #: 0.333333333333
print y  #: 0.333333333333
print x == y  #: False

repr prints too many digits:

print repr(x)  #: 0.33333333333333331
print repr(y)  #: 0.33333333333300003
print x == 0.3333333333333333  #: True

Solution 13 - Python

Unintentionally mixing oldstyle and newstyle classes can cause seemingly mysterious errors.

Say you have a simple class hierarchy consisting of superclass A and subclass B. When B is instantiated, A's constructor must be called first. The code below correctly does this:

class A(object):
    def __init__(self):
        self.a = 1

class B(A):
    def __init__(self):
        super(B, self).__init__()
        self.b = 1

b = B()

But if you forget to make A a newstyle class and define it like this:

class A:
    def __init__(self):
        self.a = 1

you get this traceback:

Traceback (most recent call last):
  File "AB.py", line 11, in <module>
    b = B()
  File "AB.py", line 7, in __init__
    super(B, self).__init__()
TypeError: super() argument 1 must be type, not classobj

Two other questions relating to this issue are 489269 and 770134

Solution 14 - Python

def f():
    x += 1

x = 42
f()

results in an UnboundLocalError, because local names are detected statically. A different example would be

def f():
    print x
    x = 43

x = 42
f()

Solution 15 - Python

You cannot use locals()['x'] = whatever to change local variable values as you might expect.

This works:

>>> x = 1
>>> x
1
>>> locals()['x'] = 2
>>> x
2

BUT:

>>> def test():
...     x = 1
...     print x
...     locals()['x'] = 2
...     print x  # *** prints 1, not 2 ***
...
>>> test()
1
1

This actually burnt me in an answer here on SO, since I had tested it outside a function and got the change I wanted. Afterwards, I found it mentioned and contrasted to the case of globals() in "Dive Into Python." See example 8.12. (Though it does not note that the change via locals() will work at the top level as I show above.)

Solution 16 - Python

`x += [...]` is not the same as `x = x + [...]` when `x` is a list`

>>> x = y = [1,2,3]
>>> x = x + [4]
>>> x == y
False

>>> x = y = [1,2,3]
>>> x += [4]
>>> x == y
True

One creates a new list while the other modifies in place

Solution 17 - Python

List repetition with nested lists

This caught me out today and wasted an hour of my time debugging:

>>> x = [[]]*5
>>> x[0].append(0)

# Expect x equals [[0], [], [], [], []]
>>> x
[[0], [0], [0], [0], [0]]   # Oh dear

Explanation: https://stackoverflow.com/questions/1959744/python-list-problem

Solution 18 - Python

Using class variables when you want instance variables. Most of the time this doesn't cause problems, but if it's a mutable value it causes surprises.

class Foo(object):
    x = {}

But:

>>> f1 = Foo()
>>> f2 = Foo()
>>> f1.x['a'] = 'b'
>>> f2.x
{'a': 'b'}

You almost always want instance variables, which require you to assign inside __init__:

class Foo(object):
    def __init__(self):
        self.x = {}

Solution 19 - Python

Python 2 has some surprising behaviour with comparisons:

>>> print x
0
>>> print y
1
>>> x < y
False

What's going on? repr() to the rescue:

>>> print "x: %r, y: %r" % (x, y)
x: '0', y: 1

Solution 20 - Python

If you assign to a variable inside a function, Python assumes that the variable is defined inside that function:

>>> x = 1
>>> def increase_x():
...     x += 1
... 
>>> increase_x()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 2, in increase_x
UnboundLocalError: local variable 'x' referenced before assignment

Use global x (or nonlocal x in Python 3) to declare you want to set a variable defined outside your function.

Solution 21 - Python

The values of range(end_val) are not only strictly smaller than end_val, but strictly smaller than int(end_val). For a float argument to range, this might be an unexpected result:

from future.builtins import range
list(range(2.89))
[0, 1]

Solution 22 - Python

Due to 'truthiness' this makes sense:

>>>bool(1)
True

but you might not expect it to go the other way:

>>>float(True)
1.0

This can be a gotcha if you're converting strings to numeric and your data has True/False values.

Solution 23 - Python

If you create a list of list this way:

arr = [[2]] * 5 
print arr 
[[2], [2], [2], [2], [2]]

Then this creates an array with all elements pointing to the same object ! This might create a real confusion. Consider this:

arr[0][0] = 5

then if you print arr

print arr
[[5], [5], [5], [5], [5]]

The proper way of initializing the array is for example with a list comprehension:

arr = [[2] for _ in range(5)]

arr[0][0] = 5

print arr

[[5], [2], [2], [2], [2]]

Content Type	Original Author	Original Content on Stackoverflow
Question	David	View Question on Stackoverflow
Solution 1 - Python	Garth Kidd	View Answer on Stackoverflow
Solution 2 - Python	DzinX	View Answer on Stackoverflow
Solution 3 - Python	Richard Levasseur	View Answer on Stackoverflow
Solution 4 - Python	Algorias	View Answer on Stackoverflow
Solution 5 - Python	Sven Marnach	View Answer on Stackoverflow
Solution 6 - Python	user537122	View Answer on Stackoverflow
Solution 7 - Python	Tom Dunham	View Answer on Stackoverflow
Solution 8 - Python	Jason Baker	View Answer on Stackoverflow
Solution 9 - Python	Viktiglemma	View Answer on Stackoverflow
Solution 10 - Python	David	View Answer on Stackoverflow
Solution 11 - Python	Garth Kidd	View Answer on Stackoverflow
Solution 12 - Python	pts	View Answer on Stackoverflow
Solution 13 - Python	Dawie Strauss	View Answer on Stackoverflow
Solution 14 - Python	Sven Marnach	View Answer on Stackoverflow
Solution 15 - Python	Anon	View Answer on Stackoverflow
Solution 16 - Python	mchen	View Answer on Stackoverflow
Solution 17 - Python	mchen	View Answer on Stackoverflow
Solution 18 - Python	Wilfred Hughes	View Answer on Stackoverflow
Solution 19 - Python	Wilfred Hughes	View Answer on Stackoverflow
Solution 20 - Python	Wilfred Hughes	View Answer on Stackoverflow
Solution 21 - Python	jolvi	View Answer on Stackoverflow
Solution 22 - Python	Bryan S	View Answer on Stackoverflow
Solution 23 - Python	Bendriss Jaâfar	View Answer on Stackoverflow

Python 2.x gotchas and landmines

Python Problem Overview

Python Solutions

Solution 1 - Python

Solution 2 - Python

Solution 3 - Python

Solution 4 - Python

Solution 5 - Python

Solution 6 - Python

Solution 7 - Python

Solution 8 - Python

Solution 9 - Python

Solution 10 - Python

Solution 11 - Python

Solution 12 - Python

Solution 13 - Python

Solution 14 - Python

Solution 15 - Python

Solution 16 - Python

`x += [...]` is not the same as `x = x + [...]` when `x` is a list`

Solution 17 - Python

List repetition with nested lists

Solution 18 - Python

Solution 19 - Python

Solution 20 - Python

Solution 21 - Python

Solution 22 - Python

Solution 23 - Python

How can I get the clients IP address from HTTP headers?

Binary literals?

Attributions

Python Problem Overview

Python Solutions

Solution 1 - Python

Solution 2 - Python

Solution 3 - Python

Solution 4 - Python

Solution 5 - Python

Solution 6 - Python

Solution 7 - Python

Solution 8 - Python

Solution 9 - Python

Solution 10 - Python

Solution 11 - Python

Solution 12 - Python

Solution 13 - Python

Solution 14 - Python

Solution 15 - Python

Solution 16 - Python

x += [...] is not the same as x = x + [...] when x is a list`

Solution 17 - Python

List repetition with nested lists

Solution 18 - Python

Solution 19 - Python

Solution 20 - Python

Solution 21 - Python

Solution 22 - Python

Solution 23 - Python

How can I get the clients IP address from HTTP headers?

Binary literals?

Attributions

`x += [...]` is not the same as `x = x + [...]` when `x` is a list`