Associativity of "in" in Python?

PythonSyntax

Python Problem Overview


I'm making a Python parser, and this is really confusing me:

>>> 1 in [] in 'a'
False

>>> (1 in []) in 'a'
TypeError: 'in <string>' requires string as left operand, not bool

>>> 1 in ([] in 'a')
TypeError: 'in <string>' requires string as left operand, not list

How exactly does in work in Python, with regards to associativity, etc.?

Why do no two of these expressions behave the same way?

Python Solutions


Solution 1 - Python

1 in [] in 'a' is evaluated as (1 in []) and ([] in 'a')

Since the first condition (1 in []) is False, the whole condition is evaluated as False; ([] in 'a') is never actually evaluated, so no error is raised.

We can see how Python executes each statement using the dis module:

>>> from dis import dis
>>> dis("1 in [] in 'a'")
  1           0 LOAD_CONST               0 (1)
              2 BUILD_LIST               0
              4 DUP_TOP
              6 ROT_THREE
              8 CONTAINS_OP              0        # `in` is the contains operator
             10 JUMP_IF_FALSE_OR_POP    18        # skip to 18 if the first 
                                                  # comparison is false
             12 LOAD_CONST               1 ('a')  # 12-16 are never executed
             14 CONTAINS_OP              0        # so no error here (14)
             16 RETURN_VALUE
        >>   18 ROT_TWO
             20 POP_TOP
             22 RETURN_VALUE
>>> dis("(1 in []) in 'a'")
  1           0 LOAD_CONST               0 (1)
              2 LOAD_CONST               1 (())
              4 CONTAINS_OP              0        # perform 1 in []
              6 LOAD_CONST               2 ('a')  # now load 'a'
              8 CONTAINS_OP              0        # check if result of (1 in []) is in 'a'
                                                  # throws Error because (False in 'a')
                                                  # is a TypeError
             10 RETURN_VALUE
>>> dis("1 in ([] in 'a')")
  1           0 LOAD_CONST               0 (1)
              2 BUILD_LIST               0
              4 LOAD_CONST               1 ('a')
              6 CONTAINS_OP              0        # perform ([] in 'a'), which is 
                                                  # incorrect, so it throws a TypeError
              8 CONTAINS_OP              0        # if no Error then this would 
                                                  # check if 1 is in the result of ([] in 'a')
             10 RETURN_VALUE

  1. Except that [] is only evaluated once. This doesn't matter in this example but if you (for example) replaced [] with a function that returned a list, that function would only be called once (at most). The documentation explains also this.

Solution 2 - Python

Python does special things with chained comparisons.

The following are evaluated differently:

x > y > z   # in this case, if x > y evaluates to true, then
            # the value of y is used, again, and compared with z

(x > y) > z # the parenthesized form, on the other hand, will first
            # evaluate x > y. And, compare the evaluated result
            # with z, which can be "True > z" or "False > z"

In both cases though, if the first comparison is False, the rest of the statement won't be looked at.

For your particular case,

1 in [] in 'a'   # this is false because 1 is not in []

(1 in []) in a   # this gives an error because we are
                 # essentially doing this: False in 'a'

1 in ([] in 'a') # this fails because you cannot do
                 # [] in 'a'

Also to demonstrate the first rule above, these are statements that evaluate to True.

1 in [1,2] in [4,[1,2]] # But "1 in [4,[1,2]]" is False

2 < 4 > 1               # and note "2 < 1" is also not true

Precedence of Python operators: https://docs.python.org/3/reference/expressions.html#comparisons

Solution 3 - Python

From the documentation:

> Comparisons can be chained arbitrarily, e.g., x < y <= z is equivalent to x < y and y <= z, except that y is evaluated only once (but in both cases z is not evaluated at all when x < y is found to be false).

What this means is, that there no associativity in x in y in z!

The following are equivalent:

1 in  []  in 'a'
# <=>
middle = []
#            False          not evaluated
result = (1 in middle) and (middle in 'a')


(1 in  []) in 'a'
# <=>
lhs = (1 in []) # False
result = lhs in 'a' # False in 'a' - TypeError


1 in  ([] in 'a')
# <=>
rhs = ([] in 'a') # TypeError
result = 1 in rhs

Solution 4 - Python

The short answer, since the long one is already given several times here and in excellent ways, is that the boolean expression is short-circuited, this is has stopped evaluation when a change of true in false or vice versa cannot happen by further evaluation.

(see http://en.wikipedia.org/wiki/Short-circuit_evaluation)

It might be a little short (no pun intended) as an answer, but as mentioned, all other explanation is allready done quite well here, but I thought the term deserved to be mentioned.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
Questionuser541686View Question on Stackoverflow
Solution 1 - PythonAshwini ChaudharyView Answer on Stackoverflow
Solution 2 - PythonAlexander ChenView Answer on Stackoverflow
Solution 3 - Pythonphant0mView Answer on Stackoverflow
Solution 4 - PythonPeterView Answer on Stackoverflow