Why was p[:] designed to work differently in these two situations?
PythonPython 3.xSlicePython Problem Overview
p = [1,2,3]
print(p) # [1, 2, 3]
q=p[:] # supposed to do a shallow copy
q[0]=11
print(q) #[11, 2, 3]
print(p) #[1, 2, 3]
# above confirms that q is not p, and is a distinct copy
del p[:] # why is this not creating a copy and deleting that copy ?
print(p) # []
Above confirms p[:]
doesnt work the same way in these 2 situations. Isn't it ?
Considering that in the following code, I expect to be working directly with p
and not a copy of p
,
p[0] = 111
p[1:3] = [222, 333]
print(p) # [111, 222, 333]
I feel
del p[:]
is consistent with p[:]
, all of them referencing the original list
but
q=p[:]
is confusing (to novices like me) as p[:]
in this case results in a new list !
My novice expectation would be that
q=p[:]
should be the same as
q=p
Why did the creators allow this special behavior to result in a copy instead ?
Python Solutions
Solution 1 - Python
del and assignments are designed consistently, they're just not designed the way you expected them to be. del never deletes objects, it deletes names/references (object deletion only ever happens indirectly, it's the refcount/garbage collector that deletes the objects); similarly the assignment operator never copies objects, it's always creating/updating names/references.
The del and assignment operator takes a reference specification (similar to the concept of an lvalue in C, though the details differs). This reference specification is either a variable name (plain identifier), a __setitem__
key (object in square bracket), or __setattr__
name (identifier after dot). This lvalue is not evaluated like an expression, as doing that will make it impossible to assign or delete anything.
Consider the symmetry between:
p[:] = [1, 2, 3]
and
del p[:]
In both cases, p[:]
works identically because they are both evaluated as an lvalue. On the other hand, in the following code, p[:]
is an expression that is fully evaluated into an object:
q = p[:]
Solution 2 - Python
del
on iterator is just a call to __delitem__
with index as argument. Just like parenthesis call [n] is a call to __getitem__
method on iterator instance with index n.
So when you call p[:]
you are creating a sequence of items, and when you call del p[:]
you map that del/_delitem_ to every item in that sequence.
Solution 3 - Python
As others have stated; p[:]
deletes all items in p
; BUT will not affect q. To go into further detail the list docs refer to just this:
> All slice operations return a new list containing the requested > elements. This means that the following slice returns a new (shallow) > copy of the list: > > >>> squares = [1, 4, 9, 16, 25] > ... > >>> squares[:] > [1, 4, 9, 16, 25]
So q=p[:]
creates a (shallow) copy of p
as a separate list but upon further inspection it does point to a completely separate location in memory.
>>> p = [1,2,3]
>>> q=p[:]
>>> id(q)
139646232329032
>>> id(p)
139646232627080
This is explained better in the copy
module:
> A shallow copy constructs a new compound object and then (to the > extent possible) inserts references into it to the objects found in > the original.
Although the del statement is performed recursively on lists/slices:
> Deletion of a target list recursively deletes each target, from left to right.
So if we use del p[:]
we are deleting the contents of p
by iterating over each element, whereas q
is not altered as stated earlier, it references a separate list although having the same items:
>>> del p[:]
>>> p
[]
>>> q
[1, 2, 3]
In fact this is also referenced in the list docs as well in the list.clear
method:
> list.copy()
> Return a shallow copy of the list. Equivalent to a[:]
.
> list.clear()
> Remove all items from the list. Equivalent to del a[:]
.
Solution 4 - Python
Basically the slice-syntax can be used in 3 different contexts:
- Accessing, i.e.
x = foo[:]
- Setting, i.e.
foo[:] = x
- Deleting, i.e.
del foo[:]
And in these contexts the values put in the square brackets just select the items. This is designed that the "slice" is used consistently in each of these cases:
-
So
x = foo[:]
gets all elements infoo
and assigns them tox
. This is basically a shallow copy. -
But
foo[:] = x
will replace all elements infoo
with the elements inx
. -
And when deleting
del foo[:]
will delete all elements infoo
.
However this behavior is customizable as explained by 3.3.7. Emulating container types:
> ## object.__getitem__(self, key)
>
> Called to implement evaluation of self[key]
. For sequence types, the accepted keys should be integers and slice objects. Note that the special interpretation of negative indexes (if the class wishes to emulate a sequence type) is up to the __getitem__()
method. If key is of an inappropriate type, TypeError
may be raised; if of a value outside the set of indexes for the sequence (after any special interpretation of negative values), IndexError
should be raised. For mapping types, if key is missing (not in the container), KeyError
should be raised.
>
> ### Note
>
> for
loops expect that an IndexError
will be raised for illegal indexes to allow proper detection of the end of the sequence.
>
> ## object.__setitem__(self, key, value)
>
> Called to implement assignment to self[key]
. Same note as for __getitem__()
. This should only be implemented for mappings if the objects support changes to the values for keys, or if new keys can be added, or for sequences if elements can be replaced. The same exceptions should be raised for improper key values as for the __getitem__()
method.
>
> ## object.__delitem__(self, key)
>
> Called to implement deletion of self[key]
. Same note as for __getitem__()
. This should only be implemented for mappings if the objects support removal of keys, or for sequences if elements can be removed from the sequence. The same exceptions should be raised for improper key values as for the __getitem__()
method.
(Emphasis mine)
So in theory any container type could implement this however it wants. However many container types follow the list-implementation.
Solution 5 - Python
I'm not sure if you want this sort of answer. In words, for p[:], it means to "iterate through all elements of p". If you use it in
q=p[:]
Then it can be read as "iterate with all elements of p and set it to q". On the other hand, using
q=p
Just means, "assign the address of p to q" or "make q a pointer to p" which is confusing if you came from other languages that handles pointers individually.
Therefore, using it in del, like
del p[:]
Just means "delete all elements of p".
Hope this helps.
Solution 6 - Python
Historical reasons, mainly.
In early versions of Python, iterators and generators weren't really a thing. Most ways of working with sequences just returned lists: range()
, for example, returned a fully-constructed list containing the numbers.
So it made sense for slices, when used on the right-hand side of an expression, to return a list. a[i:j:s]
returned a new list containing selected elements from a
. And so a[:]
on the right-hand side of an assignment would return a new list containing all the elements of a
, that is, a shallow copy: this was perfectly consistent at the time.
On the other hand, brackets on the left side of an expression always modified the original list: that was the precedent set by a[i] = d
, and that precedent was followed by del a[i]
, and then by del a[i:j]
.
Time passed, and copying values and instantiating new lists all over the place was seen as unnecessary and expensive. Nowadays, range()
returns a generator that produces each number only as it's requested, and iterating over a slice could potentially work the same way—but the idiom of copy = original[:]
is too well-entrenched as a historical artifact.
In Numpy, by the way, this isn't the case: ref = original[:]
will make a reference rather than a shallow copy, which is consistent with how del
and assignment to arrays work.
>>> a = np.array([1,2,3,4])
>>> b = a[:]
>>> a[1] = 7
>>> b
array([1, 7, 3, 4])
Python 4, if it ever happens, may follow suit. It is, as you've observed, much more consistent with other behavior.