What is the underlying data structure for Python lists?

PythonListData Structures

Python Problem Overview


What is the typical underlying data structure used to implement Python's built-in list data type?

Python Solutions


Solution 1 - Python

> List objects are implemented as > arrays. They are optimized for fast > fixed-length operations and incur O(n) > memory movement costs for pop(0) and > insert(0, v) operations which change > both the size and position of the > underlying data representation.

See also: http://docs.python.org/library/collections.html#collections.deque

Btw, I find it interesting that the Python tutorial on data structures recommends using pop(0) to simulate a queue but does not mention O(n) or the deque option.

http://docs.python.org/tutorial/datastructures.html#using-lists-as-queues

Solution 2 - Python

CPython:

typedef struct {
    PyObject_VAR_HEAD
    /* Vector of pointers to list elements.  list[0] is ob_item[0], etc. */
    PyObject **ob_item;

    /* ob_item contains space for 'allocated' elements.  The number
     * currently in use is ob_size.
     * Invariants:
     *     0 <= ob_size <= allocated
     *     len(list) == ob_size
     *     ob_item == NULL implies ob_size == allocated == 0
     * list.sort() temporarily sets allocated to -1 to detect mutations.
     *
     * Items must normally not be NULL, except during construction when
     * the list is not yet visible outside the function that builds it.
     */
    Py_ssize_t allocated;
} PyListObject;

As can be seen on the following line, the list is declared as an array of pointers to PyObjects.

PyObject **ob_item;

Solution 3 - Python

In the Jython implementation, it's an ArrayList<PyObject>.

Solution 4 - Python

Although it may be obvious, worth stating that Python lists are Dynamic arrays (as opposed to Static arrays). This is an important distinction that comes up in job interview questions/academia.

Because the array is dynamic, Python reserves an amount of memory upon declaration, eg:

somelist = []

Because extra memory already is set aside, performing somelist.append() simply writes to the next reserved memory slot(s) and hence is O(1) most of the time. For a static array, typically the array is full (ie. if have 4 bytes, then array size is 4) and appends would always be O(n) because they require reserving an entirely new set of memory (maybe 5 bytes now) and copying contents over.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionNixuzView Question on Stackoverflow
Solution 1 - Pythone1i45View Answer on Stackoverflow
Solution 2 - PythonGeorg SchöllyView Answer on Stackoverflow
Solution 3 - PythonnategoodView Answer on Stackoverflow
Solution 4 - PythonAdam HughesView Answer on Stackoverflow