How to test if a list contains another list as a contiguous subsequence?

Python Problem Overview

How can I test if a list contains another list (ie. it's a contiguous subsequence). Say there was a function called contains:

contains([1,2], [-1, 0, 1, 2]) # Returns [2, 3] (contains returns [start, end])
contains([1,3], [-1, 0, 1, 2]) # Returns False
contains([1, 2], [[1, 2], 3]) # Returns False
contains([[1, 2]], [[1, 2], 3]) # Returns [0, 0]

Edit:

contains([2, 1], [-1, 0, 1, 2]) # Returns False
contains([-1, 1, 2], [-1, 0, 1, 2]) # Returns False
contains([0, 1, 2], [-1, 0, 1, 2]) # Returns [1, 3]

Python Solutions

Solution 1 - Python

If all items are unique, you can use sets.

>>> items = set([-1, 0, 1, 2])
>>> set([1, 2]).issubset(items)
True
>>> set([1, 3]).issubset(items)
False

Solution 2 - Python

There's an all() and any() function to do this. To check if big contains ALL elements in small

result = all(elem in big for elem in small)

To check if small contains ANY elements in big

result = any(elem in big for elem in small)

the variable result would be boolean (TRUE/FALSE).

Solution 3 - Python

Here is my version:

def contains(small, big):
    for i in xrange(len(big)-len(small)+1):
        for j in xrange(len(small)):
            if big[i+j] != small[j]:
                break
        else:
            return i, i+len(small)
    return False

It returns a tuple of (start, end+1) since I think that is more pythonic, as Andrew Jaffe points out in his comment. It does not slice any sublists so should be reasonably efficient.

One point of interest for newbies is that it uses the else clause on the for statement - this is not something I use very often but can be invaluable in situations like this.

This is identical to finding substrings in a string, so for large lists it may be more efficient to implement something like the Boyer-Moore algorithm.

Note: If you are using Python3, change xrange to range.

Solution 4 - Python

May I humbly suggest the Rabin-Karp algorithm if the big list is really big. The link even contains almost-usable code in almost-Python.

Solution 5 - Python

This works and is fairly fast since it does the linear searching using the builtin list.index() method and == operator:

def contains(sub, pri):
    M, N = len(pri), len(sub)
    i, LAST = 0, M-N+1
    while True:
        try:
            found = pri.index(sub[0], i, LAST) # find first elem in sub
        except ValueError:
            return False
        if pri[found:found+N] == sub:
            return [found, found+N-1]
        else:
            i = found+1

Solution 6 - Python

After OP's edit:

def contains(small, big):
    for i in xrange(1 + len(big) - len(small)):
        if small == big[i:i+len(small)]:
            return i, i + len(small) - 1
    return False

Solution 7 - Python

If we refine the problem talking about testing if a list contains another list with as a sequence, the answer could be the next one-liner:

def contains(subseq, inseq):
    return any(inseq[pos:pos + len(subseq)] == subseq for pos in range(0, len(inseq) - len(subseq) + 1))

Here unit tests I used to tune up this one-liner:

https://gist.github.com/anonymous/6910a85b4978daee137f

Solution 8 - Python

a=[[1,2] , [3,4] , [0,5,4]]
print(a.__contains__([0,5,4]))

It provides true output.

a=[[1,2] , [3,4] , [0,5,4]]
print(a.__contains__([1,3]))

It provides false output.

Solution 9 - Python

Here's a straightforward algorithm that uses list methods:

#!/usr/bin/env python

def list_find(what, where):
    """Find `what` list in the `where` list.

    Return index in `where` where `what` starts
    or -1 if no such index.

    >>> f = list_find
    >>> f([2, 1], [-1, 0, 1, 2])
    -1
    >>> f([-1, 1, 2], [-1, 0, 1, 2])
    -1
    >>> f([0, 1, 2], [-1, 0, 1, 2])
    1
    >>> f([1,2], [-1, 0, 1, 2])
    2
    >>> f([1,3], [-1, 0, 1, 2])
    -1
    >>> f([1, 2], [[1, 2], 3])
    -1
    >>> f([[1, 2]], [[1, 2], 3])
    0
    """
    if not what: # empty list is always found
        return 0
    try:
        index = 0
        while True:
            index = where.index(what[0], index)
            if where[index:index+len(what)] == what:
                return index # found
            index += 1 # try next position
    except ValueError:
        return -1 # not found

def contains(what, where):
    """Return [start, end+1] if found else empty list."""
    i = list_find(what, where)
    return [i, i + len(what)] if i >= 0 else [] #NOTE: bool([]) == False

if __name__=="__main__":
    import doctest; doctest.testmod()

Solution 10 - Python

Smallest code:

def contains(a,b):
    str(a)[1:-1].find(str(b)[1:-1])>=0

Solution 11 - Python

Here is my answer. This function will help you to find out whether B is a sub-list of A. Time complexity is O(n).

`def does_A_contain_B(A, B): #remember now A is the larger list
    b_size = len(B)
    for a_index in range(0, len(A)):
        if A[a_index : a_index+b_size]==B:
            return True
    else:
        return False`

Solution 12 - Python

> I've Summarized and evaluated Time taken by different techniques

Used methods are:

def containsUsingStr(sequence, element:list):
    return str(element)[1:-1] in str(sequence)[1:-1]


def containsUsingIndexing(sequence, element:list):
    lS, lE = len(sequence), len(element)
    for i in range(lS - lE + 1):
        for j in range(lE):
            if sequence[i+j] != element[j]: break
        else: return True
    return False


def containsUsingSlicing(sequence, element:list):
    lS, lE = len(sequence), len(element)
    for i in range(lS - lE + 1):
        if sequence[i : i+lE] == element: return True
    return False


def containsUsingAny(sequence:list, element:list):
    lE = len(element)
    return any(element == sequence[i:i+lE] for i in range(len(sequence)-lE+1))

Code for Time analysis (averaging over 1000 iterations):

from time import perf_counter

functions = (containsUsingStr, containsUsingIndexing, containsUsingSlicing, containsUsingAny)
fCount = len(functions)


for func in functions:
    print(str.ljust(f'Function : {func.__name__}', 32), end='   ::   Return Values: ')
    print(func([1,2,3,4,5,5], [3,4,5,5]) , end=', ')
    print(func([1,2,3,4,5,5], [1,3,4,5]))



avg_times = [0]*fCount
for _ in range(1000):
    perf_times = []
    for func in functions:
        startTime = perf_counter()
        func([1,2,3,4,5,5], [3,4,5,5])
        timeTaken = perf_counter()-startTime
        perf_times.append(timeTaken)
        

    for t in range(fCount): avg_times[t] += perf_times[t]

minTime = min(avg_times)
print("\n\n Ratio of Time of Executions : ", ' : '.join(map(lambda x: str(round(x/minTime, 4)), avg_times)))

Output:

Conclusion: In this case, Slicing operation proves to be the fastest

Solution 13 - Python

I tried to make this as efficient as possible.

It uses a generator; those unfamiliar with these beasts are advised to check out their documentation and that of yield expressions.

Basically it creates a generator of values from the subsequence that can be reset by sending it a true value. If the generator is reset, it starts yielding again from the beginning of sub.

Then it just compares successive values of sequence with the generator yields, resetting the generator if they don't match.

When the generator runs out of values, i.e. reaches the end of sub without being reset, that means that we've found our match.

Since it works for any sequence, you can even use it on strings, in which case it behaves similarly to str.find, except that it returns False instead of -1.

As a further note: I think that the second value of the returned tuple should, in keeping with Python standards, normally be one higher. i.e. "string"[0:2] == "st". But the spec says otherwise, so that's how this works.

It depends on if this is meant to be a general-purpose routine or if it's implementing some specific goal; in the latter case it might be better to implement a general-purpose routine and then wrap it in a function which twiddles the return value to suit the spec.

def reiterator(sub):
    """Yield elements of a sequence, resetting if sent ``True``."""
    it = iter(sub)
    while True:
        if (yield it.next()):
            it = iter(sub)

def find_in_sequence(sub, sequence):
    """Find a subsequence in a sequence.

    >>> find_in_sequence([2, 1], [-1, 0, 1, 2])
    False
    >>> find_in_sequence([-1, 1, 2], [-1, 0, 1, 2])
    False
    >>> find_in_sequence([0, 1, 2], [-1, 0, 1, 2])
    (1, 3)
    >>> find_in_sequence("subsequence",
    ...                  "This sequence contains a subsequence.")
    (25, 35)
    >>> find_in_sequence("subsequence", "This one doesn't.")
    False

    """
    start = None
    sub_items = reiterator(sub)
    sub_item = sub_items.next()
    for index, item in enumerate(sequence):
        if item == sub_item:
            if start is None: start = index
        else:
            start = None
        try:
            sub_item = sub_items.send(start is None)
        except StopIteration:
            # If the subsequence is depleted, we win!
            return (start, index)
    return False

Solution 14 - Python

I think this one is fast...

def issublist(subList, myList, start=0):
    if not subList: return 0
    lenList, lensubList = len(myList), len(subList)
    try:
        while lenList - start >= lensubList:
            start = myList.index(subList[0], start)
            for i in xrange(lensubList):
                if myList[start+i] != subList[i]:
                    break
            else:
                return start, start + lensubList - 1
            start += 1
        return False
    except:
        return False

Solution 15 - Python

The problem of most of the answers, that they are good for unique items in list. If items are not unique and you still want to know whether there is an intersection, you should count items:

from collections import Counter as count

def listContains(l1, l2):
  list1 = count(l1)
  list2 = count(l2)

  return list1&list2 == list1

print( listContains([1,1,2,5], [1,2,3,5,1,2,1]) ) # Returns True
print( listContains([1,1,2,8], [1,2,3,5,1,2,1]) ) # Returns False

You can also return the intersection by using ''.join(list1&list2)

Solution 16 - Python

Here a solution with less line of code and easily understandable (or at least I like to think so).

If you want to keep order (match only if the smaller list is found in the same order on the bigger list):

def is_ordered_subset(l1, l2):
    # First check to see if all element of l1 are in l2 (without checking order)
    if not set(l1).issubset(l2): 
        return False

    length = len(l1)
    # Make sublist of same size than l1
    list_of_sublist = [l2[i:i+length] for i, x in enumerate(l2)]
    #Check if one of this sublist is l1
    return l1 in list_of_sublist

Solution 17 - Python

Dave answer is good. But I suggest this implementation which is more efficient and doesn't use nested loops.

def contains(small_list, big_list):
	"""
	Returns index of start of small_list in big_list if big_list
	contains small_list, otherwise -1.
	"""
	loop = True
	i, curr_id_small= 0, 0
	while loop and i<len(big_list):
		if big_list[i]==small_list[curr_id_small]:
			if curr_id_small==len(small_list)-1:
				loop = False
			else:
				curr_id_small += 1
		else:
			curr_id_small = 0
		i=i+1
	if not loop:
		return i-len(small_list)
	else:
		return -1

Solution 18 - Python

Here's a simple and efficient function to check whether big list contains a small one in matching order:

def contains(big, small):
    i = 0
    for value in big:
        if value == small[i]:
            i += 1
            if i == len(small):
                return True
        else:
            i = 1 if value == small[0] else 0
    return False

Usage:

"""
>>> contains([1,2,3,4,5], [2,3,4])
True
>>> contains([4,2,3,2,4], [2,3,4])
False
>>> contains([1,2,3,2,3,2,2,4,3], [2,4,3])
True
"""

Solution 19 - Python

You can use numpy:

def contains(l1, l2):
   """ returns True if l2 conatins l1 and False otherwise """

   if len(np.intersect1d(l1,l2))==len(l1):
      return = True
   else:
      return = False

Content Type	Original Author	Original Content on Stackoverflow
Question	None	View Question on Stackoverflow
Solution 1 - Python	Thomas O	View Answer on Stackoverflow
Solution 2 - Python	ericyan3000	View Answer on Stackoverflow
Solution 3 - Python	Dave Kirby	View Answer on Stackoverflow
Solution 4 - Python	9000	View Answer on Stackoverflow
Solution 5 - Python	martineau	View Answer on Stackoverflow
Solution 6 - Python	eumiro	View Answer on Stackoverflow
Solution 7 - Python	Oleksiy	View Answer on Stackoverflow
Solution 8 - Python	Sihat Afnan	View Answer on Stackoverflow
Solution 9 - Python	jfs	View Answer on Stackoverflow
Solution 10 - Python	Bart Mensfort	View Answer on Stackoverflow
Solution 11 - Python	Akila D. Perera	View Answer on Stackoverflow
Solution 12 - Python	HIMANSHU PANDEY	View Answer on Stackoverflow
Solution 13 - Python	intuited	View Answer on Stackoverflow
Solution 14 - Python	ChessMaster	View Answer on Stackoverflow
Solution 15 - Python	mafonya	View Answer on Stackoverflow
Solution 16 - Python	thibaultbl	View Answer on Stackoverflow
Solution 17 - Python	HLeb	View Answer on Stackoverflow
Solution 18 - Python	Granitosaurus	View Answer on Stackoverflow
Solution 19 - Python	Atila Abdula	View Answer on Stackoverflow