R function rep() in Python (replicates elements of a list/vector)

Python

Python Problem Overview


The R function rep() replicates each element of a vector:

> rep(c("A","B"), times=2)
[1] "A" "B" "A" "B"

This is like the list multiplication in Python:

>>> ["A","B"]*2
['A', 'B', 'A', 'B']

But with the rep() R function it is also possible to specifiy the number of repeats for each element of the vector:

> rep(c("A","B"), times=c(2,3))
[1] "A" "A" "B" "B" "B"

Is there such a function availbale in Python ? Otherwise how could one define it ? By the way I'm also interested in such a function for duplicating rows of an array.

Python Solutions


Solution 1 - Python

Use numpy arrays and the numpy.repeat function:

import numpy as np

x = np.array(["A", "B"])
print np.repeat(x, [2, 3], axis=0)

['A' 'A' 'B' 'B' 'B']

Solution 2 - Python

Not sure if there's a built-in available for this, but you can try something like this:

>>> lis = ["A", "B"]
>>> times = (2, 3)
>>> sum(([x]*y for x,y in zip(lis, times)),[])
['A', 'A', 'B', 'B', 'B']

Note that sum() runs in quadratic time. So, it's not the recommended way.

>>> from itertools import chain, izip, starmap
>>> from operator import mul
>>> list(chain.from_iterable(starmap(mul, izip(lis, times))))
['A', 'A', 'B', 'B', 'B']

Timing comparions:

>>> lis = ["A", "B"] * 1000
>>> times = (2, 3) * 1000
>>> %timeit list(chain.from_iterable(starmap(mul, izip(lis, times))))
1000 loops, best of 3: 713 µs per loop
>>> %timeit sum(([x]*y for x,y in zip(lis, times)),[])
100 loops, best of 3: 15.4 ms per loop

Solution 3 - Python

Since you say "array" and mention R. You may want to use numpy arrays anyways, and then use:

import numpy as np
np.repeat(np.array([1,2]), [2,3])

EDIT: Since you mention you want to repeat rows as well, I think you should use numpy. np.repeat has an axis argument to do this.

Other then that, maybe:

from itertools import izip, chain, repeat
list(chain(*(repeat(a,b) for a, b in izip([1,2], [2,3]))))

As it doesn't make the assumption you have a list or string to multiply. Though I admit, passing everything as argument into chain is maybe not perfect, so writing your own iterator may be better.

Solution 4 - Python

l = ['A','B']
n = [2, 4]

Your example uses strings which are already iterables. You can produce a result string which is similar to a list.

''.join([e * m for e, m in zip(l, n)])
'AABBBB'

Update: the list comprehension is not required here:

''.join(e * m for e, m in zip(l, n))
'AABBBB'

Solution 5 - Python

What do you think about this way?

To repeat a value:

>>> repetitions=[]
>>> torep=3
>>> nrep=5
>>> for i in range(nrep):
>>>     i=torep
>>>     repetitions.append(i)
[3, 3, 3, 3, 3]

To repeat a sequence:

>>> repetitions=[]
>>> torep=[1,2,3,4]
>>> nrep= 2
>>> for i in range(nrep):
>>>     repetitions=repetitions+torep
>>> print(repetitions)
[1, 2, 3, 4, 1, 2, 3, 4]

Solution 6 - Python

The following might work for you:

>>>[['a','b'],['A','B']]*5


[['a', 'b'], ['A', 'B'], ['a', 'b'], ['A', 'B'], ['a', 'b'], ['A', 'B'], ['a', 'b'], ['A', 'B'], ['a', 'b'], ['A', 'B']]

Solution 7 - Python

The numpy.repeat has been mentioned, and that's clearly the equivalent to what you want. But for completenes' sake, there's also repeat from the itertools standard library. However, this is intended for iterables in general, so it doesn't allow repetions by index (because iterables in general do not have an index defined).

We can use the code given there as a rough equivalent

def repeat(object, times=None):
    # repeat(10, 3) --> 10 10 10
    if times is None:
        while True:
            yield object
    else:
        for i in xrange(times):
            yield object

to define our own generalised repeat:

def repeat_generalised(object, times=None):
    # repeat(10, 3) --> 10 10 10
    if times is None:
        while True:
            yield object
    else:
        for reps, elem in zip(times, object):
            for i in xrange(reps): 
                yield elem

The problem of course is that there's a lot of possible edge cases you have to define (What should happen if object and times have a different number of elements?), and that would depend on you individual use case.

Solution 8 - Python

Here is my attempt at a clone of R rep:

def rep(x, times = 1, each = 1, length_out = None):
    if not isinstance(times, list):
        times = [times]

    res = ''.join([str(i) * each for i in x])

    if len(times) > 1:   
        res = ''.join(str(i) * m for i, m in zip(x, times))
    else:
        res = ''.join(res * times[0])
    
    if length_out is None:
        return res
    else:
        return res[0:length_out]

Reproduces the R examples:

rep(range(4), times = 2)
rep(range(4), each = 2)
rep(range(4), times = [2,2,2,2])
rep(range(4), each = 2, length_out = 4)
rep(range(4), each = 2, times = 3)

with the exception that there is no recycling of shorter vectors/lists (imo this is the worst feature of R).

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionStéphane LaurentView Question on Stackoverflow
Solution 1 - PythonLukas GrafView Answer on Stackoverflow
Solution 2 - PythonAshwini ChaudharyView Answer on Stackoverflow
Solution 3 - PythonsebergView Answer on Stackoverflow
Solution 4 - PythontzellekeView Answer on Stackoverflow
Solution 5 - PythonDavidDzView Answer on Stackoverflow
Solution 6 - PythonGopi Krishna NutiView Answer on Stackoverflow
Solution 7 - PythonBurnNoteView Answer on Stackoverflow
Solution 8 - PythonjstaView Answer on Stackoverflow