When is StringIO used, as opposed to joining a list of strings?

PythonStringio

Python Problem Overview


Using StringIO as string buffer is slower than using list as buffer.

When is StringIO used?

from io import StringIO


def meth1(string):
    a = []
    for i in range(100):
        a.append(string)
    return ''.join(a)

def meth2(string):
    a = StringIO()
    for i in range(100):
        a.write(string)
    return a.getvalue()


if __name__ == '__main__':
    from timeit import Timer
    string = "This is test string"
    print(Timer("meth1(string)", "from __main__ import meth1, string").timeit())
    print(Timer("meth2(string)", "from __main__ import meth2, string").timeit())

Results:

16.7872819901
18.7160351276

Python Solutions


Solution 1 - Python

The main advantage of StringIO is that it can be used where a file was expected. So you can do for example (for Python 2):

import sys
import StringIO

out = StringIO.StringIO()
sys.stdout = out
print "hi, I'm going out"
sys.stdout = sys.__stdout__
print out.getvalue()

Solution 2 - Python

If you measure for speed, you should use cStringIO.

From the docs:

> The module cStringIO provides an > interface similar to that of the > StringIO module. Heavy use of > StringIO.StringIO objects can be made > more efficient by using the function > StringIO() from this module instead.

But the point of StringIO is to be a file-like object, for when something expects such and you don't want to use actual files.

Edit: I noticed you use from io import StringIO, so you are probably on Python >= 3 or at least 2.6. The separate StringIO and cStringIO are gone in Py3. Not sure what implementation they used to provide the io.StringIO. There is io.BytesIO too.

Solution 3 - Python

Well, I don't know if I would like to call that using it as a "buffer", you are just multiplying a string a 100 times, in two complicated ways. Here is an uncomplicated way:

def meth3(string):
    return string * 100

If we add that to your test:

if __name__ == '__main__':
    
    from timeit import Timer
    string = "This is test string"
    # Make sure it all does the same:
    assert(meth1(string) == meth3(string))
    assert(meth2(string) == meth3(string))
    print(Timer("meth1(string)", "from __main__ import meth1, string").timeit())
    print(Timer("meth2(string)", "from __main__ import meth2, string").timeit())
    print(Timer("meth3(string)", "from __main__ import meth3, string").timeit())

It turns out to be way faster as a bonus:

21.0300650597
22.4869811535
0.811429977417

If you want to create a bunch of strings, and then join them, meth1() is the correct way. There is no point in writing it to StringIO, which is something completely different, namely a string with a file-like stream interface.

Solution 4 - Python

Another approach based on Lennart Regebro approach. This is faster than list method (meth1)

def meth4(string):
    a = StringIO(string * 100)
    contents = a.getvalue()
    a.close()
    return contents

if __name__ == '__main__':
    from timeit import Timer
    string = "This is test string"
    print(Timer("meth1(string)", "from __main__ import meth1, string").timeit())
    print(Timer("meth2(string)", "from __main__ import meth2, string").timeit())
    print(Timer("meth3(string)", "from __main__ import meth3, string").timeit())
    print(Timer("meth4(string)", "from __main__ import meth4, string").timeit())

Results (sec.):

>meth1 = 7.731315963647944 > >meth2 = 9.609279402186985 > >meth3 = 0.26534052061106195 > >meth4 = 2.915035489152274

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionsimhaView Question on Stackoverflow
Solution 1 - PythonTryPyPyView Answer on Stackoverflow
Solution 2 - PythonplundraView Answer on Stackoverflow
Solution 3 - PythonLennart RegebroView Answer on Stackoverflow
Solution 4 - PythonJagadeesh SaliView Answer on Stackoverflow