When is StringIO used, as opposed to joining a list of strings?
PythonStringioPython Problem Overview
Using StringIO as string buffer is slower than using list as buffer.
When is StringIO used?
from io import StringIO
def meth1(string):
a = []
for i in range(100):
a.append(string)
return ''.join(a)
def meth2(string):
a = StringIO()
for i in range(100):
a.write(string)
return a.getvalue()
if __name__ == '__main__':
from timeit import Timer
string = "This is test string"
print(Timer("meth1(string)", "from __main__ import meth1, string").timeit())
print(Timer("meth2(string)", "from __main__ import meth2, string").timeit())
Results:
16.7872819901
18.7160351276
Python Solutions
Solution 1 - Python
The main advantage of StringIO is that it can be used where a file was expected. So you can do for example (for Python 2):
import sys
import StringIO
out = StringIO.StringIO()
sys.stdout = out
print "hi, I'm going out"
sys.stdout = sys.__stdout__
print out.getvalue()
Solution 2 - Python
If you measure for speed, you should use cStringIO
.
From the docs:
> The module cStringIO provides an > interface similar to that of the > StringIO module. Heavy use of > StringIO.StringIO objects can be made > more efficient by using the function > StringIO() from this module instead.
But the point of StringIO is to be a file-like object, for when something expects such and you don't want to use actual files.
Edit: I noticed you use from io import StringIO
, so you are probably on Python >= 3 or at least 2.6. The separate StringIO and cStringIO are gone in Py3. Not sure what implementation they used to provide the io.StringIO. There is io.BytesIO
too.
Solution 3 - Python
Well, I don't know if I would like to call that using it as a "buffer", you are just multiplying a string a 100 times, in two complicated ways. Here is an uncomplicated way:
def meth3(string):
return string * 100
If we add that to your test:
if __name__ == '__main__':
from timeit import Timer
string = "This is test string"
# Make sure it all does the same:
assert(meth1(string) == meth3(string))
assert(meth2(string) == meth3(string))
print(Timer("meth1(string)", "from __main__ import meth1, string").timeit())
print(Timer("meth2(string)", "from __main__ import meth2, string").timeit())
print(Timer("meth3(string)", "from __main__ import meth3, string").timeit())
It turns out to be way faster as a bonus:
21.0300650597
22.4869811535
0.811429977417
If you want to create a bunch of strings, and then join them, meth1() is the correct way. There is no point in writing it to StringIO, which is something completely different, namely a string with a file-like stream interface.
Solution 4 - Python
Another approach based on Lennart Regebro approach. This is faster than list method (meth1)
def meth4(string):
a = StringIO(string * 100)
contents = a.getvalue()
a.close()
return contents
if __name__ == '__main__':
from timeit import Timer
string = "This is test string"
print(Timer("meth1(string)", "from __main__ import meth1, string").timeit())
print(Timer("meth2(string)", "from __main__ import meth2, string").timeit())
print(Timer("meth3(string)", "from __main__ import meth3, string").timeit())
print(Timer("meth4(string)", "from __main__ import meth4, string").timeit())
Results (sec.):
>meth1 = 7.731315963647944 > >meth2 = 9.609279402186985 > >meth3 = 0.26534052061106195 > >meth4 = 2.915035489152274