Should I avoid converting to a string if a value is already a string?

PythonString

Python Problem Overview


Sometimes you have to use list comprehension to convert everything to string including strings themselves.

b = [str(a) for a in l]

But do I have to do:

b = [a if type(a)==str else str(a) for a in l]

I was wondering if str on a string is optimized enough to not create another copy of the string.

I have tried:

>>> x="aaaaaa"
>>> str(x) is x
True

but that may be because Python can cache strings, and reuses them. But is that behaviour guaranteed for any value of a string?

Python Solutions


Solution 1 - Python

Testing if an object is already a string is slower than just always converting to a string.

That's because the str() method also makes the exact same test (is the object already a string). You are a) doing double the work, and b) your test is slower to boot.

Note: for Python 2, using str() on unicode objects includes an implicit encode to ASCII, and this can fail. You may still have to special case handling of such objects. In Python 3, there is no need to worry about that edge-case.

As there is some discussion around this:

  • isinstance(s, str) has a different meaning when s can be a subclass of str. As subclasses are treated exactly like any other type of object by str() (either __str__ or __repr__ is called on the object), this difference matters here.

  • You should use type(s) is str for exact type checks. Types are singletons, take advantage of this, is is faster:

      >>> import timeit
      >>> timeit.timeit("type(s) is str", "s = ''")
      0.10074466899823165
      >>> timeit.timeit("type(s) == str", "s = ''")
      0.1110201120027341
    
  • Using s if type(s) is str else str(s) is significantly slower for the non-string case:

      >>> import timeit
      >>> timeit.timeit("str(s)", "s = None")
      0.1823573520014179
      >>> timeit.timeit("s if type(s) is str else str(s)", "s = None")
      0.29589492800005246
      >>> timeit.timeit("str(s)", "s = ''")
      0.11716728399915155
      >>> timeit.timeit("s if type(s) is str else str(s)", "s = ''")
      0.12032335300318664
    

    (The timings for the s = '' cases are very close and keep swapping places).

All timings in this post were conducted on Python 3.6.0 on a Macbook Pro 15" (Mid 2015), OS X 10.12.3.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionJean-François FabreView Question on Stackoverflow
Solution 1 - PythonMartijn PietersView Answer on Stackoverflow