Why is the shovel operator (<<) preferred over plus-equals (+=) when building a string in Ruby?

RubyStringOptimization

Ruby Problem Overview


I am working through Ruby Koans.

The test_the_shovel_operator_modifies_the_original_string Koan in about_strings.rb includes the following comment:

> Ruby programmers tend to favor the shovel operator (<<) over the plus > equals operator (+=) when building up strings. Why?

My guess is it involves speed, but I don't understand the action under the hood that would cause the shovel operator to be faster.

Would someone be able to please explain the details behind this preference?

Ruby Solutions


Solution 1 - Ruby

Proof:

a = 'foo'
a.object_id #=> 2154889340
a << 'bar'
a.object_id #=> 2154889340
a += 'quux'
a.object_id #=> 2154742560

So << alters the original string rather than creating a new one. The reason for this is that in ruby a += b is syntactic shorthand for a = a + b (the same goes for the other <op>= operators) which is an assignment. On the other hand << is an alias of concat() which alters the receiver in-place.

Solution 2 - Ruby

Performance proof:

#!/usr/bin/env ruby

require 'benchmark'

Benchmark.bmbm do |x|
  x.report('+= :') do
    s = ""
    10000.times { s += "something " }
  end
  x.report('<< :') do
    s = ""
    10000.times { s << "something " }
  end
end

# Rehearsal ----------------------------------------
# += :   0.450000   0.010000   0.460000 (  0.465936)
# << :   0.010000   0.000000   0.010000 (  0.009451)
# ------------------------------- total: 0.470000sec
# 
#            user     system      total        real
# += :   0.270000   0.010000   0.280000 (  0.277945)
# << :   0.000000   0.000000   0.000000 (  0.003043)

Solution 3 - Ruby

A friend who is learning Ruby as his first programming language asked me this same question while going through Strings in Ruby on the Ruby Koans series. I explained it to him using the following analogy;

You have a glass of water that is half full and you need to refill your glass.

First way you do it by taking a new glass, filling it halfway with water from a tap and then using this second half-full glass to refill your drinking glass. You do this every time you need to refill your glass.

The second way you take your half full glass and just refill it with water straight from the tap.

At the end of the day, you would have more glasses to clean if you choose to pick a new glass every time you needed to refill your glass.

The same applies to the shovel operator and the plus equal operator. Plus equal operator picks a new 'glass' every time it needs to refill its glass while the shovel operator just takes the same glass and refills it. At the end of the day more 'glass' collection for the Plus equal operator.

Solution 4 - Ruby

This is an old question, but I just ran across it and I'm not fully satisfied with the existing answers. There are lots of good points about the shovel << being faster than concatenation +=, but there is also a semantic consideration.

The accepted answer from @noodl shows that << modifies the existing object in place, whereas += creates a new object. So you need to consider if you want all references to the string to reflect the new value, or do you want to leave the existing references alone and create a new string value to use locally. If you need all references to reflect the updated value, then you need to use <<. If you want to leave other references alone, then you need to use +=.

A very common case is that there is only a single reference to the string. In this case, the semantic difference does not matter and it is natural to prefer << because of its speed.

Solution 5 - Ruby

Because it's faster / does not create a copy of the string <-> garbage collector does not need to run.

Solution 6 - Ruby

While a majority of answers cover += is slower because it creates a new copy, it's important to keep in mind that += and << are not interchangeable! You want to use each in different cases.

Using << will also alter any variables that are pointed to b. Here we also mutate a when we may not want to.

2.3.1 :001 > a = "hello"
 => "hello"
2.3.1 :002 > b = a
 => "hello"
2.3.1 :003 > b << " world"
 => "hello world"
2.3.1 :004 > a
 => "hello world"

Because += makes a new copy, it also leaves any variables that are pointing to it unchanged.

2.3.1 :001 > a = "hello"
 => "hello"
2.3.1 :002 > b = a
 => "hello"
2.3.1 :003 > b += " world"
 => "hello world"
2.3.1 :004 > a
 => "hello"

Understanding this distinction can save you a lot of headaches when you're dealing with loops!

Solution 7 - Ruby

While not a direct answer to your question, why's The Fully Upturned Bin always has been one of my favorite Ruby articles. It also contains some info on strings in regards to garbage collection.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionerinbrownView Question on Stackoverflow
Solution 1 - RubynoodlView Answer on Stackoverflow
Solution 2 - RubyNemo157View Answer on Stackoverflow
Solution 3 - RubyKibet YegonView Answer on Stackoverflow
Solution 4 - RubyTonyView Answer on Stackoverflow
Solution 5 - RubygrosserView Answer on Stackoverflow
Solution 6 - RubyJoseph ChoView Answer on Stackoverflow
Solution 7 - RubyMichael KohlView Answer on Stackoverflow