Redundant comparison & "if" before assignment

C#.NetIf Statement

C# Problem Overview


Here is the example:

if(value != ageValue) {
  ageValue = value;
}

I mean, if we assign the value of a variable to another one, why would we need to check if they have anyway the same value?

That confuses me. Here is the broader context:

private double ageValue;
public double Age {
  get {
    return ageValue;
  }

  set {
    if(value != ageValue) {
      ageValue = value;
    }
  }
}

C# Solutions


Solution 1 - C#

Here is a code sample when the check is quite useful:

 public class MyClass {
    ...
    int ageValue = 0;

    public int AgeValue {
      get {
        return ageValue
      }
      protected set {
        ... // value validation here

        // your code starts
        if (value != ageValue) { 
          ageValue = value; 
        }
        // your code ends
        else
          return; // do nothing since value == ageValue

        // ageValue has been changed
        // Time (or / and memory) consuming process
        SaveToRDBMS();
        InvalidateCache(); 
        ...
      } 
    } 

 ... 

More natural implementation, however, is to check in the very beginning in order to avoid unnecessary computation.

    protected set {
      if (ageValue == value)
        return;

      ... // value validation here
      ageValue = value; 

      // ageValue has been changed
      // Time (or / and memory) consuming process
      SaveToRDBMS();
      InvalidateCache();  
      ...
    }

Solution 2 - C#

In a winforms control we had set the BackgroundColor to a specific color:

myControl.BackgroundColor = Color.White

Under specific circumstances this could happen in a tight loop and lead to a frozen UI. After some performance analysis we found that this call was the reason for the frozen UI and so we simply changed it to:

if (myControl.BackgroundColor != Color.White)
    myControl.BackgroundColor = Color.White

And the performance of our tool was back on track (and then we eliminated the reason of the tight loop).

So this check is not always redundant. Especially if the target is a property which does more within the setter then simply applying the value to a backing store.

Solution 3 - C#

The if is, on inspection, not redundant. It depends on the remaining implementation. Note that in C#, != can be overloaded, which means that evaluation can have side effects. Futhermore, the checked variables could be implemented as properties, which also can have side effects on evaluation.

Solution 4 - C#

This question has gained quite some comments but so far all answers try to reframe the question to address issues with operator overloading or side effects of the setter.

If the setter is used by multiple threads it can really make a difference. The check before set pattern can (you should measure) be useful if you are iterating over the same data with multiple threads which alter the data. The text book name for this phenomena is called false sharing. If you read the data and did verify that it already matches the target value you can omit the write.

If you omit the write the CPU does not need to flush the cache line (a 64 byte block on Intel CPUs) to ensure that other cores see the changed value. If the other core was about to read some other data from that 64 byte block then you just have slowed down your core and increased cross core traffic to synchronize memory contents between CPU caches.

The following sample application shows this effect which also contains the check before write condition:

 if (tmp1 != checkValue)  // set only if not equal to checkvalue
 {
    values[i] = checkValue;
 }

Here is the full code:

using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;
using System.Threading.Tasks;

class Program
{
    static void Main(string[] args)
    {
        const int N = 500_000_000;
        int[] values = new int[N]; // 2 GB
        for (int nThreads = 1; nThreads < Environment.ProcessorCount; nThreads++)
        {
            SetArray(values, checkValue: 1, nTimes: 10, nThreads: nThreads);
            SetArray(values, checkValue: 2, nTimes: 10, nThreads: nThreads);
            SetArrayNoCheck(values, checkValue: 2, nTimes: 10, nThreads: nThreads);
        }
    }

    private static void SetArray(int[] values, int checkValue, int nTimes, int nThreads)
    {
        List<double> ms = new List<double>();

        for (int k = 0; k < nTimes; k++)  // set array values to 1
        {
            for (int i = 0; i < values.Length; i++)
            {
                values[i] = 1;
            }

            var sw = Stopwatch.StartNew();
            Action acc = () =>
            {
                int tmp1 = 0;
                for (int i = 0; i < values.Length; i++)
                {
                    tmp1 = values[i];
                    if (tmp1 != checkValue)  // set only if not equal to checkvalue
                    {
                        values[i] = checkValue;
                    }
                }
            };

            Parallel.Invoke(Enumerable.Repeat(acc, nThreads).ToArray());  // Let this run on 3 cores

            sw.Stop();
            ms.Add(sw.Elapsed.TotalMilliseconds);
            //  Console.WriteLine($"Set {values.Length * 4 / (1_000_000_000.0f):F1} GB of Memory in {sw.Elapsed.TotalMilliseconds:F0} ms. Initial Value 1. Set Value {checkValue}");
        }
        string descr = checkValue == 1 ? "Conditional Not Set" : "Conditional Set";
        Console.WriteLine($"{descr}, {ms.Average():F0}, ms, nThreads, {nThreads}");

    }

    private static void SetArrayNoCheck(int[] values, int checkValue, int nTimes, int nThreads)
    {
        List<double> ms = new List<double>();
        for (int k = 0; k < nTimes; k++)  // set array values to 1
        {
            for (int i = 0; i < values.Length; i++)
            {
                values[i] = 1;
            }

            var sw = Stopwatch.StartNew();
            Action acc = () =>
            {
                for (int i = 0; i < values.Length; i++)
                {
                        values[i] = checkValue;
                }
            };

            Parallel.Invoke(Enumerable.Repeat(acc, nThreads).ToArray());  // Let this run on 3 cores

            sw.Stop();
            ms.Add(sw.Elapsed.TotalMilliseconds);
            //Console.WriteLine($"Unconditional Set {values.Length * 4 / (1_000_000_000.0f):F1} GB of Memory in {sw.Elapsed.TotalMilliseconds:F0} ms. Initial Value 1. Set Value {checkValue}");
        }
        Console.WriteLine($"Unconditional Set, {ms.Average():F0}, ms, nThreads, {nThreads}");
    }
}

If you let that run you get values like:

// Value not set
Set 2.0 GB of Memory in 439 ms. Initial Value 1. Set Value 1
Set 2.0 GB of Memory in 420 ms. Initial Value 1. Set Value 1
Set 2.0 GB of Memory in 429 ms. Initial Value 1. Set Value 1
Set 2.0 GB of Memory in 393 ms. Initial Value 1. Set Value 1
Set 2.0 GB of Memory in 404 ms. Initial Value 1. Set Value 1
Set 2.0 GB of Memory in 395 ms. Initial Value 1. Set Value 1
Set 2.0 GB of Memory in 419 ms. Initial Value 1. Set Value 1
Set 2.0 GB of Memory in 421 ms. Initial Value 1. Set Value 1
Set 2.0 GB of Memory in 442 ms. Initial Value 1. Set Value 1
Set 2.0 GB of Memory in 422 ms. Initial Value 1. Set Value 1
// Value written
Set 2.0 GB of Memory in 519 ms. Initial Value 1. Set Value 2
Set 2.0 GB of Memory in 582 ms. Initial Value 1. Set Value 2
Set 2.0 GB of Memory in 543 ms. Initial Value 1. Set Value 2
Set 2.0 GB of Memory in 484 ms. Initial Value 1. Set Value 2
Set 2.0 GB of Memory in 523 ms. Initial Value 1. Set Value 2
Set 2.0 GB of Memory in 540 ms. Initial Value 1. Set Value 2
Set 2.0 GB of Memory in 552 ms. Initial Value 1. Set Value 2
Set 2.0 GB of Memory in 527 ms. Initial Value 1. Set Value 2
Set 2.0 GB of Memory in 535 ms. Initial Value 1. Set Value 2
Set 2.0 GB of Memory in 581 ms. Initial Value 1. Set Value 2

That results in a 22% faster performance which can be significant in high performance number crunching scenarios.

To answer the question as it was written:

You can remove the if statement if access to the memory is only single threaded. If multiple threads are working on the same or nearby data false sharing can happen which can cost you up to ca. 20% of memory access performance.

Update 1 I have ran more tests and created a chart to show the cross core chit chat. This shows a simple set (Unconditional Set) as it was noted by commenter Frank Hopkins. Conditional Not Set contains the if which never sets the value. And last but not least Conditional Set will set the value in the if condition.

Performance vs Cores

Solution 5 - C#

I've actually coded stuff like this a few times, for different reasons. They're kinda hard to explain, so bear with me.

The main thing is that you don't set a new reference if the value at the reference is logically equal to the prior reference's value. In comments above, users have criticized the obnoxiousness of this scenario – and it is obnoxious to have to deal with – but still essentially necessary in cases.

I'd try to split up use cases like this:

  1. The value is an abstract data type, where you may have different constructed instances representing the same logical value.
* This happens a lot in math programs, e.g. Mathematica, where you can't use primitive numerics, allowing you to end up with different objects meant to represent the same.
  1. The reference of value is useful to a caching logic.
* This can also pop up when using abstract numerics.  For example, if you expect other parts of the program to have cached data about a reference, then you don't want to replace it with a logically equivalent reference, as it'll invalidate the caches used elsewhere.
  1. You're using a reactive evaluator, where setting a new value may forces a chain-reaction of updates.
* Exactly how and why this matters varies depending on the context.

The big conceptual point is that, in some cases, you can have the same logical value stored at different references, but you want to try to minimize the number of degenerate references for two big reasons:

  1. Having the same logical value stored multiple times hogs more memory.

  2. A lot of the run-time can use reference-checking as a shortcut, e.g. through caching, which can be more efficient if you avoid allowing redundant references to the same logical value to propagate.

For another random example, .NET's garbage collector is "generational", meaning that it puts more effort into checking if a value can be collected when it's newer. So, the garbage collector can experience gains if you preferentially retain the older reference, as it's in a more privileged generation, allowing the newer reference to get garbage collected sooner.

Another use case, again with abstract data types, is where you might have lazily-evaluated properties attached to them. For example, say you have an abstract class Number that has properties like .IsRational, .IsEven, etc.. Then, you might not calculate those immediately, but rather generate them on-demand, caching the results. In a scenario like this, you may tend to prefer to retain older Number's of the same logical value as they may have more stuff attached to them, whereas a new value may have less information associated with it, even if it's logically ==.

It's kinda hard to think of how to sum up the various reasons why this can make sense in some cases, but it's basically an optimization that can make sense if you have a reason to use it. If you don't have any reason to use it, then probably best to not worry about it until some motivation arises.

Solution 6 - C#

The performance is not a big deal, just depends on your logic needs.

Solution 7 - C#

Yes, this if is useless. You check if the value are the same (and set it if not).

When the !=-operator is not overloaded, then is this:

private double ageValue; 

public double Age 
{ 
    get { return ageValue; } 

    set
    { 
        if (value != ageValue) 
        { 
            ageValue = value; 
        } 
    }
} 

same to

private double ageValue; 

public double Age 
{ 
    get { return ageValue; } 
    set { ageValue = value; }
} 

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionTheOrlexxView Question on Stackoverflow
Solution 1 - C#Dmitry BychenkoView Answer on Stackoverflow
Solution 2 - C#OliverView Answer on Stackoverflow
Solution 3 - C#CodorView Answer on Stackoverflow
Solution 4 - C#Alois KrausView Answer on Stackoverflow
Solution 5 - C#NatView Answer on Stackoverflow
Solution 6 - C#grant sunView Answer on Stackoverflow
Solution 7 - C#akopView Answer on Stackoverflow