Avoiding if statement inside a for loop?

C++C++11For LoopDesign Patterns

C++ Problem Overview


I have a class called Writer that has a function writeVector like so:

void Drawer::writeVector(vector<T> vec, bool index=true)
{
    for (unsigned int i = 0; i < vec.size(); i++) {
        if (index) {
            cout << i << "\t";
        }
        cout << vec[i] << "\n";
    }
}

I'm trying not to have a duplicate code, while still worrying about the performance. In the function, I'm doing the if (index) check on every round of my for-loop, even though the result is always the same. This is against "worrying about the performance".

I could easily avoid this by placing the check outside of my for-loop. However, I'll get loads of duplicate code:

void Drawer::writeVector(...)
{
    if (index) {
        for (...) {
            cout << i << "\t" << vec[i] << "\n";
        }
    }
    else {
        for (...) {
            cout << vec[i] << "\n";
        }
    }
}

So these are both "bad" solutions for me. What I've been thinking, is two private functions, one of them outs the index and then calls the other. The other one only outs the value. However, I can't figure out how to use it with my program, I'd still need the if check to see which one to call...

According to the problem, polymorphism seems like a correct solution. But I can't see how should I use it here. What would be the preferred way to solve this kind of problem?

This is not a real program, I'm just interested in learning how this kind of problem should be solved.

C++ Solutions


Solution 1 - C++

Pass in the body of the loop as a functor. It gets inlined at compile-time, no performance penalty.

The idea of passing in what varies is ubiquitous in the C++ Standard Library. It is called the strategy pattern.

If you are allowed to use C++11, you can do something like this:

#include <iostream>
#include <set>
#include <vector>

template <typename Container, typename Functor, typename Index = std::size_t>
void for_each_indexed(const Container& c, Functor f, Index index = 0) {

    for (const auto& e : c)
        f(index++, e);
}

int main() {

    using namespace std;

    set<char> s{'b', 'a', 'c'};

    // indices starting at 1 instead of 0
    for_each_indexed(s, [](size_t i, char e) { cout<<i<<'\t'<<e<<'\n'; }, 1u);

    cout << "-----" << endl;

    vector<int> v{77, 88, 99};

    // without index
    for_each_indexed(v, [](size_t , int e) { cout<<e<<'\n'; });
}

This code is not perfect but you get the idea.

In old C++98 it looks like this:

#include <iostream>
#include <vector>
using namespace std;

struct with_index {
  void operator()(ostream& out, vector<int>::size_type i, int e) {
    out << i << '\t' << e << '\n';
  }
};

struct without_index {
  void operator()(ostream& out, vector<int>::size_type i, int e) {
    out << e << '\n';
  }
};


template <typename Func>
void writeVector(const vector<int>& v, Func f) {
  for (vector<int>::size_type i=0; i<v.size(); ++i) {
	f(cout, i, v[i]);
  }
}

int main() {

  vector<int> v;
  v.push_back(77);
  v.push_back(88);
  v.push_back(99);

  writeVector(v, with_index());

  cout << "-----" << endl;

  writeVector(v, without_index());

  return 0;
}

Again, the code is far from perfect but it gives you the idea.

Solution 2 - C++

> In the function, I'm doing the if (index) check on every round of my for-loop, even though the result is always the same. This is against "worrying about the performance".

If this is indeed the case, the branch predictor will have no problem in predicting the (constant) result. As such, this will only cause a mild overhead for mispredictions in the first few iterations. It's nothing to worry about in terms of performance

In this case I advocate for keeping the test inside the loop for clarity.

Solution 3 - C++

To expand on Ali's answer, which is perfectly correct but still duplicates some code (part of the loop body, this is unfortunately hardly avoidable when using the strategy pattern)...

Granted in this particular case the code duplication is not much but there's a way to reduce it even more, which comes in handy if the function body is bigger than just a few instructions.

The key is to use the compiler's ability to perform constant folding / dead code elimination. We can do that by manually mapping the runtime value of index to a compile-time value (easy to do when there are only a limited number of cases -- two in this case) and use a non-type template argument which is known at compile-time:

template<bool index = true>
//                  ^^^^^^ note: the default value is now part of the template version
//                         see below to understand why
void writeVector(const vector<int>& vec) {
    for (size_t i = 0; i < vec.size(); ++i) {
        if (index) { // compile-time constant: this test will always be eliminated
            cout << i << "\t"; // this will only be kept if "index" is true
        }
        cout << vec[i] << "\n";
    }
}

void writeVector(const vector<int>& vec, bool index)
//                                            ^^^^^ note: no more default value, otherwise
//                                            it would clash with the template overload
{
    if (index) // runtime decision
        writeVector<true>(vec);
        //          ^^^^ map it to a compile-time constant
    else
        writeVector<false>(vec);
}

This way we end up with compiled code which is equivalent to your second code example (outer if / inner for) but without duplicating the code ourselves. Now we can make the template version of writeVector as complicated as we want, there will always be a single piece of code to maintain.

Note how the template version (which takes a compile-time constant in the form of a non-type template argument) and the non-template version (which takes a runtime variable as a function argument) are overloaded. This allows you to choose the most relevant version depending on your needs, having a rather similar, easy to remember syntax in both cases:

writeVector<true>(vec);   // you already know at compile-time which version you want
                          // no need to go through the non-template runtime dispatching

writeVector(vec, index);  // you don't know at compile-time what "index" will be
                          // so you have to use the non-template runtime dispatching

writeVector(vec);         // you can even use your previous syntax using a default argument
                          // it will call the template overload directly

Solution 4 - C++

In most of cases, your code is already good for performance and readability. A good compiler is capable to detect loop invariants and do appropriate optimizations. Consider the following example which is very close to your code:

#include <cstdio>
#include <iterator>

void write_vector(int* begin, int* end, bool print_index = false) {
	unsigned index = 0;
	for(int* it = begin; it != end; ++it) {
		if (print_index) {
			std::printf("%d: %d\n", index, *it);
		} else {
			std::printf("%d\n", *it);
		}
		++index;
	}
}

int my_vector[] = {
	1, 2, 3, 4, 5, 6, 7, 8, 9, 10,
};
	

int main(int argc, char** argv) {
	write_vector(std::begin(my_vector), std::end(my_vector));
}

I am using the following command line to compile it:

g++ --version
g++ (GCC) 4.9.1
Copyright (C) 2014 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
g++ -O3 -std=c++11 main.cpp

Then, let's dump assembly:

objdump -d a.out | c++filt > main.s

The result assembly of write_vector is:

00000000004005c0 <write_vector(int*, int*, bool)>:
  4005c0:	48 39 f7             	cmp    %rsi,%rdi
  4005c3:	41 54                	push   %r12
  4005c5:	49 89 f4             	mov    %rsi,%r12
  4005c8:	55                   	push   %rbp
  4005c9:	53                   	push   %rbx
  4005ca:	48 89 fb             	mov    %rdi,%rbx
  4005cd:	74 25                	je     4005f4 <write_vector(int*, int*, bool)+0x34>
  4005cf:	84 d2                	test   %dl,%dl
  4005d1:	74 2d                	je     400600 <write_vector(int*, int*, bool)+0x40>
  4005d3:	31 ed                	xor    %ebp,%ebp
  4005d5:	0f 1f 00             	nopl   (%rax)
  4005d8:	8b 13                	mov    (%rbx),%edx
  4005da:	89 ee                	mov    %ebp,%esi
  4005dc:	31 c0                	xor    %eax,%eax
  4005de:	bf a4 06 40 00       	mov    $0x4006a4,%edi
  4005e3:	48 83 c3 04          	add    $0x4,%rbx
  4005e7:	83 c5 01             	add    $0x1,%ebp
  4005ea:	e8 81 fe ff ff       	callq  400470 <printf@plt>
  4005ef:	49 39 dc             	cmp    %rbx,%r12
  4005f2:	75 e4                	jne    4005d8 <write_vector(int*, int*, bool)+0x18>
  4005f4:	5b                   	pop    %rbx
  4005f5:	5d                   	pop    %rbp
  4005f6:	41 5c                	pop    %r12
  4005f8:	c3                   	retq   
  4005f9:	0f 1f 80 00 00 00 00 	nopl   0x0(%rax)
  400600:	8b 33                	mov    (%rbx),%esi
  400602:	31 c0                	xor    %eax,%eax
  400604:	bf a8 06 40 00       	mov    $0x4006a8,%edi
  400609:	48 83 c3 04          	add    $0x4,%rbx
  40060d:	e8 5e fe ff ff       	callq  400470 <printf@plt>
  400612:	49 39 dc             	cmp    %rbx,%r12
  400615:	75 e9                	jne    400600 <write_vector(int*, int*, bool)+0x40>
  400617:	eb db                	jmp    4005f4 <write_vector(int*, int*, bool)+0x34>
  400619:	0f 1f 80 00 00 00 00 	nopl   0x0(%rax)

We can see, that at the begging of the function, we check the value and jump to one of two possible loops:

  4005cf:	84 d2                	test   %dl,%dl
  4005d1:	74 2d                	je     400600 <write_vector(int*, int*, bool)+0x40>

Of course, this only works if a compiler is capable to detect that a condition is actual invariant. Usually, it perfectly works for flags and simple inline functions. But if the condition is "complex", consider using approaches from other answers.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionSkamah OneView Question on Stackoverflow
Solution 1 - C++AliView Answer on Stackoverflow
Solution 2 - C++Marc ClaesenView Answer on Stackoverflow
Solution 3 - C++syamView Answer on Stackoverflow
Solution 4 - C++ivaigultView Answer on Stackoverflow