Why is there no transform_if in the C++ standard library?

C++C++ Standard-LibraryStl Algorithm

C++ Problem Overview


A use case emerged when wanting to do a contitional copy (1. doable with copy_if) but from a container of values to a container of pointers to those values (2. doable with transform).

With the available tools I can't do it in less than two steps :

#include <vector>
#include <algorithm>

using namespace std;

struct ha { 
    int i;
	explicit ha(int a) : i(a) {}
};

int main() 
{
	vector<ha> v{ ha{1}, ha{7}, ha{1} }; // initial vector
	// GOAL : make a vector of pointers to elements with i < 2
	vector<ha*> ph; // target vector
	vector<ha*> pv; // temporary vector
	// 1. 
	transform(v.begin(), v.end(), back_inserter(pv), 
		[](ha &arg) { return &arg; }); 
	// 2. 
	copy_if(pv.begin(), pv.end(), back_inserter(ph),
		[](ha *parg) { return parg->i < 2;  }); // 2. 
	
	return 0;
}

Ofcourse we could call remove_if on pv and eliminate the need for a temporary, better yet though, it's not difficult to implement (for unary operations) something like this :

template <
	class InputIterator, class OutputIterator, 
	class UnaryOperator, class Pred
>
OutputIterator transform_if(InputIterator first1, InputIterator last1,
                            OutputIterator result, UnaryOperator op, Pred pred)
{
	while (first1 != last1) 
	{
		if (pred(*first1)) {
			*result = op(*first1);
			++result;
		}
		++first1;
	}
	return result;
}

// example call 
transform_if(v.begin(), v.end(), back_inserter(ph), 
[](ha &arg) { return &arg;      }, // 1. 
[](ha &arg) { return arg.i < 2; });// 2.
  1. Is there a more elegant workaround with the available C++ standard library tools ?
  2. Is there a reason why transform_if does not exist in the library? Is the combination of the existing tools a sufficient workaround and/or considered performance wise well behaved ?

C++ Solutions


Solution 1 - C++

The standard library favours elementary algorithms.

Containers and algorithms should be independent of each other if possible.

Likewise, algorithms that can be composed of existing algorithms are only rarely included, as shorthand.

If you require a transform if, you can trivially write it. If you want it /today/, composing of ready-mades and not incur overhead, you can use a range library that has lazy ranges, such as Boost.Range, e.g.:

v | filtered(arg1 % 2) | transformed(arg1 * arg1 / 7.0)

As @hvd points out in a comment, transform_if double result in a different type (double, in this case). Composition order matters, and with Boost Range you could also write:

 v | transformed(arg1 * arg1 / 7.0) | filtered(arg1 < 2.0)

resulting in different semantics. This drives home the point:

> it makes very little sense to include std::filter_and_transform, std::transform_and_filter, std::filter_transform_and_filter etc. etc. into the standard library.

See a sample Live On Coliru

#include <boost/range/algorithm.hpp>
#include <boost/range/adaptors.hpp>

using namespace boost::adaptors;

// only for succinct predicates without lambdas
#include <boost/phoenix.hpp>
using namespace boost::phoenix::arg_names;

// for demo
#include <iostream>

int main()
{
    std::vector<int> const v { 1,2,3,4,5 };

    boost::copy(
            v | filtered(arg1 % 2) | transformed(arg1 * arg1 / 7.0),
            std::ostream_iterator<double>(std::cout, "\n"));
}

Solution 2 - C++

Sorry to resurrect this question after so long. I had a similar requirement recently. I solved it by writing a version of back_insert_iterator that takes a boost::optional:

template<class Container>
struct optional_back_insert_iterator
: public std::iterator< std::output_iterator_tag,
void, void, void, void >
{
    explicit optional_back_insert_iterator( Container& c )
    : container(std::addressof(c))
    {}
    
    using value_type = typename Container::value_type;
    
    optional_back_insert_iterator<Container>&
    operator=( const boost::optional<value_type> opt )
    {
        if (opt) {
            container->push_back(std::move(opt.value()));
        }
        return *this;
    }
    
    optional_back_insert_iterator<Container>&
    operator*() {
        return *this;
    }
    
    optional_back_insert_iterator<Container>&
    operator++() {
        return *this;
    }
    
    optional_back_insert_iterator<Container>&
    operator++(int) {
        return *this;
    }
    
protected:
    Container* container;
};

template<class Container>
optional_back_insert_iterator<Container> optional_back_inserter(Container& container)
{
    return optional_back_insert_iterator<Container>(container);
}

used like this:

transform(begin(s), end(s),
          optional_back_inserter(d),
          [](const auto& s) -> boost::optional<size_t> {
              if (s.length() > 1)
                  return { s.length() * 2 };
              else
                  return { boost::none };
          });

Solution 3 - C++

The new for loop notation in many ways reduces the need for algorithms that access every element of the collection where it is now cleaner to just write a loop and put the logic inplace.

std::vector< decltype( op( begin(coll) ) > output;
for( auto const& elem : coll )
{
   if( pred( elem ) )
   {
        output.push_back( op( elem ) );
   }
}

Does it really provide a lot of value now to put in an algorithm? Whilst yes, the algorithm would have been useful for C++03 and indeed I had one for it, we don't need one now so no real advantage in adding it.

Note that in practical use your code won't always look exactly like that either: you don't necessarily have functions "op" and "pred" and may have to create lambdas to make them "fit" into algorithms. Whilst it is nice to separate out concerns if the logic is complex, if it is just a matter of extracting a member from the input type and checking its value or adding it to the collection, it's a lot simpler once again than using an algorithm.

In addition, once you are adding some kind of transform_if, you have to decide whether to apply the predicate before or after the transform, or even have 2 predicates and apply it in both places.

So what are we going to do? Add 3 algorithms? (And in the case that the compiler could apply the predicate on either end of the convert, a user could easily pick the wrong algorithm by mistake and the code still compile but produce wrong results).

Also, if the collections are large, does the user want to loop with iterators or map/reduce? With the introduction of map/reduce you get even more complexities in the equation.

Essentially, the library provides the tools, and the user is left here to use them to fit what they want to do, not the other way round as was often the case with algorithms. (See how the user above tried to twist things using accumulate to fit what they really wanted to do).

For a simple example, a map. For each element I will output the value if the key is even.

std::vector< std::string > valuesOfEvenKeys
    ( std::map< int, std::string > const& keyValues )
{
    std::vector< std::string > res;
    for( auto const& elem: keyValues )
    {
        if( elem.first % 2 == 0 )
        {
            res.push_back( elem.second );
        }
    }
    return res;
}         

Nice and simple. Fancy fitting that into a transform_if algorithm?

Solution 4 - C++

After just finding this question again after some time, and devising a whole slew of potentially useful generic iterator adaptors I realized that the original question required NOTHING more than std::reference_wrapper.

Use it instead of a pointer, and you're good:

Live On Coliru

#include <algorithm>
#include <functional> // std::reference_wrapper
#include <iostream>
#include <vector>

struct ha {
    int i;
};

int main() {
    std::vector<ha> v { {1}, {7}, {1}, };

    std::vector<std::reference_wrapper<ha const> > ph; // target vector
    copy_if(v.begin(), v.end(), back_inserter(ph), [](const ha &parg) { return parg.i < 2; });

    for (ha const& el : ph)
        std::cout << el.i << " ";
}

Prints

1 1 

Solution 5 - C++

The standard is designed in such a way as to minimise duplication.

In this particular case you can achieve the algoritm's aims in a more readable and succinct way with a simple range-for loop.

// another way

vector<ha*> newVec;
for(auto& item : v) {
    if (item.i < 2) {
        newVec.push_back(&item);
    }
}

I have modified the example so that it compiles, added some diagnostics and presented both the OP's algorithm and mine side by side.

#include <vector>
#include <algorithm>
#include <iostream>
#include <iterator>

using namespace std;

struct ha { 
    explicit ha(int a) : i(a) {}
    int i;   // added this to solve compile error
};

// added diagnostic helpers
ostream& operator<<(ostream& os, const ha& t) {
    os << "{ " << t.i << " }";
    return os;
}

ostream& operator<<(ostream& os, const ha* t) {
    os << "&" << *t;
    return os;
}

int main() 
{
    vector<ha> v{ ha{1}, ha{7}, ha{1} }; // initial vector
    // GOAL : make a vector of pointers to elements with i < 2
    vector<ha*> ph; // target vector
    vector<ha*> pv; // temporary vector
    // 1. 
    transform(v.begin(), v.end(), back_inserter(pv), 
        [](ha &arg) { return &arg; }); 
    // 2. 
    copy_if(pv.begin(), pv.end(), back_inserter(ph),
        [](ha *parg) { return parg->i < 2;  }); // 2. 
        
    // output diagnostics
    copy(begin(v), end(v), ostream_iterator<ha>(cout));
    cout << endl;
    copy(begin(ph), end(ph), ostream_iterator<ha*>(cout));
    cout << endl;
    
    
    // another way
    
    vector<ha*> newVec;
    for(auto& item : v) {
        if (item.i < 2) {
            newVec.push_back(&item);
        }
    }

    // diagnostics
    copy(begin(newVec), end(newVec), ostream_iterator<ha*>(cout));
    cout << endl;
    return 0;
}

Solution 6 - C++

C++20 brought ranges and with them a new set of algorithms to operate on them. One of the most powerful tools in this addition is that of views:

  • They support lazy evaluation, which means elements are generated upon request and not upon construction. So performance considerations are put to rest (the original question mentions how creating temporary vectors with intermediate results is sub-optimal).
  • They are composable, which means that operations can easily chained together without loss of performance or expressiveness.

Armed with those new tools, a transform if operation to:

  • "transform a vector v using function A
  • only if an element satisfies condition B

becomes as simple as:

v | std::views::filter(B) | std::views::transform(A)

It's now fair to say that there is a pretty straight-forward way to do "transform if" using the Standard library.

What was originally asked can be written as:

struct ha { 
    int i;
    explicit ha(int a) : i(a) {}
};

int main() 
{
    std::vector<ha> v{ ha{1}, ha{7}, ha{1}, ha{4}, ha{3}, ha{0} };

    auto less4 =  [](ha const& h) { return h.i < 4; };
    auto pnter =  [](ha const& h) { return std::addressof(h); };
 
    for (auto vp : v | std::views::filter(less4) 
                     | std::views::transform(pnter)) 
    {
        std::cout << vp->i << ' ';
    }    
}

Demo

Solution 7 - C++

You may use copy_if along. Why not? Define OutputIt (see [copy][1]):

struct my_inserter: back_insert_iterator<vector<ha *>>
{
  my_inserter(vector<ha *> &dst)
    : back_insert_iterator<vector<ha *>>(back_inserter<vector<ha *>>(dst))
  {
  }
  my_inserter &operator *()
  {
    return *this;
  }
  my_inserter &operator =(ha &arg)
  {
    *static_cast< back_insert_iterator<vector<ha *>> &>(*this) = &arg;
    return *this;
  }
};

and rewrite your code:

int main() 
{
    vector<ha> v{ ha{1}, ha{7}, ha{1} }; // initial vector
    // GOAL : make a vector of pointers to elements with i < 2
    vector<ha*> ph; // target vector
    
    my_inserter yes(ph);
    copy_if(v.begin(), v.end(), yes,
        [](const ha &parg) { return parg.i < 2;  });

    return 0;
}

[1]: http://en.cppreference.com/w/cpp/algorithm/copy "copy_if"

Solution 8 - C++

template <class InputIt, class OutputIt, class BinaryOp>
OutputIt
transform_if(InputIt it, InputIt end, OutputIt oit, BinaryOp op)
{
    for(; it != end; ++it, (void) ++oit)
        op(oit, *it);
    return oit;
}

Usage: (Note that CONDITION and TRANSFORM are not macros, they are placeholders for whatever condition and transformation you want to apply)

std::vector a{1, 2, 3, 4};
std::vector b;

return transform_if(a.begin(), a.end(), b.begin(),
    [](auto oit, auto item)             // Note the use of 'auto' to make life easier
    {
        if(CONDITION(item))             // Here's the 'if' part
            *oit++ = TRANSFORM(item);   // Here's the 'transform' part
    }
);

Solution 9 - C++

This is just an answer to question 1 "Is there a more elegant workaround with the available C++ standard library tools ?".

If you can use c++17 then you can use std::optional for a simpler solution using only C++ standard library functionality. The idea is to return std::nullopt in case there is no mapping:

See live on Coliru

#include <iostream>
#include <optional>
#include <vector>

template <
	class InputIterator, class OutputIterator, 
	class UnaryOperator
>
OutputIterator filter_transform(InputIterator first1, InputIterator last1,
							OutputIterator result, UnaryOperator op)
{
	while (first1 != last1) 
	{
		if (auto mapped = op(*first1)) {
			*result = std::move(mapped.value());
			++result;
		}
		++first1;
	}
	return result;
}

struct ha { 
	int i;
	explicit ha(int a) : i(a) {}
};

int main()
{
	std::vector<ha> v{ ha{1}, ha{7}, ha{1} }; // initial vector
	
	// GOAL : make a vector of pointers to elements with i < 2
	std::vector<ha*> ph; // target vector
	filter_transform(v.begin(), v.end(), back_inserter(ph), 
		[](ha &arg) { return arg.i < 2 ? std::make_optional(&arg) : std::nullopt; });

	for (auto p : ph)
		std::cout << p->i << std::endl;
	
	return 0;
}

Note that I just implemented Rust's approach in C++ here.

Solution 10 - C++

You can use std::accumulate which operates on a pointer to the destination container:

Live On Coliru

#include <numeric>
#include <iostream>
#include <vector>

struct ha
{
    int i;
};

// filter and transform is here
std::vector<int> * fx(std::vector<int> *a, struct ha const & v)
{
    if (v.i < 2)
    {
        a->push_back(v.i);
    }

    return a;
}

int main()
{
    std::vector<ha> v { {1}, {7}, {1}, };

    std::vector<int> ph; // target vector

    std::accumulate(v.begin(), v.end(), &ph, fx);
    
    for (int el : ph)
    {
        std::cout << el << " ";
    }
}

Prints

1 1 

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionNikos AthanasiouView Question on Stackoverflow
Solution 1 - C++seheView Answer on Stackoverflow
Solution 2 - C++Richard HodgesView Answer on Stackoverflow
Solution 3 - C++CashCowView Answer on Stackoverflow
Solution 4 - C++seheView Answer on Stackoverflow
Solution 5 - C++Richard HodgesView Answer on Stackoverflow
Solution 6 - C++Nikos AthanasiouView Answer on Stackoverflow
Solution 7 - C++dyomasView Answer on Stackoverflow
Solution 8 - C++user5406764View Answer on Stackoverflow
Solution 9 - C++Jonny DeeView Answer on Stackoverflow
Solution 10 - C++Wojciech MigdaView Answer on Stackoverflow