What is the C++ equivalent of Python's "in" operator?
C++ArraysC++ Problem Overview
What is the C++ way of checking if an element is contained in an array/list, similar to what the in
operator does in Python?
if x in arr:
print "found"
else
print "not found"
How does the time complexity of the C++ equivalent compare to Python's in
operator?
C++ Solutions
Solution 1 - C++
The time complexity of Python's in
operator varies depending on the data structure it is actually called with. When you use it with a list, complexity is linear (as one would expect from an unsorted array without an index). When you use it to look up set membership or presence of a dictionary key complexity is constant on average (as one would expect from a hash table based implementation):
In C++ you can use std::find
to determine whether or not an item is contained in a std::vector
. Complexity is said to be linear (as one would expect from an unsorted array without an index). If you make sure the vector is sorted, you can also use std::binary_search
to achieve the same in logarithmic time.
- http://en.cppreference.com/w/cpp/algorithm/find
- https://stackoverflow.com/q/24139428/1025391
- https://stackoverflow.com/q/19215027/1025391
- http://en.cppreference.com/w/cpp/algorithm/binary_search
The associative containers provided by the standard library (std::set
, std::unordered_set
, std::map
, ...) provide the member functions find()
and count()
and contains()
(C++20) for this. These will perform better than linear search, i.e., logarithmic or constant time depending on whether you have picked the ordered or the unordered alternative. Which one of these functions to prefer largely depends on what you want to achieve with that info afterwards, but also a bit on personal preference. (Lookup the documentation for details and examples.)
- https://stackoverflow.com/q/1701067/1025391
- https://stackoverflow.com/q/3886593/1025391
- https://en.wikipedia.org/wiki/Associative_containers
- http://en.cppreference.com/w/cpp/container
If you want to, you can use some template magic to write a wrapper function that picks the correct method for the container at hand, e.g., as presented in this answer.
Solution 2 - C++
You can approach this in two ways:
You can use std::find
from <algorithm>
:
auto it = std::find(container.begin(), container.end(), value);
if (it != container.end())
return it;
or you can iterate through every element in your containers with for ranged loops:
for(const auto& it : container)
{
if(it == value)
return it;
}
Solution 3 - C++
Python does different things for in
depending on what kind of container it is. In C++, you'd want the same mechanism. Rule of thumb for the standard containers is that if they provide a find()
, it's going to be a better algorithm than std::find()
(e.g. find()
for std::unordered_map
is O(1), but std::find()
is always O(N)).
So we can write something to do that check ourselves. The most concise would be to take advantage of C++17's if constexpr
and use something like Yakk's can_apply
:
template <class C, class K>
using find_t = decltype(std::declval<C const&>().find(std::declval<K const&>()));
template <class Container, class Key>
bool in(Container const& c, Key const& key) {
if constexpr (can_apply<find_t, Container, Key>{}) {
// the specialized case
return c.find(key) != c.end();
} else {
// the general case
using std::begin; using std::end;
return std::find(begin(c), end(c), key) != end(c);
}
}
In C++11, we can take advantage of expression SFINAE:
namespace details {
// the specialized case
template <class C, class K>
auto in_impl(C const& c, K const& key, int )
-> decltype(c.find(key), true) {
return c.find(key) != c.end();
}
// the general case
template <class C, class K>
bool in_impl(C const& c, K const& key, ...) {
using std::begin; using std::end;
return std::find(begin(c), end(c), key) != end(c);
}
}
template <class Container, class Key>
bool in(Container const& c, Key const& key) {
return details::in_impl(c, key, 0);
}
Note that in both cases we have the using std::begin; using std::end;
two-step in order to handle all the standard containers, raw arrays, and any use-provided/adapted containers.
Solution 4 - C++
I guess one might make use of this thread and create a custom version of in
function.
The main idea is to use SFINAE (Substitution Failure Is Not An Error) to differentiate associative containers (which have key_type member) from sequence containers (which have no key_type member).
Here is a possible implementation:
namespace detail
{
template<typename, typename = void>
struct is_associative : std::false_type {};
template<typename T>
struct is_associative<T,
std::enable_if_t<sizeof(typename T::key_type) != 0>> : std::true_type {};
template<typename C, typename T>
auto in(const C& container, const T& value) ->
std::enable_if_t<is_associative<C>::value, bool>
{
using std::cend;
return container.find(value) != cend(container);
}
template<typename C, typename T>
auto in(const C& container, const T& value) ->
std::enable_if_t<!is_associative<C>::value, bool>
{
using std::cbegin;
using std::cend;
return std::find(cbegin(container), cend(container), value) != cend(container);
}
}
template<typename C, typename T>
auto in(const C& container, const T& value)
{
return detail::in(container, value);
}
Small usage example on WANDBOX.
Solution 5 - C++
This gives you an infix *in*
operator:
namespace notstd {
namespace ca_helper {
template<template<class...>class, class, class...>
struct can_apply:std::false_type{};
template<class...>struct voider{using type=void;};
template<class...Ts>using void_t=typename voider<Ts...>::type;
template<template<class...>class Z, class...Ts>
struct can_apply<Z,void_t<Z<Ts...>>, Ts...>:std::true_type{};
}
template<template<class...>class Z, class...Ts>
using can_apply = ca_helper::can_apply<Z,void,Ts...>;
namespace find_helper {
template<class C, class T>
using dot_find_r = decltype(std::declval<C>().find(std::declval<T>()));
template<class C, class T>
using can_dot_find = can_apply< dot_find_r, C, T >;
template<class C, class T>
constexpr std::enable_if_t<can_dot_find<C&, T>{},bool>
find( C&& c, T&& t ) {
using std::end;
return c.find(std::forward<T>(t)) != end(c);
}
template<class C, class T>
constexpr std::enable_if_t<!can_dot_find<C&, T>{},bool>
find( C&& c, T&& t ) {
using std::begin; using std::end;
return std::find(begin(c), end(c), std::forward<T>(t)) != end(c);
}
template<class C, class T>
constexpr bool finder( C&& c, T&& t ) {
return find( std::forward<C>(c), std::forward<T>(t) );
}
}
template<class C, class T>
constexpr bool find( C&& c, T&& t ) {
return find_helper::finder( std::forward<C>(c), std::forward<T>(t) );
}
struct finder_t {
template<class C, class T>
constexpr bool operator()(C&& c, T&& t)const {
return find( std::forward<C>(c), std::forward<T>(t) );
}
constexpr finder_t() {}
};
constexpr finder_t finder{};
namespace named_operator {
template<class D>struct make_operator{make_operator(){}};
template<class T, char, class O> struct half_apply { T&& lhs; };
template<class Lhs, class Op>
half_apply<Lhs, '*', Op> operator*( Lhs&& lhs, make_operator<Op> ) {
return {std::forward<Lhs>(lhs)};
}
template<class Lhs, class Op, class Rhs>
auto operator*( half_apply<Lhs, '*', Op>&& lhs, Rhs&& rhs )
-> decltype( named_invoke( std::forward<Lhs>(lhs.lhs), Op{}, std::forward<Rhs>(rhs) ) )
{
return named_invoke( std::forward<Lhs>(lhs.lhs), Op{}, std::forward<Rhs>(rhs) );
}
}
namespace in_helper {
struct in_t:notstd::named_operator::make_operator<in_t> {};
template<class T, class C>
bool named_invoke( T&& t, in_t, C&& c ) {
return ::notstd::find(std::forward<C>(c), std::forward<T>(t));
}
}
in_helper::in_t in;
}
On a flat container, like a vector array or string, it is O(n).
On an associative sorted container, like a std::map
, std::set
, it is O(lg(n)).
On an unordered associated container, like std::unordered_set
, it is O(1).
Test code:
std::vector<int> v{1,2,3};
if (1 *in* v)
std::cout << "yes\n";
if (7 *in* v)
std::cout << "no\n";
std::map<std::string, std::string, std::less<>> m{
{"hello", "world"}
};
if ("hello" *in* m)
std::cout << "hello world\n";
C++14, but mainly for enable_if_t
.
So what is going on here?
Well, can_apply
is a bit of code that lets me write can_dot_find
, which detects (at compile time) if container.find(x)
is a valid expression.
This lets me dispatch the searching code to use member-find if it exists. If it doesn't exist, a linear search using std::find
is used instead.
Which is a bit of a lie. If you define a free function find(c, t)
in the namespace of your container, it will use that rather than either of the above. But that is me being fancy (and it lets you extend 3rd party containers with *in*
support).
That ADL (argument dependent lookup) extensibity (the 3rd party extension ability) is why we have three different functions named find
, two in a helper namespace and one in notstd
. You are intended to call notstd::find
.
Next, we want a python-like in
, and what is more python like than an infix operator? To do this in C++ you need to wrap your operator name in other operators. I chose *
, so we get an infix *in*
named operator.
TL;DR
You do using notstd::in;
to import the named operator in
.
After that, t *in* c
first checks if find(t,c)
is valid. If not, it checks if c.find(t)
is valid. If that fails, it does a linear search of c
using std::begin
std::end
and std::find
.
This gives you very good performance on a wide variety of std
containers.
The only thing it doesn't support is
if (7 *in* {1,2,3})
as operators (other than =
) cannot deduce initializer lists I believe. You could get
if (7 *in* il(1,2,3))
to work.
Solution 6 - C++
You can use std::find
from <algorithm>
, but this works only for datatypes like: std::map
and std::vector
(etc).
Also note that this will return, iterator to the first element that is found equal to the value you pass, unlike the in
operator in Python that returns a bool.
Solution 7 - C++
I think one of the nice features of the "in" operator in python is that it can be used with different data types (strings v/s strings, numbers v/s lists, etc).
I am developing a toy library for using python constructions in C++. It includes an "in" operator.
It is based on the same technique used to implement the in operator posted in a previous answer, in which make_operator
- Searching a string inside a string
- Searching an element inside a vector
- Implementing a "not_in" operator
It works by defining several overloads for a function: bool in__(T1 &v1, T2 &v2), in which T1 and T2 consider different possible types of objects. Also, overloads for a function: bool not_in__(T1 &v1, T2 &v2) are defined. Then, the operators "in" and "not_in" call those functions for working.
The main problem with this kind of aproaches is that compilation errors are incredibly verbose and uninformative...
The toy implementation is in this repository: