What are the pitfalls of ADL?

C++NamespacesOverload ResolutionArgument Dependent-Lookup

C++ Problem Overview


Some time ago I read an article that explained several pitfalls of argument dependent lookup, but I cannot find it anymore. It was about gaining access to things that you should not have access to or something like that. So I thought I'd ask here: what are the pitfalls of ADL?

C++ Solutions


Solution 1 - C++

There is a huge problem with argument-dependent lookup. Consider, for example, the following utility:

#include <iostream>

namespace utility
{
    template <typename T>
    void print(T x)
    {
        std::cout << x << std::endl;
    }

    template <typename T>
    void print_n(T x, unsigned n)
    {
        for (unsigned i = 0; i < n; ++i)
            print(x);
    }
}

It's simple enough, right? We can call print_n() and pass it any object and it will call print to print the object n times.

Actually, it turns out that if we only look at this code, we have absolutely no idea what function will be called by print_n. It might be the print function template given here, but it might not be. Why? Argument-dependent lookup.

As an example, let's say you have written a class to represent a unicorn. For some reason, you've also defined a function named print (what a coincidence!) that just causes the program to crash by writing to a dereferenced null pointer (who knows why you did this; that's not important):

namespace my_stuff
{
    struct unicorn { /* unicorn stuff goes here */ };

    std::ostream& operator<<(std::ostream& os, unicorn x) { return os; }

    // Don't ever call this!  It just crashes!  I don't know why I wrote it!
    void print(unicorn) { *(int*)0 = 42; }
}

Next, you write a little program that creates a unicorn and prints it four times:

int main()
{
    my_stuff::unicorn x;
    utility::print_n(x, 4);
}

You compile this program, run it, and... it crashes. "What?! No way," you say: "I just called print_n, which calls the print function to print the unicorn four times!" Yes, that's true, but it hasn't called the print function you expected it to call. It's called my_stuff::print.

Why is my_stuff::print selected? During name lookup, the compiler sees that the argument to the call to print is of type unicorn, which is a class type that is declared in the namespace my_stuff.

Because of argument-dependent lookup, the compiler includes this namespace in its search for candidate functions named print. It finds my_stuff::print, which is then selected as the best viable candidate during overload resolution: no conversion is required to call either of the candidate print functions and nontemplate functions are preferred to function templates, so the nontemplate function my_stuff::print is the best match.

(If you don't believe this, you can compile the code in this question as-is and see ADL in action.)

Yes, argument-dependent lookup is an important feature of C++. It is essentially required to achieve the desired behavior of some language features like overloaded operators (consider the streams library). That said, it's also very, very flawed and can lead to really ugly problems. There have been several proposals to fix argument-dependent lookup, but none of them have been accepted by the C++ standards committee.

Solution 2 - C++

The accepted answer is simply wrong - this is not a bug of ADL. It shows an careless anti-pattern to use function calls in daily coding - ignorance of dependent names and relying on unqualified function names blindly.

In short, if you are using unqualified name in the postfix-expression of a function call, you should have acknowledged that you have granted the ability that the function can be "overridden" elsewhere (yes, this is a kind of static polymorphism). Thus, the spelling of the unqualified name of a function in C++ is exactly a part of the interface.

In the case of the accepted answer, if the print_n really need ADL print (i.e. allowing it to be overridden), it should have been documented with the use of unqualified print as an explicit notice, thus clients would receive a contract that print should be carefully declared and the misbehavior would be all of the responsibility of my_stuff. Otherwise, it is a bug of print_n. The fix is simple: qualify print with prefix utility::. This is indeed a bug of print_n, but hardly a bug of the ADL rules in the language.

However, there do exist unwanted things in the language specification, and technically, not only one. They are realized more than 10 years, but nothing in the language is fixed yet. They are missed by the accepted answer (except that the last paragraph is solely correct till now). See this paper for details.

I can append one real case against the name lookup nasty. I was implementing is_nothrow_swappable where __cplusplus < 201703L. I found it impossible to rely on ADL to implementing such feature once I have a declared swap function template in my namespace. Such swap would always found together with std::swap introduced by a idiomatic using std::swap; to use ADL under the ADL rules, and then there would come ambiguity of swap where the swap template (which would instantiate is_nothrow_swappable to get the proper noexcept-specification) is called. Combined with 2-phase lookup rules, the order of declarations does not count, once the library header containing the swap template is included. So, unless I overload all my library types with specialized swap function (to supress any candidate generic templates swap being matched by overloading resolution after ADL), I cannot declare the template. Ironically, the swap template declared in my namespace is exactly to utilize ADL (consider boost::swap) and it is one of the most significant direct client of is_nothrow_swappable in my library (BTW, boost::swap does not respects the exception specification). This perfectly beat my purpose up, sigh...

#include <type_traits>
#include <utility>
#include <memory>
#include <iterator>

namespace my
{

#define USE_MY_SWAP_TEMPLATE true
#define HEY_I_HAVE_SWAP_IN_MY_LIBRARY_EVERYWHERE false

namespace details
{

using ::std::swap;

template<typename T>
struct is_nothrow_swappable
	: std::integral_constant<bool, noexcept(swap(::std::declval<T&>(), ::std::declval<T&>()))>
{};

} // namespace details

using details::is_nothrow_swappable;

#if USE_MY_SWAP_TEMPLATE
template<typename T>
void
swap(T& x, T& y) noexcept(is_nothrow_swappable<T>::value)
{
	// XXX: Nasty but clever hack?
	std::iter_swap(std::addressof(x), std::addressof(y));
}
#endif

class C
{};

// Why I declared 'swap' above if I can accept to declare 'swap' for EVERY type in my library?
#if !USE_MY_SWAP_TEMPLATE || HEY_I_HAVE_SWAP_IN_MY_LIBRARY_EVERYWHERE
void
swap(C&, C&) noexcept
{}
#endif

} // namespace my

int
main()
{
	my::C a, b;
#if USE_MY_SWAP_TEMPLATE

	my::swap(a, b); // Even no ADL here...
#else
	using std::swap; // This merely works, but repeating this EVERYWHERE is not attractive at all... and error-prone.

	swap(a, b); // ADL rocks?
#endif
}

Try https://wandbox.org/permlink/4pcqdx0yYnhhrASi and turn USE_MY_SWAP_TEMPLATE to true to see the ambiguity.

Update 2018-11-05:

Aha, I am bitten by ADL this morning again. This time it even has nothing to do with function calls!

Today I am finishing the work of porting ISO C++17 std::polymorphic_allocator to my codebase. Since some container class templates have been introduced long ago in my code (like this), this time I just replace the declarations with alias templates like:

namespace pmr = ystdex::pmr;
template<typename _tKey, typename _tMapped, typename _fComp
	= ystdex::less<_tKey>, class _tAlloc
	= pmr::polymorphic_allocator<std::pair<const _tKey, _tMapped>>>
using multimap = std::multimap<_tKey, _tMapped, _fComp, _tAlloc>;

... so it can use my implementation of polymorphic_allocator by default. (Disclaimer: it has some known bugs. Fixes of the bugs would be committed in a few days.)

But it suddenly does not work, with hundreds of lines of cryptic error messages...

The error begins from this line. It roughly complains that the declared BaseType is not a base of the enclosing class MessageQueue. That seems very strange because the alias is declared with exactly the same tokens to those in the base-specifier-list of the class definition, and I am sure nothing of them can be macro-expanded. So why?

The answer is... ADL sucks. The line inroducing BaseType is hard-coded with a std name as a template argument, so the template would be looked up per ADL rules in the class scope. Thus, it finds std::multimap, which differs to the result of lookup in as the actual base class declared in the enclosing namespace scope. Since std::multimap uses std::allocator instance as the default template argument, BaseType is not the same type to the actual base class which have an instance of polymorphic_allocator, even multimap declared in the enclosing namespace is redirected to std::multimap. By adding the enclosing qualification as the prefix right to the =, the bug is fixed.

I'd admit I am lucky enough. The error messages are heading the problem to this line. There are only 2 similar problems and the other is without any explicit std (where string is my own one being adapted to ISO C++17's string_view change, not std one in pre-C++17 modes). I would not figure out the bug is about ADL so quickly.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionfredoverflowView Question on Stackoverflow
Solution 1 - C++James McNellisView Answer on Stackoverflow
Solution 2 - C++FrankHBView Answer on Stackoverflow