Have there ever been silent behavior changes in C++ with new standard versions?

C++Language LawyerStandardization

C++ Problem Overview


(I'm looking for an example or two to prove the point, not a list.)

Has it ever been the case that a change in the C++ standard (e.g. from 98 to 11, 11 to 14 etc.) changed the behavior of existing, well-formed, defined-behavior user code - silently? i.e. with no warning or errors when compiling with the newer standard version?

Notes:

  • I'm asking about standards-mandated behavior, not about implementer/compiler author choices.
  • The less contrived the code, the better (as an answer to this question).
  • I don't mean code with version detection such as #if __cplusplus >= 201103L.
  • Answers involving the memory model are fine.

C++ Solutions


Solution 1 - C++

The return type of string::data changes from const char* to char* in C++ 17. That could certainly make a difference

void func(char* data)
{
    cout << data << " is not const\n";
}

void func(const char* data)
{
    cout << data << " is const\n";
}

int main()
{
    string s = "xyz";
    func(s.data());
}

A bit contrived but this legal program would change its output going from C++14 to C++17.

Solution 2 - C++

The answer to this question shows how initializing a vector using a single size_type value can result in different behavior between C++03 and C++11.

std::vector<Something> s(10);

C++03 default-constructs a temporary object of the element type Something and copy-constructs each element in the vector from that temporary.

C++11 default-constructs each element in the vector.

In many (most?) cases these result in equivalent final state, but there is no reason they have to. It depends on the implementation of Something's default/copy constructors.

See this contrived example:

class Something {
private:
    static int counter;

public:
    Something() : v(counter++) {
        std::cout << "default " << v << '\n';
    }

    Something(Something const & other) : v(counter++) {
        std::cout << "copy " << other.v << " to " << v << '\n';
    }

    ~Something() {
        std::cout << "dtor " << v << '\n';
    }

private:
    int v;
};

int Something::counter = 0;

C++03 will default-construct one Something with v == 0 then copy-construct ten more from that one. At the end, the vector contains ten objects whose v values are 1 through 10, inclusive.

C++11 will default-construct each element. No copies are made. At the end, the vector contains ten objects whose v values are 0 through 9, inclusive.

Solution 3 - C++

The standard has a list of breaking changes in Annex C [diff]. Many of these changes can lead to silent behavior change.

An example:

int f(const char*); // #1
int f(bool);        // #2

int x = f(u8"foo"); // until C++20: calls #1; since C++20: calls #2

Solution 4 - C++

Every time they add new methods (and often functions) to the standard library this happens.

Suppose you have a standard library type:

struct example {
  void do_stuff() const;
};

pretty simple. In some standard revision, a new method or overload or next to anything is added:

struct example {
  void do_stuff() const;
  void method(); // a new method
};

this can silently change the behavior of existing C++ programs.

This is because C++'s currently limited reflection capabilities are sufficient to detect if such a method exists, and run different code based on it.

template<class T, class=void>
struct detect_new_method : std::false_type {};

template<class T>
struct detect_new_method< T, std::void_t< decltype( &T::method ) > > : std::true_type {};

this is just a relatively simple way to detect the new method, there are myriad of ways.

void task( std::false_type ) {
  std::cout << "old code";
};
void task( std::true_type ) {
  std::cout << "new code";
};

int main() {
  task( detect_new_method<example>{} );
}

The same can happen when you remove methods from classes.

While this example directly detects the existence of a method, this kind of thing happening indirectly can be less contrived. As a concrete example, you might have a serialization engine that decides if something can be serialized as a container based on if it is iterable, or if it has a data pointing-to-raw-bytes and a size member, with one preferred over the other.

The standard goes and adds a .data() method to a container, and suddenly the type changes which path it uses for serialization.

All the C++ standard can do, if it doesn't want to freeze, is to make the kind of code that silently breaks be rare or somehow unreasonable.

Solution 5 - C++

Oh boy... The link cpplearner provided is scary.

Among others, C++20 disallowed C-style struct declaration of C++ structs.

typedef struct
{
  void member_foo(); // Ill-formed since C++20
} m_struct;

If you were taught writing structs like that (and people that teach "C with classes" teach exactly that) you're screwed.

Solution 6 - C++

Here's an example that prints 3 in C++03 but 0 in C++11:

template<int I> struct X   { static int const c = 2; };
template<> struct X<0>     { typedef int c; };
template<class T> struct Y { static int const c = 3; };
static int const c = 4;
int main() { std::cout << (Y<X< 1>>::c >::c>::c) << '\n'; }

This change in behavior was caused by special handling for >>. Prior to C++11, >> was always the right shift operator. With C++11, >> can be part of a template declaration, too.

Solution 7 - C++

Trigraphs dropped

Source files are encoded in a physical character set that is mapped in an implementation-defined way to the source character set, which is defined in the standard. To accommodate mappings from some physical character sets that didn't natively have all of the punctuation needed by the source character set, the language defined trigraphs—sequences of three common characters that could be used in place of a less common punctuation character. The preprocessor and compiler were required to handle these.

In C++17, trigraphs were removed. So some source files will not be accepted by newer compilers unless they are first translated from the physical character set to some other physical character set that maps one-to-one to the source character set. (In practice, most compilers just made interpretation of trigraphs optional.) This isn't a subtle behavior change, but a breaking change the prevents previously-acceptable source files from being compiled without an external translation process.

More constraints on char

The standard also refers to the execution character set, which is implementation defined, but must contain at least the entire source character set plus a small number of control codes.

The C++ standard defined char as a possibly-unsigned integral type that can efficiently represent every value in the execution character set. With the representation from a language lawyer, you can argue that a char has to be at least 8 bits.

If your implementation uses an unsigned value for char, then you know it can range from 0 to 255, and is thus suitable for storing every possible byte value.

But if your implementation uses a signed value, it has options.

Most would use two's complement, giving char a minimum range of -128 to 127. That's 256 unique values.

But another option was sign+magnitude, where one bit is reserved to indicate whether the number is negative and the other seven bits indicate the magnitude. That would give char a range of -127 to 127, which is only 255 unique values. (Because you lose one useful bit combination to represent -0.)

I'm not sure the committee ever explicitly designated this as a defect, but it was because you couldn't rely on the standard to guarantee a round-trip from unsigned char to char and back would preserve the original value. (In practice, all implementations did because they all used two's complement for signed integral types.)

Only recently (C++17?) was the wording fixed to ensure round-tripping. That fix, along with all the other requirements on char, effectively mandates two's complement for signed char without saying so explicitly (even as the standard continues to allow sign+magnitude representations for other signed integral types). There's a proposal out to require all signed integral types use two's complement, but I don't recall whether it made it into C++20.

So this one is sort of the opposite of what you're looking for because it gives previously incorrect overly presumptuous code a retroactive fix.

Solution 8 - C++

I'm not sure if you'd consider this a breaking change to correct code, but ...

Before C++11, compilers were allowed, but not required, to elide copies in certain circumstances, even when the copy constructor has observable side effects. Now we have guaranteed copy elision. The behavior essentially went from implementation-defined to required.

This means that your copy constructor side effects may have occurred with older versions, but will never occur with newer ones. You could argue the correct code shouldn't rely on implementation-defined results, but I don't think that's quite the same as saying such code is incorrect.

Solution 9 - C++

The behavior when reading (numeric) data from a stream, and reading fails, was changed since c++11.

For example, reading an integer from a stream, while it does not contain an integer:

#include <iostream>
#include <sstream>

int main(int, char **) 
{
    int a = 12345;
    std::string s = "abcd";         // not an integer, so will fail
    std::stringstream ss(s);
    ss >> a;
    std::cout << "fail = " << ss.fail() << " a = " << a << std::endl;        // since c++11: a == 0, before a still 12345 
}

Since c++ 11 will set the read integer to 0 when it failed; at c++ < 11 the integer was not changed. That said, gcc, even when forcing the standard back to c++98 (with -std=c++98 ) always shows new behavior at least since version 4.4.7.

(Imho the old behavior was actually better: why change the value to 0, which is by itself valid, when nothing could be read?)

Reference: see https://en.cppreference.com/w/cpp/locale/num_get/get

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestioneinpoklumView Question on Stackoverflow
Solution 1 - C++johnView Answer on Stackoverflow
Solution 2 - C++cdhowieView Answer on Stackoverflow
Solution 3 - C++cpplearnerView Answer on Stackoverflow
Solution 4 - C++Yakk - Adam NevraumontView Answer on Stackoverflow
Solution 5 - C++Noone AtAllView Answer on Stackoverflow
Solution 6 - C++WaxratView Answer on Stackoverflow
Solution 7 - C++Adrian McCarthyView Answer on Stackoverflow
Solution 8 - C++Adrian McCarthyView Answer on Stackoverflow
Solution 9 - C++DanRechtsafView Answer on Stackoverflow