Why do the C++ language designers keep re-using keywords?

C++C++11KeywordLanguage Design

C++ Problem Overview


What is the main argument in favor of re-using short keywords (and adding context-dependent meanings) instead of just adding more keywords?

Is it just that you want to avoid breaking existing code that may already be using a proposed new keyword, or is there a deeper reason?

The new "enum class" in C++11 got me thinking about this, but this is a general language design question.

C++ Solutions


Solution 1 - C++

> Is it just that you want to avoid breaking existing code that may already be using a proposed new keyword, or is there a deeper reason?

No, that's the reason.

Keywords, by definition, are always considered keywords wherever they occur in the source, so they cannot be used for other purposes. Making something a keyword breaks any code that might be using that token as a variable, function, or type name.

The C committee take a different approach and add new keywords using _Reserved names, e.g. _Atomic, _Bool, and then they add a new header (<stdatomic.h>, <stdbool.h>) with a nicer macro, so that you can choose whether to include the header to get the name atomic or bool, but it won't be declared automatically and won't break code that happens to be using those names already.

The C++ committee don't like macros and want them to be proper keywords, so either re-use existing ones (such as auto) or add context-dependent "keywords" (which are not really keywords, but are "identifiers with special meaning" so they can be used for other things, such as override) or use strange spellings that are unlikely to clash with user code (such as decltype instead of the widely supported typeof extension).

Solution 2 - C++

Some old languages did not have keywords at all, in particular PL/1 where

IF IF=THEN THEN BEGIN;
  /* some more code */
END;

was a legal piece of code, but completely unreadable. (Look also into APL as an example of write-mostly programming language, which is completely cryptic to read a few months later, even by the code's original author).

The C and C++ language family have a set of keywords defined by the language specification. But there are very widely used languages with billions of legacy source code lines. If you (or their standardization committee) add a new keyword, there is a chance of collisions with some existing program, and as you guessed and others answered this is bad. So if the standard added for instance enum_class as a new keyword, chances are that someone would already have used it as an identifier, and that entity would be unhappy (to have to change their code when adopting a new C++ standard).

Also C++ is widely known to be slowly parsed (in particular, because standard headers like <vector> are pulling dozen of thousand lines of source code, and because modules are not in C++ yet, and because the syntax is strongly ambiguous), so complexifying the parser to handle new syntax is not a big deal (parsing C++ has always been horrible anyway). For example the GCC community is working much harder on new optimizations than on new C++ features (apparently, recent features of the C++ standard library requires much work than parsing new syntax), even if the jump from C++03 to C++11 was a huge jump and required a lot of work in the C++ frontend. This is less true for the C++11 to C++14 jump.

Some other languages (e.g. some dialects of Lisp such as Common Lisp and some Scheme, where you could redefine a let or if macro, and macros in homoiconic languages like these are very different, since operating on ASTs, from the crude textual substitution mechanism in C or C++...) permit the redefinition of existing keywords; read also about hygienic macros. But this can make the source code difficult to understand a few months later.

Solution 3 - C++

I think it's mainly because adding keywords will break existing code that happens to use this keyword in other contexts, as you suggest.

Solution 4 - C++

> Is it just that you want to avoid breaking existing code that may already be using a proposed new keyword, or is there a deeper reason?

By definition, a keyword is a special token which cannot be used anywhere else; as a result, introducing a keyword breaks any code that happened to use an identifier with the given spelling.

Some languages use the term contextual keyword to refer to spellings that are only interpreted as keyword in specific contexts. If no "wild" identifier could previously be used in this context, then it is guaranteed that the introduction of the contextual keyword will not break existing code. For example, since no identifier can appear immediately after the closing parenthese in a function signature, this is a place where one can introduce so-called contextual keywords (such as override or final).

On the other hand, in places where any identifier was previously allowed, adding a keyword poses a risk. For example:

  • struct H { my_type f; enum { g }; };: the use of enum class rather than a new keyword is because any new word could be mistakenly taken as the start of a data member declaration in this context; only a keyword is unambiguous (in LL(1)), and introducing a new one could break code.
  • void h() { my_type f; auto x = g(); }: the use of auto rather than a new keyword is because any new word could clash with an existing type. It's a surprising choice still, since it was already a keyword usable in this position in C (defaulting to int type) but its meaning was altered (the justification was the low probability of its usage).

As some have mentioned, languages can be designed without keywords entirely (Haskell comes pretty close), or made in a way than keywords can be introduced seamlessly (for example, if every declaration starts by a keyword already, then introducing a new keyword cannot clash). It just so happens than C and C++ where not made so, and indeed many C-like languages.

Solution 5 - C++

Mistaken enthusiasm of "less is more". It is thought (incorrectly) that by using fewer keywords, programmers would have to learn less and can be more productive sooner. But this only creates confusion about the syntax.

> "Real Perl programmers prefer things to be visually distinct." ---- Larry Wall

In other words, use a keyword for one task only.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionAndrew WagnerView Question on Stackoverflow
Solution 1 - C++Jonathan WakelyView Answer on Stackoverflow
Solution 2 - C++Basile StarynkevitchView Answer on Stackoverflow
Solution 3 - C++Stefan HausteinView Answer on Stackoverflow
Solution 4 - C++Matthieu M.View Answer on Stackoverflow
Solution 5 - C++shawnhcoreyView Answer on Stackoverflow