Why are string literals l-value while all other literals are r-value?

C++CLiteralsString Literals

C++ Problem Overview


C++03 5.1 Primary expressions §2 says:

> A literal is a primary expression. Its type depends on its form (2.13). A string literal is an lvalue; all other literals are rvalues.

Similarly, C99 6.5.1 §4 says:

> A string literal is a primary expression. It is an lvalue with type as detailed in 6.4.5.

What is the rationale behind this?

As I understand, string literals are objects, while all other literals are not. And an l-value always refers to an object.

But the question then is why are string literals objects while all other literals are not? This rationale seems to me more like an egg or chicken problem.

I understand the answer to this may be related to hardware architecture rather than C/C++ as programming languages, nevertheless I would like to hear the same.

C++ Solutions


Solution 1 - C++

A string literal is a literal with array type, and in C there is no way for an array type to exist in an expression except as an lvalue. String literals could have been specified to have pointer type (rather than array type that usually decays to a pointer) pointing to the string "contents", but this would make them rather less useful; in particular, the sizeof operator could not be applied to them.

Note that C99 introduced compound literals, which are also lvalues, so having a literal be an lvalue is no longer a special exception; it's closer to being the norm.

Solution 2 - C++

String literals are arrays - objects of inherently unpredictable size (i.e of user-defined and possibly large size). In general case, there's simply no other way to represent such literals except as objects in memory, i.e. as lvalues. In C99 this also applies to compound literals, which are also lvalues.

Any attempts to artificially hide the fact that string literals are lvalues at the language level would produce a considerable number of completely unnecessary difficulties, since the ability to point to a string literal with a pointer as well as the ability to access it as an array relies critically on its lvalue-ness being visible at the language level.

Meanwhile, literals of scalar types have fixed compile-time size. At the same time, such literals are very likely to be embedded directly into the machine commands on the given hardware architecture. For example, when you write something like i = i * 5 + 2, the literal values 5 and 2 become explicit (or even implicit) parts of the generated machine code. They don't exist and don't need to exist as standalone locations in data storage. There's simply no point in storing values 5 and 2 in the data memory.

It is also worth noting that on many (if not most, or all) hardware architectures floating-point literals are actually implemented as "hidden" lvalues (even though the language does not expose them as such). On platforms like x86 machine commands from floating-point group do not support embedded immediate operands. This means that virtually every floating-point literal has to be stored in (and read from) data memory by the compiler. E.g. when you write something like i = i * 5.5 + 2.1 it is translated into something like

const double unnamed_double_5_5 = 5.5;
const double unnamed_double_2_1 = 2.1;
i = i * unnamed_double_5_5 + unnamed_double_2_1;

In other words, floating-point literals often end up becoming "unofficial" lvalues internally. However, it makes perfect sense that language specification did not make any attempts to expose this implementation detail. At language level, arithmetic literals make more sense as rvalues.

Solution 3 - C++

I'd guess that the original motive was mainly a pragmatic one: a string literal must reside in memory and have an address. The type of a string literal is an array type (char[] in C, char const[] in C++), and array types convert to pointers in most contexts. The language could have found other ways to define this (e.g. a string literal could have pointer type to begin with, with special rules concerning what it pointed to), but just making the literal an lvalue is probably the easiest way of defining what is concretely needed.

Solution 4 - C++

An lvalue in C++ does not always refer to an object. It can refer to a function too. Moreover, objects do not have to be referred to by lvalues. They may be referred to by rvalues, including for arrays (in C++ and C). However, in old C89, the array to pointer conversion did not apply for rvalues arrays.

Now, an rvalue denotes no, limited or soon to be an expired lifetime. A string literal, however, lives for the entire program.

So string literals being lvalues is exactly right.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionAlok SaveView Question on Stackoverflow
Solution 1 - C++R.. GitHub STOP HELPING ICEView Answer on Stackoverflow
Solution 2 - C++AnTView Answer on Stackoverflow
Solution 3 - C++James KanzeView Answer on Stackoverflow
Solution 4 - C++Johannes Schaub - litbView Answer on Stackoverflow