Why is the dereference operator (*) also used to declare a pointer?
C++PointersSyntaxDereferenceNotationC++ Problem Overview
I'm not sure if this is a proper programming question, but it's something that has always bothered me, and I wonder if I'm the only one.
When initially learning C++, I understood the concept of references, but pointers had me confused. Why, you ask? Because of how you declare a pointer.
Consider the following:
void foo(int* bar)
{
}
int main()
{
int x = 5;
int* y = NULL;
y = &x;
*y = 15;
foo(y);
}
The function foo(int*)
takes an int
pointer as parameter. Since I've declared y
as int
pointer, I can pass y
to foo
, but when first learning C++ I associated the *
symbol with dereferencing, as such I figured a dereferenced int
needed to be passed. I would try to pass *y
into foo
, which obviously doesn't work.
Wouldn't it have been easier to have a separate operator for declaring a pointer? (or for dereferencing). For example:
void test(int@ x)
{
}
C++ Solutions
Solution 1 - C++
In The Development of the C Language, Dennis Ritchie explains his reasoning thusly:
> The second innovation that most clearly distinguishes C from its
> predecessors is this fuller type structure and especially its
> expression in the syntax of declarations... given an object of any
> type, it should be possible to describe a new object that gathers
> several into an array, yields it from a function, or is a pointer to
> it.... [This] led to a
> declaration syntax for names mirroring that of the expression syntax
> in which the names typically appear. Thus,
>
> int i, *pi, **ppi;
declare an integer, a pointer to an integer, a
> pointer to a pointer to an integer. The syntax of these declarations
> reflects the observation that i, *pi, and **ppi
all yield an int
type
> when used in an expression.
>
> Similarly, int f(), *f(), (*f)();
declare
> a function returning an integer, a function returning a pointer to an
> integer, a pointer to a function returning an integer. int *api[10], > (*pai)[10];
declare an array of pointers to integers, and a pointer to
> an array of integers.
>
> In all these cases the declaration of a
> variable resembles its usage in an expression whose type is the one
> named at the head of the declaration.
>
> An accident of syntax contributed to the perceived complexity of the
> language. The indirection operator, spelled * in C, is syntactically a
> unary prefix operator, just as in BCPL and B. This works well in
> simple expressions, but in more complex cases, parentheses are
> required to direct the parsing. For example, to distinguish
> indirection through the value returned by a function from calling a
> function designated by a pointer, one writes *fp() and (*pf)()
> respectively. The style used in expressions carries through to
> declarations, so the names might be declared
>
> int *fp(); int (*pf)();
>
> In more ornate but still realistic cases,
> things become worse: int *(*pfp)();
is a pointer to a function
> returning a pointer to an integer.
>
> There are two effects occurring.
> Most important, C has a relatively rich set of ways of describing
> types (compared, say, with Pascal). Declarations in languages as
> expressive as C—Algol 68, for example—describe objects equally hard to
> understand, simply because the objects themselves are complex. A
> second effect owes to details of the syntax. Declarations in C must be
> read in an `inside-out' style that many find difficult to grasp.
> Sethi [Sethi 81] observed that many of the nested
> declarations and expressions would become simpler if the indirection
> operator had been taken as a postfix operator instead of prefix, but
> by then it was too late to change.
Solution 2 - C++
The reason is clearer if you write it like this:
int x, *y;
That is, both x and *y are ints. Thus y is an int *.
Solution 3 - C++
That is a language decision that predates C++, as C++ inherited it from C. I once heard that the motivation was that the declaration and the use would be equivalent, that is, given a declaration int *p;
the expression *p
is of type int
in the same way that with int i;
the expression i
is of type int
.
Solution 4 - C++
Because the committee, and those that developed C++ in the decades before its standardisation, decided that *
should retain its original three meanings:
- A pointer type
- The dereference operator
- Multiplication
You're right to suggest that the multiple meanings of *
(and, similarly, &
) are confusing. I've been of the opinion for some years that it they are a significant barrier to understanding for language newcomers.
Why not choose another symbol for C++?
Backwards-compatibility is the root cause... best to re-use existing symbols in a new context than to break C programs by translating previously-not-operators into new meanings.
Why not choose another symbol for C?
It's impossible to know for sure, but there are several arguments that can be — and have been — made. Foremost is the idea that:
> when [an] identifier appears in an expression of the same form as the declarator, it yields an object of the specified type. {K&R, p216}
This is also why C programmers tend to[citation needed] prefer aligning their asterisks to the right rather than to the left, i.e.:
int *ptr1; // roughly C-style
int* ptr2; // roughly C++-style
though both varieties are found in programs of both languages, varyingly.
Solution 5 - C++
Page 65 of Expert C Programming: Deep C Secrets includes the following: And then, there is the C philosophy that the declaration of an object should look like its use.
Page 216 of The C Programming Language, 2nd edition (aka K&R) includes: A declarator is read as an assertion that when its identifier appears in an expression of the same form as the declarator, it yields an object of the specified type.
I prefer the way van der Linden puts it.
Solution 6 - C++
Haha, I feel your pain, I had the exact same problem.
I thought a pointer should be declared as &int
because it makes sense that a pointer is an address of something.
After a while I thought for myself, every type in C has to be read backwards, like
int * const a
is for me
a constant something, when dereferenced equals an int
.
Something that has to be dereferenced, has to be a pointer.