How to explain C pointers (declaration vs. unary operators) to a beginner?

CPointers

C Problem Overview


I have had the recent pleasure to explain pointers to a C programming beginner and stumbled upon the following difficulty. It might not seem like an issue at all if you already know how to use pointers, but try to look at the following example with a clear mind:

int foo = 1;
int *bar = &foo;
printf("%p\n", (void *)&foo);
printf("%i\n", *bar);

To the absolute beginner the output might be surprising. In line 2 he/she had just declared *bar to be &foo, but in line 4 it turns out *bar is actually foo instead of &foo!

The confusion, you might say, stems from the ambiguity of the * symbol: In line 2 it is used to declare a pointer. In line 4 it is used as an unary operator which fetches the value the pointer points at. Two different things, right?

However, this "explanation" doesn't help a beginner at all. It introduces a new concept by pointing out a subtle discrepancy. This can't be the right way to teach it.

So, how did Kernighan and Ritchie explain it?

>The unary operator * is the indirection or dereferencing operator; when applied to a pointer, it accesses the object the pointer points to. […] > > The declaration of the pointer ip, int *ip is intended as a mnemonic; it says that the expression *ip is an int. The syntax of the declaration for a variable mimics the syntax of expressions in which the variable might appear.

int *ip should be read like "*ip will return an int"? But why then doesn't the assignment after the declaration follow that pattern? What if a beginner wants to initialize the variable? int *ip = 1 (read: *ip will return an int and the int is 1) won't work as expected. The conceptual model just doesn't seem coherent. Am I missing something here?


Edit: It tried to summarize the answers here.

C Solutions


Solution 1 - C

The reason why the shorthand:

int *bar = &foo;

in your example can be confusing is that it's easy to misread it as being equivalent to:

int *bar;
*bar = &foo;    // error: use of uninitialized pointer bar!

when it actually means:

int *bar;
bar = &foo;

Written out like this, with the variable declaration and assignment separated, there is no such potential for confusion, and the use ↔ declaration parallelism described in your K&R quote works perfectly:

  • The first line declares a variable bar, such that *bar is an int.

  • The second line assigns the address of foo to bar, making *bar (an int) an alias for foo (also an int).

When introducing C pointer syntax to beginners, it may be helpful to initially stick to this style of separating pointer declarations from assignments, and only introduce the combined shorthand syntax (with appropriate warnings about its potential for confusion) once the basic concepts of pointer use in C have been adequately internalized.

Solution 2 - C

For your student to understand the meaning of the * symbol in different contexts, they must first understand that the contexts are indeed different. Once they understand that the contexts are different (i.e. the difference between the left hand side of an assignment and a general expression) it isn't too much of a cognitive leap to understand what the differences are.

Firstly explain that the declaration of a variable cannot contain operators (demonstrate this by showing that putting a - or + symbol in a variable declaration simply causes an error). Then go on to show that an expression (i.e. on the right hand side of an assignment) can contain operators. Make sure the student understands that an expression and a variable declaration are two completely different contexts.

When they understand that the contexts are different, you can go on to explain that when the * symbol is in a variable declaration in front of the variable identifier, it means 'declare this variable as a pointer'. Then you can explain that when used in an expression (as a unary operator) the * symbol is the 'dereference operator' and it means 'the value at the address of' rather than its earlier meaning.

To truly convince your student, explain that the creators of C could have used any symbol to mean the dereference operator (i.e. they could have used @ instead) but for whatever reason they made the design decision to use *.

All in all, there's no way around explaining that the contexts are different. If the student doesn't understand the contexts are different, they can't understand why the * symbol can mean different things.

Solution 3 - C

Short on declarations

It is nice to know the difference between declaration and initialization. We declare variables as types and initialize them with values. If we do both at the same time we often call it a definition.

1. int a; a = 42;

int a;
a = 42;

We declare an int named a. Then we initialize it by giving it a value 42.

2. int a = 42;

We declare and int named a and give it the value 42. It is initialized with 42. A definition.

3. a = 43;

When we use the variables we say we operate on them. a = 43 is an assignment operation. We assign the number 43 to the variable a.

By saying

int *bar;

we declare bar to be a pointer to an int. By saying

int *bar = &foo;

we declare bar and initialize it with the address of foo.

After we have initialized bar we can use the same operator, the asterisk, to access and operate on the value of foo. Without the operator we access and operate on the address the pointer is pointing to.

Besides that I let the picture speak.

What

A simplified ASCIIMATION on what is going on. (And here a player version if you want to pause etc.)

>          ASCIIMATION

Solution 4 - C

The 2nd statement int *bar = &foo; can be viewed pictorially in memory as,

   bar           foo
  +-----+      +-----+
  |0x100| ---> |  1  |
  +-----+      +-----+ 
   0x200        0x100

Now bar is a pointer of type int containing address & of foo. Using the unary operator * we deference to retrieve the value contained in 'foo' by using the pointer bar.

EDIT: My approach with beginners is to explain the memory address of a variable i.e

Memory Address: Every variable has an address associated with it provided by the OS. In int a;, &a is address of variable a.

Continue explaining basic types of variables in C as,

Types of variables: Variables can hold values of respective types but not addresses.

int a = 10; float b = 10.8; char ch = 'c'; `a, b, c` are variables. 

Introducing pointers: As said above variables, for example

 int a = 10; // a contains value 10
 int b; 
 b = &a;      // ERROR

It is possible assigning b = a but not b = &a, since variable b can hold value but not address, Hence we require Pointers.

Pointer or Pointer variables : If a variable contains an address it is known as a pointer variable. Use * in the declaration to inform that it is a pointer.

• Pointer can hold address but not value
• Pointer contains the address of an existing variable.
• Pointer points to an existing variable

Solution 5 - C

Looking at the answers and comments here, there seems to be a general agreement that the syntax in question can be confusing for a beginner. Most of them propose something along these lines:

  • Before showing any code, use diagrams, sketches or animations to illustrate how pointers work.
  • When presenting the syntax, explain the two different roles of the asterisk symbol. Many tutorials are missing or evading that part. Confusion ensues ("When you break an initialized pointer declaration up into a declaration and a later assignment, you have to remember to remove the *" – comp.lang.c FAQ) I hoped to find an alternative approach, but I guess this is the way to go.

You may write int* bar instead of int *bar to highlight the difference. This means you won't follow the K&R "declaration mimics use" approach, but the Stroustrup C++ approach:

We don't declare *bar to be an integer. We declare bar to be an int*. If we want to initialize a newly created variable in the same line, it is clear that we are dealing with bar, not *bar. int* bar = &foo;

The drawbacks:

  • You have to warn your student about the multiple pointer declaration issue (int* foo, bar vs int *foo, *bar).
  • You have to prepare them for a world of hurt. Many programmers want to see the asterisk adjacent to the name of the variable, and they will take great lengths to justify their style. And many style guides enforce this notation explicitly (Linux kernel coding style, NASA C Style Guide, etc.).

Edit: A different approach that has been suggested, is to go the K&R "mimic" way, but without the "shorthand" syntax (see here). As soon as you omit doing a declaration and an assignment in the same line, everything will look much more coherent.

However, sooner or later the student will have to deal with pointers as function arguments. And pointers as return types. And pointers to functions. You will have to explain the difference between int *func(); and int (*func)();. I think sooner or later things will fall apart. And maybe sooner is better than later.

Solution 6 - C

There's a reason why K&R style favours int *p and Stroustrup style favours int* p; both are valid (and mean the same thing) in each language, but as Stroustrup put it:

> The choice between "int* p;" and "int *p;" is not about right and wrong, but about style and emphasis. C emphasized expressions; declarations were often considered little more than a necessary evil. C++, on the other hand, has a heavy emphasis on types.

Now, since you're trying to teach C here, that would suggest you should be emphasising expressions more that types, but some people can more readily grok one emphasis quicker than the other, and that's about them rather than the language.

Therefore some people will find it easier to start with the idea that an int* is a different thing than an int and go from there.

If someone does quickly grok the way of looking at it that uses int* bar to have bar as a thing that is not an int, but a pointer to int, then they'll quickly see that *bar is doing something to bar, and the rest will follow. Once you've that done you can later explain why C coders tend to prefer int *bar.

Or not. If there was one way that everybody first understood the concept you wouldn't have had any problems in the first place, and the best way to explain it to one person will not necessarily be the best way to explain it to another.

Solution 7 - C

tl;dr:

> Q: How to explain C pointers (declaration vs. unary operators) to a beginner? > --

A: don't. Explain pointers to the beginner, and show them how to represent their pointer concepts in C syntax after.


> I have had the recent pleasure to explain pointers to a C programming beginner and stumbled upon the following difficulty.

IMO the C syntax isn't awful, but isn't wonderful either: it's neither a great hindrance if you already understand pointers, nor any help in learning them.

Therefore: start by explaining pointers, and make sure they really understand them:

  • Explain them with box-and-arrow diagrams. You can do it without hex addresses, if they're not relevant, just show the arrows pointing either to another box, or to some nul symbol.

  • Explain with pseudocode: just write address of foo and value stored at bar.

  • Then, when your novice understands what pointers are, and why, and how to use them; then show the mapping onto C syntax.

I suspect the reason the K&R text doesn't provide a conceptual model is that they already understood pointers, and probably assumed every other competent programmer at the time did too. The mnemonic is just a reminder of the mapping from the well-understood concept, to the syntax.

Solution 8 - C

This issue is somewhat confusing when starting to learn C.

Here are the basic principles that might help you get started:

  1. There are only a few basic types in C:

    • char: an integer value with the size of 1 byte.

    • short: an integer value with the size of 2 bytes.

    • long: an integer value with the size of 4 bytes.

    • long long: an integer value with the size of 8 bytes.

    • float: a non-integer value with the size of 4 bytes.

    • double: a non-integer value with the size of 8 bytes.

    Note that the size of each type is generally defined by the compiler and not by the standard.

    The integer types short, long and long long are usually followed by int.

    It is not a must, however, and you can use them without the int.

    Alternatively, you can just state int, but that might be interpreted differently by different compilers.

    So to summarize this:

    • short is the same as short int but not necessarily the same as int.

    • long is the same as long int but not necessarily the same as int.

    • long long is the same as long long int but not necessarily the same as int.

    • On a given compiler, int is either short int or long int or long long int.

  2. If you declare a variable of some type, then you can also declare another variable pointing to it.

    For example:

    int a;

    int* b = &a;

    So in essence, for each basic type, we also have a corresponding pointer type.

    For example: short and short*.

    There are two ways to "look at" variable b (that's what probably confuses most beginners):

    • You can consider b as a variable of type int*.

    • You can consider *b as a variable of type int.

    Hence, some people would declare int* b, whereas others would declare int *b.

    But the fact of the matter is that these two declarations are identical (the spaces are meaningless).

    You can use either b as a pointer to an integer value, or *b as the actual pointed integer value.

    You can get (read) the pointed value: int c = *b.

    And you can set (write) the pointed value: *b = 5.

  3. A pointer can point to any memory address, and not only to the address of some variable that you have previously declared. However, you must be careful when using pointers in order to get or set the value located at the pointed memory address.

    For example:

    int* a = (int*)0x8000000;

    Here, we have variable a pointing to memory address 0x8000000.

    If this memory address is not mapped within the memory space of your program, then any read or write operation using *a will most likely cause your program to crash, due to a memory access violation.

    You can safely change the value of a, but you should be very careful changing the value of *a.

  4. Type void* is exceptional in the fact that it doesn't have a corresponding "value type" which can be used (i.e., you cannot declare void a). This type is used only as a general pointer to a memory address, without specifying the type of data that resides in that address.

Solution 9 - C

Perhaps stepping through it just a bit more makes it easier:

#include <stdio.h>

int main()
{
    int foo = 1;
    int *bar = &foo;
    printf("%i\n", foo);
    printf("%p\n", &foo);
    printf("%p\n", (void *)&foo);
    printf("%p\n", &bar);
    printf("%p\n", bar);
    printf("%i\n", *bar);
    return 0;
}

Have them tell you what they expect the output to be on each line, then have them run the program and see what turns up. Explain their questions (the naked version in there will certainly prompt a few -- but you can worry about style, strictness and portability later). Then, before their mind turns to mush from overthinking or they become an after-lunch-zombie, write a function that takes a value, and the same one that takes a pointer.

In my experience its getting over that "why does this print that way?" hump, and then immediately showing why this is useful in function parameters by hands-on toying (as a prelude to some basic K&R material like string parsing/array processing) that makes the lesson not just make sense but stick.

The next step is to get them to explain to you how i[0] relates to &i. If they can do that, they won't forget it and you can start talking about structs, even a little ahead of time, just so it sinks in.

The recommendations above about boxes and arrows is good also, but it can also wind up digressing into a full-blown discussion about how memory works -- which is a talk that must happen at some point, but can distract from the point immediately at hand: how to interpret pointer notation in C.

Solution 10 - C

The type of the expression *bar is int; thus, the type of the variable (and expression) bar is int *. Since the variable has pointer type, its initializer must also have pointer type.

There is an inconsistency between pointer variable initialization and assignment; that's just something that has to be learned the hard way.

Solution 11 - C

int *bar = &foo;

Question 1: What is bar?

Ans : It is a pointer variable(to type int). A pointer should point to some valid memory location and later should be dereferenced(*bar) using a unary operator * in order to read the value stored in that location.

Question 2: What is &foo?

Ans: foo is a variable of type int.which is stored in some valid memory location and that location we get it from the operator & so now what we have is some valid memory location &foo.

So both put together i.e what the pointer needed was a valid memory location and that is got by &foo so the initialization is good.

Now pointer bar is pointing to valid memory location and the value stored in it can be got be dereferencing it i.e. *bar

Solution 12 - C

I'd rather read it as the first * apply to int more than bar.

int  foo = 1;           // foo is an integer (int) with the value 1
int* bar = &foo;        // bar is a pointer on an integer (int*). it points on foo. 
                        // bar value is foo address
                        // *bar value is foo value = 1

printf("%p\n", &foo);   // print the address of foo
printf("%p\n", bar);    // print the address of foo
printf("%i\n", foo);    // print foo value
printf("%i\n", *bar);   // print foo value

Solution 13 - C

You should point out a beginner that * has different meaning in the declaration and the expression. As you know, * in the expression is a unary operator, and * In the declaration is not an operator and just a kind of syntax combining with type to let compiler know that it is a pointer type. it is better to say a beginner, "* has different meaning. For understanding the meaning of *, you should find where * is used"

Solution 14 - C

I think the devil is in the space.

I would write (not only for the beginner, but for myself as well): int* bar = &foo; instead of int *bar = &foo;

this should make evident what is the relationship between syntax and semantics

Solution 15 - C

It was already noted that * has multiple roles.

There's another simple idea that may help a beginner to grasp things:

Think that "=" has multiple roles as well.

When assignment is used on the same line with declaration, think of it as a constructor call, not an arbitrary assignment.

When you see:

int *bar = &foo;

Think that it's nearly equivalent to:

int *bar(&foo);

Parentheses take precendence over asterisk, so "&foo" is much more easily intuitively attributed to "bar" rather than "*bar".

Solution 16 - C

If the problem is the syntax, it may be helpful to show equivalent code with template/using.

template<typename T>
using ptr = T*;

This can then be used as

ptr<int> bar = &foo;

After that, compare the normal/C syntax with this C++ only approach. This is also useful for explaining const pointers.

Solution 17 - C

The source of confusion arises from the fact that * symbol can have different meanings in C, depending upon the fact in which it is used. To explain the pointer to a beginner, the meaning of * symbol in different context should be explained.

In the declaration

int *bar = &foo;  

the * symbol is not the indirection operator. Instead, it helps to specify the type of bar informing the compiler that bar is a pointer to an int. On the other hand, when it appears in a statement the * symbol (when used as a unary operator) performs indirection. Therefore, the statement

*bar = &foo;

would be wrong as it assigns the address of foo to the object that bar points to, not to bar itself.

Solution 18 - C

"maybe writing it as int* bar makes it more obvious that the star is actually part of the type, not part of the identifier." So I do. And I say, that it is somesing like Type, but only for one pointer name.

" Of course this runs you into different problems with unintuitive stuff like int* a, b."

Solution 19 - C

I saw this question a few days ago, and then happened to be reading the explanation of Go's type declaration on the Go Blog. It starts off by giving an account of C type declarations, which seems like a useful resource to add to this thread, even though I think that there are more complete answers already given.

> C took an unusual and clever approach to declaration syntax. Instead of describing the types with special syntax, one writes an expression involving the item being declared, and states what type that expression will have. Thus > > int x; > > declares x to be an int: the expression 'x' will have type int. In general, to figure out how to write the type of a new variable, write an expression involving that variable that evaluates to a basic type, then put the basic type on the left and the expression on the right. > > Thus, the declarations > > int *p; > int a[3]; > state that p is a pointer to int because '*p' has type int, and that a is an array of ints because a[3] (ignoring the particular index value, which is punned to be the size of the array) has type int.

(It goes on to describe how to extend this understanding to function pointers etc)

This is a way that I've not thought about it before, but it seems like a pretty straightforward way of accounting for the overloading of the syntax.

Solution 20 - C

Here you have to use, understand and explain the compiler logic, not the human logic (I know, you are a human, but here you must mimic the computer ...).

When you write

int *bar = &foo;

the compiler groups that as

{ int * } bar = &foo;

That is : here is a new variable, its name is bar, its type is pointer to int, and its initial value is &foo.

And you must add : the = above denotes an initialization not an affectation, whereas in following expressions *bar = 2; it is an affectation

Edit per comment:

Beware : in case of multiple declaration the * is only related to the following variable :

int *bar = &foo, b = 2;

bar is a pointer to int initialized by the address of foo, b is an int initialized to 2, and in

int *bar=&foo, **p = &bar;

bar in still pointer to int, and p is a pointer to a pointer to an int initialized to the address or bar.

Solution 21 - C

Basically Pointer is not a array indication. Beginner easily thinks that pointer looks like array. most of string examples using the

"char *pstr" it's similar looks like

"char str[80]"

But, Important things , Pointer is treated as just integer in the lower level of compiler.

Let's look examples::

#include <stdio.h>
#include <stdlib.h>

int main(int argc, char **argv, char **env)
{
	char str[] = "This is Pointer examples!"; // if we assume str[] is located in 0x80001000 address

	char *pstr0 = str;   // or this will be using with
	// or
	char *pstr1 = &str[0];

	unsigned int straddr = (unsigned int)pstr0;

	printf("Pointer examples: pstr0 = %08x\n", pstr0);
	printf("Pointer examples: &str[0] = %08x\n", &str[0]);
	printf("Pointer examples: str = %08x\n", str);
	printf("Pointer examples: straddr = %08x\n", straddr);
	printf("Pointer examples: str[0] = %c\n", str[0]);

	return 0;
}

Results will like this 0x2a6b7ed0 is address of str[]

~/work/test_c_code$ ./testptr
Pointer examples: pstr0 = 2a6b7ed0
Pointer examples: &str[0] = 2a6b7ed0
Pointer examples: str = 2a6b7ed0
Pointer examples: straddr = 2a6b7ed0
Pointer examples: str[0] = T

So, Basically, Keep in mind Pointer is some kind of Integer. presenting the Address.

Solution 22 - C

I would explain that ints are objects, as are floats etc. A pointer is a type of object whose value represents an address in memory ( hence why a pointer defaults to NULL ).

When you first declare a pointer you use the type-pointer-name syntax. It's read as an "integer-pointer called name that can point to the address of any integer object". We only use this syntax during decleration, similar to how we declare an int as 'int num1' but we only use 'num1' when we want to use that variable, not 'int num1'.

int x = 5; // an integer object with a value of 5

int * ptr; // an integer with a value of NULL by default

To make a pointer point to an address of an object we use the '&' symbol which can be read as "the address of".

ptr = &x; // now value is the address of 'x'

As the pointer is only the address of the object, to get the actual value held at that address we must use the '*' symbol which when used before a pointer means "the value at the address pointed to by".

std::cout << *ptr; // print out the value at the address

You can explain briefly that '' is an 'operator' that returns different results with different types of objects. When used with a pointer, the '' operator doesn't mean "multiplied by" anymore.

It helps to draw a diagram showing how a variable has a name and a value and a pointer has an address (the name) and a value and show that the value of the pointer will be the address of the int.

Solution 23 - C

A pointer is just a variable used to store addresses.

Memory in a computer is made up of bytes (A byte consists of 8 bits) arranged in a sequential manner. Each byte has a number associated with it just like index or subscript in an array, which is called the address of the byte. The address of byte starts from 0 to one less than size of memory. For example, say in a 64MB of RAM, there are 64 * 2^20 = 67108864 bytes . Therefore the address of these bytes will start from 0 to 67108863 .

enter image description here

Let’s see what happens when you declare a variable.

int marks;

As we know an int occupies 4 bytes of data (assuming we are using a 32-bit compiler) , so compiler reserves 4 consecutive bytes from memory to store an integer value. The address of the first byte of the 4 allocated bytes is known as the address of the variable marks . Let’s say that address of 4 consecutive bytes are 5004 , 5005 , 5006 and 5007 then the address of the variable marks will be 5004 . enter image description here

Declaring pointer variables

As already said a pointer is a variable that stores a memory address. Just like any other variables you need to first declare a pointer variable before you can use it. Here is how you can declare a pointer variable.

Syntax: data_type *pointer_name;

data_type is the type of the pointer (also known as the base type of the pointer). pointer_name is the name of the variable, which can be any valid C identifier.

Let’s take some examples:

int *ip;

float *fp;

int *ip means that ip is a pointer variable capable of pointing to variables of type int . In other words, a pointer variable ip can store the address of variables of type int only . Similarly, the pointer variable fp can only store the address of a variable of type float . The type of variable (also known as base type) ip is a pointer to int and type of fp is a pointer to float . A pointer variable of type pointer to int can be symbolically represented as ( int * ) . Similarly, a pointer variable of type pointer to float can be represented as ( float * )

After declaring a pointer variable the next step is to assign some valid memory address to it. You should never use a pointer variable without assigning some valid memory address to it, because just after declaration it contains garbage value and it may be pointing to anywhere in the memory. The use of an unassigned pointer may give an unpredictable result. It may even cause the program to crash.

int *ip, i = 10;
float *fp, f = 12.2;

ip = &i;
fp = &f;

Source: thecguru is by far the simplest yet detailed explanation I have ever found.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionarminView Question on Stackoverflow
Solution 1 - CIlmari KaronenView Answer on Stackoverflow
Solution 2 - CPharapView Answer on Stackoverflow
Solution 3 - CMorpfhView Answer on Stackoverflow
Solution 4 - CSunil BojanapallyView Answer on Stackoverflow
Solution 5 - CarminView Answer on Stackoverflow
Solution 6 - CJon HannaView Answer on Stackoverflow
Solution 7 - CUselessView Answer on Stackoverflow
Solution 8 - Cbarak manosView Answer on Stackoverflow
Solution 9 - Czxq9View Answer on Stackoverflow
Solution 10 - CJohn BodeView Answer on Stackoverflow
Solution 11 - CGopiView Answer on Stackoverflow
Solution 12 - CgrorelView Answer on Stackoverflow
Solution 13 - CYongkil KwonView Answer on Stackoverflow
Solution 14 - Crpaulin56View Answer on Stackoverflow
Solution 15 - CmorfizmView Answer on Stackoverflow
Solution 16 - CMI3GuyView Answer on Stackoverflow
Solution 17 - ChaccksView Answer on Stackoverflow
Solution 18 - CПавел БивойноView Answer on Stackoverflow
Solution 19 - CAndy TurnerView Answer on Stackoverflow
Solution 20 - CSerge BallestaView Answer on Stackoverflow
Solution 21 - Ccpplover - Slw EssencialView Answer on Stackoverflow
Solution 22 - Cuser2796283View Answer on Stackoverflow
Solution 23 - CCodyView Answer on Stackoverflow