C/C++ int[] vs int* (pointers vs. array notation). What is the difference?
C++CArraysPointersStandardsC++ Problem Overview
I know that arrays in C are just pointers to sequentially stored data. But what differences imply the difference in notation [] and *. I mean in ALL possible usage context. For example:
char c[] = "test";
if you provide this instruction in a function body it will allocate the string on a stack while
char* c = "test";
will point to a data (readonly) segment.
Can you list all the differences between these two notations in ALL usage contexts to form a clear general view.
C++ Solutions
Solution 1 - C++
According to the C99 standard:
> An array type describes a contiguously allocated nonempty set of > objects with a particular member object type, called the element > type.
> 36) Array types are characterized by their element type and by
> the number of elements in the array. An array type is said to be
> derived from its element type, and if its element type is T
, the array
> type is sometimes called array of T
. The construction of an array
> type from an element type is called array type derivation.
> A pointer type may be derived from a function type, an object type, or
> an incomplete type, called the referenced type. A pointer type
> describes an object whose value provides a reference to an entity of
> the referenced type. A pointer type derived from the referenced type T
> is sometimes referred to as a pointer to T
. The construction of a pointer
> type from a referenced type is called pointer type derivation.
According to the standard declarations…
char s[] = "abc", t[3] = "abc";
char s[] = { 'a', 'b', 'c', '\0' }, t[] = { 'a', 'b', 'c' };
…are identical. The contents of the arrays are modifiable. On the other hand, the declaration…
const char *p = "abc";
…defines p with the type as pointer to constant char
and initializes it to point to an object with type constant array of char
(in C++) with length 4 whose elements are initialized with a character string literal. If an attempt is made to use p
to modify the contents of the array, the behavior is undefined.
According to 6.3.2.1 Array subscripting dereferencing and array subscripting are identical:
> The definition of the subscript operator []
is that E1[E2]
is
> identical to (*((E1)+(E2)))
.
The differences of arrays vs. pointers are:
- pointer has no information of the memory size behind it (there is no portable way to get it)
- an array of incomplete type cannot be constructed
- a pointer type may be derived from a an incomplete type
- a pointer can define a recursive structure (this one is the consequence of the previous two)
More helpful information on the subject can be found at http://www.cplusplus.com/forum/articles/9/
Solution 2 - C++
char c[] = "test";
This will create an array containing the string test so you can modify/change any character, say
c[2] = 'p';
but,
char * c = "test"
It is a string literal -- it's a const char.
So doing any modification to this string literal gives us segfault. So
c[2] = 'p';
is illegal now and gives us segfault.
Solution 3 - C++
char []
denotes the type "array of unknown bound of char", while char *
denotes the type "pointer to char". As you've observed, when a definition of a variable of type "array of unknown bound of char" is initialised with a string literal, the type is converted to "array[N] of char" where N is the appropriate size. The same applies in general to initialisation from array aggregate:
int arr[] = { 0, 1, 2 };
arr is converted to type "array[3] of int".
In a user-defined type definition (struct
, class
or union
), array-of-unknown-bound types are prohibited in C++, although in some versions of C they are allowed as the last member of a struct, where they can be used to access allocated memory past the end of the struct; this usage is called "flexible arrays".
Recursive type construction is another difference; one can construct pointers to and arrays of char *
(e.g. char **
, char (*)[10]
) but this is illegal for arrays of unknown bound; one cannot write char []*
or char [][10]
(although char (*)[]
and char [10][]
are fine).
Finally, cv-qualification operates differently; given typedef char *ptr_to_char
and typedef char array_of_unknown_bound_of_char[]
, cv-qualifiying the pointer version will behave as expected, while cv-qualifying the array version will migrate the cv-qualification to the element type: that is, const array_of_unknown_bound_of_char
is equivalent to const char []
and not the fictional char (const) []
. This means that in a function definition, where array-to-pointer decay operates on the arguments prior to constructing the prototype,
void foo (int const a[]) {
a = 0;
}
is legal; there is no way to make the array-of-unknown-bound parameter non-modifiable.
Solution 4 - C++
The whole lot becomes clear if you know that declaring a pointer variable does not create the type of variable, it points at. It creates a pointer variable.
So, in practice, if you need a string then you need to specify an array of characters and a pointer can be used later on.
Solution 5 - C++
Actually arrays are equivalent to constant pointers.
Also, char c[] allocates memory for the array, whose base address is c itself. No separate memory is allocated for storing that address.
Writing char *c allocates memory for the string whose base address is stored in c. Also, a separate memory location is used to store c.