What is the difference between returning a char* and a char[] from a function?

CStringPointers

C Problem Overview


Why does the first function return the string "Hello, World" but the second function returns nothing. I thought the return value of both of the functions would be undefined since they are returning data that is out of scope.

#include <stdio.h>
// This successfully returns "Hello, World"
char* function1()
{
    char* string = "Hello, World!";
    return string;
}
// This returns nothing
char* function2()
{
    char string[] = "Hello, World!";
    return string; 
}

int main()
{
    char* foo1 = function1();
    printf("%s\n", foo1); // Prints "Hello, World"
    printf("------------\n");
    char* foo2 = function2(); // Prints nothing
    printf("%s\n", foo2);
    return 0;
}

C Solutions


Solution 1 - C

> the second function returns nothing

The string array in the second function:

char string[] = "Hello, World!";

has automatic storage duration. It does not exist after the control flow has returned from the function.

Whereas string in the first function:

char* string = "Hello, World!";

points to a literal string, which has static storage duration. That implies that, the string still exists after returning back from the function. What you are returning from the function is a pointer to this literal string.

Solution 2 - C

The first thing you need to learn about strings is that a string literal is really an array of read-only characters with a lifetime of the full program. That means they will never go out of scope, they will always exist throughout the execution of the program.

What the first function (function1) does is returning a pointer to the first element of such an array.

With the second function (function2) things are a little bit different. Here the variable string is a local variable within the function. As such it will go out of scope and cease to exist once the function returns. With this function you return a pointer to the first element of that array, but that pointer will immediately become invalid since it will point to something which no longer exist. Dereferencing it (which happens when you pass it to printf) will lead to undefined behavior.

Solution 3 - C

A very important thing to remember when coding in C or other stack based languages is that when a function returns, it (and all its local storage) is gone. This means that if you want someone else to be able to see the results of your methods hard work, you have to put it somewhere that will still exist after your method has ceased to, and to do that means you need to get an understanding of where C stores stuff and how.

You probably already know how an array operates in C. It is just a memory address that is incremented by the size of the object and you probably also know that C does not do bounds checking so if you want to access the 11th element of a ten element array, no one is going to stop you, and as long as you don't try to write anything, no harm done. What you may not know is that C extends this idea to the way it uses functions and variables. A function is just a area of memory on a stack that is loaded on demand and the storage for its variables are just offsets from that location. Your function returned a pointer to a local variable, specifically, the address of a location on the stack that holds the 'H' of 'Hello World\n\0' but when then you called another function (the print method) that memory was reused by the print method to do what it needed. You can see this easily enough (DO NOT DO THIS IN PRODUCTION CODE!!!)

char* foo2 = function2(); // Prints nothing
ch = foo2[0];  // Do not do this in live code!
printf("%s\n", foo2);  // stack used by foo2 now used by print()
printf("ch is %c\n", ch);  // will have the value 'H'!
 

Solution 4 - C

>I thought the return value of both of the functions would be undefined since they are returning data that is out of scope.

No. That's not the case.

In function function1 you are returning pointer to a string literal. Returning pointer to a string literal is fine because string literals have static storage duration. But that's not true with automatic local variable.

In function function2 the array string is an automatic local variable and the statement

return string; 

returns a pointer to an automatic local variable. Once the function return, the the variable string will no longer exist. Dereferencing the returned pointer will cause undefined behavior.

Solution 5 - C

"Hello, World!" is a string literal, which has a static storage duration, so the problem is elsewhere. Your first function returns the value of string, which is fine. The second function however returns the address of a local variable (string is the same as &string[0]), resulting in undefined behavior. Your second printf statement could print nothing, or "Hello, World!", or something else entirely. On my machine, the program just gets a segmentation fault.

Always take a look at messages your compiler outputs. For your example, gcc gives:

file.c:12:12: warning: function returns address of local variable [-Wreturn-local-addr]
    return string; 
           ^

which is pretty much self-explanatory.

Solution 6 - C

>I thought the return value of both of the functions would be undefined since they are returning data that is out of scope.

Both functions return a pointer. What matters is the scope of the referent.

In function1, the referent is the string literal "Hello, World!", which has static storage duration. string is a local variable which points to that string, and conceptually, a copy of that pointer is returned (in practice, the compiler will avoid unnecessarily copying the value).

In function2, conceptually the referent is the local array string, which has been automatically sized (at compile time) to be big enough to hold the string literal (including a null terminator, of course), and been initialized with a copy of the string. The function would return a pointer to that array, except that the array has automatic storage duration and thus no longer exists after exiting the function (it is indeed "out of scope", in more familiar terminology). Since this is undefined behaviour, the compiler may in practice do all sorts of things.

>Does that mean that all char* are static?

Again, you need to distinguish between the pointer and the referent. Pointers point at data; they don't themselves "contain" the data.

You have reached a point where you should properly study what arrays and pointers actually are in C - unfortunately, it's a bit of a mess. The best reference I can offer offhand is this, in Q&A format.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionTobsView Question on Stackoverflow
Solution 1 - CネロクView Answer on Stackoverflow
Solution 2 - CSome programmer dudeView Answer on Stackoverflow
Solution 3 - CPaul SmithView Answer on Stackoverflow
Solution 4 - ChaccksView Answer on Stackoverflow
Solution 5 - CDmitry GrigoryevView Answer on Stackoverflow
Solution 6 - CKarl KnechtelView Answer on Stackoverflow