Where you can and cannot declare new variables in C?

CDeclaration

C Problem Overview


I heard (probably from a teacher) that one should declare all variables on top of the program/function, and that declaring new ones among the statements could cause problems.

But then I was reading K&R and I came across this sentence: "Declarations of variables (including initializations) may follow the left brace that introduces any compound statement, not just the one that begins a function". He follows with an example:

if (n > 0){
    int i;
    for (i=0;i<n;i++)
    ...
}

I played a bit with the concept, and it works even with arrays. For example:

int main(){
    int x = 0 ;

    while (x<10){
        if (x>5){
            int y[x];
            y[0] = 10;
            printf("%d %d\n",y[0],y[4]);
        }
        x++;
    }
}

So when exactly I am not allowed to declare variables? For example, what if my variable declaration is not right after the opening brace? Like here:

int main(){
    int x = 10;

    x++;
    printf("%d\n",x);

    int z = 6;
    printf("%d\n",z);
}

Could this cause trouble depending on the program/machine?

C Solutions


Solution 1 - C

I also often hear that putting variables at the top of the function is the best way to do things, but I strongly disagree. I prefer to confine variables to the smallest scope possible so they have less chance to be misused and so I have less stuff filling up my mental space in each line on the program.

While all versions of C allow lexical block scope, where you can declare the variables depends of the version of the C standard that you are targeting:

C99 onwards or C++

Modern C compilers such as gcc and clang support the C99 and C11 standards, which allow you to declare a variable anywhere a statement could go. The variable's scope starts from the point of the declaration to the end of the block (next closing brace).

if( x < 10 ){
   printf("%d", 17);  // z is not in scope in this line
   int z = 42;
   printf("%d", z);   // z is in scope in this line
}

You can also declare variables inside for loop initializers. The variable will only exist only inside the loop.

for(int i=0; i<10; i++){
    printf("%d", i);
}
ANSI C (C90)

If you are targeting the older ANSI C standard, then you are limited to declaring variables immediately after an opening brace1.

This doesn't mean you have to declare all your variables at the top of your functions though. In C you can put a brace-delimited block anywhere a statement could go (not just after things like if or for) and you can use this to introduce new variable scopes. The following is the ANSI C version of the previous C99 examples:

if( x < 10 ){
   printf("%d", 17);  // z is not in scope in this line

   {
       int z = 42;
       printf("%d", z);   // z is in scope in this line
   }
}

{int i; for(i=0; i<10; i++){
    printf("%d", i);
}}

1 Note that if you are using gcc you need to pass the --pedantic flag to make it actually enforce the C90 standard and complain that the variables are declared in the wrong place. If you just use -std=c90 it makes gcc accept a superset of C90 which also allows the more flexible C99 variable declarations.

Solution 2 - C

missingno covers what ANSI C allows, but he doesn't address why your teachers told you to declare your variables at the top of your functions. Declaring variables in odd places can make your code harder to read, and that can cause bugs.

Take the following code as an example.

#include <stdio.h>

int main() {
    int i, j;
    i = 20;
    j = 30;
    
    printf("(1) i: %d, j: %d\n", i, j);
    
    {
        int i;
        i = 88;
        j = 99;
        printf("(2) i: %d, j: %d\n", i, j);
    }
    
    printf("(3) i: %d, j: %d\n", i, j);
    
    return 0;
}

As you can see, I've declared i twice. Well, to be more precise, I've declared two variables, both with the name i. You might think this would cause an error, but it doesn't, because the two i variables are in different scopes. You can see this more clearly when you look at the output of this function.

(1) i: 20, j: 30
(2) i: 88, j: 99
(3) i: 20, j: 99

First, we assign 20 and 30 to i and j respectively. Then, inside the curly braces, we assign 88 and 99. So, why then does the j keep its value, but i goes back to being 20 again? It's because of the two different i variables.

Between the inner set of curly braces the i variable with the value 20 is hidden and inaccessible, but since we have not declared a new j, we are still using the j from the outer scope. When we leave the inner set of curly braces, the i holding the value 88 goes away, and we again have access to the i with the value 20.

Sometimes this behavior is a good thing, other times, maybe not, but it should be clear that if you use this feature of C indiscriminately, you can really make your code confusing and hard to understand.

Solution 3 - C

If your compiler allows it then its fine to declare anywhere you want. In fact the code is more readable (IMHO) when you declare the variable where you use instead of at the top of a function because it makes it easier to spot errors e.g. forgetting to initialize the variable or accidently hiding the variable.

Solution 4 - C

A post shows the following code:

//C99
printf("%d", 17);
int z=42;
printf("%d", z);

//ANSI C
printf("%d", 17);
{
    int z=42;
    printf("%d", z);
}

and I think the implication is that these are equivalent. They are not. If int z is placed at the bottom of this code snippet, it causes a redefinition error against the first z definition but not against the second.

However, multiple lines of:

//C99
for(int i=0; i<10; i++){}

does work. Showing the subtlety of this C99 rule.

Personally, I passionately shun this C99 feature.

The argument that it narrows the scope of a variable is false, as shown by these examples. Under the new rule, you cannot safely declare a variable until you have scanned the entire block, whereas formerly you only needed to understand what was going on at the head of each block.

Solution 5 - C

As per the The C Programming Language By K&R -

In C, all variables must be declared before they are used, usually at the beginning of the function before any executable statements.

Here you can see word usually it is not must..

Solution 6 - C

With clang and gcc, I encountered major issues with the following. gcc version 8.2.1 20181011 clang version 6.0.1

  {
    char f1[]="This_is_part1 This_is_part2";
    char f2[64]; char f3[64];
    sscanf(f1,"%s %s",f2,f3);      //split part1 to f2, part2 to f3 
  }

neither compiler liked f1,f2 or f3, to be within the block. I had to relocate f1,f2,f3 to the function definition area. the compiler did not mind the definition of an integer with the block.

Solution 7 - C

Internally all variables local to a function are allocated on a stack or inside CPU registers, and then the generated machine code swaps between the registers and the stack (called register spill), if compiler is bad or if CPU doesn't have enough registers to keep all the balls juggling in the air.

To allocate stuff on stack, CPU has two special registers, one called Stack Pointer (SP) and another -- Base Pointer (BP) or frame pointer (meaning the stack frame local to the current function scope). SP points inside the current location on a stack, while BP points to the working dataset (above it) and the function arguments (below it). When function is invoked, it pushes the BP of the caller/parent function onto the stack (pointed by SP), and sets the current SP as the new BP, then increases SP by the number of bytes spilled from registers onto stack, does computation, and on return, it restores its parent's BP, by poping it from the stack.

Generally, keeping your variables inside their own {}-scope could speedup compilation and improve the generated code by reducing the size of the graph the compiler has to walk to determine which variables are used where and how. In some cases (especially when goto is involved) compiler can miss the fact the variable wont be used anymore, unless you explicitly tell compiler its use scope. Compilers could have time/depth limit to search the program graph.

Compiler could place variables declared near each other to the same stack area, which means loading one will preload all other into cache. Same way, declaring variable register, could give compiler a hint that you want to avoid said variable being spilled on stack at all costs.

Strict C99 standard requires explicit { before declarations, while extensions introduced by C++ and GCC allow declaring vars further into the body, which complicates goto and case statements. C++ further allows declaring stuff inside for loop initialization, which is limited to the scope of the loop.

Last but not least, for another human being reading your code, it would be overwhelming when he sees the top of a function littered with half a hundred variables declarations, instead of them localized at their use places. It also makes easier to comment out their use.

TLDR: using {} to explicitly state variables scope can help both compiler and human reader.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionDaniel ScoccoView Question on Stackoverflow
Solution 1 - ChugomgView Answer on Stackoverflow
Solution 2 - ChaydenmuhlView Answer on Stackoverflow
Solution 3 - CAndersKView Answer on Stackoverflow
Solution 4 - Cuser2073625View Answer on Stackoverflow
Solution 5 - CGagandeep kaurView Answer on Stackoverflow
Solution 6 - CLeslie SatensteinView Answer on Stackoverflow
Solution 7 - CSmugLispWeenieView Answer on Stackoverflow