What is the simplest standard conform way to produce a Segfault in C?

CSegmentation FaultIso

C Problem Overview


I think the question says it all. An example covering most standards from C89 to C11 would be helpful. I though of this one, but I guess it is just undefined behaviour:

#include <stdio.h>

int main( int argc, char* argv[] )
{
  const char *s = NULL;
  printf( "%c\n", s[0] );
  return 0;
}

EDIT:

As some votes requested clarification: I wanted to have a program with an usual programming error (the simplest I could think of was an segfault), that is guaranteed (by standard) to abort. This is a bit different to the minimal segfault question, which don't care about this insurance.

C Solutions


Solution 1 - C

raise() can be used to raise a segfault:

raise(SIGSEGV);

Solution 2 - C

A segmentation fault is an implementation defined behavior. The standard does not define how the implementation should deal with undefined behavior and in fact the implementation could optimize out undefined behavior and still be compliant. To be clear, implementation defined behavior is behavior which is not specified by the standard but the implementation should document. Undefined behavior is code that is non-portable or erroneous and whose behavior is unpredictable and therefore can not be relied on.

If we look at the C99 draft standard §3.4.3 undefined behavior which comes under the Terms, definitions and symbols section in paragraph 1 it says (emphasis mine going forward):

>behavior, upon use of a nonportable or erroneous program construct or of erroneous data, for which this International Standard imposes no requirements

and in paragraph 2 says:

>NOTE Possible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message).

If, on the other hand, you simply want a method defined in the standard that will cause a segmentation fault on most Unix-like systems then raise(SIGSEGV) should accomplish that goal. Although, strictly speaking, SIGSEGV is defined as follows:

>SIGSEGV an invalid access to storage

and §7.14 Signal handling <signal.h> says:

>An implementation need not generate any of these signals, except as a result of explicit calls to the raise function. Additional signals and pointers to undeclarable functions, with macro definitions beginning, respectively, with the letters SIG and an uppercase letter or with SIG_ and an uppercase letter,219) may also be specified by the implementation. The complete set of signals, their semantics, and their default handling is implementation-defined; all signal numbers shall be positive.

Solution 3 - C

The standard only mentions undefined behavior. It knows nothing about memory segmentation. Also note that the code that produces the error is not standard-conformant. Your code cannot invoke undefined behavior and be standard conformant at the same time.

Nonetheless, the shortest way to produce a segmentation fault on architectures that do generate such faults would be:

int main()
{
    *(int*)0 = 0;
}

Why is this sure to produce a segfault? Because access to memory address 0 is always trapped by the system; it can never be a valid access (at least not by userspace code.)

Note of course that not all architectures work the same way. On some of them, the above could not crash at all, but rather produce other kinds of errors. Or the statement could be perfectly fine, even, and memory location 0 is accessible just fine. Which is one of the reasons why the standard doesn't actually define what happens.

Solution 4 - C

A correct program doesn't produce a segfault. And you cannot describe deterministic behaviour of an incorrect program.

A "segmentation fault" is a thing that an x86 CPU does. You get it by attempting to reference memory in an incorrect way. It can also refer to a situation where memory access causes a page fault (i.e. trying to access memory that's not loaded into the page tables) and the OS decides that you had no right to request that memory. To trigger those conditions, you need to program directly for your OS and your hardware. It is nothing that is specified by the C language.

Solution 5 - C

If we assume we are not raising a signal calling raise, segmentation fault is likely to come from undefined behavior. Undefined behavior is undefined and a compiler is free to refuse to translate so no answer with undefined is guaranteed to fail on all implementations. Moreover a program which invokes undefined behavior is an erroneous program.

But this one is the shortest I can get that segfault on my system:

main(){main();}

(I compile with gcc and -std=c89 -O0).

And by the way, does this program really invokes undefined bevahior?

Solution 6 - C

 main;

That's it.

Really.

Essentially, what this does is it defines main as a variable. In C, variables and functions are both symbols -- pointers in memory, so the compiler does not distinguish them, and this code does not throw an error.

However, the problem rests in how the system runs executables. In a nutshell, the C standard requires that all C executables have an environment-preparing entrypoint built into them, which basically boils down to "call main".

In this particular case, however, main is a variable, so it is placed in a non-executable section of memory called .bss, intended for variables (as opposed to .text for the code). Trying to execute code in .bss violates its specific segmentation, so the system throws a segmentation fault.

To illustrate, here's (part of) an objdump of the resulting file:

# (unimportant)

Disassembly of section .text:

0000000000001020 <_start>:
    1020:	f3 0f 1e fa          	endbr64 
    1024:	31 ed                	xor    %ebp,%ebp
    1026:	49 89 d1             	mov    %rdx,%r9
    1029:	5e                   	pop    %rsi
    102a:	48 89 e2             	mov    %rsp,%rdx
    102d:	48 83 e4 f0          	and    $0xfffffffffffffff0,%rsp
    1031:	50                   	push   %rax
    1032:	54                   	push   %rsp
    1033:	4c 8d 05 56 01 00 00 	lea    0x156(%rip),%r8        # 1190 <__libc_csu_fini>
    103a:	48 8d 0d df 00 00 00 	lea    0xdf(%rip),%rcx        # 1120 <__libc_csu_init>
    
    # This is where the program should call main
    1041:	48 8d 3d e4 2f 00 00 	lea    0x2fe4(%rip),%rdi      # 402c <main> 
    1048:	ff 15 92 2f 00 00    	callq  *0x2f92(%rip)          # 3fe0 <__libc_start_main@GLIBC_2.2.5>
    104e:	f4                   	hlt    
    104f:	90                   	nop

# (nice things we still don't care about)

Disassembly of section .data:

0000000000004018 <__data_start>:
    ...

0000000000004020 <__dso_handle>:
    4020:	20 40 00             	and    %al,0x0(%rax)
    4023:	00 00                	add    %al,(%rax)
    4025:	00 00                	add    %al,(%rax)
    ...

Disassembly of section .bss:

0000000000004028 <__bss_start>:
    4028:	00 00                	add    %al,(%rax)
    ...

# main is in .bss (variables) instead of .text (code)

000000000000402c <main>:
    402c:	00 00                	add    %al,(%rax)
    ...

# aaand that's it! 

PS: This won't work if you compile to a flat executable. Instead, you will cause undefined behaviour.

Solution 7 - C

On some platforms, a standard-conforming C program can fail with a segmentation fault if it requests too many resources from the system. For instance, allocating a large object with malloc can appear to succeed, but later, when the object is accessed, it will crash.

Note that such a program is not strictly conforming; programs which meet that definition have to stay within each of the minimum implementation limits.

A standard-conforming C program cannot produce a segmentation fault otherwise, because the only other ways are via undefined behavior.

The SIGSEGV signal can be raised explicitly, but there is no SIGSEGV symbol in the standard C library.

(In this answer, "standard-conforming" means: "Uses only the features described in some version of the ISO C standard, avoiding unspecified, implementation-defined or undefined behavior, but not necessarily confined to the minimum implementation limits.")

Solution 8 - C

The simplest form considering the smallest number of characters is:

++*(int*)0;

Solution 9 - C

Most of the answers to this question are talking around the key point, which is: The C standard does not include the concept of a segmentation fault. (Since C99 it includes the signal number SIGSEGV, but it does not define any circumstance where that signal is delivered, other than raise(SIGSEGV), which as discussed in other answers doesn't count.)

Therefore, there is no "strictly conforming" program (i.e. program that uses only constructs whose behavior is fully defined by the C standard, alone) that is guaranteed to cause a segmentation fault.

Segmentation faults are defined by a different standard, POSIX. This program is guaranteed to provoke either a segmentation fault, or the functionally equivalent "bus error" (SIGBUS), on any system that is fully conforming with POSIX.1-2008 including the Memory Protection and Advanced Realtime options, provided that the calls to sysconf, posix_memalign and mprotect succeed. My reading of C99 is that this program has implementation-defined (not undefined!) behavior considering only that standard, and therefore it is conforming but not strictly conforming.

#define _XOPEN_SOURCE 700
#include <sys/mman.h>
#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <errno.h>

int main(void)
{
    size_t pagesize = sysconf(_SC_PAGESIZE);
    if (pagesize == (size_t)-1) {
        fprintf(stderr, "sysconf: %s\n", strerror(errno));
        return 1;
    }
    void *page;
    int err = posix_memalign(&page, pagesize, pagesize);
    if (err || !page) {
        fprintf(stderr, "posix_memalign: %s\n", strerror(err));
        return 1;
    }
    if (mprotect(page, pagesize, PROT_NONE)) {
        fprintf(stderr, "mprotect: %s\n", strerror(errno));
        return 1;
    }
    *(long *)page = 0xDEADBEEF;
    return 0;
}

Solution 10 - C

It's hard to define a method to segmentation fault a program on undefined platforms. A segmentation fault is a loose term that is not defined for all platforms (eg. simple small computers).

Considering only the operating systems that support processes, processes can receive notification that a segmentation fault occurred.

Further, limiting operating systems to 'unix like' OSes, a reliable method for a process to receive a SIGSEGV signal is kill(getpid(),SIGSEGV)

As is the case in most cross platform problems, each platform may (an usually does) have a different definition of seg-faulting.

But to be practical, current mac, lin and win OSes will segfault on

*(int*)0 = 0;

Further, it's not bad behaviour to cause a segfault. Some implementations of assert() cause a SIGSEGV signal which might produce a core file. Very useful when you need to autopsy.

What's worse than causing a segfault is hiding it:

try
{
     anyfunc();
}
catch (...) 
{
     printf("?\n");
}

which hides the origin of an error and all you've got to go on is:

?

.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionmathView Question on Stackoverflow
Solution 1 - CmsamView Answer on Stackoverflow
Solution 2 - CShafik YaghmourView Answer on Stackoverflow
Solution 3 - CNikos C.View Answer on Stackoverflow
Solution 4 - CKerrek SBView Answer on Stackoverflow
Solution 5 - CouahView Answer on Stackoverflow
Solution 6 - CTheSola10View Answer on Stackoverflow
Solution 7 - CKazView Answer on Stackoverflow
Solution 8 - CEnock Gomes NetoView Answer on Stackoverflow
Solution 9 - CzwolView Answer on Stackoverflow
Solution 10 - CeffbiaeView Answer on Stackoverflow