What makes a C standard library function dangerous, and what is the alternative?

CFunction

C Problem Overview


While learning C I regularly come across resources which recommend that some functions (e.g. gets()) are never to be used, because they are either difficult or impossible to use safely.

If the C standard library contains a number of these "never-use" functions, it would seem necessary to learn a list of them, what makes them unsafe, and what to do instead.

So far, I've learned that functions which:

  • Cannot be prevented from overwriting memory
  • Are not guaranteed to null-terminate a string
  • Maintain internal state between calls

are commonly regarded as being unsafe to use. Is there a list of functions which exhibit these behaviours? Are there other types of functions which are impossible to use safely?

C Solutions


Solution 1 - C

In the old days, most of the string functions had no bounds checking. Of course they couldn't just delete the old functions, or modify their signatures to include an upper bound, that would break compatibility. Now, for almost every one of those functions, there is an alternative "n" version. For example:

strcpy -> strncpy
strlen -> strnlen
strcmp -> strncmp
strcat -> strncat
strdup -> strndup
sprintf -> snprintf
wcscpy -> wcsncpy
wcslen -> wcsnlen

And more.

See also https://github.com/leafsr/gcc-poison which is a project to create a header file that causes gcc to report an error if you use an unsafe function.

Solution 2 - C

Yes, fgets(..., ..., STDIN) is a good alternative to gets(), because it takes a size parameter (gets() has in fact been removed from the C standard entirely in C11). Note that fgets() is not exactly a drop-in replacement for gets(), because the former will include the terminating \n character if there was room in the buffer for a complete line to be read.

scanf() is considered problematic in some cases, rather than straight-out "bad", because if the input doesn't conform to the expected format it can be impossible to recover sensibly (it doesn't let you rewind the input and try again). If you can just give up on badly formatted input, it's useable. A "better" alternative here is to use an input function like fgets() or fgetc() to read chunks of input, then scan it with sscanf() or parse it with string handling functions like strchr() and strtol(). Also see below for a specific problem with the "%s" conversion specifier in scanf().

It's not a standard C function, but the BSD and POSIX function mktemp() is generally impossible to use safely, because there is always a TOCTTOU race condition between testing for the existence of the file and subsequently creating it. mkstemp() or tmpfile() are good replacements.

strncpy() is a slightly tricky function, because it doesn't null-terminate the destination if there was no room for it. Despite the apparently generic name, this function was designed for creating a specific style of string that differs from ordinary C strings - strings stored in a known fixed width field where the null terminator is not required if the string fills the field exactly (original UNIX directory entries were of this style). If you don't have such a situation, you probably should avoid this function.

atoi() can be a bad choice in some situations, because you can't tell when there was an error doing the conversion (e.g., if the number exceeded the range of an int). Use strtol() if this matters to you.

strcpy(), strcat() and sprintf() suffer from a similar problem to gets() - they don't allow you to specify the size of the destination buffer. It's still possible, at least in theory, to use them safely - but you are much better off using strncat() and snprintf() instead (you could use strncpy(), but see above). Do note that whereas the n for snprintf() is the size of the destination buffer, the n for strncat() is the maximum number of characters to append and does not include the null terminator. Another alternative, if you have already calculated the relevant string and buffer sizes, is memmove() or memcpy().

On the same theme, if you use the scanf() family of functions, don't use a plain "%s" - specify the size of the destination e.g. "%200s".

Solution 3 - C

strtok() is generally considered to be evil because it stores state information between calls. Don't try running THAT in a multithreaded environment!

Solution 4 - C

Strictly speaking, there is one really dangerous function. It is gets() because its input is not under the control of the programmer. All other functions mentioned here are safe in and of themselves. "Good" and "bad" boils down to defensive programming, namely preconditions, postconditions and boilerplate code.

Let's take strcpy() for example. It has some preconditions that the programmer must fulfill before calling the function. Both strings must be valid, non-NULL pointers to zero terminated strings, and the destination must provide enough space with a final string length inside the range of size_t. Additionally, the strings are not allowed to overlap.

That is quite a lot of preconditions, and none of them is checked by strcpy(). The programmer must be sure they are fulfilled, or he must explicitly test them with additional boilerplate code before calling strcpy():

n = DST_BUFFER_SIZE;
if ((dst != NULL) && (src != NULL) && (strlen(dst)+strlen(src)+1 <= n))
{
    strcpy(dst, src);
}

Already silently assuming the non-overlap and zero-terminated strings.

strncpy() does include some of these checks, but it adds another postcondition the programmer must take care for after calling the function, because the result may not be zero-terminated.

strncpy(dst, src, n);
if (n > 0)
{
    dst[n-1] = '\0';
}

Why are these functions considered "bad"? Because they would require additional boilerplate code for each call to really be on the safe side when the programmer assumes wrong about the validity, and programmers tend to forget this code.

Or even argue against it. Take the printf() family. These functions return a status that indicate error and success. Who checks if the output to stdout or stderr succeeded? With the argument that you can't do anything at all when the standard channels are not working. Well, what about rescuing the user data and terminating the program with an error-indicating exit code? Instead of the possible alternative of crash and burn later with corrupted user data.

In a time- and money-limited environment it is always the question of how much safety nets you really want and what is the resulting worst case scenario? If it is a buffer overflow as in case of the str-functions, then it makes sense to forbid them and probably provide wrapper functions with the safety nets already within.

One final question about this: What makes you sure that your "good" alternatives are really good?

Solution 5 - C

Any function that does not take a maximum length parameter and instead relies on an end-of- marker to be present (such as many 'string' handling functions).

Any method that maintains state between calls.

Solution 6 - C

  • sprintf is bad, does not check size, use snprintf
  • gmtime, localtime -- use gmtime_r, localtime_r

Solution 7 - C

To add something about strncpy most people here forgot to mention. strncpy can result in performance problems as it clears the buffer to the length given.

char buff[1000];
strncpy(buff, "1", sizeof buff);

will copy 1 char and overwrite 999 bytes with 0

Another reason why I prefer strlcpy (I know strlcpy is a BSDism but it is so easy to implement that there's no excuse to not use it).

Solution 8 - C

View page 7 (PDF page 9) SAFECode Dev Practices

Edit: From the page -

strcpy family
strncpy family
strcat family
scanf family
sprintf family
gets family

Solution 9 - C

strcpy - again!

Most people agree that strcpy is dangerous, but strncpy is only rarely a useful replacement. It is usually important that you know when you've needed to truncate a string in any case, and for this reason you usually need to examine the length of the source string anwyay. If this is the case, usually memcpy is the better replacement as you know exactly how many characters you want copied.

e.g. truncation is error:

n = strlen( src );

if( n >= buflen )
    return ERROR;

memcpy( dst, src, n + 1 );

truncation allowed, but number of characters must be returned so caller knows:

n = strlen( src );

if( n >= buflen )
    n = buflen - 1;

memcpy( dst, src, n );
dst[n] = '\0';

return n;

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionCarson MyersView Question on Stackoverflow
Solution 1 - CAdam BatkinView Answer on Stackoverflow
Solution 2 - CcafView Answer on Stackoverflow
Solution 3 - CDave MarkleView Answer on Stackoverflow
Solution 4 - CSecureView Answer on Stackoverflow
Solution 5 - CMitch WheatView Answer on Stackoverflow
Solution 6 - CArtyomView Answer on Stackoverflow
Solution 7 - CPatrick SchlüterView Answer on Stackoverflow
Solution 8 - CDan McGrathView Answer on Stackoverflow
Solution 9 - CCB BaileyView Answer on Stackoverflow