Going through a text file line by line in C

CGetlineScanf

C Problem Overview


I have been working on a small exercise for my CIS class and am very confused by the methods C uses to read from a file. All that I really need to do is read through a file line by line and use the information gathered from each line to do a few manipulations. I tried using the getline method and others with no luck. My code is currently as follows:

int main(char *argc, char* argv[]){
      const char *filename = argv[0];
      FILE *file = fopen(filename, "r");
      char *line = NULL;
    
      while(!feof(file)){
        sscanf(line, filename, "%s");
        printf("%s\n", line);
      }
    return 1;
}

Right now I am getting a seg fault with the sscanf method and I am not sure why. I am a total C noob and just wondering if there was some big picture thing that I was missing. Thanks

C Solutions


Solution 1 - C

So many problems in so few lines. I probably forget some:

  • argv[0] is the program name, not the first argument;
  • if you want to read in a variable, you have to allocate its memory
  • one never loops on feof, one loops on an IO function until it fails, feof then serves to determinate the reason of failure,
  • sscanf is there to parse a line, if you want to parse a file, use fscanf,
  • "%s" will stop at the first space as a format for the ?scanf family
  • to read a line, the standard function is fgets,
  • returning 1 from main means failure

So

#include <stdio.h>

int main(int argc, char* argv[])
{
    char const* const fileName = argv[1]; /* should check that argc > 1 */
    FILE* file = fopen(fileName, "r"); /* should check the result */
    char line[256];

    while (fgets(line, sizeof(line), file)) {
        /* note that fgets don't strip the terminating \n, checking its
           presence would allow to handle lines longer that sizeof(line) */
        printf("%s", line); 
    }
    /* may check feof here to make a difference between eof and io failure -- network
       timeout for instance */

    fclose(file);

    return 0;
}

Solution 2 - C

To read a line from a file, you should use the fgets function: It reads a string from the specified file up to either a newline character or EOF.

The use of sscanf in your code would not work at all, as you use filename as your format string for reading from line into a constant string literal %s.

The reason for SEGV is that you write into the non-allocated memory pointed to by line.

Solution 3 - C

In addition to the other answers, on a recent C library (Posix 2008 compliant), you could use getline. See this answer (to a related question).

Solution 4 - C

Say you're dealing with some other delimiter, such as a \t tab, instead of a \n newline.

A more general approach to delimiters is the use of getc(), which grabs one character at a time.

Note that getc() returns an int, so that we can test for equality with EOF.

Secondly, we define an array line[BUFFER_MAX_LENGTH] of type char, in order to store up to BUFFER_MAX_LENGTH-1 characters on the stack (we have to save that last character for a \0 terminator character).

Use of an array avoids the need to use malloc and free to create a character pointer of the right length on the heap.

#define BUFFER_MAX_LENGTH 1024

int main(int argc, char* argv[])
{
    FILE *file = NULL;
    char line[BUFFER_MAX_LENGTH];
    int tempChar;
    unsigned int tempCharIdx = 0U;

    if (argc == 2)
         file = fopen(argv[1], "r");
    else {
         fprintf(stderr, "error: wrong number of arguments\n"
                         "usage: %s textfile\n", argv[0]);
         return EXIT_FAILURE;
    }

    if (!file) {
         fprintf(stderr, "error: could not open textfile: %s\n", argv[1]);
         return EXIT_FAILURE;
    }

    /* get a character from the file pointer */
    while(tempChar = fgetc(file))
    {
        /* avoid buffer overflow error */
        if (tempCharIdx == BUFFER_MAX_LENGTH) {
            fprintf(stderr, "error: line is too long. increase BUFFER_MAX_LENGTH.\n");
            return EXIT_FAILURE;
        }

        /* test character value */
        if (tempChar == EOF) {
            line[tempCharIdx] = '\0';
            fprintf(stdout, "%s\n", line);
            break;
        }
        else if (tempChar == '\n') {
            line[tempCharIdx] = '\0';
            tempCharIdx = 0U;
            fprintf(stdout, "%s\n", line);
            continue;
        }
        else
            line[tempCharIdx++] = (char)tempChar;
    }

    return EXIT_SUCCESS;
}

If you must use a char *, then you can still use this code, but you strdup() the line[] array, once it is filled up with a line's worth of input. You must free this duplicated string once you're done with it, or you'll get a memory leak:

#define BUFFER_MAX_LENGTH 1024

int main(int argc, char* argv[])
{
    FILE *file = NULL;
    char line[BUFFER_MAX_LENGTH];
    int tempChar;
    unsigned int tempCharIdx = 0U;
    char *dynamicLine = NULL;

    if (argc == 2)
         file = fopen(argv[1], "r");
    else {
         fprintf(stderr, "error: wrong number of arguments\n"
                         "usage: %s textfile\n", argv[0]);
         return EXIT_FAILURE;
    }

    if (!file) {
         fprintf(stderr, "error: could not open textfile: %s\n", argv[1]);
         return EXIT_FAILURE;
    }

    while(tempChar = fgetc(file))
    {
        /* avoid buffer overflow error */
        if (tempCharIdx == BUFFER_MAX_LENGTH) {
            fprintf(stderr, "error: line is too long. increase BUFFER_MAX_LENGTH.\n");
            return EXIT_FAILURE;
        }

        /* test character value */
        if (tempChar == EOF) {
            line[tempCharIdx] = '\0';
            dynamicLine = strdup(line);
            fprintf(stdout, "%s\n", dynamicLine);
            free(dynamicLine);
            dynamicLine = NULL;
            break;
        }
        else if (tempChar == '\n') {
            line[tempCharIdx] = '\0';
            tempCharIdx = 0U;
            dynamicLine = strdup(line);
            fprintf(stdout, "%s\n", dynamicLine);
            free(dynamicLine);
            dynamicLine = NULL;
            continue;
        }
        else
            line[tempCharIdx++] = (char)tempChar;
    }

    return EXIT_SUCCESS;
}

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionDan BradburyView Question on Stackoverflow
Solution 1 - CAProgrammerView Answer on Stackoverflow
Solution 2 - CAbrixas2View Answer on Stackoverflow
Solution 3 - CBasile StarynkevitchView Answer on Stackoverflow
Solution 4 - CAlex ReynoldsView Answer on Stackoverflow