19

Which is the fastest way to get the lines of an ASCII file?

1
  • 1
    in .txt for example, basically i need the newlines Commented Nov 25, 2010 at 16:02

5 Answers 5

23

Normally you read files in C using fgets. You can also use scanf("%[^\n]"), but quite a few people reading the code are likely to find that confusing and foreign.

Edit: on the other hand, if you really do just want to count lines, a slightly modified version of the scanf approach can work quite nicely:

while (EOF != (scanf("%*[^\n]"), scanf("%*c"))) 
    ++lines;

The advantage of this is that with the '*' in each conversion, scanf reads and matches the input, but does nothing with the result. That means we don't have to waste memory on a large buffer to hold the content of a line that we don't care about (and still take a chance of getting a line that's even larger than that, so our count ends up wrong unless we got to even more work to figure out whether the input we read ended with a newline).

Unfortunately, we do have to break up the scanf into two pieces like this. scanf stops scanning when a conversion fails, and if the input contains a blank line (two consecutive newlines) we expect the first conversion to fail. Even if that fails, however, we want the second conversion to happen, to read the next newline and move on to the next line. Therefore, we attempt the first conversion to "eat" the content of the line, and then do the %c conversion to read the newline (the part we really care about). We continue doing both until the second call to scanf returns EOF (which will normally be at the end of the file, though it can also happen in case of something like a read error).

Edit2: Of course, there is another possibility that's (at least arguably) simpler and easier to understand:

int ch;

while (EOF != (ch=getchar()))
    if (ch=='\n')
        ++lines;

The only part of this that some people find counterintuitive is that ch must be defined as an int, not a char for the code to work correctly.

Sign up to request clarification or add additional context in comments.

16 Comments

I can do a while until fgets returns NULL? like that: while(fgets(szTmp, 256, pfFile)) nLines++;
@Sunscreen: no, you can't. fgets() will return fragments if your line is longer than 256 characters and your count will be too high. You have to check for the EOL character.
@Sunscreen: see edits. Now that it's clear what you really want, there is an approach I think is a bit cleaner than using fgets.
A big +1! This answer (1) gives detailed explanations of everything the code does and how it deals with input cases, (2) avoids all failure cases by not using any buffers, and (3) demonstrates one of the rare correct uses of the scanf family.
I think there's one corner-case you've missed - what about a file where the last line doesn't end in newline?
|
6

Here's a solution based on fgetc() which will work for lines of any length and doesn't require you to allocate a buffer.

#include <stdio.h>

int main()
{
    FILE                *fp = stdin;    /* or use fopen to open a file */
    int                 c;              /* Nb. int (not char) for the EOF */
    unsigned long       newline_count = 0;

        /* count the newline characters */
    while ( (c=fgetc(fp)) != EOF ) {
        if ( c == '\n' )
            newline_count++;
    }

    printf("%lu newline characters\n", newline_count);
    return 0;
}

1 Comment

I've tried a million different ways count new lines in all of the methods suggested above and yours was the only one that worked for me! So thank you
2

Maybe I'm missing something, but why not simply:

#include <stdio.h>
int main(void) {
  int n = 0;
  int c;
  while ((c = getchar()) != EOF) {
    if (c == '\n')
      ++n;
  }
  printf("%d\n", n);
}

if you want to count partial lines (i.e. [^\n]EOF):

#include <stdio.h>
int main(void) {
  int n = 0;
  int pc = EOF;
  int c;
  while ((c = getchar()) != EOF) {
    if (c == '\n')
      ++n;
    pc = c;
  }
  if (pc != EOF && pc != '\n')
    ++n;
  printf("%d\n", n);
}

1 Comment

+1 IMO, this is the best getchar() answer as it deals with the last line not terminated with '\n'. Suggested minor simplification: int pc = '\n'; while (..) { ...} if (pc != '\n') ++n;
2

Common, why You compare all characters? It is very slow. In 10MB file it is ~3s.
Under solution is faster.

unsigned long count_lines_of_file(char *file_patch) {
    FILE *fp = fopen(file_patch, "r");
    unsigned long line_count = 0;

    if(fp == NULL){
        return 0;
    }
    while ( fgetline(fp) )
        line_count++;

    fclose(fp);
    return line_count;
}

6 Comments

It depends on the length of the line. For my task it was ~400 times faster.
Why is it faster? The internal implementation of fgetline() also has to compare every character to find the newline...
in practice, I got such a difference results
Note: fgetline() is neither in C99 nor C11 spec.
if (fp == NULL) fclose(fp) If the pointer is NULL doesn't this imply the file wasn't found or something? Anyway, it means it wasn't opened in the first place. Why do you need to call fclose? (I only know for sure that this applies in C++. Is it the same in C?
|
1

What about this?

#include <stdio.h>
#include <string.h>

#define BUFFER_SIZE 4096

int main(int argc, char** argv)
{
    int count;
    int bytes;
    FILE* f;
    char buffer[BUFFER_SIZE + 1];
    char* ptr;

    if (argc != 2 || !(f = fopen(argv[1], "r")))
    {
        return -1;
    }

    count = 0;
    while(!feof(f))
    {
        bytes = fread(buffer, sizeof(char), BUFFER_SIZE, f);
        if (bytes <= 0)
        {
            return -1;
        }

        buffer[bytes] = '\0';
        for (ptr = buffer; ptr; ptr = strchr(ptr, '\n'))
        {
            ++count;
            ++ptr;
        }
    }

    fclose(f);

    printf("%d\n", count - 1);

    return 0;
}

2 Comments

no reason to buffer stdio's buffered input. Also, this will report -1 on an empty file.
Do not recommend this. It exits with -1 on any file whose length is a multiple of BUFFER_SIZE including an empty file as noted by @vlabrecque.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.