11
\$\begingroup\$

I have been working on a project for several days. It is a command line program for viewing files in hexadecimal. It is usually very fast, but is there any way I could make it more efficient?

#include <stdio.h>
#include <stdlib.h>

int main(int argc, char *argv[]){
    if(argc == 2){
        FILE *in = fopen(argv[1], "rb");
        if(in == NULL){
            printf("Could not open %s\n", argv[1]);
        }else{
            int i = 0, j, br, ln[8];
            char l[16];
            while((br = fread(&l, 1, 16, in)) > 0){
                printf("%08x ", i);
                i += br;
                for(j = 0; j < br; j++){
                    if(l[j] > 0){
                        printf("%02x ", l[j]);
                    }else{
                        printf("?? ");
                    }
                }
                for(j = 0; j < 16 - br; j++){
                    printf("   ");
                }
                for(j = 0; j < br; j++){
                    if(l[j] > 31 && l[j] < 127){
                        printf("%c", l[j]);
                    }else{
                        printf(".");
                    }
                }
                printf("\n");
            }
            fclose(in);
        }
    }else{
        printf("Please use the syntax 'hex [file]'\n");
    }
    return 0;
}
\$\endgroup\$
1
  • \$\begingroup\$ How about: od -t x1 <file> \$\endgroup\$ Commented Jan 3, 2016 at 0:22

2 Answers 2

13
\$\begingroup\$

If a program fails for some reason then it should

  • Write a diagnostic message to standard error (not standard output), so that it can not be mistaken for normal output, e.g. when the program runs in a pipe. strerror(errno) gives you an error message string corresponding to the last error, e.g. "No such file or directory" or "Permission denied".
  • Terminate with a non-zero status code. You can use predefined exit codes from <sysexits.h> if that is available on your system, or make up your own.

Combining this with the "Avoiding deep nesting" advice from Caridorc's answer, the overall structure would be

int main(int argc, char *argv[]) {

    if (argc != 2) {
        fprintf(stderr, "Usage: hex [file]\n");
        exit(EX_USAGE);
    }

    FILE *in = fopen(argv[1], "rb");
    if (in == NULL) {
        fprintf(stderr, "Cannot not open %s: %s\n", argv[1], strerror(errno));
        exit(EX_NOINPUT);
    }

    // ... print hex dump ...

    fclose(in);

    exit(EXIT_SUCCESS); // EXIT_SUCCESS == 0
}

Your variable names are too short and non-descriptive.

  • i is the current offset into the file. The correct type for a file offset is off_t.
  • br is the amount of bytes read. The correct type would be size_t (which is what fread() returns, so this avoids a warning when compiling on a 64-bit architecture).
  • l is a line buffer.

j is OK as a iteration variable, but the scope should be limited to the for-loop.

The main loop would then start like this:

off_t offset = 0;
size_t bytesRead;
char line[16];

while ((bytesRead = fread(&line, 1, 16, in)) > 0){
    printf("%08llx ", offset);
    offset += bytesRead;
    for (size_t j = 0; j < bytesRead; j++){
        printf("%02x ", line[j]);
    }

I don't know why you want to print a NUL-byte as ??, therefore I have removed that part.

The padding can be done by printing an empty string with a field width:

    printf("%*.s", 3*(16 - (int)bytesRead), "");

The remaining part of the main loop is OK, only that I would generally use more space, at least around parentheses and keywords:

    for (int j = 0; j < bytesRead; j++) {
        if (line[j] > 31 && line[j] < 127) {
            printf("%c", line[j]);
        } else {
            printf(".");
        }
    }
    printf("\n");
}
\$\endgroup\$
11
\$\begingroup\$

Compiler diagnostic

My first suggestion is very easy to follow, just compile your code with:

gcc -Wall -pedantic

This will give you all sorts of nice warnings for possible errors, your code only raises one, kudos!

hex.c: In function ‘main’:
hex.c:10:31: warning: unused variable ‘ln’ [-Wunused-variable]
         int i = 0, j, br, ln[8];

Just remove , ln[8] from your code to simplify and silence the warning.

Avoiding deep nesting

Your code now looks like:

if proper_condition1() {
    if proper_condition2() {
        // do_stuff()
    } else ()
} else()

It is deeply nested. But beware that deep nesting increases complexity and I cannot clearly see how trouble-some situations are handled, I suggest the pattern:

if NOT proper_condition1() {
    handle_error()
}
if NOT proper_condition2() {
    handle_error()
}
// do_stuff()

After flattening the error conditions, your code is:

int main(int argc, char *argv[]){
    if(argc != 2) {
            printf("Please use the syntax 'hex [file]'\n");
            return 1;
        }

    FILE *in = fopen(argv[1], "rb");

    if(in == NULL){
      printf("Could not open %s\n", argv[1]);
      return 2;
    }

    int i = 0, j, br;
    char l[16];
    while((br = fread(&l, 1, 16, in)) > 0){
            printf("%08x ", i);
            i += br;
            for(j = 0; j < br; j++){
                    if(l[j] > 0){
                            printf("%02x ", l[j]);
                    }else{
                            printf("?? ");
                    }
            }
            for(j = 0; j < 16 - br; j++){
                    printf("   ");
            }
            for(j = 0; j < br; j++){
                    if(l[j] > 31 && l[j] < 127){
                            printf("%c", l[j]);
                    }else{
                            printf(".");
                    }
            }
            printf("\n");
    }
    fclose(in);
    return 0;
}

That I like not only for putting in evidence the strategy to deal with errors and for being less nested, but also because it gives different return values for different outcomes:

  • 1 if the call syntax was invalid
  • 2 if the file is not readable
  • 0 is everything goes fine

This way a script can easily check if things went right or wrong and more importantly in which way it failed that may be useful for further processing.

Built-in usage

Even if C is mid-level, and thus has much less built-ins than say Python or Ruby, there are still some to take advantage of, an example relevant to this code is the isprint function, that may be used in place of l[j] > 31 && l[j] < 127 to improve readability.

Avoid overkill usage of printf

printf is a very powerful function, and I like avoiding using powerful functions when smaller and less powerful options are available:

Ternary simplification

Till the end of your code, after making my modifications, you always use putchar just the argument changes, this makes a great opportunity to simplify with a ternary:

for(j = 0; j < br; j++){
    putchar(isprint(l[j]) ? l[j] : '.');
}

This is much shorter then if and else and also more readable as the ternary is less powerful in that it only allows expressions (if also allows statements). I also like that we got rid of the duplication of putchar.

  • printf("?? ") becomes fputs("?? ", stdout)
  • printf(".") becomes putchar('.')
  • printf("\n") becomes putchar('\n')
  • printf("%c", l[j]) becomes putchar(l[j])

Small helper function

Given the loop:

            for(j = 0; j < 16 - br; j++){
                printf("   ");
            }

It is not so obvious at first glance how many spaces it prints, I defined a small helper function:

void print_spaces(int n) {
    for (int i = 0; i < n; i++) {
        putchar(' ');
    }
}

So that calling it makes the matter much more obvious:

print_spaces(4 * (16 - bytes_read) );

I always think that too many functions is a much better problem than too little functions, so I err on the side of writing many, even tiny, helpers.

\$\endgroup\$
7
  • \$\begingroup\$ @Daniel Gee please un-accept this answer, we are different from SO in that a good answer may take a day or more to arrive. After a day or two you may accept an answer. \$\endgroup\$ Commented Jan 2, 2016 at 21:05
  • \$\begingroup\$ I do hope that you don't mind some overlap between our answers. I had prepared those arguments already while you posted and amended your answer. \$\endgroup\$ Commented Jan 2, 2016 at 21:39
  • \$\begingroup\$ @MartinR Surely not, on a given piece of code there is only so much to say and I am sure you came up with your ideas by yourself, so no problem. Your answer also contains many sensible improvments not found in mine, so you have my upvote. \$\endgroup\$ Commented Jan 2, 2016 at 21:41
  • \$\begingroup\$ @Caridorc should I post my updated code under the question? \$\endgroup\$ Commented Jan 3, 2016 at 9:44
  • 2
    \$\begingroup\$ @DanielGee: No, you should not edit the question after receiving an answer. Have a look at meta.codereview.stackexchange.com/questions/1763/… for the possible options (such as a follow-up question). \$\endgroup\$ Commented Jan 3, 2016 at 10:34

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.