Skip to main content
Spelling and grammar
Source Link
Toby Speight
  • 88.3k
  • 14
  • 104
  • 327

The program could be more useful if it accepted arguments for the input file name,name; currently the input file name is hardwired into main().

When doing C program development it is always a good practice to compile with the -Wall and -Wextra flags. These flags will help you find possible errors in the code. When I was learning how to program in C on Unix we had a program callcalled lint which helped identify possible problems in the code. I always ran lint on my code. To some extent the -Wall and -Wextra flags have replaced lint.

If the program had been compilecompiled with the -Wall andd -Wextra flags it would have found the following issues:

nn.c: In function ‘tData’:
nn.c:78:21: warning: format ‘%f’ expects argument of type ‘float *’, but argument 3 has type ‘IO_Neuron *’ [-Wformat=]
   78 |         fscanf(fp,"%f",&ret.training_in[i][inIndex]);
      |                    ~^  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      |                     |  |
      |                     |  IO_Neuron *
      |                     float *
nn.c:91:21: warning: format ‘%f’ expects argument of type ‘float *’, but argument 3 has type ‘IO_Neuron *’ [-Wformat=]
   91 |         fscanf(fp,"%f",&ret.training_out[i][outIndex]);
      |                    ~^  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      |                     |  |
      |                     |  IO_Neuron *
      |                     float *
nn.c: In function ‘train’:
nn.c:189:23: warning: unused parameter ‘input_layer’ [-Wunused-parameter]
  189 | void train(IO_Neuron* input_layer,Neuron* hidden_layer,IO_Neuron* output_layer,IO_Neuron** input_training,IO_Neuron** output_training,int training_samples,int iterations)
      |            ~~~~~~~~~~~^~~~~~~~~~~
nn.c: In function ‘main’:
nn.c:272:9: warning: unused variable ‘j’ [-Wunused-variable]
  272 |   int i,j;
      |         ^
nn.c: In function ‘tData’:
nn.c:78:21: warning: format ‘%f’ expects argument of type ‘float *’, but argument 3 has type ‘IO_Neuron *’ [-Wformat=]
   78 |         fscanf(fp,"%f",&ret.training_in[i][inIndex]);
      |                    ~^  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      |                     |  |
      |                     |  IO_Neuron *
      |                     float *
nn.c:91:21: warning: format ‘%f’ expects argument of type ‘float *’, but argument 3 has type ‘IO_Neuron *’ [-Wformat=]
   91 |         fscanf(fp,"%f",&ret.training_out[i][outIndex]);
      |                    ~^  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      |                     |  |
      |                     |  IO_Neuron *
      |                     float *
nn.c: In function ‘train’:
nn.c:189:23: warning: unused parameter ‘input_layer’ [-Wunused-parameter]
  189 | void train(IO_Neuron* input_layer,Neuron* hidden_layer,IO_Neuron* output_layer,IO_Neuron** input_training,IO_Neuron** output_training,int training_samples,int iterations)
      |            ~~~~~~~~~~~^~~~~~~~~~~
nn.c: In function ‘main’:
nn.c:272:9: warning: unused variable ‘j’ [-Wunused-variable]
  272 |   int i,j;
      |         ^

To prevent this undefined behavior a best practice is to always follow the memory allocation statement with a test that the pointer that was returned is not NULLnull.

In the above memory allocations, since the code seems to be allocating arrays of IO_NeuronsIO_Neuron or NeuronsNeuron it might be better to use calloc(size_t num, size_t size)calloc(size_t num, size_t size) rather than malloc(). One of the benefits of using calloc(size_t num, size_t size) idis that all the memory allocated is zeroed out.

When using malloc()malloc(), calloc()calloc() or realloc()realloc() in C a common convention is to sizeof(*PTR)sizeof *PTR rather sizeof(PTR_TYPE), this makesizeof (PTR_TYPE). This makes the code easier to maintain and less error prone, since less editing is required if the type of the pointer changes.

Input Optimization

#Input Optimization
ItIt would be better to input an entire line at one time using getline(char **lineptr, size_t *n, FILE *stream)getline(char **lineptr, size_t *n, FILE *stream) rather than using fscanf() multiple times per line. The getline() fuctionfunction will input the entire line at once and then it can be processed using string functions or character manipulation. The optimization is that lessfewer system calls are used to get the input. It itmightmight also be easier to create a function to get a line of input and process it this way.

There is a programming principle called the Don't Repeat Yourself Principle sometimes referred to as DRY code. If you find yourself repeating the same code mutiplemultiple times it is better to encapsulate it in a function. If it is possible to loop through the code, that can reduce repetition as well.

Three of the functions, main(), TData tData(const char* filename) and void train(...) are too complex (do too much). All three of these functions should be broken into smaller functions.

The program could be more useful if it accepted arguments for the input file name, currently the input file name is hardwired into main().

When doing C program development it is always a good practice to compile with the -Wall and -Wextra flags. These flags will help you find possible errors in the code. When I was learning how to program in C on Unix we had a program call lint which helped identify possible problems in the code. I always ran lint on my code. To some extent the -Wall and -Wextra flags have replaced lint.

If the program had been compile with the -Wall andd -Wextra flags it would have found the following issues:

nn.c: In function ‘tData’:
nn.c:78:21: warning: format ‘%f’ expects argument of type ‘float *’, but argument 3 has type ‘IO_Neuron *’ [-Wformat=]
   78 |         fscanf(fp,"%f",&ret.training_in[i][inIndex]);
      |                    ~^  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      |                     |  |
      |                     |  IO_Neuron *
      |                     float *
nn.c:91:21: warning: format ‘%f’ expects argument of type ‘float *’, but argument 3 has type ‘IO_Neuron *’ [-Wformat=]
   91 |         fscanf(fp,"%f",&ret.training_out[i][outIndex]);
      |                    ~^  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      |                     |  |
      |                     |  IO_Neuron *
      |                     float *
nn.c: In function ‘train’:
nn.c:189:23: warning: unused parameter ‘input_layer’ [-Wunused-parameter]
  189 | void train(IO_Neuron* input_layer,Neuron* hidden_layer,IO_Neuron* output_layer,IO_Neuron** input_training,IO_Neuron** output_training,int training_samples,int iterations)
      |            ~~~~~~~~~~~^~~~~~~~~~~
nn.c: In function ‘main’:
nn.c:272:9: warning: unused variable ‘j’ [-Wunused-variable]
  272 |   int i,j;
      |         ^

To prevent this undefined behavior a best practice is to always follow the memory allocation statement with a test that the pointer that was returned is not NULL.

In the above memory allocations, since the code seems to be allocating arrays of IO_Neurons or Neurons it might be better to use calloc(size_t num, size_t size) rather than malloc(). One of the benefits of using calloc(size_t num, size_t size) id that all the memory allocated is zeroed out.

When using malloc(), calloc() or realloc() in C a common convention is to sizeof(*PTR) rather sizeof(PTR_TYPE), this make the code easier to maintain and less error prone, since less editing is required if the type of the pointer changes.

#Input Optimization
It would be better to input an entire line at one time using getline(char **lineptr, size_t *n, FILE *stream) rather than using fscanf() multiple times per line. The getline() fuction will input the entire line at once and then it can be processed using string functions or character manipulation. The optimization is that less system calls are used to get the input. It itmight also be easier to create a function to get a line of input and process it this way.

There is a programming principle called the Don't Repeat Yourself Principle sometimes referred to as DRY code. If you find yourself repeating the same code mutiple times it is better to encapsulate it in a function. If it is possible to loop through the code that can reduce repetition as well.

Three of the functions, main(), TData tData(const char* filename) and void train(...) are too complex (do too much). All three of these functions should be broken into smaller functions.

The program could be more useful if it accepted arguments for the input file name; currently the input file name is hardwired into main().

When doing C program development it is always a good practice to compile with the -Wall and -Wextra flags. These flags will help you find possible errors in the code. When I was learning how to program in C on Unix we had a program called lint which helped identify possible problems in the code. I always ran lint on my code. To some extent the -Wall and -Wextra flags have replaced lint.

If the program had been compiled with the -Wall andd -Wextra flags it would have found the following issues:

nn.c: In function ‘tData’:
nn.c:78:21: warning: format ‘%f’ expects argument of type ‘float *’, but argument 3 has type ‘IO_Neuron *’ [-Wformat=]
   78 |         fscanf(fp,"%f",&ret.training_in[i][inIndex]);
      |                    ~^  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      |                     |  |
      |                     |  IO_Neuron *
      |                     float *
nn.c:91:21: warning: format ‘%f’ expects argument of type ‘float *’, but argument 3 has type ‘IO_Neuron *’ [-Wformat=]
   91 |         fscanf(fp,"%f",&ret.training_out[i][outIndex]);
      |                    ~^  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      |                     |  |
      |                     |  IO_Neuron *
      |                     float *
nn.c: In function ‘train’:
nn.c:189:23: warning: unused parameter ‘input_layer’ [-Wunused-parameter]
  189 | void train(IO_Neuron* input_layer,Neuron* hidden_layer,IO_Neuron* output_layer,IO_Neuron** input_training,IO_Neuron** output_training,int training_samples,int iterations)
      |            ~~~~~~~~~~~^~~~~~~~~~~
nn.c: In function ‘main’:
nn.c:272:9: warning: unused variable ‘j’ [-Wunused-variable]
  272 |   int i,j;
      |         ^

To prevent this undefined behavior a best practice is to always follow the memory allocation statement with a test that the pointer that was returned is not null.

In the above memory allocations, since the code seems to be allocating arrays of IO_Neuron or Neuron it might be better to use calloc(size_t num, size_t size) rather than malloc(). One of the benefits of using calloc() is that all the memory allocated is zeroed out.

When using malloc(), calloc() or realloc() in C a common convention is to sizeof *PTR rather sizeof (PTR_TYPE). This makes the code easier to maintain and less error prone, since less editing is required if the type of the pointer changes.

Input Optimization

It would be better to input an entire line at one time using getline(char **lineptr, size_t *n, FILE *stream) rather than using fscanf() multiple times per line. The getline() function will input the entire line at once and then it can be processed using string functions or character manipulation. The optimization is that fewer system calls are used to get the input. It might also be easier to create a function to get a line of input and process it this way.

There is a programming principle called the Don't Repeat Yourself Principle sometimes referred to as DRY code. If you find yourself repeating the same code multiple times it is better to encapsulate it in a function. If it is possible to loop through the code, that can reduce repetition as well.

Three of the functions, main(), TData tData(const char* filename) and void train(...) are too complex (do too much). All three of these should be broken into smaller functions.

Source Link
pacmaninbw
  • 26.1k
  • 13
  • 47
  • 114

General Observations

The program could be more useful if it accepted arguments for the input file name, currently the input file name is hardwired into main().

When doing C program development it is always a good practice to compile with the -Wall and -Wextra flags. These flags will help you find possible errors in the code. When I was learning how to program in C on Unix we had a program call lint which helped identify possible problems in the code. I always ran lint on my code. To some extent the -Wall and -Wextra flags have replaced lint.

If the program had been compile with the -Wall andd -Wextra flags it would have found the following issues:

nn.c: In function ‘tData’:
nn.c:78:21: warning: format ‘%f’ expects argument of type ‘float *’, but argument 3 has type ‘IO_Neuron *’ [-Wformat=]
   78 |         fscanf(fp,"%f",&ret.training_in[i][inIndex]);
      |                    ~^  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      |                     |  |
      |                     |  IO_Neuron *
      |                     float *
nn.c:91:21: warning: format ‘%f’ expects argument of type ‘float *’, but argument 3 has type ‘IO_Neuron *’ [-Wformat=]
   91 |         fscanf(fp,"%f",&ret.training_out[i][outIndex]);
      |                    ~^  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      |                     |  |
      |                     |  IO_Neuron *
      |                     float *
nn.c: In function ‘train’:
nn.c:189:23: warning: unused parameter ‘input_layer’ [-Wunused-parameter]
  189 | void train(IO_Neuron* input_layer,Neuron* hidden_layer,IO_Neuron* output_layer,IO_Neuron** input_training,IO_Neuron** output_training,int training_samples,int iterations)
      |            ~~~~~~~~~~~^~~~~~~~~~~
nn.c: In function ‘main’:
nn.c:272:9: warning: unused variable ‘j’ [-Wunused-variable]
  272 |   int i,j;
      |         ^

Test for Possible Memory Allocation Errors

In modern high-level languages such as C++, memory allocation errors throw an exception that the programmer can catch. This is not the case in the C programming language. While it is rare in modern computers because there is so much memory, memory allocation can fail, especially if the code is working in a limited memory application such as embedded control systems. In the C programming language when memory allocation fails, the functions malloc(), calloc() and realloc() return NULL. Referencing any memory address through a NULL pointer results in undefined behavior (UB).

Possible unknown behavior in this case can be a memory page error (in Unix this would be call Segmentation Violation), corrupted data in the program and in very old computers it could even cause the computer to reboot (corruption of the stack pointer).

To prevent this undefined behavior a best practice is to always follow the memory allocation statement with a test that the pointer that was returned is not NULL.

Example of Current Code:

  //allocate neural network
  IO_Neuron* input_layer = malloc(sizeof(IO_Neuron)*INPUT_COUNT);
  IO_Neuron* hidden_layer = malloc(sizeof(Neuron)*HIDDEN_COUNT);
  IO_Neuron* output_layer = malloc(sizeof(IO_Neuron)*OUTPUT_COUNT);

Example of Current Code with Test:

  //allocate neural network
  IO_Neuron* input_layer = malloc(sizeof(IO_Neuron)*INPUT_COUNT);
  if (input_layer == NULL)
  {
    fprintf(stderr, "malloc of input_layer FAILED\n");
    return EXIT_FAILURE;
  }

  IO_Neuron* hidden_layer = malloc(sizeof(Neuron)*HIDDEN_COUNT);
  if (hidden_layer == NULL)
  {
    fprintf(stderr, "malloc of hidden_layer FAILED\n");
    return EXIT_FAILURE;
  }

  IO_Neuron* output_layer = malloc(sizeof(IO_Neuron)*OUTPUT_COUNT);
  if (output_layer == NULL)
  {
    fprintf(stderr, "malloc of output_layer FAILED\n");
    return EXIT_FAILURE;
  }

In the above memory allocations, since the code seems to be allocating arrays of IO_Neurons or Neurons it might be better to use calloc(size_t num, size_t size) rather than malloc(). One of the benefits of using calloc(size_t num, size_t size) id that all the memory allocated is zeroed out.

Convention When Using Memory Allocation in C

When using malloc(), calloc() or realloc() in C a common convention is to sizeof(*PTR) rather sizeof(PTR_TYPE), this make the code easier to maintain and less error prone, since less editing is required if the type of the pointer changes.

#Input Optimization
It would be better to input an entire line at one time using getline(char **lineptr, size_t *n, FILE *stream) rather than using fscanf() multiple times per line. The getline() fuction will input the entire line at once and then it can be processed using string functions or character manipulation. The optimization is that less system calls are used to get the input. It itmight also be easier to create a function to get a line of input and process it this way.

DRY Code

There is a programming principle called the Don't Repeat Yourself Principle sometimes referred to as DRY code. If you find yourself repeating the same code mutiple times it is better to encapsulate it in a function. If it is possible to loop through the code that can reduce repetition as well.

In main() these 3 lines should be a loop:

  input_layer[0].value = 0;
  input_layer[1].value = 0;
  input_layer[2].value = 0;

In the TData tData(const char* filename) function the code below should be modified so that it can be made into a function that can be called twice with different input.

    for (j =0; j < (INPUT_COUNT*2 - 1);j++)
    {
      if (j % 2 == 1)
      {
        fscanf(fp,",");
      }
      else
      {
        fscanf(fp,"%f",&ret.training_in[i][inIndex].value);
        inIndex += 1;
      }
    }
    fscanf(fp," ");
    for (j =0; j < (OUTPUT_COUNT*2 - 1);j++)
    {
      if (j % 2 == 1)
      {
        fscanf(fp,",");
      }
      else
      {
        fscanf(fp,"%f",&ret.training_out[i][outIndex].value);
        outIndex += 1;
      }
    }
  }

Complexity

Three of the functions, main(), TData tData(const char* filename) and void train(...) are too complex (do too much). All three of these functions should be broken into smaller functions.

As programs grow in size the use of main() should be limited to calling functions that parse the command line, calling functions that set up for processing, calling functions that execute the desired function of the program, and calling functions to clean up after the main portion of the program.

There is also a programming principle called the Single Responsibility Principle that applies here. The Single Responsibility Principle states:

that every module, class, or function should have responsibility over a single part of the functionality provided by the software, and that responsibility should be entirely encapsulated by that module, class or function.