0

I have to read from a file which has a unknown number of students records in it written in binary, then sort the students by their GPA and send to stdout.

Our sort function had to be like

void insertion_sort(Student **, int);

That's why I choose to use an pointer to a pointer to Student (probably not the best solution? I think I could have just sent an pointer to Student like this (&p_to_Student, n) ?)

The code is bellow, the problem is that when I print the first element of what p is pointing to (the first students name) I get gibberish, the other students are fine.

I checked the value of p, and it does change after realloc() is called, and because it's also the address of the first element of p (right?).

Also checked with Valgrind and it returns a bunch of errors about memory leaks!

The code runs fine when there is no realloc() call, also when I initialize p after I'm done reading the file. So it must be something to do with not using realloc() correctly.

Bonus question: is this a proper way to read an unknown number of data entries from a file?

#include <stdio.h>
#include <stdlib.h>

struct student {
    char name[30];
    char surname[30];
    double GPA;
};

typedef struct student Student;

void insertion_sort(Student **arr, int n)
{
    int i,j;

    for (i=1; i<n; i++)
    {
        Student *tmp = arr[i];

        for (j=i; j>0 && (tmp->GPA > arr[j-1]->GPA); j--)
            arr[j] = arr[j-1];

        arr[j] = tmp;
    }
}


int main(int argc, char **argv)
{
    FILE *in;
    Student s, *arr, **p;
    size_t ret;
    int i = 1, n=0, c=2;

    in = fopen(argv[1], "rb"); 
    if (in == NULL)
        return printf("Can't open file!\n"), 1;

    arr = (Student*) malloc(c*sizeof(Student*));
    p = (Student**) malloc(c*sizeof(Student*));

    do
    {
        ret = fread(&s, sizeof(Student), 1, in);

        if (ret)
        {
            if (n == c)
            {
                arr = (Student*) realloc(arr, (c*=2)*sizeof(Student));
                p = (Student**) realloc(p, c*sizeof(Student*));
            }
            // when I print the value of pointer p
            // the values is changed when realloc() is called
            printf("p = %p\n", p);

            arr[n] = s;
            p[n] = arr+n;
            n++;
        }

    } while (ret);

    fclose(in);

    // If I do this instead the program runs correctly        
    //p = (Student**) malloc(c*sizeof(Student));
    //for (int i=0; i<n; i++)
    //{
        //p[i] = arr+i;
    //}

    insertion_sort(p, n);

    for (i=0; i<n; i++)
    {
        printf("%2d. %-20s %-20s %7.2lf\n", i+1, p[i]->name,
            p[i]->surname, p[i]->GPA);
    }

    free(arr);
    free(p);

    return 0;
}
5
  • 2
    realloc may change the pointer. That means all pointers into that pointer may become invalid. In your case, p holds pointers into arr. Because you need p only for your sorting, the approach you have commented out - to allocate p at one go before sorting - is better. realloc is useful only if you don't now beforehand how big your array is. Commented Apr 9, 2016 at 11:26
  • @Pemdas Believe me I have. If you could please point to what part of realloc() behaviour I have missed please do. Commented Apr 9, 2016 at 13:19
  • @MOehm The value of p changes, the rest (p[0], p[1] ...) does not. Is that what you are referring to? And because of that it cannot free that memory? Is there a simpler example than this where this is obvious? Commented Apr 9, 2016 at 13:40
  • No. it is okay that the value of p changes. If that happens, realloc guarantees that the contents of p up to the old size are the same as before, even if they are now stored at another place. Your problem is that the value of arr changes and therefore your values of p[0] and p[1] are no longer valid. You have moved arr away from under p's feet, so to speak. I've written an answer to illustrate. Commented Apr 9, 2016 at 14:07
  • @MOehm "You have moved arr away from under p's feet" now this sentence makes sense! :) Commented Apr 18, 2016 at 14:13

1 Answer 1

1

realloc may change the pointer. That means all pointers into that pointer may become invalid. In your case, p holds pointers into arr.

Your problem is not that the value of p changes, but that the old values of p are no longer valid when the value of arr changes.

To illustrate (all pointer and size values are made up):

sizeof(stud) == 16;
allocate arr: arr == 0x00100000;

1st value:    arr[0] = stud1;      p[0] = &arr[0];  // 0x00100000
2nd value:    arr[1] = stud2;      p[1] = &arr[1];  // 0x00100010

reallocate arr: arr == 0x00200000;
old address of arr is no longer valid!

3rd value:    arr[0] = stud1;      p[2] = &arr[2];  // 0x00200020

Now your pointer array looks like this:

p[0] == 0x00100000      // no longer valid!
p[0] == 0x00100010      // no longer valid!
p[0] == 0x00200020      // okay

Because you need p only for your sorting, the approach you have commented out – to allocate p at one go before sorting – is better.

realloc is useful only if you don't now beforehand how big your array is, so you should use it as long as you are building the array. When you are done building the array and you can be sure that arr will stay the same you should create the array of pointers, p.

Sign up to request clarification or add additional context in comments.

2 Comments

So, p has the same contents, but some of it is no longer valid, because arr is now a new chunk of memory, but because p is initializes along the way it does not reflect that fact? realloc is doing what it's supposed to do. That would explain why I get partial results (the latest addresses of arr are valid).
Exactly. That's a problem with realloc you should be aware of. But it's usually easy to separate the input phase, where the vector may still grow, and the working phase, where the size of the allocated memory is frozen.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.