0

I am currently learning C, as someone, who has quite a bit of experience in more high-level languages than that.

After considering some of your comments and doing a bit more testing and fidelling, I came up with a better solution (scroll down).

Advice and comments are still very welcome though.

That said, I struggle a bit with dynamic array handling in C. I got it down to a point, where I quite like, simply terminating every array with NULL and using realloc where neccessary.

Basically I am in search of advice: Is there any downside, of doing it that way (see example code below), besides the obvious (can't use NULL values as data value) ?

#include "stdlib.h"

int array_length(void* array)
{
    int i;
    for (i = 0; ((void**)array)[i]; i++);
    return i;
}

void array_add(void* array, void* item)
{
    int size = array_length(array);
    realloc(array, sizeof(item) * (size + 2));

    ((void**)array)[size] = item;
    ((void**)array)[size + 1] = NULL;
}

void* array_new(){
    void** array = malloc(sizeof(NULL));
    array[0] = NULL;
    return array;
}

#include "stdio.h"

void print_items(char** array){
    printf("%i - [", array_length(array));
    for (int i = 0; array[i]; i++){
        printf("\"%s\"", array[i]);
        if (array[i+1])
            printf(", ");
    }
    printf("]\n");
}

int main(){
    char** str_list = array_new();

    print_items(str_list);
    array_add(str_list, "Something!");
    print_items(str_list);
    array_add(str_list, "Another Thing.");
    print_items(str_list);
}

Better Version (Update!)

After reading through your answers and debugging a bit more myself, I came up with a new version, which is mainly based on the struct approach mentioned below.

Have a Look, and tell me, what I am doing wrong now. :)

#include "stdlib.h"
#include "assert.h"

typedef void* Any;

typedef struct {
    Any* items;
    int count;
} List;

List* list_new(){
    List* list = malloc(sizeof(List));
    list->items = NULL;
    list->count = 0;
    return list;
}

void list_add(List* list, Any element){
    int count = list->count + 1;
    list->items = realloc(list->items, count * sizeof(element));
    list->items[count-1] = element;
    list->count = count;
}

void list_free(List* list){
    if (list){
        if (list->items) free(list->items);
        free(list);
    }
}

void list_each(List* list, void (*callback)(List*, Any, int)){
    for (int i = 0; i < list->count; i++){
        callback(list, list->items[i], i);
    }
}

// Sample usage

#include "stdio.h"

void debug_item(List* list, Any item, int index){
    if(index > 0) printf(", ");
    printf("\"%s\"", item);
}

void debug(List* list){
    printf("%i : [", list->count);
    list_each(list, debug_item);
    printf("]\n");
}

int main(){
    List* cats = list_new();
    debug(cats);
    list_add(cats, "Baltazar");
    list_add(cats, "Scar");
    list_add(cats, "Garfield");
    debug(cats);
    list_free(cats);
}
6
  • 2
    malloc(sizeof(NULL)) will fail miserably if sizeof(NULL) < sizeof(void *). A safer idiom is T *p = malloc(sizeof *p) for some type T. Anyway, this 0-terminated array thing is very ugly, error-prone, and linear-time length querying can become a performance problem for large arrays. The usual practice when implementing a dynamic array is to maintain a base pointer and an explicit size. Commented Dec 31, 2014 at 0:24
  • Hey, thanks for the comment. 1. Why would malloc(sizeof(NULL))fail? Basically I do only need the space to store a integer 0 at the end of the stream, right? No matter the actual datatype. 2. I buy the performance argument, that makes sence, thanks. 3. Can you elaborate a bit more on the error phrone bit? What are the errors I could expect? Why is it so ugly? Commented Dec 31, 2014 at 0:26
  • (Not the original replier) I think in most implementations of vector, the capacity is the power of 2 >= the size. You may want to realloc in a similar fashion. I can see the error-prone issue, if you do for some reason need to store a NULL. In a large project, this would probably happen due to someone's carelessness. The performance, too, if it matters in your application. Seems like you could also use a stack, and keep null at the bottom, then just find it when you want to read from the bottom, if you are not wanting to store another pointer/var and will not be seeking the bottom often. Commented Dec 31, 2014 at 0:45
  • This will be deadly slow. Common vector implementations in C do a couple of things: first, keep the current allocation size, current occupied size, a pointer to the memory block, and other data in a struct. Also, they initially allocate some small amount of space like for 8 or ten items, then realloc more when needed rather than every time. A common strategy is to grow 1.5X when needed, that way performance on inserts scales well. Commented Dec 31, 2014 at 1:24
  • @Hr.Rabe "Basically I do only need the space to store a integer 0 at the end of the stream, right? No matter the actual datatype." - no, that's wrong. In C, arrays are homogenous. If you are 0-terminating an array of Ts, then you need to terminate it with (T)0. "Can you elaborate a bit more on the error phrone bit? What are the errors I could expect?" - for one, you may forget the 0-terminator. (People forget to NUL-terminate their strings all over the place, even experts.) "Why is it so ugly?" - because it breaks symmetry, and you lose part of the domain… Commented Dec 31, 2014 at 2:21

2 Answers 2

2

Your array_length is O(N). This will lead to quadratic runtimes if you add lots of items in an array, which can be quite slow.

What you can so instead is use a struct to store the size of the array

struct dyn_array {
    size_t array_size; // how many elements fit in the array?
    size_t nelems;     // how much array space are we using now? 
    int *elems; //or void*, you know the drill...
}

One of the benefits of this approach is that you don't need to use the NULL terminator anymore. It also lets you use arrays of ints (or whatever else) instead of just arrays of pointers.

Sign up to request clarification or add additional context in comments.

6 Comments

This approach could be expanded to keep track of the allocated size versus the currently in use size, so that it's not necessary to call realloc every single time an item is added to the array.
How would the start value of dyn_array.elems look like then? As far as I understood, I can only use realloc on pointers that do actually occupy stack space (read: Have been created via malloc at some point) anyway? Otherwise at least Clang gives me a nice error of: pointer being realloc'd was not allocated... So: How would you initialize such a structed array?
The usual strategy is to keep allocated size and filled size, then whenever you add an item when full, realloc to 1.5X allocated size. This keeps insert performance in the N*log(N) range as you scale.
@Hr.Rabe: Use malloc in array_new and realloc in array_add, just like you re doing now. @Lee: That strategy is actually O(N) amortized, so its better than you say. You ca use different multipliers if you want though (time/space tradeoff)
If I get you right, that implies, that there is basically no meaningful way in C, to start an empty array of with actually no elements in memory. I would (If I understood correctly) always need at least a NULL element, only with the struct approach I would be able to fill that on the first call to àrray_add. Correct?
|
0

This is what the Croissant probably meant:

#define NULL 0 // in some include file.

This is unusual, but possible. Now assume 0 is a 4-byte int and void**array is an 8-byte pointer:

void* array_new(){
    void** array = malloc(sizeof(NULL)); // this works
    array[0] = NULL;                     // this write 8 bytes
    return array;
}

array[0]=NULLinterprets 0 as an 8-byte pointer and writes it into the allocated 4 bytes and beyond.

7 Comments

So what you are saying is, that sizeof(NULL) can actually give a size different to what NULL has been defined to mean? Can this really happen? oO
Maybe NULL has no meaning. Stroustrup strongly advised not to use anything that looks like "NULL" and is defined in an include file. He says if you need a 0, write 0, and if you need a 0 pointer, write 0. If he would be forced to define NULL, it would be define NULL 0 which, for me, is quite reasonable. And 0 can always be interpreted as a pointer, which may change the size. Thus in the instance you assign NULL to a variable of pointer type, it changes it's type and size.
@Hans_Klünder: it's not really appropriate to apply Stroustrup's reasoning here because we're talking about C, not C++ and NULL is subtly different between the two languages.
@HansKlünder That's a bit misleading. Stroustrup advised against NULL in favour of 0 in C++, and C++ 11 now has nullptr. Besides Kernighan and Ricthie used NULL.
Yes, K&R use NULL, and it's a pointer. stroustrup prefers 0 for C++. I think it's an unnecessary risk to make the integrity of the memory depend on sizeof(NULL) == sizeof(void*), while it's much safer to write void** array = malloc(sizeof(*array)) - this is what the Croissant suggested to do, and I agree. It's a good habit which makes sure that the memory returned by malloc is large enough to store one item of the type being used in that line.
|

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.