Return to Answer

Added expansion detailing why re-implementing vector as a learning project is unwise

Source Link

edited Jul 5, 2024 at 21:40

16.5k
2
25
65

Expanding on why implementing `std::vector` as a learning exercise is a bad idea

☛ What exactly is your position?

To put it as simply and succinctly as possible: Re-implementing std::vector as a beginner project for self-learning C++ is a terrible idea.

That is the bare minimum I can strip my position down to, so if you remove anything else—for example, stripping it down to “re-implementing std::vector is a terrible idea”—then you are misrepresenting it.

☛ Why is it a terrible idea?

Because:

std::vector is monstrously hard to implement correctly. This is true even if you try to “simplify” the project by removing features (for example, not including allocator support, or not making it constexpr-capable).

Implementing it correctly requires very specialized techniques that are of little or no use generally. So even if you do implement vector correctly, all you’ve learned is how to… implement vector correctly. You hard-won skills are largely non-transferable.

It’s not even a useful project. Because even if you do re-implement vector, and you do it perfectly, with the entire feature set of std::vector… there is no intelligent reason to use your implementation in a project, rather than using your standard library’s implementation.

In short:

You will get it wrong.

You won’t learn useful skills.

You won’t even get a usable tool out of it.

On top of that, the attempt will be so difficult and frustrating, that it will discourage many people, and ruin their motivation for learning C++. So not only will most newbies fail at the attempt… not only will they learn little of use… not only will they not even get something they can actually use out of it… attempting this as a learning project is likely to drive people away from C++ altogether.

☛ You’re wrong! You WILL learn useful skills re-implementing vector!

Okay, this is where the nuance comes in.

Let me be absolutely, crystal clear: I am NOT saying that you will learn nothing useful in attempting to re-implement vector.

What I am saying is that, when compared to good learning projects, you will learn nothing useful from re-implementing vector that you will not learn easier, and better from a good learning project.

Let me put it another way.

Imagine two beginners. One decides to learn C++ by trying to re-implement std::vector. The other decides to learn C++ by trying to re-implement std::list.

Let’s imagine that, against all odds, both of them succeed perfectly in their attempts.

In that case, both beginners will learn a lot from their efforts. Both will learn about things like resource management, special class operations, exception safety, and much, much more.

But there will be things that one of them learns, that the other does not, and vice versa. And here is the key point: the things that the list student learns will be useful in other contexts, but the things that the vector student learns will not.

The “concepts” underlying list—node-based containers, linked lists, etc.—and the way they are implemented, are broadly useful, even where std::list itself is not. There are linked lists in kernel code! (And there are no dynamically-resizable arrays in kernel code… particularly not any that have to support anything other than C-style implicit-data types.) Linked lists are also very useful in high-performance, multi-threaded code. (Dynamic arrays, however, are not.)

By contrast, the “concepts” underlying std::vector are pretty trivial… which is part of the reason everyone thinks it’s easy to implement. It’s an array. Possibly with some extra space at the end. 🤷🏼 The hard part of vector is how to implement that. And the things you need to know for that are very complex, and very specific to vector or vector-like types. Once you learn the dirty, ugly details of how to implement vector, there is pretty much no other situation where you will be able to apply that knowledge.

So, again, to be clear, I am NOT saying that you will learn absolutely nothing by trying to re-implement vector. I am saying that you will not learn anything that you cannot learn more easily with better learning projects… except for very specific minutia that is of no use generally, and is only useful for re-implementing vector.

So you might as well learn with those better learning projects, and not try re-implementing vector.

☛ You’re wrong! vector is NOT “monstrously” hard to implement!

Yeah? Try it.

I have been on CodeReview for over half a decade now. I have seen dozens of attempts at re-implementing vector. I have never, not even once, seen anyone get it right.

I have been coding C++ for over 25 years now, and teaching for almost as long. I have seen many… many… people take a swing at it. Everyone got it wrong at first. And most never get it all-the-way correct.

There is a rough pattern beginners follow when trying to implement vector.

First, once they’ve learned dynamic allocation and arrays, they simply do new T[size].

Then, they learn about capacity, and try something very much like what is in the code in this review: new T[capacity], along with a basically-pointless size.

Then comes the next group of people, who know a little more than the beginners, like about placement new, and think they’ve got the problem licked. These people do new std::byte[sizeof(T) * capacity], and then manually place T objects in that memory.

And finally comes the group that consider themselves “expert-level” in C++. These recognize the need for proper alignment, and so do new (std::align_val_t{alignof(T)}) std::byte[sizeof(T) * capacity]. And then they create new objects in that memory with new (data + n) T{...}.

This is where most people who know C++ very well will stop, thinking that this is the proper way… but, of course… that is still not correct.

I’m not going to keep going deeper into details of why it’s all wrong, because this is not supposed to be a tutorial on how to write std::vector; that would take way too long. I just want to demonstrate that it is harder than even most experts give it credit for.

But just to really hammer the point home… let’s just focus just on the construction: new (data + n) T{...}, where data is a T* and n is the index of the element being constructed. Do you know why this is not correct? There are several possible answers. I’ll just give three.

It’s not constexpr. std::vector is constexpr, but placement new is not. But, okay, let’s say you want to hand-wave that away (along with allocator support, etc.… we just keep stripping vector down to make it “easier”, but…), and you’re fine with making a vector that is not constexpr. Well, then….

Check this out. Never mind why someone might do this (I honestly can’t think of a good reason off the top of my head, but that doesn’t meant there isn’t one), the point is that they can do this, and you need to guard against it. The solution here is to force the global new with ::, so ::new (data + n) T{...}. Unfortunately, even that isn’t enough, because….

Check this out. Again, never mind why someone might do this (I can think of some very good reasons for it). The point is that placement new can accidentally get hijacked if your “placement” argument turns out to be a better match for some other new overload. The solution here is to make sure the “placement” argument is void*, so ::new (static_cast<void*>(data + n)) T{...}. And even this is still not all the way correct, but we’ll stop here.

You don’t have to deal with any of that bullshit when re-implementing std::list or std::map etc., because those are all node-based containers. You are allocating nodes, and specifically nodes of your own creation, not random T objects, so you will never get burned by the nonsense that evil T types might get up to. Indeed, in modern C++, using naked memory management and placement new like this would be a code smell. If you wrote code using those techniques it would be bad code (even if you used the techniques correctly, which is far from easy).

(Incidentally, you also don’t have to deal with most of this crap when implementing an allocator-aware vector, which is why I tell people that allocator-aware vector is easier to implement. Nobody ever believes me, though. 🤷🏼)

☛ So are you saying one should never try to re-implement vector?

No.

I thought it was clear from the context, and even if not, I explicitly said multiple times that I was talking about re-implementing vector as a learning project, and even then, explicitly as a beginner learning project.

Trying to re-implement vector is an absolutely terrible way for a beginner to learn C++. That does not mean it is a terrible way for an expert to learn expert-level C++.

In fact, I highly recommend that expert-level C++ coders do try to re-implement vector. It will be a humbling experience. And it will teach you a lot about the low-level, dark-and-dirty corners of the language.

And—and this may be surprising to hear—it is not only possible, it is easy to make a version of vector that is more efficient than std::vector. The standard puts some very harsh limitations on std::vector that your implementation does not necessarily need to follow. For example, why not try to make a version of vector that has a “small-string optimization”… that is, one that will not allocate anything if the size and number of elements is small enough.

(It is much less easy to make a version of vector that respects all the standard limitations and is still more efficient… but it could still be possible for at least some use-cases.)

The question of when you are “expert” enough that the attempt will be worthwhile is, of course, non-trivial. You should know your own limitations. But at the very least, if you are still fairly new to the language… no, you are not ready.

☛ So what should one do as a learning project instead?

I would answer that question with a question: Why are you learning C++? (Or programming in general.)

There are many different flavours of answers to that question, ranging from “I want to make my own games” to “I just like the intellectual challenge”, and none of them are bad answers. Even “to impress the chicks” would be acceptable… though I find myself skeptical of the efficacy of your cunning plan.

But the point is, you probably starting learning C++ for a reason. Considering that reason will lead you to the answers that are right for you.

I will suggest some examples, but, again, what will work for any given person is a very personal thing.

Interested in making games? Then make one!

Start small, of course. First attempts could be as simple as tic-tac-toe, with the computer player making random moves. Black Jack is another good starting game.

Depending on what kind of games you’re ultimately interested in making, there are a lot of ways to go from here. Interested in making puzzle games (for example, for mobile)? Try making the above games graphical. Interested in MMOs? Try making the above games multiplayer, with network-connected opponents. Interested more in game development than in game engine development? Try making the above games in an existing game engine, like Godot. Interested in making game engines? Start by rendering some sprites, or a simple 3D cube.

It should be obvious what these attempts will teach you. You will not only learn C++ itself, you will learn game architecture, graphics programming, artificial intelligence (for smart gaming opponents, not the transformer garbage that is hyped as “AI” these days), and much, much more obviously useful stuff.

Interested in blockchains? Then make one!

I’ve never done it, but making a simple blockchain does not look all that hard. Study up on the structure of blockchains, and start making the most basic component… a block that chains. Then later make a program that does whatever proof function is necessary to compute a new block (a mining program). And so on.

Again, in addition to C++ itself, it should be obvious what additional skills this will teach you.

Interested in “AI” (transformers like GPT and so on)? Then make one!

This is not a joke. Again, start small. Make a simple Markov chain that takes a corpus and writes text given a prompt. Then try making complicated matrix transformation programs that maybe take an image, and perform graphical transforms on it. If you want to make image generators, study up on what “diffusion” (as in “Stable Diffusion”) means in the context of image manipulation, and make one. Learn about GPGPU programming, and implement those matrix transformations that way for more efficiency.

This will start you out in both the structure of “AI” transformers, and the math necessary to make one.

Even if your answer is “I just like the intellectual challenge” re-implementing vector to learn C++ is still a terrible idea. There are thousands of better projects for learning the language. Even re-implementing std::list, std::forward_list, or any of the associative containers is a better idea.

Once you have a high level of expertise with the language, then you can give re-implementing vector a shot. But you won’t get anywhere by biting off more than you can chew too early; you’ll just end up choking on it.

Expanding on why implementing `std::vector` as a learning exercise is a bad idea

☛ What exactly is your position?

To put it as simply and succinctly as possible: Re-implementing std::vector as a beginner project for self-learning C++ is a terrible idea.

☛ Why is it a terrible idea?

Because:

std::vector is monstrously hard to implement correctly. This is true even if you try to “simplify” the project by removing features (for example, not including allocator support, or not making it constexpr-capable).

Implementing it correctly requires very specialized techniques that are of little or no use generally. So even if you do implement vector correctly, all you’ve learned is how to… implement vector correctly. You hard-won skills are largely non-transferable.

It’s not even a useful project. Because even if you do re-implement vector, and you do it perfectly, with the entire feature set of std::vector… there is no intelligent reason to use your implementation in a project, rather than using your standard library’s implementation.

In short:

You will get it wrong.

You won’t learn useful skills.

You won’t even get a usable tool out of it.

☛ You’re wrong! You WILL learn useful skills re-implementing vector!

Okay, this is where the nuance comes in.

Let me be absolutely, crystal clear: I am NOT saying that you will learn nothing useful in attempting to re-implement vector.

Let me put it another way.

Imagine two beginners. One decides to learn C++ by trying to re-implement std::vector. The other decides to learn C++ by trying to re-implement std::list.

Let’s imagine that, against all odds, both of them succeed perfectly in their attempts.

In that case, both beginners will learn a lot from their efforts. Both will learn about things like resource management, special class operations, exception safety, and much, much more.

So you might as well learn with those better learning projects, and not try re-implementing vector.

☛ You’re wrong! vector is NOT “monstrously” hard to implement!

Yeah? Try it.

I have been on CodeReview for over half a decade now. I have seen dozens of attempts at re-implementing vector. I have never, not even once, seen anyone get it right.

There is a rough pattern beginners follow when trying to implement vector.

First, once they’ve learned dynamic allocation and arrays, they simply do new T[size].

Then, they learn about capacity, and try something very much like what is in the code in this review: new T[capacity], along with a basically-pointless size.

This is where most people who know C++ very well will stop, thinking that this is the proper way… but, of course… that is still not correct.

It’s not constexpr. std::vector is constexpr, but placement new is not. But, okay, let’s say you want to hand-wave that away (along with allocator support, etc.… we just keep stripping vector down to make it “easier”, but…), and you’re fine with making a vector that is not constexpr. Well, then….

Check this out. Never mind why someone might do this (I honestly can’t think of a good reason off the top of my head, but that doesn’t meant there isn’t one), the point is that they can do this, and you need to guard against it. The solution here is to force the global new with ::, so ::new (data + n) T{...}. Unfortunately, even that isn’t enough, because….

Check this out. Again, never mind why someone might do this (I can think of some very good reasons for it). The point is that placement new can accidentally get hijacked if your “placement” argument turns out to be a better match for some other new overload. The solution here is to make sure the “placement” argument is void*, so ::new (static_cast<void*>(data + n)) T{...}. And even this is still not all the way correct, but we’ll stop here.

☛ So are you saying one should never try to re-implement vector?

No.

Trying to re-implement vector is an absolutely terrible way for a beginner to learn C++. That does not mean it is a terrible way for an expert to learn expert-level C++.

(It is much less easy to make a version of vector that respects all the standard limitations and is still more efficient… but it could still be possible for at least some use-cases.)

☛ So what should one do as a learning project instead?

I would answer that question with a question: Why are you learning C++? (Or programming in general.)

But the point is, you probably starting learning C++ for a reason. Considering that reason will lead you to the answers that are right for you.

I will suggest some examples, but, again, what will work for any given person is a very personal thing.

Interested in making games? Then make one!

Start small, of course. First attempts could be as simple as tic-tac-toe, with the computer player making random moves. Black Jack is another good starting game.

Depending on what kind of games you’re ultimately interested in making, there are a lot of ways to go from here. Interested in making puzzle games (for example, for mobile)? Try making the above games graphical. Interested in MMOs? Try making the above games multiplayer, with network-connected opponents. Interested more in game development than in game engine development? Try making the above games in an existing game engine, like Godot. Interested in making game engines? Start by rendering some sprites, or a simple 3D cube.

It should be obvious what these attempts will teach you. You will not only learn C++ itself, you will learn game architecture, graphics programming, artificial intelligence (for smart gaming opponents, not the transformer garbage that is hyped as “AI” these days), and much, much more obviously useful stuff.

Interested in blockchains? Then make one!

I’ve never done it, but making a simple blockchain does not look all that hard. Study up on the structure of blockchains, and start making the most basic component… a block that chains. Then later make a program that does whatever proof function is necessary to compute a new block (a mining program). And so on.

Again, in addition to C++ itself, it should be obvious what additional skills this will teach you.

Interested in “AI” (transformers like GPT and so on)? Then make one!

This is not a joke. Again, start small. Make a simple Markov chain that takes a corpus and writes text given a prompt. Then try making complicated matrix transformation programs that maybe take an image, and perform graphical transforms on it. If you want to make image generators, study up on what “diffusion” (as in “Stable Diffusion”) means in the context of image manipulation, and make one. Learn about GPGPU programming, and implement those matrix transformations that way for more efficiency.

This will start you out in both the structure of “AI” transformers, and the math necessary to make one.

Source Link

answered Jun 27, 2024 at 23:29

indi

16.5k
2
25
65

Advice for learning

Don’t do this.

The container you are trying to implement is std::vector. You don’t call it that, and you never say explicitly that std::vector is what you are trying to make… but that’s what it is.

It is very common for beginners to want to re-implement std::vector. It’s not hard to see why. It is one of the very first parts of the standard library they learn, and it… seems… simple enough. It’s certainly simple to understand, so it should be simple to implement, right?

No.

In fact, std::vector is monstrously hard to implement correctly. It is one of the hardest parts of the entire standard library to implement; even things like std::thread are much easier to implement than std::vector. Even most experienced C++ professionals can’t do it correctly. Even before I looked at your code, I could guarantee with 100% certainty that it would be wrong. And, having looked… yes, it is wrong.

But it gets worse. It’s not just that std::vector is stupidly hard to get right. You might think, “okay, so it’s hard, but at least I’ll learn a lot from it!”

No, no you won’t.

In fact, most of the skills and knowledge you need to re-implement std::vector are absolutely useless for anything… except re-implementing std::vector.

That’s the nasty truth about std::vector. Despite seeming simple, and tempting many a beginner to try and re-implement it, it is in fact insanely difficult to implement correctly. You will fail. (You did fail.) And you won’t even learn anything useful from the attempt.

My advice: Don’t do this. Don’t try to re-implement std::vector as a learning project. You’ll either do it wrong and get a bad lesson out of it, or you will do it correctly and learn useless, trivial minutia that you can’t actually use anywhere else. It is a waste of your time.

Instead, find a project that actually interests you. (Because, come on, do you really want to implement containers? Is that really your goal for learning C++?) Interested in gaming? Make a simple game! (Yes, making a whole game is actually way, way easier than re-implementing std::vector.) Interested in physics? Make a simulation! Even something smaller scale, like a simple program to read and give information about the information in an image file would be easier and more useful as a learning project than trying to re-implement std::vector.

An easy way to illustrate the problem

Okay, so let’s say you don’t believe me when I say that std::vector is monstrously hard to implement correctly, and you think you got pretty close with what you have so far.

Let me show you an easy way to test the inner workings of a class like Container. What you need is a “barking” class (or some call it a “meowing” class; I won’t judge). A “barking” class is a very simple class that does little or nothing other than make noise (“bark”) whenever something interesting happens. Typically you want the class to “bark” whenever it is constructed, destructed, copied, or moved. So, for example:

struct barker
{
    barker() { std::cout << "default constructor\n"; }
    ~barker() { std::cout << "destructor\n"; }

    barker(const barker&) { std::cout << "copy constructor\n"; }
    barker(barker&&) { std::cout << "move constructor\n"; }

    auto operator=(barker const&) -> barker&
    {
        std::cout << "copy assignment\n";
        return *this;
    }

    auto operator=(barker&&) -> barker&
    {
        std::cout << "move assignment\n";
        return *this;
    }
};

You can get more advanced, and, for example, give each barker object a unique ID to keep track of individual objects. But let’s work with this for now.

Let’s write a very simple program:

auto main() -> int
{
    Container<barker> c;

    std::cout << "Number of barker objects in container: " << c.Size() << '\n';
}

Now, riddle me this: What should the output of this program be?

You would probably guess that the output would be “Number of barker objects in container: 0”.

You would be half right.

This is the actual output:

default constructor
default constructor
default constructor
default constructor
default constructor
default constructor
default constructor
default constructor
default constructor
default constructor
Number of barker objects in container: 0
destructor
destructor
destructor
destructor
destructor
destructor
destructor
destructor
destructor
destructor

It says there are zero barker objects in the container so… where are all those barker objects that are being constructed and destroyed?

The answer is: They are in the container. Where else could they be? The container is lying when it says it has zero objects.

Here is the actual problem. Your container is basically this:

template <typename Container>
class Container
{
    T* data_;
    std::size_t size_;
    std::size_t capacity_;
};

data_ is a pointer to an array of T. The number of elements in that array is capacity_. So what is size_?

Well, frankly, size_ is bullshit. It is a lie. It is a fiction you want to tell yourself. Even though the actual number of objects in your container is capacity_, you want to pretend the number is size_. But it’s not. The number of objects is capacity_, because that is the size of the array you created.

Let’s look at it another way. When you default construct a container, it calls Initialize(0) (it actually calls Initialize(), but there is a hidden default argument of 0 for some reason). Initialize() looks like this:

void Initialize(const size_t init_size = 0) {
    if (data_ != nullptr) {
        return;
    }
    UpdateCapacity(init_size);
    T* ptr = new T[capacity_];
    if (ptr == nullptr) {
        throw std::runtime_error("Memory not allocated");
    }
    size_ = init_size;
    data_ = ptr;
}

When default constructing, data_ will always start as nullptr and capacity_ will start as 10 (DEFAULT_CAPACITY_). UpdateCapacity() will ultimately do nothing (except return false). So, stripping away the error checking (which is pointless anyway) and simplifying, you get:

void Initialize(...)
{
    capacity_ = 10;
    size_     = 0;
    data_     = new T[capacity_];
}

Now, this should make the fact that size_ is a lie plainly obvious. You are literally and quite obviously and explicitly creating an array of 10 (capacity_) objects. The number of objects in your container is clearly not zero.

Okay, now that you can see and understand the problem, you are probably asking: How do you fix it?

The answer to that is extremely complicated, and it’s actually easier to use allocators than to do it with raw memory allocations. But… and I stress, this is a VERY basic explanation, that is so stripped down and simplified it is more wrong than right… the idea is that you don’t create an array of capacity_ T objects… you create an array of bytes *that is large enough (and properly aligned) to hold capacity_ T objects.

The class will basically look the same:

template <typename Container>
class Container
{
    T* data_;
    std::size_t size_;
    std::size_t capacity_;
};

But when you initialize, you do NOT do data_ = new T[capacity_]. That would create an array already full of capacity_ T objects. Instead you need to allocate a buffer of bytes that is large enough to hold capacity_ number of sizeof(T), and is properly aligned. So your Initialize() function might look like:

// THIS FUNCTION IS NOT CORRECT. It probably won't compile.
// It is just to illustrate the ideas.
auto Initialize(std::size_t n)
{
    try
    {
        size_ = n;
        capacity_ = std::max(size_, 10);
        data_ = new(std::align_val_t{alignof(T)}) std::byte[capacity_ * std::max(sizeof(T), alignof(T))]; 
        std::ranges::uninitialized_default_construct_n(data_, data_ + size_);
    }
    catch (...)
    {
        delete[] data_;
        throw;
    }
}

As you can see, first we allocate enough memory—properly aligned—in raw bytes (not an array of T, but an array of std::byte). Then we actually construct the correct number of objects (using uninitialized_default_construct_n()) in that memory. So if we only want 4 objects, we allocate enough space for 10, but only construct 4 (rather than allocating 10 and just pretending we have 4).

This is just the tip of a very large, very ugly iceberg that—again I stress—really isn’t worth your time to learn. There is a lot more going on, and a lot more you’d have to get right in order to make everything work properly. Especially if you want to correctly handle errors (note that the code above doesn’t!).

So I repeat: Don’t do this. Find a better, more interesting, more useful project, and you will learn a lot more, and have a lot more fun.

Code review

#pragma once

Don’t use #pragma once. It is not standard, and for a very good reason. Use include guards.

using std::literals::string_literals::operator""s;

You should never, ever do this kind of thing in a header file, and certainly not at global scope.

It is also completely unnecessary. The only place you want to use string literals is in stream inserter, so, if anything, this using declaration should be localized to just the stream inserter. (But even there it is silly. We’ll get to that.)

You should have everything in your own namespace. It is bad manners to pollute the global namespace.

template <typename T>
class Container {

Starting in C++20, one of the most important and useful skills you can learn is how to properly constrain your templates.

In this case, you don’t want a container that can hold any T. For example, Container<void> is obviously nonsense. So is Container<int&> (though it is admittedly harder to understand why). So you probably want to ensure that T satisfies std::is_object_v<T>.

The way your container is implemented, you also require a type that is default-constructible. (That is probably not a requirement you want… but it is a requirement that you have unless you re-design the class.) That means you also need std::default_initializable. You probably also want it to be copyable and/or movable. So maybe std::semiregular.

You should think about what constraints are necessary, and then apply them like so:

template <typename T>
requires std::is_object_v<T>
    and std::semiregular<T>
class Container {

Note that you may not necessarily want or need std::semiregular. Be careful not to over-constrain.

    Container() {
        Initialize();
    }

Your default constructor always allocates at least 10 T objects. That is both wasteful, and risky. There is no need to allocate anything. Instead your default constructor could just leave the size as zero, the capacity as zero, and the data pointer as nullptr. Why not, after all?

    Container(const Container& from) {
        Initialize(from.size_);
        *this = from;
    }

Okay, but… why? Why initialize here if you are just going to clear and then re-initialize again in the copy assignment operator? You are literally just allocating a ton of default-constructed objects (as many as there are objects in the from container), then throwing them all away immediately, then allocating the same number of default-constructed objects again… and then copying over them all. It’s massively wasteful.

What you want to do is just allocate a buffer large enough to hold the number of objects in from, then do uninitialized_copy().

In fact, the pattern you are using is back-asswards. You implement the copy assignment operator, and then implement the copy constructor in terms of the copy assignment operator. That is wrong-headed. What you should do is implement the copy constructor, and then implement the copy assignment operator in terms of the copy constructor (using the copy-and-swap idiom).

    Container(const size_t size, const T& with = {}) {
        Initialize(size);
        for (size_t index = 0; index < size; ++index) {
            data_[index] = with;
        }
    }

A warning: This is a poorly designed and dangerous constructor. I am aware that std::vector has the exact same constructor (more or less), and that it is even part of the container requirements, but that is generally considered to be a mistake. This is the constructor that causes confusion and bugs when you can’t decide whether to use Container{3, 4} or Container(3, 4).

That aside, your implementation is not incorrect, but it is quite wasteful. Consider what you are doing. First you initialize the container with size default-constructed T objects… then you copy over them. Aside from the inefficiency, it means this will only work if T is default-constructible. But that shouldn’t be necessary.

Instead what you should do is allocate the memory as a buffer of raw bytes, then use uninitialized_fill_n(). Instead of default constructing then copy assigning, this will just copy construct the objects.

Also, and this is a general problem throughout your code… it is spelled std::size_t, not just size_t. You also need to be sure to include a header that defines std::size_t, like <cstddef>. For historical reasons, most compilers and standard libraries will quietly accept incorrect code that just assumes size_t is a built-in type… but be forewarned, that will change when modules become the norm.

    Container(const std::initializer_list<T>& list) {
        Initialize(list.size());
        size_t index{ 0 };
        for (const auto& item : list) {
            data_[index] = item;
            ++index;
        }
    }

First, you should never take initializer lists by reference. They are meant to be cheap, trivially-copyable views.

This function has basically the same problems as the previous one. The only difference is that instead of uninitialized_fill_n(), you would use uninitialized_copy().

    Container& operator=(const Container& right) {
        Clear();
        Initialize(right.size_);
        CopyData(data_, right.data_, right.size_);
        resize_count_ = right.resize_count_;
        capacity_ = right.capacity_;
        return *this; 
        return *this;
    }

Consider what will happen if the allocation in Initialize() fails. You’ve already wiped out all the existing data in Container with Clear(). Or, if Initialize() succeeds, but CopyData() fails, same basic idea: you’ve lost data.

The standard pattern for situations like this is the copy-and-swap idiom. First you implement a proper, efficient copy constructor. Then you implement a proper, efficient swap function. Then your assignment operator is just:

auto operator=(Container const& right) -> Container&
{
    auto temp = right;              // copy construction
    std::ranges::swap(*this, temp); // swap

    // *this now contains the copy of right

    // temp contains the previous contents of *this
    // but temp is about to be destroyed

    return *this;
}

Now, suppose the first line (the copy construction) fails. No biggie, nothing has been changed yet, so both *this and right still have their original, untouched contents.

Now suppose the second line fails. Ideally, swap operations should never fail, but let’s assume this one does. If it is properly written, then it should either succeed or leave everything in the original state (aka, the strong exception guarantee). If that is the case, again, no biggie. *this will be in its original state… and so will right… so, again, everything still has the original, untouched contents.

And then you get to the third line, which cannot fail. Which means if you get here, you’ve succeeded. Which means everything always works, even in the event of a failure. That is the copy-and-swap idiom.

Now, back to your code:

    Container& operator=(const Container& right) {
        Clear();
        Initialize(right.size_);
        CopyData(data_, right.data_, right.size_);
        resize_count_ = right.resize_count_;
        capacity_ = right.capacity_;
        return *this; 
        return *this;
    }

What is the sense in setting capacity_ to right.capacity_? this->capacity_ was set, properly, within Initialize(), and may not necessarily be the same as right.capacity_.

(There also appears to be a copy-paste error in returning twice.)

    T& operator[](const size_t index) {
        if (index < size_) {
            return data_[index];
        }
        else {
            throw std::out_of_range("Out of range");
        }
    }
    const T& At(const size_t index) const {
        if (index < size_) {
            return data_[index];
        }
        else {
            throw std::out_of_range("Out of range");
        }
    }

In std::vector, operator[] does not do the bounds check. This is important, because element access is a very common operation, and should be as fast as possible. If I’ve already made sure that the access is not out-of-bounds, it is cruel to make me pay for it again.

So these two functions should not be identical. Ideally, the direct access with operator[] should be fast (by skipping the bounds check).

    size_t Size() const{
        return size_;
    }

Consider that there is no possible way for this function to fail. Even if the container is empty, it will just return zero… which is the correct behaviour.

When a function cannot possibly fail, it should be marked noexcept.

    void PushBack(const T& item) {
        Resize(size_ + 1);
        data_[size_] = item;
    }

Continuing the trend, this is yet another case where you are unnecessarily default-constructing a T, then copying over it. The correct behaviour would be to reallocate the raw memory (if necessary), then do a construct_at() (uninitialized_copy() would also be correct).

    void Resize(const size_t new_size) {
        if (!UpdateCapacity(new_size)) {
            size_ = new_size;
            return;
        }
        T* ptr = new T[capacity_];
        if (ptr == nullptr) {
            throw std::runtime_error("Memory not allocated");
        }
        CopyData(ptr, data_, (new_size > size_ ? size_ : new_size));

        delete[] data_;
        data_ = ptr;
        size_ = new_size;
    }

There are some bugs lurking here.

UpdateCapacity() does nothing but change the value of capacity_ (and resize_count_, but that seems totally pointless). Okay… but… the value of capacity_ has (possibly) changed… but the actual capacity has not.

And then you allocate a new buffer with T* ptr = new T[capacity_];. And what if that fails? Well, now your container is hopelessly broken, believing it has a capacity that it does not.

There’s also the problem of what happens if copying fails (in CopyData()). That not only leaves you with a bogus capacity_, it also means that the memory allocated for ptr will be leaked.

(Let’s also not miss the fact that if the existing capacity was enough, no objects are actually created or destroyed. You just change the value of size_. That should be a glaring red flag for the fact that size_ is a fiction. It has no connection to the actual number of objects in the container.)

I should also point out that you seem to have a misunderstanding of how new works. If that allocation fails, it does not return nullptr. It throws std::bad_alloc. So the next line that checks whether ptr is nullptr is specious. ptr will never be nullptr. Either the allocation succeeded, or it threw std::bad_alloc.

Now, what you have here is almost correct. Setting aside the problem of changing capacity_ before you actually… well, change the capacity… the basic structure is pretty much the copy-and-swap idiom:

auto Resize(std::size_t new_size)
{
    // Let's just ignore this bit:
    /*
    if (!UpdateCapacity(new_size)) {
        size_ = new_size;
        return;
    }
    */

    // Let's assume instead of changing capacity_ (which causes
    // the bug), we use new_capacity:
    auto const new_capacity = /* ... */; // figure out the new capacity

    // USE SMART POINTERS!!!
    auto ptr = std::make_unique_for_overwrite<T[]>(new_capacity);

    // This next bit is specious:
    /*
    if (ptr == nullptr) {
        throw std::runtime_error("Memory not allocated");
    }
    */

    // Don't really know what purpose this function serves:
    //CopyData(ptr, data_, (new_size > size_ ? size_ : new_size));
    // This does the same thing, after all:
    std::ranges::copy_n(data_, std::ranges::min(size_, new_size), ptr.get());

    // Everything up to this point is the "copy" part of copy-and-swap.
    // If the copying failed, ptr is automatically freed.

    // Now comes the "swap" part.
    // None of the following ops can fail.
    auto const temp = ptr.release();
    ptr.reset(data_);
    data_ = temp;
    size_  = new_size;
    capacity_ = new_capacity;

    // Old data is automatically deleted when ptr is destroyed
}

See? Copy-and-swap.

    void Clear() {
        delete[] data_;
        data_ = nullptr;
        size_ = 0;
        resize_count_ = DEFAULT_RESIZE_COUNT_;
        capacity_ = DEFAULT_CAPACITY_;
    }

This can’t possibly fail (unless delete[] can fail, which should never happen), so it should probably be noexcept.

    const unsigned short RESIZE_MULT_{ 2 };
    const unsigned int DEFAULT_RESIZE_COUNT_{ 10 };
    const unsigned int DEFAULT_CAPACITY_{ 10 };

Don’t use SCREAMING_SNAKE_CASE for anything but macros.

And then, don’t use macros.

These should probably also be constexpr, and static as well, since they are class constants. That way they will not take up space in each container instance.

    unsigned int resize_count_{ DEFAULT_RESIZE_COUNT_ };

So, this seems to be part of your resize strategy. However, it doesn’t seem like a good idea to have this extra variable in every single container instance.

Typically, vectors just grow by a constant factor: usually 2, but I think MSVC uses 1.5. Using a constant factor means you don’t have to pay for that extra variable in every instance.

    void Initialize(const size_t init_size = 0) {
        if (data_ != nullptr) {
            return;
        }
        UpdateCapacity(init_size);
        T* ptr = new T[capacity_];
        if (ptr == nullptr) {
            throw std::runtime_error("Memory not allocated");
        }
        size_ = init_size;
        data_ = ptr;
    }

Having an “initialize” function, even buried in your class’s private guts, is a code smell.

If you aren’t aware of this, it is possible to chain constructors. That is almost always a better idea than having an “initialize” function.

In this case, for example, all this “initialize” function really does is allocate space for at least init_size T objects. Then the actual constructors just need to copy the data into the allocated space. So, something like this:

// Tag type:
struct with_capacity_t
{
    constexpr explicit with_capacity_t() noexcept = default;
};
inline constexpr auto with_capacity = with_capacity_t{};

template <typename T>
class Container
{
    T* data_              = nullptr;
    std::size_t size_     = 0;
    std::size_t capacity_ = 0;

public:
    // Default constructor is simple:
    constexpr Container() noexcept = default;

    // This could also be a private constructor, if you prefer:
    constexpr Container(with_capacity_t, std::size_t cap)
    {
        // Allocate memory, and set data_ and capacity_.
        // size_ is left at zero.
    }

    // This first runs the "with_capacity" constructor, which allocates
    // the space, then runs this constructor's body, which fills the
    // space (by copying with uninitialized_copy_n()).
    constexpr Container(Container const& from)
        : Container(with_capacity, from.size_)
    {
        std::ranges::uninitialized_copy_n(from.data_, from.size_, data_, data_ + capacity_);
        size_ = from.size_;
    }

    // This first runs the "with_capacity" constructor, which allocates
    // the space, then runs this constructor's body, which fills the
    // space (by filling with uninitialized_fill_n()).
    constexpr Container(std::size_t n, T const& t = {})
        : Container(with_capacity, n)
    {
        std::ranges::uninitialized_fill_n(data_, n, t);
        size_ = n;
    }

    // This first runs the "with_capacity" constructor, which allocates
    // the space, then runs this constructor's body, which fills the
    // space (by copying with uninitialized_copy()).
    constexpr Container(std::initializer_list<T> list)
        : Container(with_capacity, list.size())
    {
        std::ranges::uninitialized_copy(list.begin(), list.end(), data_, data_ + capacity_);
        size_ = list.size();
    }
};

Making the “initialize” function an actual constructor removes the temptation to use it elsewhere (for example, in your copy assignment operator).

    void CopyData(T* destination, const T* source, size_t size) {
        for (size_t index = 0; index < size; ++index) {
            destination[index] = source[index];
        }
    }

This is literally just the standard copy_n() algorithm. Don’t reinvent the wheel.

    bool UpdateCapacity(const size_t feature_size) {
        if (feature_size > capacity_) {
            capacity_ = feature_size + resize_count_;
            resize_count_ *= RESIZE_MULT_;
            return true;
        }
        else if ((feature_size + resize_count_) < capacity_) {
            capacity_ = feature_size + DEFAULT_RESIZE_COUNT_;
            resize_count_ = DEFAULT_RESIZE_COUNT_;
            return true;            
        }
        return false;
    }

This function is in dire need of some comments to explain the logic of what is going on. What is “feature size”, for example? What is a “resize count”? What is going on here? And why?

template <typename T>
std::ostream& operator<<(std::ostream& os, const Container<T>& right) {
    size_t size = right.Size();
    os << "Size: "s << size << std::endl;
    os << "--------------------------------"s << std::endl;
    for (size_t index{ 0 }; index < size; ++index) {
        os << '<' << (index+1) << "> "s << right.At(index) << std::endl;
    }
    os << "--------------------------------" << std::endl;
    return os;
}

There are quite a few problems here.

First, there is a standard pattern for implementing stream inserters, and it uses a technique called “hidden friends”.

What you do is put the entire inserter, body and all, inside the class, as a friend, like so:

template <typename T>
class Container
{
public:
    friend auto operator<<(std::ostream& os, Container const& right) -> std::ostream&
    {
        // ... body ...
    }
};

This has a couple of effects, like making the insert operator an associated function of the class.

The next problem is: never use std::endl. It is always wrong. Always, always. If you want a newline, just use a newline ('\n').

The next problem is that you use the std::string literal operator on your string literals. In this situation, that is remarkably silly. Every time you do "..."s, you are constructing a std::string, which may require a memory allocation. And for what? The string contents are already static character array data, which already work just fine in stream operations (and even better, as you will see in a moment). You are literally just wasting cycles.

So let’s fix everything mentioned so far:

template <typename T>
class Container
{
public:
    friend auto operator<<(std::ostream& os, Container const& right) -> std::ostream&
    {
        auto const size = right.Size();
        os << "Size: " << size << '\n';

        os << "--------------------------------\n";

        for (std::size_t index = 0; index < size; ++index)
        {
            os << '<' << (index + 1) << "> " << right[index] << '\n';
        }

        os << "--------------------------------\n";

        return os;
    }
};

Now here’s a neat little fact about IOStreams that most people aren’t aware of: All output streams support char and char const*. You’re probably thinking, “yeah, duh, how else would you print characters and strings”. No, you’re not getting it. ALL output streams support char and char const*. ALL of them. Even output streams that aren’t char streams.

For example, std::wcout is a wchar_t stream. Naturally, it can write wchar_t and wchar_t const*. However, and this is the magical point: it, too, supports char and char const*.

Even when a stream doesn’t support std::string (like std::wcout does not), it will still support char const*. So since we have now converted all those unnecessary std::string literals into char const* literals, they will now work with ALL output stream types.

We can take advantage of this, and do:

template <typename T>
class Container
{
public:
    template <typename Char, typename Traits>
    friend auto operator<<(std::basic_ostream<Char, Traits>& os, Container const& right) -> std::basic_ostream<Char, Traits>&
    {
        auto const size = right.Size();
        os << "Size: " << size << '\n';

        os << "--------------------------------\n";

        for (std::size_t index = 0; index < size; ++index)
        {
            os << '<' << (index + 1) << "> " << right[index] << '\n';
        }

        os << "--------------------------------\n";

        return os;
    }
};

As you can see, we did not change a single thing in the body of the function. Yet we automatically, for free, get support for ALL output stream types. We can stream to wide-character streams. We can stream to Unicode streams. We can stream to any output stream type. For no extra work.

Extra stuff?

Now, this is where I might normally suggest a bunch of extra stuff you could add to this class to extend its usability.

However, I don’t think you should pursue this project. It will take an ungodly amount of work to get it correct, and you won’t learn many useful skills from it.

Instead, I suggest finding another type of project. Or, if you really want to make containers, pick a better container; don’t try to re-implement std::vector, try std::list or std::forward_list instead. Or try one of the associative containers. Or, hell, try something else, like a tree or a graph or something exotic.

But not std::vector.

Return to Answer

Expanding on why implementing std::vector as a learning exercise is a bad idea

Expanding on why implementing std::vector as a learning exercise is a bad idea

Advice for learning

An easy way to illustrate the problem

Code review

Extra stuff?

Expanding on why implementing `std::vector` as a learning exercise is a bad idea

Expanding on why implementing `std::vector` as a learning exercise is a bad idea