Return to Revisions

4 of 4

added 2549 characters in body

edited Jul 21, 2023 at 20:32

97.7k
5
126
341

As vector is such a common thing we review here I have written up a series of articles about implementing a vector.

Overview

Very good.

I found one major bug in push_back(). You have potential memory leaks in reserve() and shrinktofit() that are easy to fix. You can simplify your assignment operator (currently you have copy and move versions) they can be combined into a single version that works for both.

Minor comments about some missing functionaity that is in std::vector.

Code review

Sure you can have empty vectors.

    Vector(): data{nullptr}, m_size{0}, m_capacity{0} {}

But I am wondering if the best strategy is not to simply always allocate capacity for minimal size. How often is a vector allocated but not used?

But pointed out by @chrysante below, the std::vector default constructor is noexcept so can't have any memory allocation (as that can potentially throw). So if you want to go that route you can mark this default constructor noexcept.

    Vector()  noexcept
        : data{nullptr}
        , m_size{0}
        , m_capacity{0}
    {}

One comment on style. Like in variable declarations in the code blocks its nice to have member initialization in the constructor one per line (its easier to read). You are not trying to save vertical space.

Slight deviation from the interface of std::vector! Sure you can do it. But it will confuse people. Also I use it to simplify things below. I have the same constructor in my class but mine is private so it can only be used internally.

    explicit Vector(size_type capacity) : Vector() {
        resize(capacity);
    }

You can simplify these two assignments into a single method:

    Vector& operator=(const Vector& other) {
        Vector temp = other;
        swap(temp);
        return *this;
    }

    Vector& operator=(Vector&& other) noexcept {
        swap(other);
        return *this;
    }

Easy to write as:

    // Notice the parameter is value.
    // If passed a RValue it is moved into the parameter.
    // If passed an LValue it is copied.
    // So you get the same effect with less code.
    Vector& operator=(Vector other) noexcept
    {
        swap(other);
        return *this;
    }

This is good:

    reference operator[](size_type index) {
        return data[index];
    }

But what about accesses to a const Vector? Just because you can't modify does not mean you can't use the operator[] on it.

    const_reference operator[](size_type index) const
    {
        return data[index];
    }

While we are here: Why is there no at() method?

    void resize(size_type count, const_reference value = T()) {
                         // Not valid here            ^^^^^^
                         // That should be in the header only

This is fine. But std::vector destroys them from back to front. Just like how it deletes elements during destruction. This is to mimic the behavior of the C-style array (objects are destroyed in reverse order of creation).

        if (count < m_size) {
            for (size_type i = count; i < m_size; i++) {
                data[i].~T();
            }

        } else if (count > m_size) {
            if (count > m_capacity) {
                reserve(count);
            }

            for (size_type i = m_size; i < count; i++) {
                new (data + i) T(value);
            }
        }

        m_size = count;
    }

    void reserve(size_type count) {
        if (m_capacity < count) {
            pointer newData = allocate(count);

            // Note sure why you need to check against 0 here.
            // Copy zero data is a NOOP so would not be more expensive.
            // If I had to guess this is actually a pesimization as you
            // code has an extra branch.
            if (m_size > 0) {
                // A new function to me.
                // Thats really cool.
                std::uninitialized_move(data, data + m_size, newData);
            }

            deallocate(data);
            data = newData;
            m_capacity = count;
        }
    }

The one thing here I would watch is that you have a potential leak.

It's hard to spot. But if the type T does not support move construction then the compiler will use the copy constructor during the std::uninitialized_move. If one of the copy constructors fails (i.e. throws) then you will leave this function needing to clean up newData. Though your function does provide the strong exception guarantee.

You can make this simpler by re-using the Vector :-)

    void reserve(size_type count) {
        if (m_capacity < count) {
            Vector   temp(count)           // Use your vector with pre-reserved size.

            // Remember that Vector is a friend of Vector
            // So you can reach into the other class and mess with
            // its members (just remember to unit test).
            std::uninitialized_move(data, data + m_size, temp.data);
            temp.m_size = m_size;
            swap(temp);
        }
    }

This is broken. You are pushing into uninitialized memory so you need to construct the object in place.

    void push_back(const_reference item) {
        reserve_if_needed();
        data[m_size++] = item;

        // Should be this.
        new (std::addressof(data[m_size++])) T(item);
    }

What about pushing an RVALUE?

Replace the above with:

    // Note: Pass by value.
    //       RVALUE are moved into item then moved to container.
    //       LVALUE are copied into item then moved to the container.
    void push_back(T item) {
        reserve_if_needed();
        new (std::addressof(data[m_size++])) T(std::move(item));
    }

        if (m_size > 0) {
            // Why not swap the next two lines?
            // This would make the line to call the destructor
            // simpler and easier to read as you don't need the -1
            data[m_size - 1].~T();
            m_size -= 1;
        }

Same issue as reserve()

    void shrink_to_fit() {
        
        if (m_capacity > m_size) {
            pointer new_data = allocate(m_size);

            if (m_size > 0) {
                std::uninitialized_move(data, data + m_size, new_data);
            }
            
            deallocate(data);

            data = new_data;
            m_capacity = m_size; 
        }
    }

Same solution:

    void shrink_to_fit() {
        if (m_capacity > m_size) {
            Vector   temp(m_size);      // Vector with reserved capacity
            std::uninitialized_move(data, data + m_size, temp.data);
            temp.m_size = m_size;   
            swap(temp);
        }
    }

Sure standard iterators:

    // Iterator support 
    iterator begin() const {
        return data;
    }

    iterator end() const {
        return data + m_size;
    }

But what about const iterators or reverse iterators or const reverse iterators? Also when calling begin() on a const object you will get a const_iterator.

    iterator               begin()         { // normal begin
    const_iterator         begin()   const { // begin on const object
    const_iterator         cbegin()  const { // explicitly asking for const
    reverse_iterator       rbegin()        { // normal rbegin
    const_reverse_iterator rbegin()  const { // rbegin on const object
    const_reverse_iterator crbegin() const { // explicitly asking for re

etc

To be similar to C-style arrays (and the C++ standard idiom that objects are destroyed in reverse order of creation), you should destroy the members in reverse order.

    void clear() noexcept {
        for (std::size_t i = 0; i < m_size; i++) {
            data[i].~T();
        }

        m_size = 0;
    }

No need to check for null pointers here!

    void deallocate(pointer p) {
        if (p != nullptr) {
            ::operator delete(p);
        }
    }

Questions:

What is the slight deviation from the standard interface in the explicit Vector(size_type capacity)? Is it that the param is named 'capacity' and not 'count?

If you look at the standard (I link to a non standard but reputable source) std::vector::vector you will see there is no constructor that takes a "capacity" (ie. you have allocated space but zero size). There is one that takes a size and fills it with values (but you have that one).

When you say "T()" should be in the header only, do you mean the template declaration?

No. Default Parameters should be defined in the class header file only. They are not part of the declaration:

class Vector
{
    void resize(size_type count, const_reference value = T());
    // This is good.
    // Reading the interface you see you can have a default value.
};


// Don't define the default parameters here.
// It should generate an error if you do this.
void Vector::resize(size_type count, const_reference value) 
{
    // STUFF
}

How does constructing the item in place in push_back() resolve the uninitialized memory issue?

This is a problem becasue:

  // This code uses the assignment operator.
  data[m_size++] = item;

  // The assignment operator assumes that the object
  // referenced on the left hand side has already been constructed
  // but in your case that is not true, this is unintialized memory.
  // So you are using assignment to uninitialized memory
  // which could be anything and thus with non trivial T
  // will cause an issue.

This is solved by constructing in place:

        // Should be this.
        new (std::addressof(data[m_size++])) T(item);

The constructor assumes the memory has not been constructed before. You pass the address to operator new it will not allocate space but simply use the pointer provided as the location and then call the constructor for the type T to correctly initialize the memory.

do you have any references or documentation on the "pass by value" idiom for avoiding the two assignment operators?

Nope. There are lots of questions on Stackoverflow that go over this though. Should be simple to find.

And does the SFINAE approach involve checking std::is_nothrow_move_constructible or something in overload resolution? I should look into how this is done in some compiler implementation

Yep.

answered Jul 21, 2023 at 1:46

Loki Astari

97.7k
5
126
341