Return to Answer

deleted 641 characters in body

Source Link

edited Sep 25, 2016 at 21:44

user116966

In hash_table(std::size_t m): The body is simply zero-initialization of the mark array. The same can be achieved with:

 hash_table(std::size_t m)
     : M(power_of_2_ceiling(m)),
     table(new std::pair<K, V>[M]),
     mark(new bool[M]{}), N(0)
 { }

In hash_table(hash_table<K, V>&& map): Is there a reason to set M and N to zero? A moved object only needs to be properly destructable, but the destructor does not use M or N.
You can refer to the current instantiation of a template class without specifying the arguments, i.e. hash_table(hash_table&& map) would be fine for the move constructor, too.
To use std::unordered_map you need to include <unordered_map>.
hash_table(std::unordered_map<K, V>&& map): You have a rvalue reference constructor here, but none for const references. That seems inconsistent.

hash_table(std::unordered_map<K, V>&& map) initially does the same as hash_table(std::size_t), you could simply call it:

 hash_table(std::unordered_map<K, V>&& map)
     : hash_table(map.size())
 {
     for (auto &p : map) {
         insert(std::move(p));
     }
 }

hash_table<K, V>& operator=(hash_table<K, V> other): There is no reason to pass by value here. There should be one copy assignment operator taking a const reference and one move assignment operator taking an rvalue reference. swap(*this, other); would then also be the wrong behavior. You need to explicitly handle memory reallocations here.(dropped)
With a proper move assignment operator you would not really need to define a swap. The default implementation of std::swap will already do the correct thing and probably not too much performance difference, see here.(dropped)
You could use just one array of type std::tuple<K,V,bool> (or a custom struct) instead of two arrays table and mark. Then you would not have to repeat the memory management code twice, and with the access pattern related data is closer in memory. Together with my point 2. this would reduce overhead even more.
Your only method to return elements find returns a const std::pair<K,V>*, implying that it is impossible to modify the value of an entry in-place. I think that is a rather strong limitation. Of course one may not change the key in-place, but the value should be modifiable. Therefore the type should be std::pair<const K,V>* and table also should be of type std::pair<const K,V>.
The maximal size of your hash_table is kind of arbitarily chosen and once it is full there will be an exception thrown. This might not have been the scope of your intentions but really the size should grow when a certain load is reached and rehashing should happen.
Your insert takes a rvalue reference. This means that only temporaries or moved elements can be inserted. There should also be an overload for const references (and without move).
size() should be const.

In hash_table(std::size_t m): The body is simply zero-initialization of the mark array. The same can be achieved with:

 hash_table(std::size_t m)
     : M(power_of_2_ceiling(m)),
     table(new std::pair<K, V>[M]),
     mark(new bool[M]{}), N(0)
 { }

In hash_table(hash_table<K, V>&& map): Is there a reason to set M and N to zero? A moved object only needs to be properly destructable, but the destructor does not use M or N.
You can refer to the current instantiation of a template class without specifying the arguments, i.e. hash_table(hash_table&& map) would be fine for the move constructor, too.
To use std::unordered_map you need to include <unordered_map>.
hash_table(std::unordered_map<K, V>&& map): You have a rvalue reference constructor here, but none for const references. That seems inconsistent.

hash_table(std::unordered_map<K, V>&& map) initially does the same as hash_table(std::size_t), you could simply call it:

 hash_table(std::unordered_map<K, V>&& map)
     : hash_table(map.size())
 {
     for (auto &p : map) {
         insert(std::move(p));
     }
 }

hash_table<K, V>& operator=(hash_table<K, V> other): There is no reason to pass by value here. There should be one copy assignment operator taking a const reference and one move assignment operator taking an rvalue reference. swap(*this, other); would then also be the wrong behavior. You need to explicitly handle memory reallocations here.
With a proper move assignment operator you would not really need to define a swap. The default implementation of std::swap will already do the correct thing and probably not too much performance difference, see here.
You could use just one array of type std::tuple<K,V,bool> (or a custom struct) instead of two arrays table and mark. Then you would not have to repeat the memory management code twice, and with the access pattern related data is closer in memory. Together with my point 2. this would reduce overhead even more.
Your only method to return elements find returns a const std::pair<K,V>*, implying that it is impossible to modify the value of an entry in-place. I think that is a rather strong limitation. Of course one may not change the key in-place, but the value should be modifiable. Therefore the type should be std::pair<const K,V>* and table also should be of type std::pair<const K,V>.
The maximal size of your hash_table is kind of arbitarily chosen and once it is full there will be an exception thrown. This might not have been the scope of your intentions but really the size should grow when a certain load is reached and rehashing should happen.
Your insert takes a rvalue reference. This means that only temporaries or moved elements can be inserted. There should also be an overload for const references (and without move).
size() should be const.

In hash_table(std::size_t m): The body is simply zero-initialization of the mark array. The same can be achieved with:

 hash_table(std::size_t m)
     : M(power_of_2_ceiling(m)),
     table(new std::pair<K, V>[M]),
     mark(new bool[M]{}), N(0)
 { }

In hash_table(hash_table<K, V>&& map): Is there a reason to set M and N to zero? A moved object only needs to be properly destructable, but the destructor does not use M or N.
You can refer to the current instantiation of a template class without specifying the arguments, i.e. hash_table(hash_table&& map) would be fine for the move constructor, too.
To use std::unordered_map you need to include <unordered_map>.
hash_table(std::unordered_map<K, V>&& map): You have a rvalue reference constructor here, but none for const references. That seems inconsistent.

hash_table(std::unordered_map<K, V>&& map) initially does the same as hash_table(std::size_t), you could simply call it:

 hash_table(std::unordered_map<K, V>&& map)
     : hash_table(map.size())
 {
     for (auto &p : map) {
         insert(std::move(p));
     }
 }

(dropped)
(dropped)
You could use just one array of type std::tuple<K,V,bool> (or a custom struct) instead of two arrays table and mark. Then you would not have to repeat the memory management code twice, and with the access pattern related data is closer in memory. Together with my point 2. this would reduce overhead even more.
Your only method to return elements find returns a const std::pair<K,V>*, implying that it is impossible to modify the value of an entry in-place. I think that is a rather strong limitation. Of course one may not change the key in-place, but the value should be modifiable. Therefore the type should be std::pair<const K,V>* and table also should be of type std::pair<const K,V>.
The maximal size of your hash_table is kind of arbitarily chosen and once it is full there will be an exception thrown. This might not have been the scope of your intentions but really the size should grow when a certain load is reached and rehashing should happen.
Your insert takes a rvalue reference. This means that only temporaries or moved elements can be inserted. There should also be an overload for const references (and without move).
size() should be const.

Source Link

answered Sep 25, 2016 at 19:40

user116966

Here are a few things I noticed reading your code:

You should probably use more descriptive identifiers for M and N here.
Instead of dynamic allocation you could also use std::vector<...> for table and mark. That would allow to simplify your code considerably. For example you could drop the custom copy/move constructors/assignment operators and the destructor and use the implicitly defined ones instead. You could also use the provided .size() method of std::vector to get M instead of saving it explicitly.

If you didn't use std::vector for performance considerations, then I doubt that is a problem. Element access is through one indirection anyway and you do not have resize operations. Simply use reserve in your constructor to allocate the proper size immediately. Only the required memory for std::vector might be a bit larger (additional integer for current size/reserved size and doubling of length M).

Even if you still don't want to use std::vector, then you should consider writting an additional class (heap_array?) to handle the memory management or you could use std::unique_ptr, which would at least make your custom move constructor/assignment operator and the destructor redundant.

In hash_table(std::size_t m): The body is simply zero-initialization of the mark array. The same can be achieved with:

 hash_table(std::size_t m)
     : M(power_of_2_ceiling(m)),
     table(new std::pair<K, V>[M]),
     mark(new bool[M]{}), N(0)
 { }

In hash_table(hash_table<K, V>&& map): Is there a reason to set M and N to zero? A moved object only needs to be properly destructable, but the destructor does not use M or N.
You can refer to the current instantiation of a template class without specifying the arguments, i.e. hash_table(hash_table&& map) would be fine for the move constructor, too.
To use std::unordered_map you need to include <unordered_map>.
hash_table(std::unordered_map<K, V>&& map): You have a rvalue reference constructor here, but none for const references. That seems inconsistent.

hash_table(std::unordered_map<K, V>&& map) initially does the same as hash_table(std::size_t), you could simply call it:

 hash_table(std::unordered_map<K, V>&& map)
     : hash_table(map.size())
 {
     for (auto &p : map) {
         insert(std::move(p));
     }
 }

hash_table<K, V>& operator=(hash_table<K, V> other): There is no reason to pass by value here. There should be one copy assignment operator taking a const reference and one move assignment operator taking an rvalue reference. swap(*this, other); would then also be the wrong behavior. You need to explicitly handle memory reallocations here.
With a proper move assignment operator you would not really need to define a swap. The default implementation of std::swap will already do the correct thing and probably not too much performance difference, see here.
You could use just one array of type std::tuple<K,V,bool> (or a custom struct) instead of two arrays table and mark. Then you would not have to repeat the memory management code twice, and with the access pattern related data is closer in memory. Together with my point 2. this would reduce overhead even more.
Your only method to return elements find returns a const std::pair<K,V>*, implying that it is impossible to modify the value of an entry in-place. I think that is a rather strong limitation. Of course one may not change the key in-place, but the value should be modifiable. Therefore the type should be std::pair<const K,V>* and table also should be of type std::pair<const K,V>.
The maximal size of your hash_table is kind of arbitarily chosen and once it is full there will be an exception thrown. This might not have been the scope of your intentions but really the size should grow when a certain load is reached and rehashing should happen.
Your insert takes a rvalue reference. This means that only temporaries or moved elements can be inserted. There should also be an overload for const references (and without move).
size() should be const.