12

I'm trying to understand java.util.Collection and java.util.Map a little deeper but I have some doubts about HashSet funcionality:

In the documentation, it says: This class implements the Set interface, backed by a hash table (actually a HashMap instance). Ok, so I can see that a HashSet always has a Hashtable working in background. A Hashtable is a struct that asks for a key and a value everytime you want to add a new element to it. Then, the value and the key are stored in a bucket based on the key hashCode. If the hashcodes of two keys are the same, they add both key values to the same bucket, using a LinkedList. Please, correct me if I said something wrong.

So, my question is: If a HashSet always has a Hashtable acting in background, then everytime we add a new element to the HashSet using HashSet.add() method, the HashSet should add it to its internal Hashtable. But, the Hashtable asks for a value and a key, so what key does it use? Does it just uses the value we're trying to add also as a key and then take its hashCode? Please, correct me if I said something wrong about HashSet implementation.

Another question that I have is: In general, what classes can use the hashCode() method of an java object? I'm asking this because, in the documentation, it says that everytime we override equals() method we need to override hashCode() method. Ok, it really makes sense, but my doubt is if it's just a recommendation we should do to keep everything 'nice and perfect' (putting in this way), or if it's really necessary, because maybe a lot of Java defaults classes will constantly uses hashCode() method of your objects. In my vision, I can't see other classes using this method instead of those classes related to Collections. Thank you very much guys

1
  • 1
    The "values" you add to the set are actually the keys to the map. The values that HashSet puts in the map are irrelevant. The value it actually uses is a dummy object reference to an object named PRESENT. Commented Jul 14, 2014 at 18:08

2 Answers 2

15

If you look at the actual javacode of HashSet you can see what it does:

 // Dummy value to associate with an Object in the backing Map
private static final Object PRESENT = new Object();
...

 public boolean add(E e) {
    return map.put(e, PRESENT)==null;
}

So the element you are adding is the Key in the backing hashmap with a dummy value as the value. this dummy value is never actually used by the HashSet.

Your second question regarding overriding equals and hashcode:

It is really necessary to always override both if you want to override either one. This is because the contract for hashCode() says equal objects must have the same hashcode. the default implementation of hashcode will give different values for each instance.

Therefore, if you override equals() but not hashCode() This could happen

object1.equals(object2) //true

MySet.add(object1);

MySet.contains(object2); //false but should be true if we overrode hashcode()

Since contains will use hashcode to find the bucket to search in we might get a different bucket back and not find the equal object.

Sign up to request clarification or add additional context in comments.

3 Comments

Thank you very much! So, based on what you've said, if the values we add to the HashSet are used as keys in its internal HashMap, everytime we create a HashSet of a specific type of Object, then the hashCode() method of this type should be Overrided and working well, if we want to get maximum perfomance, right? One last question is: The hashcode() is also important to lists, as well?(for example ArrayList)And what about sets that doesn't have 'Hash' in the name, for example TreeSet? I guess TreeSet is implemented using a tree instead of hashtable, so the hashCode() is relevant or not?
The hash code is not used by ArrayList, LinkedList, or TreeSet. It's used by the hash-based collections HashMap, HashSet, ConcurrentHashMap, etc. You can always check the javadoc for a collection class. It should document if it is hash based.
@David Conrad Thank you very much! You guys solved all my doubts :)
4

If you look at the source for HashSet (the source comes with the JDK and is very informative), you will see that it creates an object to use as the value:

// Dummy value to associate with an Object in the backing Map
private static final Object PRESENT = new Object();

Each value that is added to the HashSet is used as a key to the backing HashMap with this PRESENT object as the value.

Regarding overriding equals() whenever you override hashCode() (and vice versa), it is very important that these two methods return consistent results. That is, they should agree with one another. For more details, see the book Effective Java by Josh Bloch.

1 Comment

Actually I can't, I need 15 reputation :P

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.