Java HashMap implementation hashcode issue

Question

Looking through the implementation of the Java HashMap here : http://www.docjar.com/html/api/java/util/HashMap.java.html I noticed the following :

The internal data structure used is an array which at each index stores the reference to the first entry in a linked list. The array index is based on the key's hashcode and the linked list represents the bucket for that particular hashcode. What I found interesting is the method indexFor(int h, int length) which, for a given key, determines what bucket in the array to look in. But the implementation, return h & (length - 1) looks odd in the sense that for an indeterminate number of hashcodes which do not coincide with a given array index the method will return 0. So, no matter what unique hashcode you implement for your object, the 0 bucket in the array will most likely be full of objects and thus you don't benefit from what a unique hashcode is supposed to offer you, that is faster data access.

Am I missing something?

Cristian

ddimitrov · Accepted Answer · 2012-10-05 14:30:04Z

4

You are missing the following Javadoc from the HashMap source code:

/**
 * The table, resized as necessary. Length MUST Always be a power of two.
 */
transient Entry<K,V>[] table;

This means that table.length-1 will always be a sequence of 1's.

answered Oct 5, 2012 at 14:30

ddimitrov

3,3633 gold badges35 silver badges48 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

cristian.petroaca Over a year ago

I have another question related to the hashtable class this time. In this implementation docjar.com/html/api/java/util/Hashtable.java.html, the bucket index in the array is calculated with key.hashcode % array.length. Here array.length is not a power of two. Lets say we have an intial array with length = 11 and we insert first a key with hashcode = 1 so, the array index would be 1 and then we insert another key with hashcode = 12 so the array index would also be 1. Wouldn't this mean that the objects for the 2 keys which have different hashcode will be stored in the same bucket?

ddimitrov Over a year ago

Correct, though if you specify a good initial (i.e. 15) the resizing formula oldCapacity*2 + 1 will keep it good. Prime number is a typical recomendation in these cases and it would work reasonably well. If you specify crappy initial size, it will get better with the resizes. In the end, it is no perfect - no datastructure ever is. If you care that much about efficiency - I suggest you check: labs.carrotsearch.com/hppc.html - it gives you enough rope to do whatever you want :-)

parsifal · Accepted Answer · 2012-10-05 14:29:55Z

0

I can't quite understand what you believe the problem to be.

The h & (length - 1) is a simple way to calculate h % n where n is a power of two. There isn't, in my understanding, any reason why h % n should give an unnaturally large number of zeros.

answered Oct 5, 2012 at 14:29

parsifal

1241 bronze badge

Collectives™ on Stack Overflow

Java HashMap implementation hashcode issue

2 Answers 2

2 Comments

Comments

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

Comments

Related