The goal is to distribute approximately 100 million variable length strings, average length 100 characters, uniformly among 100 million buckets. Perfection not required, just no egregious clumping. The strings are URLs. They tend to begin much the same way and end much the same way (e.g. h t t p:// or w w w. and ".c o m", ".e d u", "a s p x" and so forth) and show their greatest variation in the latter half of the string, except for the final few characters.
What's a good algorithm that would accept the string and the number of slots as inputs, and return a number between 0 and SlotCount-1, and satisfy the uniform-distribution requirement?