Peter Stuifzand

Hash function performance

In Hash function performance Dave Winer writes:

It’s been a long time since we looked at the object database in Frontier, so today Brent wrote a little app that checks out how the hash function distributes objects across the buckets for each table in the database.

[… result distribution among buckets …]

This looks pretty random to us.

And he’s right. In the sense that it’s just a bunch of random numbers. Not something you would want in a hash function.

Meanwhile in the Hacker News page about this people are talking about a few other things, they thought they should read from this article.

  1. Ignorance. I think it’s sarcasm actually.
  2. Performance. Performance isn’t always about speed.
  3. Hash function (as in cryptography). In cryptography you use functions that won’t collide for documents that are possible. In an object database, collisions aren’t important, it’s assumed.
  4. Using a built-in hash function. If you built your own platform and language, there isn’t always a built-in hash function. Sometimes you need to write one yourself.
© 2023 Peter Stuifzand