Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Redis uses a single thread, not just a single process, so if Hyperdex is multi-threaded you are comparing single core vs multiple cores. As you can see with memcached that instead is able to use multiple cores (a feature that Redis is going to implement soon) this leads to a big performance improvement in this kind of benchmarks.

EDIT: (I checked that Hyperdex actually uses threads and multiple cores) If you want a fair comparison you should run Hyperdex on a single core as well, or you can run N instances of Redis (one per number of core) and write the benchmark so that it uses all the instances.

I'll check the YCSB Redis bindings, I never looked at them before. Thanks for the reply.

EDIT2: there are also problems with the YCSB Redis bindings:

1) It basically forces an object-store data model on Redis, so only uses hashes to store objects, and every time an object is stored or deleted, a sorted set is updated as well. This is a possible use case of Redis but not a very idiomatic / representative one.

2) Even for benchmarks not involving searching, the sorted set is anyway updated.

3) There is no pipelining used to alter the object and store the sorted set. Every operation pays 2x the Round Trip Time in the Redis bindings.

4) Even worse, there is no pipelining in the "search" operation, so you may the RTT a lot of times when you do a scan operation with this bindings.

The minimal change to the YCSB bindings is to modify the bindings to use pipelining when possible (almost always, actually). Still I think that an intermediate layer to turn Redis into an automatically-indexing object store does not make sense. Another big problem is that you are comparing multi-cores vs single-core.

So if you really are interested in a comparison between HyperDex and Redis you should pick an use case and model it accordingly with the best tools of both the databases.

What I would recommend is to use the following use cases.

1) Populate the two DBs with 10 millions of hashes, then write a benchmark where 50 clients simultaneously get and set specific fields.

2) Like "1" but increment a field by 10 at every write query.

3) Simulate a leader board where 50 clients simultaneously update the scores of the different "players" in one operation, and ask the top 10 users in the leader bord with another operation.

4) Simulate a capped collection where you always add the latest news in a web site, and you can get the top-10 to show in the home page. Every of the 50 clients should write a single item and fetch top-10 items.

And so forth. Always use 10 million objects and mixed reads and writes with 50 clients at the same time. Write the best code for both the DBs (I can help with Redis).

This time you are truly comparing the two DBs in a real world scenario.



I think YCSB is meant so that experts in each technology can tweak it for their system. So, it looks to me like you're in a great place to make that happen, and I'd love to see it!

Also, It's extremely difficult to get speedup of 4x on 4 cores, so I'll believe that argument when I see it. It seems to me that with the current implementation of Redis, you'd run into some serious problems with memory management if you run 4 redis nodes on one 4 core machine.

Also, you say: "This is a possible use case of Redis but not a very idiomatic / representative one." Can you elaborate? What is idiomatic? From the front page of the Redis site: "Redis is an open source, advanced key-value store."

Also, on a slightly different topic, I can't seem to find any real documentation about the consistency guarantees of Redis, so I thought you might be able to point me in the right direction. It appears that the master/slave replication scheme in Redis just backs up onto the slaves eventually, but the master immediately returns. Is that true? If a master goes down (disk and all), could a write have been confirmed to the client which is missing on a slave? What replication protocol does it use? I find the documentation lacking in this regard and it'd be great if you could point me to some actual technical specifications. Thanks!




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: