So I've decided to cut another CR on 3.0.0. I wasn't originally planning on it and this would have gone gold save for the fact that a few important JIRAs did get in, which have pretty significant impact on performance (going northwards, of course - JBCACHE-1419 is notable, which contributed to a huge performance boost on async replication).
Anyway, to cut a long story short, download this final CR here, provide feedback here. Changelog is in JIRA, and all-new documentation is also available at last. If all goes well, I hope to cut a GA towards the end of next week, so I would appreciate as much feedback as possible - both on the release, as well as the documentation, tutorials, sample configs, etc.
And now for the fun stuff - I've finally found the time to put together some replicated benchmarks, to go with the standalone benchmarks I published a while back.
Kit used
The benchmarks were run on Red Hat's lab cluster of 8 servers, connected via gigabit ethernet.
- Quad-CPU Intel Xeon 3GHz servers with 4GB of RAM each
- RHEL 4 x86_64 with kernel version 2.6.9-42.0.10.ELsmp
- SUN JDK 1.5.0_11-b03 (32-bit)
- Benchmarks generated using the CacheBenchFramework
- A single thread on each server instance, reading 90% of the time and writing 10% of the time
Benchmark 1: Comparing 2.2.0 "Poblano" and 3.0.0 "Naga" (synchronous replication)
This benchmark compares 2.2.0.GA with 3.0.0.CR2, using synchronous replication and pessimistic locking (Poblano) and MVCC (Naga).
The performance gain is between 10 and 20% for synchronous replication, and a pretty consistent 20% when using buddy replication. A large part of this is MVCC locking, but also other improvements in the code base play a part, including more efficient marshalling.
The total aggregate throughput chart shows that buddy replication (still) scales almost linearly. Increasing cluster size directly benefits the overall throughput the entire cluster can handle.
Benchmark 2: Comparing 2.2.0 "Poblano" and 3.0.0 "Naga" (asynchronous replication)
This had to be a separate benchmark from the synchronous one since the throughput is so much higher than sync replication it made a combined charts unreadable!
JBCACHE-1419 is the main contributor to the phenomenal performance gains in async replication in Naga. For folks who use JBoss Cache with async replication, I strongly encourage you to give Naga a try! :-)
Benchmark 3: Competitive analysis
This is something a lot of people have been asking me for. For this benchmark, I've pitted JBoss Cache 3.0.0.CR2 against EHCache 1.5.0, Terracotta 2.5.0 and a popular commercial, closed-source distributed cache which will have to remain unnamed. I've called this Cache X in my charts. I've used synchronous replication throughout since this is all that Cache X supported.
Both EHCache and Cache X slightly outperforms Naga on a 2-node cluster (not so slightly with EHCache), but these quickly fall behind as the cluster size increases, particularly in the case of EHCache. This is also reflected in the overall throughput chart. Buddy replication still shows linear scalability.
All of these benchmarks are reproducible using the cache benchmark framework, but as with all benchmarks, these should be used as a guideline only. Real performance can only be measured in your environment, with your specific use case and data access patterns.
Enjoy!
Manik