rkyv is faster than {bincode, capnp, cbor, flatbuffers, postcard, prost, serde_json}
And I have the benchmarks to prove it
Mar 11, 2021 – 10 min read
Rust, Rkyv, Serialization
Rust

I've been working on rkyv, a zero-copy deserialization library, since November of 2020. rkyv is similar to Cap'n Proto and FlatBuffers, but has a handful of different design choices that make it stand out:

But just having design goals isn't good enough, you need results to back them up. With that in mind, I can't disclaim enough that I am the creator and maintainer of rkyv. However, the last thing I want is to be biased, so I made some benchmarks to hopefully convince you on their own merits.

Benchmarks

There are a couple different benchmarks already available, but in general they fail in a couple different ways:

They test with too little data

This leads to highly variable results and can make it difficult to see whether one library is really faster than another

They test only with simple data

The library may perform completely differently with complex and highly structured data

They test only serialization and deserialization

For most serialization formats, all you can do is serialize and deserialize data. But zero-copy deserialization libraries can access and traverse data without deserializing it first. Knowing how these operations compare with each other is essential to evaluating their relative performance.

rust_serialization_benchmark

With these shortcomings in mind, I set off to make my own benchmarks. The goal was to be thorough and complete, and I think I did a pretty good job.

You can run the benchmarks yourself or look over the raw data from the github repo. I'll summarize the results.

Rules

Each library got tested on three different data sets:

Each data set is randomly generated from an RNG seeded with the first 20 digits of pi, so the data tested is identical for every run. For each data set, a library was measured for the following:

Additionally, zero-copy deserialization libraries were tested for:

There are a couple footnotes that need explaining:

  1. Abomonation requires a mutable backing to access and read serialized data. This means that it's not viable for some use cases.
  2. Abomonation does not support buffer mutation, so this wasn't tested.
  3. While Flatbuffers and Cap'n Proto support buffer mutation in the main (usually C++) libraries, the rust counterparts do not and they couldn't be tested for this.
  4. None of the other zero-copy deserialization frameworks provided deserialization capabilities by default. Writing and benchmarking my own deserialization code is somewhat meaningless for these. You can get an idea of what sort of deserialization performance you'd get by looking at the read benchmark.
  5. Abomonation's decode qualified as access not deserialize because it yields an immutable reference instead of a mutable object. In order to deserialize this object, a simple Clone would suffice but I'm not here to write and benchmark my own deserialization code.

Results

These results are directly from the benchmark repo.

log

This data set is composed of HTTP request logs that are small and contain many strings.

Raw data:

For operations, time per iteration; for size, bytes. Lower is better.

Format / LibSerializeAccessReadUpdateDeserializeSizeSize (zlib)
abomonation315.13 us36.773 us*58.999 us*1705800507971
bincode640.51 us4.2787 ms1045784374305
capnp1.8558 ms259.95 ns711.84 us§1843240537966
cbor1.9698 ms8.9702 ms1407835407372
flatbuffers2.6780 ms2.9815 ns162.95 us§1276368469962
postcard714.70 us4.4387 ms765778312739
prost5.4927 ms5.1024 ms764951269811
rkyv422.92 us1.3616 ns18.962 us71.321 us3.2492 ms1065784333895
serde_json4.4054 ms10.148 ms1827461474358

Comparison:

Relative to best. Higher is better.

Format / LibSerializeAccessReadUpdateDeserializeSizeSize (zlib)
abomonation100.00%0.00%*32.14%*44.84%53.12%
bincode49.20%75.94%73.15%72.08%
capnp16.98%0.52%2.66%§41.50%50.15%
cbor16.00%36.22%54.34%66.23%
flatbuffers11.77%45.67%11.64%§59.93%57.41%
postcard44.09%73.20%99.89%86.27%
prost5.74%63.68%100.00%100.00%
rkyv74.51%100.00%100.00%100.00%100.00%71.77%80.81%
serde_json7.15%32.02%41.86%56.88%

mesh

The data set is a single mesh. The mesh contains an array of triangles, each of which has three vertices and a normal vector.

Raw data:

For operations, time per iteration; for size, bytes. Lower is better.

Format / LibSerializeAccessReadUpdateDeserializeSizeSize (zlib)
abomonation430.61 us2.4135 ns*177.87 us*60000245380836
bincode7.0288 ms12.294 ms60000085380823
capnp15.854 ms247.35 ns8.9442 ms§160000566780527
cbor43.109 ms70.247 ms131223247527423
flatbuffers1.9518 ms2.9588 ns152.39 us§60000245380800
postcard6.6844 ms8.9408 ms60000035380817
prost34.037 ms20.232 ms87500006683814
rkyv1.1217 ms1.4006 ns172.20 us649.18 us1.9594 ms60000084263104
serde_json105.86 ms83.016 ms261928839612105

Comparison:

Relative to best. Higher is better.

Format / LibSerializeAccessReadUpdateDeserializeSizeSize (zlib)
abomonation100.00%58.03%*85.67%*100.00%79.23%
bincode6.13%15.94%100.00%79.23%
capnp2.72%0.57%1.70%§37.50%62.87%
cbor1.00%2.79%45.72%56.63%
flatbuffers22.06%47.34%100.00%§100.00%79.23%
postcard6.44%21.92%100.00%79.23%
prost1.27%9.68%68.57%63.78%
rkyv38.39%100.00%88.50%100.00%100.00%100.00%100.00%
serde_json0.41%2.36%22.91%44.35%

minecraft_savedata

The data set is composed of Minecraft player saves that contain highly-structured data.

Raw data:

For operations, time per iteration; for size, bytes. Lower is better.

Format / LibSerializeAccessReadUpdateDeserializeSizeSize (zlib)
abomonation368.23 us40.823 us*41.413 us*1290592393696
bincode806.73 us3.4132 ms569975240897
capnp863.41 us256.55 ns5.3431 us§835784342099
cbor2.4356 ms8.8797 ms1109821347562
flatbuffers38.683 ms2.9212 ns3.9676 us§849472349208
postcard774.37 us3.7533 ms356311213270
prost5.8678 ms5.4083 ms596811306728
rkyv843.80 us1.3837 ns282.88 ns6.5422 us2.4810 ms725176334238
serde_json4.3501 ms10.699 ms1623197472162

Comparison:

Relative to best. Higher is better.

Format / LibSerializeAccessReadUpdateDeserializeSizeSize (zlib)
abomonation100.00%0.00%*0.68%*27.61%54.17%
bincode45.64%72.69%62.51%88.53%
capnp42.65%0.54%5.29%§42.63%62.34%
cbor15.12%27.94%32.11%61.36%
flatbuffers0.95%47.37%7.13%§41.94%61.07%
postcard47.55%66.10%100.00%100.00%
prost6.28%45.87%59.70%69.53%
rkyv43.64%100.00%100.00%100.00%100.00%49.13%63.81%
serde_json8.46%23.19%21.95%45.17%

Footnotes

* abomonation requires a mutable backing to access data

abomonation does not support buffer mutation

do not provide deserialization capabilities, but the user can write their own

§ supports buffer mutation, but not in the rust implementation

Analysis

CBOR / serde_json

Unsurprisingly, these two had very similar performance because they're almost the same format. CBOR did a bit better than serde_json in every benchmark, but these two consistently trailed behind all the other frameworks (in some cases, very considerably behind).

Prost

Prost was the chosen representative for protobuf-style serialization. Its performance was average-to-lackluster on every benchmark, with the exception of the log size benchmark. It beat out postcard, which consistently performed extremely well in the size/zlib categories. This shows just how much the format was optimized for stringy data and minimizing wire size.

bincode / postcard

Despite being completely different libraries, bincode and postcard had very similar benchmark results. Serialize and deserialize speed were very close for both of them, and the main difference between the two was usually the final size. Postcard consistently beat bincode on size and zlib. I suspect that they are using very similar techniques, but that postcard has a few more tricks up its sleeve that don't cost much to perform but give it a sizeable advantage.

Cap'n Proto

Cap'n Proto had a good showing, and it proved its worth as a replacement for protobuf. Compared to prost, it was faster to serialize, and supported comparatively fast zero-copy deserialization. These two features are absolutely killer. Unfortunately, it didn't stack up nearly as well against the other zero-copy frameworks. It consistently had disapointing access and read times compared to its competitors, and failed pretty miserably on the mesh size benchmarks. This makes sense as it wasn't built to handle large amounts of raw data, but it was disappointing to see so much wasted space compared to FlatBuffers.

FlatBuffers

FlatBuffers is the comparison point for zero-copy deserialization. It's got a lot of usage, was built specifically for performance, and proves out the zero-copy concept. It did well in all categories on most of the tests, but had a major stumbling block. In the minecraft_savedata test, its serialization performance was by far the worst, even worse than serde_json (which had to write twice as much data!). This highlights a major weakness of FlatBuffers: its very poor serialization performance on highly-structured data. It's possible (even probable) that I wrote this bench more poorly than it could be, but it's enough that I wouldn't recommend its use for general-purpose data.

Abomonation

Abomonation was definitely a bright spot in the benchmarks. It proved out its insanely fast serialization on every bench, and didn't suffer from some of the size traps that its competitors fell into. It would be an easy library to recommend if it didn't come with so many caveats. It's very unsafe, non-portable, requires mutable backing to access its data, and doesn't support mutations. Nonetheless, abomonation was a really impressive contender in every benchmark.

rkyv

I went into these benchmarks not knowing how rkyv would perform relative to its peers, but confident that it would make a good showing. It ended up doing much better than I expected. It won nearly every performance category, and was highly competitive with the winner when it didn't. It also did so without compromising on size, where it was also highly competitive. Finally it showed exceptional scalability, peforming equally well on all different kinds of data where its zero-copy competitors all hard shortcomings on one or more of the data sets. Unlike abomonation, it's also a safe, highly-portable format that doesn't need mutable backing and has more feature support than other competitors.

Conclusion

I welcome and encourage anyone to run the benchmarks for themselves and open pull requests to improve or clean up whatever they want. I am confident in the validity of these results, and will happily update the tables as changes are made. I will update my analyses if there are any major changes.

My hope is that this article not only convinced you that rkyv is one of the best-performing serializers available, but that it also helped you understand the relationships between the different serialization solutions available in rust today.

If you're interested in rkyv, I encourage you to contribute to the request for feedback for planning its future

Thanks to burntsushi for the article title inspiration