I've been working on rkyv, a zero-copy deserialization library, since November of 2020. rkyv is similar to Cap'n Proto and FlatBuffers, but has a handful of different design choices that make it stand out:
#[derive]
sBut just having design goals isn't good enough, you need results to back them up. With that in mind, I can't disclaim enough that I am the creator and maintainer of rkyv. However, the last thing I want is to be biased, so I made some benchmarks to hopefully convince you on their own merits.
There are a couple different benchmarks already available, but in general they fail in a couple different ways:
This leads to highly variable results and can make it difficult to see whether one library is really faster than another
The library may perform completely differently with complex and highly structured data
For most serialization formats, all you can do is serialize and deserialize data. But zero-copy deserialization libraries can access and traverse data without deserializing it first. Knowing how these operations compare with each other is essential to evaluating their relative performance.
rust_serialization_benchmark
With these shortcomings in mind, I set off to make my own benchmarks. The goal was to be thorough and complete, and I think I did a pretty good job.
You can run the benchmarks yourself or look over the raw data from the github repo. I'll summarize the results.
Each library got tested on three different data sets:
log
: a data set of HTTP request logs that are small and contain many stringsmesh
: a single mesh composed of triangles, each of which has three vertices and a normalminecraft_savedata
: a highly-structured data set modeled after Minecraft player savedataEach data set is randomly generated from an RNG seeded with the first 20 digits of pi, so the data tested is identical for every run. For each data set, a library was measured for the following:
Additionally, zero-copy deserialization libraries were tested for:
There are a couple footnotes that need explaining:
decode
qualified as access not deserialize because it yields an immutable reference instead of a mutable object. In order to deserialize this object, a simple Clone
would suffice but I'm not here to write and benchmark my own deserialization code.These results are directly from the benchmark repo.
log
This data set is composed of HTTP request logs that are small and contain many strings.
Raw data:
For operations, time per iteration; for size, bytes. Lower is better.
Format / Lib | Serialize | Access | Read | Update | Deserialize | Size | Size (zlib) |
---|---|---|---|---|---|---|---|
abomonation | 315.13 us | 36.773 us* | 58.999 us* | † | ‡ | 1705800 | 507971 |
bincode | 640.51 us | 4.2787 ms | 1045784 | 374305 | |||
capnp | 1.8558 ms | 259.95 ns | 711.84 us | § | ‡ | 1843240 | 537966 |
cbor | 1.9698 ms | 8.9702 ms | 1407835 | 407372 | |||
flatbuffers | 2.6780 ms | 2.9815 ns | 162.95 us | § | ‡ | 1276368 | 469962 |
postcard | 714.70 us | 4.4387 ms | 765778 | 312739 | |||
prost | 5.4927 ms | 5.1024 ms | 764951 | 269811 | |||
rkyv | 422.92 us | 1.3616 ns | 18.962 us | 71.321 us | 3.2492 ms | 1065784 | 333895 |
serde_json | 4.4054 ms | 10.148 ms | 1827461 | 474358 |
Comparison:
Relative to best. Higher is better.
Format / Lib | Serialize | Access | Read | Update | Deserialize | Size | Size (zlib) |
---|---|---|---|---|---|---|---|
abomonation | 100.00% | 0.00%* | 32.14%* | † | ‡ | 44.84% | 53.12% |
bincode | 49.20% | 75.94% | 73.15% | 72.08% | |||
capnp | 16.98% | 0.52% | 2.66% | § | ‡ | 41.50% | 50.15% |
cbor | 16.00% | 36.22% | 54.34% | 66.23% | |||
flatbuffers | 11.77% | 45.67% | 11.64% | § | ‡ | 59.93% | 57.41% |
postcard | 44.09% | 73.20% | 99.89% | 86.27% | |||
prost | 5.74% | 63.68% | 100.00% | 100.00% | |||
rkyv | 74.51% | 100.00% | 100.00% | 100.00% | 100.00% | 71.77% | 80.81% |
serde_json | 7.15% | 32.02% | 41.86% | 56.88% |
mesh
The data set is a single mesh. The mesh contains an array of triangles, each of which has three vertices and a normal vector.
Raw data:
For operations, time per iteration; for size, bytes. Lower is better.
Format / Lib | Serialize | Access | Read | Update | Deserialize | Size | Size (zlib) |
---|---|---|---|---|---|---|---|
abomonation | 430.61 us | 2.4135 ns* | 177.87 us* | † | ‡ | 6000024 | 5380836 |
bincode | 7.0288 ms | 12.294 ms | 6000008 | 5380823 | |||
capnp | 15.854 ms | 247.35 ns | 8.9442 ms | § | ‡ | 16000056 | 6780527 |
cbor | 43.109 ms | 70.247 ms | 13122324 | 7527423 | |||
flatbuffers | 1.9518 ms | 2.9588 ns | 152.39 us | § | ‡ | 6000024 | 5380800 |
postcard | 6.6844 ms | 8.9408 ms | 6000003 | 5380817 | |||
prost | 34.037 ms | 20.232 ms | 8750000 | 6683814 | |||
rkyv | 1.1217 ms | 1.4006 ns | 172.20 us | 649.18 us | 1.9594 ms | 6000008 | 4263104 |
serde_json | 105.86 ms | 83.016 ms | 26192883 | 9612105 |
Comparison:
Relative to best. Higher is better.
Format / Lib | Serialize | Access | Read | Update | Deserialize | Size | Size (zlib) |
---|---|---|---|---|---|---|---|
abomonation | 100.00% | 58.03%* | 85.67%* | † | ‡ | 100.00% | 79.23% |
bincode | 6.13% | 15.94% | 100.00% | 79.23% | |||
capnp | 2.72% | 0.57% | 1.70% | § | ‡ | 37.50% | 62.87% |
cbor | 1.00% | 2.79% | 45.72% | 56.63% | |||
flatbuffers | 22.06% | 47.34% | 100.00% | § | ‡ | 100.00% | 79.23% |
postcard | 6.44% | 21.92% | 100.00% | 79.23% | |||
prost | 1.27% | 9.68% | 68.57% | 63.78% | |||
rkyv | 38.39% | 100.00% | 88.50% | 100.00% | 100.00% | 100.00% | 100.00% |
serde_json | 0.41% | 2.36% | 22.91% | 44.35% |
minecraft_savedata
The data set is composed of Minecraft player saves that contain highly-structured data.
Raw data:
For operations, time per iteration; for size, bytes. Lower is better.
Format / Lib | Serialize | Access | Read | Update | Deserialize | Size | Size (zlib) |
---|---|---|---|---|---|---|---|
abomonation | 368.23 us | 40.823 us* | 41.413 us* | † | ‡ | 1290592 | 393696 |
bincode | 806.73 us | 3.4132 ms | 569975 | 240897 | |||
capnp | 863.41 us | 256.55 ns | 5.3431 us | § | ‡ | 835784 | 342099 |
cbor | 2.4356 ms | 8.8797 ms | 1109821 | 347562 | |||
flatbuffers | 38.683 ms | 2.9212 ns | 3.9676 us | § | ‡ | 849472 | 349208 |
postcard | 774.37 us | 3.7533 ms | 356311 | 213270 | |||
prost | 5.8678 ms | 5.4083 ms | 596811 | 306728 | |||
rkyv | 843.80 us | 1.3837 ns | 282.88 ns | 6.5422 us | 2.4810 ms | 725176 | 334238 |
serde_json | 4.3501 ms | 10.699 ms | 1623197 | 472162 |
Comparison:
Relative to best. Higher is better.
Format / Lib | Serialize | Access | Read | Update | Deserialize | Size | Size (zlib) |
---|---|---|---|---|---|---|---|
abomonation | 100.00% | 0.00%* | 0.68%* | † | ‡ | 27.61% | 54.17% |
bincode | 45.64% | 72.69% | 62.51% | 88.53% | |||
capnp | 42.65% | 0.54% | 5.29% | § | ‡ | 42.63% | 62.34% |
cbor | 15.12% | 27.94% | 32.11% | 61.36% | |||
flatbuffers | 0.95% | 47.37% | 7.13% | § | ‡ | 41.94% | 61.07% |
postcard | 47.55% | 66.10% | 100.00% | 100.00% | |||
prost | 6.28% | 45.87% | 59.70% | 69.53% | |||
rkyv | 43.64% | 100.00% | 100.00% | 100.00% | 100.00% | 49.13% | 63.81% |
serde_json | 8.46% | 23.19% | 21.95% | 45.17% |
* abomonation requires a mutable backing to access data
† abomonation does not support buffer mutation
‡ do not provide deserialization capabilities, but the user can write their own
§ supports buffer mutation, but not in the rust implementation
Unsurprisingly, these two had very similar performance because they're almost the same format. CBOR did a bit better than serde_json in every benchmark, but these two consistently trailed behind all the other frameworks (in some cases, very considerably behind).
Prost was the chosen representative for protobuf-style serialization. Its performance was average-to-lackluster on every benchmark, with the exception of the log size benchmark. It beat out postcard, which consistently performed extremely well in the size/zlib categories. This shows just how much the format was optimized for stringy data and minimizing wire size.
Despite being completely different libraries, bincode and postcard had very similar benchmark results. Serialize and deserialize speed were very close for both of them, and the main difference between the two was usually the final size. Postcard consistently beat bincode on size and zlib. I suspect that they are using very similar techniques, but that postcard has a few more tricks up its sleeve that don't cost much to perform but give it a sizeable advantage.
Cap'n Proto had a good showing, and it proved its worth as a replacement for protobuf. Compared to prost, it was faster to serialize, and supported comparatively fast zero-copy deserialization. These two features are absolutely killer. Unfortunately, it didn't stack up nearly as well against the other zero-copy frameworks. It consistently had disapointing access and read times compared to its competitors, and failed pretty miserably on the mesh size benchmarks. This makes sense as it wasn't built to handle large amounts of raw data, but it was disappointing to see so much wasted space compared to FlatBuffers.
FlatBuffers is the comparison point for zero-copy deserialization. It's got a lot of usage, was built specifically for performance, and proves out the zero-copy concept. It did well in all categories on most of the tests, but had a major stumbling block. In the minecraft_savedata
test, its serialization performance was by far the worst, even worse than serde_json (which had to write twice as much data!). This highlights a major weakness of FlatBuffers: its very poor serialization performance on highly-structured data. It's possible (even probable) that I wrote this bench more poorly than it could be, but it's enough that I wouldn't recommend its use for general-purpose data.
Abomonation was definitely a bright spot in the benchmarks. It proved out its insanely fast serialization on every bench, and didn't suffer from some of the size traps that its competitors fell into. It would be an easy library to recommend if it didn't come with so many caveats. It's very unsafe, non-portable, requires mutable backing to access its data, and doesn't support mutations. Nonetheless, abomonation was a really impressive contender in every benchmark.
I went into these benchmarks not knowing how rkyv would perform relative to its peers, but confident that it would make a good showing. It ended up doing much better than I expected. It won nearly every performance category, and was highly competitive with the winner when it didn't. It also did so without compromising on size, where it was also highly competitive. Finally it showed exceptional scalability, peforming equally well on all different kinds of data where its zero-copy competitors all hard shortcomings on one or more of the data sets. Unlike abomonation, it's also a safe, highly-portable format that doesn't need mutable backing and has more feature support than other competitors.
I welcome and encourage anyone to run the benchmarks for themselves and open pull requests to improve or clean up whatever they want. I am confident in the validity of these results, and will happily update the tables as changes are made. I will update my analyses if there are any major changes.
My hope is that this article not only convinced you that rkyv is one of the best-performing serializers available, but that it also helped you understand the relationships between the different serialization solutions available in rust today.
If you're interested in rkyv, I encourage you to contribute to the request for feedback for planning its future
Thanks to burntsushi for the article title inspiration