Ruby data object comparison (or why you should never, ever use OpenStruct)

In profiling some code recently, we found a spot where we were using OpenStruct to handle parsed JSON coming back from a web service. OpenStruct is nice, because it allows you to address the returned object like a normal Ruby class using dot notation (IE my_class.name = "foo"). We read the following warning in the docs:

This should be a consideration if there is a concern about the performance of the objects that are created, as there is much more overhead in the setting of these properties compared to using a Hash or a Struct.

However, we didn't realize just how bad performance would be. Check out this benchmark comparing Hash, OpenStruct, Struct, and Class:

OpenStruct uses Ruby's method_missing and define_method to simulate "normal" objects, so if you are using these methods frequently in code, check to see if they are really necessary.

Seeing that classes outperformed hashes by a significant margin, I decided to investigate using Ruby 2.0's new keyword arguments when I needed optional parameters on methods. My assumption was that calling a method and passing in the arguments as a hash would incur some performance penalty as each method invocation would result in a new hash being instantiated.

I was surprised to see that using hashes outperformed Ruby 2.0's native keyword arguments by quite a bit:

My testing was done with Ruby 2.0p353 on a MacBook Retina (2012).

Update - June 2015

Turns out there have been some improvements to the Ruby 2.x codebase to make keyword args perform much better. They now beat hashes  by roughly 3x:
# Using Ruby 2.2.1p85 on a MacBook Pro (Retina Early 2015) w/ 3.1 GHz i7
                 user     system      total        real
hash:        7.980000   0.510000   8.490000 (  8.631545)
keywords:    2.940000   0.060000   3.000000 (  3.038953)

And just for reference, here is an updated run of the Hash vs OpenStruct vs Struct vs Class comparison:
                    user     system      total        real
hash:           5.490000   0.540000   6.030000 (  6.396844)
openstruct:   127.070000   1.610000 128.680000 (131.303417)
struct:         3.630000   0.040000   3.670000 (  3.691814)
class:          2.350000   0.010000   2.360000 (  2.415393)

6 responses
Wow thanks for the heads up. I was considering OpenStruct for some small data storage in an app I am looking at but will probably go with Struct now.
As of Ruby 2.2.2p95, the relative speed of keywords now far exceeds Hashes. Using your benchmark code on a Mid-2011 iMac (3.1 GHz Intel i5 CPU, 16 GB RAM) yields user system total real hash: 9.650000 0.200000 9.850000 ( 10.429284) keywords: 3.170000 0.010000 3.180000 ( 3.325435) Hashes now take 3.136x as long as keywords.
@jeff: Thanks! This is actually great news. I've been working in Python and have come to love the optional arguments. My analysis agrees: user system total real hash: 7.980000 0.510000 8.490000 ( 8.631545) keywords: 2.940000 0.060000 3.000000 ( 3.038953)
It is worth noting that most of the overhead is in the ostruct _creation_. Accessing the ostruct is still slower but not excessively so. Running the same test with a single object creation (of each type), but 10 million iterations of COUNT.times { name = .name; email = .email } gives these results: hash: 1.480000 0.000000 1.480000 ( 1.476232) ostruct: 3.500000 0.000000 3.500000 ( 3.500542) struct: 1.470000 0.000000 1.470000 ( 1.466314) class: 1.460000 0.000000 1.460000 ( 1.460085)
2 visitors upvoted this post.