RubyGuides
Share this post!

Ruby Under The Hood: Memory Layout of an Object

If you enjoy seeing how things work under the hood I think you are going to love this post…

…because we are going to explore together how a Ruby object is laid out in memory & how you can manipulate that to do some cool stuff.

So fasten your seatbelts because this is going to be quite a journey into the depths of the Ruby interpreter.

Memory Layout of Arrays

When you create an array, Ruby has to back that up with some system memory & a little bit of metadata (like the array size). Since the main Ruby interpreter (MRI) is written in C there are no objects, but there is something else: structs.

A struct in C helps you store related data together, and this is used a lot in MRI’s source code to represent things like Array, String‘s & other kinds of objects.

By looking at one of those structs we can infer the memory layout of an object.

So let’s look at the struct for Array, called RArray:

I know this can look a bit intimidating if you are not familiar with C, but don’t worry! I will help you break this down into easy to digest bits of info 🙂

The first thing we have is this RBasic thing, which is also a struct:

This is something that most Ruby objects have & it contains a few things like the class for this object & some binary flags that say if this object is frozen or not (and other things like the ‘tainted’ attribute).

In other words: RBasic contains the generic metadata for the object.

After that we have another struct, which contains the length of the array (len).

The union expression is saying that aux can be either capa (for capacity) or shared. This is mostly an optimization thing, which is explained in more detail in this excellent post by Pat Shaughnessy. In terms of memory allocation, the compiler will use the biggest type inside an union.

Then we have ptr, which contains the memory address where the actual Array data is stored.

Here’s a picture of what this looks like (every white/grey box is 4 bytes in a 32-bit system):

array memory layout

You can see the memory size of an object using the ObjectSpace module:

Now we are ready to have some fun!

A Fun Experiment

RBasic is exactly 8 bytes in a 32-bit system & 16 bytes in a 64-bit system. Knowing this we can use the Fiddle module to access the raw memory bytes for an object & change them for some fun experiments.

For example:

We can change the frozen status by toggling a single bit. This is in essence what the freeze method does, but notice how there is no unfreeze method.

Let’s implement it just for fun!

First, lets require the Fiddle module (part of the Ruby Standard Library) & create a frozen string.

Next, we will need the memory address for our string, which can be obtained like this:

Finally, we flip the exact bit that Ruby checks to see if an object is frozen. We also check to see if this worked by calling the frozen? method.

Notice that the index [1] refers to the 2nd byte of the flags value (which is composed of 4 bytes in total).

Then we use ^= which is the “XOR” (Exlusive OR) operator to flip that bit. We do this because different bits inside flags have different meanings & we don’t want to change something unrelated.

If you have read my ruby tricks post you may have seen this before, but now you know how it works 🙂

Another thing you can try is to change the length of the array & print the array. You will see how the array becomes shorter! You can even change the class to make an Array think it’s a String

Conclusion

You have learned a bit about how Ruby works under the hood. How memory for Ruby objects is laid out & how you can use the Fiddle module to play around with that.

You should probably not use Fiddle like this in a real app, but it’s fun to experiment with.

Don’t forget to share this post so more people can see it 🙂