Black Bytes
Share this post!

All posts by Jesus Castello

ruby magic

There Is No Magic in Ruby

You may have heard (or even said yourself) that Rails / Ruby has too much magic…

…but where does that idea come from & what can you do to dispel that magic?

When something feels like magic it’s because there is something you don’t know, there is some information missing. It’s just like actual magic tricks, if you know the trick there is no magic.

Magic wand and hat

In software development, knowing the tricks is equivalent to having an understanding of how things really work.

Let’s See An Example!

This example is about BCrypt, a hashing algorithm used to store passwords securely. It is implemented in Ruby via the bcrypt gem.

Here’s how to use BCrypt to hash the word “testing”:

Notice how you get a different hash every time, but if you use something like MD5 (don’t!) or SHA1 you will always get the same output for a given string.

Why is BCrypt behaving in a different way?

I will explain what’s going on in a second, but first let’s look at how you compare two bcrypt hashes, one coming from the database & one from user input (like a form or something like that).

The part on the left (BCrypt::Password.new) is a BCrypt object, which takes the hash stored in the database as a parameter.

The part on the right (params[:password]) is just the plain-text password that the user is trying to log in with.

In this case we assume that a right user / password combination is being used.

So why is this evaluating to true?

Well, to understand this you need to know two things:

  1. BCrypt uses something called a “salt”, which is a random value used to increase security against pre-computed hashes. The salt is stored in the hash itself.
  2. In Ruby many things that look like syntax are just methods. This is the case for the double equals operator (==).

Knowing that, then I can tell you that BCrypt defines its own == method, which knows how to extract that “salt” value so that it can take that into account when comparing the passwords.

In other words, BCrypt#== takes the “salt” value from the stored hash, then it hashes the plain-text password (the user input) using this salt so that both hashes will be identical if the password is valid.

If you were to look at the source code it would look something like this:

Remember that super will call the same method (in this case ==) on the parent class. The parent class of BCrypt::Password is String.

Its All About Methods

One important thing to understand is that other than a few keywords & a few syntax elements (like parenthesis), its all about classes & methods.

So if you know what class you are working with (which you can check using the class method) you will always be able to know what operations (methods) are available.

But sometimes we just have a method call without an object, this often means the method is defined in the current class, but that’s not always the case.

Example:

Where is puts defined? Let’s find out:

Another thing that may be confusing is metaprogramming, because with metaprogramming you can create, modify or even remove methods during program execution. These things make the code more opaque & obscure.

You can reduce the confusion by being aware of what metaprogramming methods are available & how they work.

Summary

You learned how a deep understanding of how things work can help you dispel any “magic”, write better code & make you a better developer.

If you want to improve your Ruby skills check out my book, Ruby Deep Dive, which is designed to bridge the gap between the fundamentals & more advanced concepts 🙂

ruby graph theory

Practical Graph Theory in Ruby

This is the next installment in the “Practical Computer Science” series, where you will learn how to apply classic computer science concepts to solve real problems using Ruby.

Today we are going to talk about Graph Theory.

You may have heard about binary trees, they look like this:

binary-tree

The thing is that a binary tree is just a specialized version of a graph, so that should give you an idea of how widespread graphs are.

Let’s start with an overview of graph theory fundamentals, then we are going to see what are some practical uses & how to implement this in Ruby!

Graph Fundamentals

A graph is composed of two elements:

  • Nodes (or vertices)
  • Edges

One node represents one element in the graph, like a city or a street, in a graph representing a map. While the edges represent the connections between the nodes.

If you look at a computer science or math book you will see a graph defined by this formula: G(V, E).

Where G means Graph, V is the set of vertices & E is the set of edges.

Graphs can be directed or undirected. Meaning that you can only go one direction (directed graph) or both directions (undirected graph).

The most popular type of graph is the Directed Acyclic Graph (DAG). Acyclic means that there are no loops, there is no way to backtrack.

Uses for Graphs

Now that we have seen an overview of the fundamentals, let’s see some common uses for graphs.

Using graphs you can do things like:

  • Find the shortest (or longest) path between two locations
  • Check if two things are related to each other
  • Build a recommendation engine
  • Analyze dependencies

Another example includes finding the best route to a destination (think GPS devices).

How to Implement & Use Graphs

You could write your own graph implementation, but for this article we are going to stick to the RGL gem which already implements one for us.

To create a basic graph using RGL:

This code produces the following graph:

graph-1234

You can get a graphical representation of your graph like this:

Then copy the output of that method on a site that can process the dot language. Like this one.

Alternatively, you can install Graphviz on your machine to produce the image locally.

Now that we have a graph, we may want to traverse it to find out information about it.

There are two basic algorithms for searching your graph:

  • Breadth-First Search (BFS)
  • Depth-First Search (DFS)

In BFS you get the closest nodes first & in DFS you go as deep as possible for every node. These algorithms can be implemented using the stack data structure.

The RGL gem already implements those algorithms for you:

Look at the graph again & follow the path these algorithms did using just your eyes (or you can use a finger too if you want). That will help you get a sense of what’s going on.

Weighted Graphs

We can add more information to a graph in the form of weights to make it more useful.

Weights are given to edges, which are the paths between two nodes (also known as “vertices”). These weights represent the cost of going from one point to another.

For example, if we have a map of a country in the form of a graph & we want to reach a certain destination in the shortest time possible, the weights would represent the distance between two cities.

usa-graph

Or if we have a computer network, the weights may represent how many hops it takes to reach a certain network.

“In computer networking, a hop is one portion of the path between source and destination. Data packets pass through bridges, routers and gateways as they travel between source and destination. Each time packets are passed to the next network device, a hop occurs.” – Wikipedia

Here’s a code example for a weighted graph:

We can now search for the shortest path from one point to another. And that’s exactly the topic of the next section!

Finding The Shortest Path

A popular algorithm for finding the shortest path inside a graph is “Dijkstra’s Shortest Path” algorithm.

Given a weighted graph, we can use Dijkstra’s algorithm to solve this question:

“What’s the fastest way to get from point A to point B?”

Here is a code example, using the RGL gem:

This tells us the shortest path from New York to Houston, using the information available in the graph.

Summary

You have learned what a graph data structure is & how to use it with the RGL gem.

You have also learned about common algorithms for working with graphs, like DFS, BFS & Dijkstra’s.

Don’t forget to share this post if you found it useful so more people can enjoy it 🙂

ruby stringio

An Object That Behaves Like a File?

If you are looking for an object that behaves like an IO object (files, sockets, etc), but that you can control like a string, then StringIO is for you.

Let me show you some examples & things to watch out for!

Basic Examples

To create a StringIO object you can do this:

Then you can read from this object using methods like gets, read & each_line.

I made a handy table for you with the most useful methods:

Method Description
gets Read one line of input
read Read a specific amount of bytes (all by default)
each_line Given a block, iterate over each line
each_char Given a block, iterate over each character
<< Append data
rewind Reset the internal position pointer

Notice that StringIO has a position pointer. This pointer keeps track of how many bytes you have read, just like a file object does.

So every time you call a method like gets or read it will give you that amount of data & advance the pointer.

Even Enumerable methods like map or each_line will advance the position pointer, so keep that in mind.

You can reset the position pointer to the beginning using the rewind method:

That’s it for the basics. I’m going to show you some practical uses for StringIO, but first let me show you something else.

What About StringScanner?

Now you have seen what StringIO can do, but the Ruby Standard Library includes another string-related class.

That class is StringScanner.

It can be confusing since they have similar names & methods, but let me help you see the difference.

The main thing is this:

A StringIO object can replace another IO object (like a File or a Socket), but StringScanner is meant for doing things like parsing (making sense of some text by breaking it into a set of tokens).

These two classes still share something, besides having “String” in their name, they both use an internal position pointer.

Replacing Standard Input & Output

Let’s say you are writing a command-line application that ask user for output using the Kernel#gets method…

…if you want to test this code you are going to have to input something by hand every time.

So is automated testing out of the picture?

No, it isn’t!

This is where StringIO comes to the rescue. You can initialize a StringIO object with your test input & then replace the standard input object pointed by $stdin (this where Ruby looks for user input when you call gets).

This technique can also be used to capture output from methods like puts.

Display methods print to the default output device, known as ‘standard output’. In Ruby this is represented by an IO object which you can replace with your StringIO object.

Example:

This is more involved than the input version, because we want to make sure to restore the original STDOUT object & to rewind our StringIO so we can read the output.

Notice that most testing frameworks include a method to do this for you (assert_output for Minitest & the output matcher in RSpec), but it’s always great to know what’s going on behind the scenes 🙂

Summary

You learned about the StringIO class, which emulates a real IO object so it can act as a replacement for that kind of object.

This can be useful for testing classes that write output to the screen or require user input via the terminal.

If you know about some interesting uses for StringIO let us know in the comments & don’t forget to share this article so more people can enjoy it!

ruby oop

Stop Using Case Statements in Ruby

Are you using the full power of OOP (Object-Oriented Programming) or are you missing out?

If you are taking decisions based on the type of an object then you are missing out on one important OOP feature: polymorphism.

Type decisions are usually done inside case statements (which are not OO friendly) & in this article you will learn how to write better code by removing them.

Checking For Types

Let me start by showing you an example where we don’t take advantage of polymorphism.

We want to implement the “Rock, Paper, Scissors” game & we have decided to have one Game class and one class for every possible move.

To check for a winner we are going to implement a play method on Game:

And here is one of the moves (the others follow the same pattern):

Now we can call the play method with two moves & we will know if the first move wins.

Ok, this is working, but can we do better? Can we get rid of that ugly case statement?

Polymorphism Instead of Type Checking

Yes! We can get rid of the type-checking case statement by using OOP fundamentals.

The idea is to use the fact that we know the current class to ask the other movement object if it can beat us.

And we are going to use a method name that is specific to our class (for Rock the method name could be: do_you_beat_rock?)

In Ruby, polymorphism is the ability to send any method calls (also know as messages, in OOP parlance) to any object without having to check the object’s class. This is also known as “duck typing”, but I don’t like that term 🙂

In Java (bear with me for a second…), you have something called “interfaces” which allows you to force a set of methods to be implemented by a class at the compiler level.

We don’t have that in Ruby (probably for the better), so you will have to rely on testing.

Let’s see a code example of what this new implementation looks like:

Notice how the case statement is gone. It has been replaced by a single method call & two method definitions.

Update: As some readers have commented, I didn’t notice that the logic is reversed when implementing this pattern. One proposed solution by Daniel P. Clark is to just flip the order in the play method to move2.wins_against?(move1).

Isn’t this a lot cleaner? What do you think?

One More Move!

Now let’s say you want to add a new move. What would you have to change?

Think about it for a minute…

With the case statement approach you have to add a new branch to every move. Notice that you are “forced” to change a perfectly working method to introduce new functionality.

But that’s not an issue with our more OOP oriented version. All we have to do is add new methods. This is the open/closed principle (the “O” in SOLID) in action.

Another option could be to use metaprogramming (with method_missing or define_method).

Would I actually use metaprogramming for this?

Probably not, unless the number of moves is changing constantly, or there is a big number of moves.

There is a cost to metaprogramming, it’s not a silver bullet like some people may believe. You would be trading performance & readability for a bit of extra flexibility.

Conclusion

In this article you learned that you should avoid using case statements to check for class types. Instead, you should take advantage of polymorphism.

This will help you create better code that can be extended by adding new code instead of changing existing code. Now it’s your turn to take action & do some refactoring 🙂

Btw this doesn’t mean I’m advocating to entirely get rid case statements from your toolbox, but you should be wary whenever you find yourself writing one. Make sure that’s the best solution to the problem.

Don’t forget to share this article if you want me to keep writing articles like these!

Ruby Under The Hood: Memory Layout of an Object

If you enjoy seeing how things work under the hood I think you are going to love this post…

…because we are going to explore together how a Ruby object is laid out in memory & how you can manipulate that to do some cool stuff.

So fasten your seatbelts because this is going to be quite a journey into the depths of the Ruby interpreter.

Memory Layout of Arrays

When you create an array, Ruby has to back that up with some system memory & a little bit of metadata (like the array size). Since the main Ruby interpreter (MRI) is written in C there are no objects, but there is something else: structs.

A struct in C helps you store related data together, and this is used a lot in MRI’s source code to represent things like Array, String‘s & other kinds of objects.

By looking at one of those structs we can infer the memory layout of an object.

So let’s look at the struct for Array, called RArray:

I know this can look a bit intimidating if you are not familiar with C, but don’t worry! I will help you break this down into easy to digest bits of info 🙂

The first thing we have is this RBasic thing, which is also a struct:

This is something that most Ruby objects have & it contains a few things like the class for this object & some binary flags that say if this object is frozen or not (and other things like the ‘tainted’ attribute).

In other words: RBasic contains the generic metadata for the object.

After that we have another struct, which contains the length of the array (len).

The union expression is saying that aux can be either capa (for capacity) or shared. This is mostly an optimization thing, which is explained in more detail in this excellent post by Pat Shaughnessy. In terms of memory allocation, the compiler will use the biggest type inside an union.

Then we have ptr, which contains the memory address where the actual Array data is stored.

Here’s a picture of what this looks like (every white/grey box is 4 bytes in a 32-bit system):

array memory layout

You can see the memory size of an object using the ObjectSpace module:

Now we are ready to have some fun!

A Fun Experiment

RBasic is exactly 8 bytes in a 32-bit system & 16 bytes in a 64-bit system. Knowing this we can use the Fiddle module to access the raw memory bytes for an object & change them for some fun experiments.

For example:

We can change the frozen status by toggling a single bit. This is in essence what the freeze method does, but notice how there is no unfreeze method.

Let’s implement it just for fun!

First, lets require the Fiddle module (part of the Ruby Standard Library) & create a frozen string.

Next, we will need the memory address for our string, which can be obtained like this:

Finally, we flip the exact bit that Ruby checks to see if an object is frozen. We also check to see if this worked by calling the frozen? method.

Notice that the index [1] refers to the 2nd byte of the flags value (which is composed of 4 bytes in total).

Then we use ^= which is the “XOR” (Exlusive OR) operator to flip that bit. We do this because different bits inside flags have different meanings & we don’t want to change something unrelated.

If you have read my ruby tricks post you may have seen this before, but now you know how it works 🙂

Another thing you can try is to change the length of the array & print the array. You will see how the array becomes shorter! You can even change the class to make an Array think it’s a String

Conclusion

You have learned a bit about how Ruby works under the hood. How memory for Ruby objects is laid out & how you can use the Fiddle module to play around with that.

You should probably not use Fiddle like this in a real app, but it’s fun to experiment with.

Don’t forget to share this post so more people can see it 🙂

1 2 3 20