Black Bytes
Share this post!

Category Archives for Programming

ruby graph theory

Pratical Graph Theory in Ruby

This is the next installment in the “Practical Computer Science” series, where you will learn how to apply classic computer science concepts to solve real problems using Ruby.

Today we are going to talk about Graph Theory.

You may have heard about binary trees, they look like this:

binary-tree

The thing is that a binary tree is just a specialized version of a graph, so that should give you an idea of how widespread graphs are.

Let’s start with an overview of graph theory fundamentals, then we are going to see what are some practical uses & how to implement this in Ruby!

Graph Fundamentals

A graph is composed of two elements:

  • Nodes (or vertices)
  • Edges

One node represents one element in the graph, like a city or a street, in a graph representing a map. While the edges represent the connections between the nodes.

If you look at a computer science or math book you will see a graph defined by this formula: G(V, E).

Where G means Graph, V is the set of vertices & E is the set of edges.

Graphs can be directed or undirected. Meaning that you can only go one direction (directed graph) or both directions (undirected graph).

The most popular type of graph is the Directed Acyclic Graph (DAG). Acyclic means that there are no loops, there is no way to backtrack.

Uses for Graphs

Now that we have seen an overview of the fundamentals, let’s see some common uses for graphs.

Using graphs you can do things like:

  • Find the shortest (or longest) path between two locations
  • Check if two things are related to each other
  • Build a recommendation engine
  • Analyze dependencies

Another example includes finding the best route to a destination (think GPS devices).

How to Implement & Use Graphs

You could write your own graph implementation, but for this article we are going to stick to the RGL gem which already implements one for us.

To create a basic graph using RGL:

This code produces the following graph:

graph-1234

You can get a graphical representation of your graph like this:

Then copy the output of that method on a site that can process the dot language. Like this one.

Alternatively, you can install Graphviz on your machine to produce the image locally.

Now that we have a graph, we may want to traverse it to find out information about it.

There are two basic algorithms for searching your graph:

  • Breadth-First Search (BFS)
  • Depth-First Search (DFS)

In BFS you get the closest nodes first & in DFS you go as deep as possible for every node. These algorithms can be implemented using the stack data structure.

The RGL gem already implements those algorithms for you:

Look at the graph again & follow the path these algorithms did using just your eyes (or you can use a finger too if you want). That will help you get a sense of what’s going on.

Weighted Graphs

We can add more information to a graph in the form of weights to make it more useful.

Weights are given to edges, which are the paths between two nodes (also known as “vertices”). These weights represent the cost of going from one point to another.

For example, if we have a map of a country in the form of a graph & we want to reach a certain destination in the shortest time possible, the weights would represent the distance between two cities.

usa-graph

Or if we have a computer network, the weights may represent how many hops it takes to reach a certain network.

“In computer networking, a hop is one portion of the path between source and destination. Data packets pass through bridges, routers and gateways as they travel between source and destination. Each time packets are passed to the next network device, a hop occurs.” – Wikipedia

Here’s a code example for a weighted graph:

We can now search for the shortest path from one point to another. And that’s exactly the topic of the next section!

Finding The Shortest Path

A popular algorithm for finding the shortest path inside a graph is the “Dijkstra’s Shortest Path” algorithm.

Given a weighted graph, we can use Dijkstra’s algorithm to solve this question:

“What’s the fastest way to get from point A to point B?”

Here is an code example, using the RGL gem:

This tells us the shortest path from New York to Houston, using the information available in the graph.

Summary

You have learned what a graph data structure is & how to use it with the RGL gem.

You have also learned about common algorithms for working with graphs, like DFS, BFS & Dijkstra’s.

Don’t forget to share this post if you found it useful so more people can enjoy it 🙂

ruby stringio

An Object That Behaves Like a File?

If you are looking for an object that behaves like an IO object (files, sockets, etc), but that you can control like a string, then StringIO is for you.

Let me show you some examples & things to watch out for!

Basic Examples

To create a StringIO object you can do this:

Then you can read from this object using methods like gets, read & each_line.

I made a handy table for you with the most useful methods:

Method Description
gets Read one line of input
read Read a specific amount of bytes (all by default)
each_line Given a block, iterate over each line
each_char Given a block, iterate over each character
<< Append data
rewind Reset the internal position pointer

Notice that StringIO has a position pointer. This pointer keeps track of how many bytes you have read, just like a file object does.

So every time you call a method like gets or read it will give you that amount of data & advance the pointer.

Even Enumerable methods like map or each_line will advance the position pointer, so keep that in mind.

You can reset the position pointer to the beginning using the rewind method:

That’s it for the basics. I’m going to show you some practical uses for StringIO, but first let me show you something else.

What About StringScanner?

Now you have seen what StringIO can do, but the Ruby Standard Library includes another string-related class.

That class is StringScanner.

It can be confusing since they have similar names & methods, but let me help you see the difference.

The main thing is this:

A StringIO object can replace another IO object (like a File or a Socket), but StringScanner is meant for doing things like parsing (making sense of some text by breaking it into a set of tokens).

These two classes still share something, besides having “String” in their name, they both use an internal position pointer.

Replacing Standard Input & Output

Let’s say you are writing a command-line application that ask user for output using the Kernel#gets method…

…if you want to test this code you are going to have to input something by hand every time.

So is automated testing out of the picture?

No, it isn’t!

This is where StringIO comes to the rescue. You can initialize a StringIO object with your test input & then replace the standard input object pointed by $stdin (this where Ruby looks for user input when you call gets).

This technique can also be used to capture output from methods like puts.

Display methods print to the default output device, known as ‘standard output’. In Ruby this is represented by an IO object which you can replace with your StringIO object.

Example:

This is more involved than the input version, because we want to make sure to restore the original STDOUT object & to rewind our StringIO so we can read the output.

Notice that most testing frameworks include a method to do this for you (assert_output for Minitest & the output matcher in RSpec), but it’s always great to know what’s going on behind the scenes 🙂

Summary

You learned about the StringIO class, which emulates a real IO object so it can act as a replacement for that kind of object.

This can be useful for testing classes that write output to the screen or require user input via the terminal.

If you know about some interesting uses for StringIO let us know in the comments & don’t forget to share this article so more people can enjoy it!

ruby oop

Stop Using Case Statements in Ruby

Are you using the full power of OOP (Object-Oriented Programming) or are you missing out?

If you are taking decisions based on the type of an object then you are missing out on one important OOP feature: polymorphism.

Type decisions are usually done inside case statements (which are not OO friendly) & in this article you will learn how to write better code by removing them.

Checking For Types

Let me start by showing you an example where we don’t take advantage of polymorphism.

We want to implement the “Rock, Paper, Scissors” game & we have decided to have one Game class and one class for every possible move.

To check for a winner we are going to implement a play method on Game:

And here is one of the moves (the others follow the same pattern):

Now we can call the play method with two moves & we will know if the first move wins.

Ok, this is working, but can we do better? Can we get rid of that ugly case statement?

Polymorphism Instead of Type Checking

Yes! We can get rid of the type-checking case statement by using OOP fundamentals.

The idea is to use the fact that we know the current class to ask the other movement object if it can beat us.

And we are going to use a method name that is specific to our class (for Rock the method name could be: do_you_beat_rock?)

In Ruby, polymorphism is the ability to send any method calls (also know as messages, in OOP parlance) to any object without having to check the object’s class. This is also known as “duck typing”, but I don’t like that term 🙂

In Java (bear with me for a second…), you have something called “interfaces” which allows you to force a set of methods to be implemented by a class at the compiler level.

We don’t have that in Ruby (probably for the better), so you will have to rely on testing.

Let’s see a code example of what this new implementation looks like:

Notice how the case statement is gone. It has been replaced by a single method call & two method definitions.

Update: As some readers have commented, I didn’t notice that the logic is reversed when implementing this pattern. One proposed solution by Daniel P. Clark is to just flip the order in the play method to move2.wins_against?(move1).

Isn’t this a lot cleaner? What do you think?

One More Move!

Now let’s say you want to add a new move. What would you have to change?

Think about it for a minute…

With the case statement approach you have to add a new branch to every move. Notice that you are “forced” to change a perfectly working method to introduce new functionality.

But that’s not an issue with our more OOP oriented version. All we have to do is add new methods. This is the open/closed principle (the “O” in SOLID) in action.

Another option could be to use metaprogramming (with method_missing or define_method).

Would I actually use metaprogramming for this?

Probably not, unless the number of moves is changing constantly, or there is a big number of moves.

There is a cost to metaprogramming, it’s not a silver bullet like some people may believe. You would be trading performance & readability for a bit of extra flexibility.

Conclusion

In this article you learned that you should avoid using case statements to check for class types. Instead, you should take advantage of polymorphism.

This will help you create better code that can be extended by adding new code instead of changing existing code. Now it’s your turn to take action & do some refactoring 🙂

Btw this doesn’t mean I’m advocating to entirely get rid case statements from your toolbox, but you should be wary whenever you find yourself writing one. Make sure that’s the best solution to the problem.

Don’t forget to share this article if you want me to keep writing articles like these!

Ruby Under The Hood: Memory Layout of an Object

If you enjoy seeing how things work under the hood I think you are going to love this post…

…because we are going to explore together how a Ruby object is laid out in memory & how you can manipulate that to do some cool stuff.

So fasten your seatbelts because this is going to be quite a journey into the depths of the Ruby interpreter.

Memory Layout of Arrays

When you create an array, Ruby has to back that up with some system memory & a little bit of metadata (like the array size). Since the main Ruby interpreter (MRI) is written in C there are no objects, but there is something else: structs.

A struct in C helps you store related data together, and this is used a lot in MRI’s source code to represent things like Array, String‘s & other kinds of objects.

By looking at one of those structs we can infer the memory layout of an object.

So let’s look at the struct for Array, called RArray:

I know this can look a bit intimidating if you are not familiar with C, but don’t worry! I will help you break this down into easy to digest bits of info 🙂

The first thing we have is this RBasic thing, which is also a struct:

This is something that most Ruby objects have & it contains a few things like the class for this object & some binary flags that say if this object is frozen or not (and other things like the ‘tainted’ attribute).

In other words: RBasic contains the generic metadata for the object.

After that we have another struct, which contains the length of the array (len).

The union expression is saying that aux can be either capa (for capacity) or shared. This is mostly an optimization thing, which is explained in more detail in this excellent post by Pat Shaughnessy. In terms of memory allocation, the compiler will use the biggest type inside an union.

Then we have ptr, which contains the memory address where the actual Array data is stored.

Here’s a picture of what this looks like (every white/grey box is 4 bytes in a 32-bit system):

array memory layout

You can see the memory size of an object using the ObjectSpace module:

Now we are ready to have some fun!

A Fun Experiment

RBasic is exactly 8 bytes in a 32-bit system & 16 bytes in a 64-bit system. Knowing this we can use the Fiddle module to access the raw memory bytes for an object & change them for some fun experiments.

For example:

We can change the frozen status by toggling a single bit. This is in essence what the freeze method does, but notice how there is no unfreeze method.

Let’s implement it just for fun!

First, lets require the Fiddle module (part of the Ruby Standard Library) & create a frozen string.

Next, we will need the memory address for our string, which can be obtained like this:

Finally, we flip the exact bit that Ruby checks to see if an object is frozen. We also check to see if this worked by calling the frozen? method.

Notice that the index [1] refers to the 2nd byte of the flags value (which is composed of 4 bytes in total).

Then we use ^= which is the “XOR” (Exlusive OR) operator to flip that bit. We do this because different bits inside flags have different meanings & we don’t want to change something unrelated.

If you have read my ruby tricks post you may have seen this before, but now you know how it works 🙂

Another thing you can try is to change the length of the array & print the array. You will see how the array becomes shorter! You can even change the class to make an Array think it’s a String

Conclusion

You have learned a bit about how Ruby works under the hood. How memory for Ruby objects is laid out & how you can use the Fiddle module to play around with that.

You should probably not use Fiddle like this in a real app, but it’s fun to experiment with.

Don’t forget to share this post so more people can see it 🙂

ruby stack

Practical Computer Science in Ruby: Using Stacks to Solve Problems

If you don’t have a CS (Computer Science) degree you might feel like you are missing out on something… Or you may feel like CS is too abstract to be useful… Or that Ruby already does all the hard work for you…

In this post I want to show you one practical computer science concept that you can start putting into practice right now!

The Stack

A stack is a data structure which you can use as a “to-do” list. You keep taking elements from the stack & processing them until the stack is empty.

Here’s is a visual representation.

Push 5 into an empty stack:

5

Push 3 into the stack:

3
5

Push 9 into the stack:

9
3
5

Take one item from the stack:

3
5

The big thing to notice here is that new items are added to the top of the stack. In a “Last-In First-Out” (LIFO) fashion. Meaning that when you take (pop) an item from the stack, it will be the last item that was pushed into it.

Another way to visualize how this works is by thinking about a stack of plates. If you put one plate on top of another you can’t take out the bottom plate without taking out the top plate first.

That’s all you are doing with a stack, pushing things to the top or poping things out. There is no indexing.

Let’s see some code examples on how this can be used.

Example 1 (flatten)

One classic example is to flatten an array. In other words, convert a multi-dimensional array into a one-dimensional array.

In Ruby we have the flatten method which does this for you. But what if you didn’t have this method? And how does this method work?

This is where the stack comes in!

Ruby implements the push & pop methods, so we can treat an array like a stack.

Note: Both push & << are the same method. I will be using << in my code examples.

The idea is to go over every element and check if it’s an array or something else. If it’s an array then we will push the elements inside this array back into the stack.

What happens is that we keep removing array layers (like an onion) until there are none left. This will leave us with a flattened array.

Here’s is the code:

You will notice there is no pop method call in this code. This is because I’m letting each do the hard work of taking the items from the stack & feeding them to my algorithm. Also notice how the order of the elements is not mantained when doing things this way.

Another version using until & empty?:

Since we are using pop now, instead of letting each do its thing, we are getting the flattened array in the correct order… But in reverse.

This reveals an interesting property of stacks:

Whatever list of things you put in, it will come back in the same order but in reverse.

Tip: The Array#flatten method takes an argument, which lets you define how many layers of nesting you would like to remove (by default all of them).

Example 2 (balanced parenthesis)

Here is another example, where there is no equivalent Ruby method to do everything for you. And it’s also another classic problem in computer science: matching balanced parenthesis.

The idea is that you are given a string & you need to validate if the parenthesis make sense.

For example, let’s say you are writing a math evaluation program. You want to make sure the input is valid before processing it.

Example (Valid Input):

Example (Invalid Input):

You can use a stack to keep track of any parenthesis you have found in the input & then when you find a new parenthesis you check the top of the stack to make sure it matches the closing parenthesis.

If there is no match that means that you have invalid input.

Example:

Another thing you will notice is that I ended this valid_parentheses method with a stack.empty?. That’s to make sure we don’t end with unclosed parenthesis.

If all the parenthesis are correctly closed then the stack should be empty 🙂

Example 3 (directions)

One more example to make sure you understand how to apply this.

In this case we are given a set of directions & we are told that we need to optimize it to save our traveler some time.

Here is an example set:

You will notice that if you go north then south you will end up in the same place (assuming it’s the same distance both directions). So that’s what we need to optimize & we can do that using a stack.

Example:

Conclusion

I hope you learned something new & that you start looking for ways to use stacks to solve programming challenges.

Don’t forget to share this post so more people can read this!

1 2 3 13