Black Bytes
Share this post!

Category Archives for Programming

ruby equality

How to Make Your Classes More Powerful by Implementing Equality

How do you compare two things in Ruby? Using == as you already know… but did you know that == is a method & not just syntax?

You can implement this method in your own classes to make them more powerful. And that’s what I want to talk about in this post.

Equality Basics

As you know you can compare two strings like this:

And if the content is equal then this will evaluate to true. This works because the String class implements a == method that knows how to compare strings.

But what if String didn’t implement ==?

Then Ruby would use Object‘s implementation of ==, which defaults to testing for object identity, instead of object contents.

Example:

Implementing Equality

Now let’s use what you just learned to make your own classes more powerful by being able to compare them.

Thanks to the == method you can define exactly what it means for two instances of your own class to be equal.

Example:

The == method says that both the name and the price must be the same for two Product objects to be considered equal.

Remember:

If you don’t implement this method (or use the Comparable module, which I explain in my Ruby book) the two objects will be compared using their object id’s, instead of their values.

Also I should mention that if you use a Struct it already implements == for you.

What About Triple Equals?

You may be wondering if == is a method, is === also a method? And the answer is yes 🙂

So what’s the difference between the two?

In Javascript there is a clear difference, where == will try to convert the object types to be the same if they aren’t (1 vs '1'). And === is for ‘strict’ equality.

But in Ruby there is not such thing. What === means depends on the class implementing it.

In many cases it is just an alias for ==.

Like in String and Object.

Here’s a table of built-in classes which give === a special meaning:

Class Meaning
Range Returns true if obj is an element of the range, false otherwise.
Regexp Match regexp against a string.
Module Returns true if obj is an instance of mod or and instance of one of mod’s descendants.
Proc Invokes the block with obj as the proc’s parameter like Proc#call. It is to allow a proc object to be a target of a when clause in a case statement.

Conclusion

In this post you learned how to make your classes more powerful by implementing the == method. You also learned the difference between == and ===.

Don’t forget to share this post so more people can see it 🙂

15 Weird Things About Ruby That You Should Know

15 Weird Things About Ruby That You Should Know

By Jesus Castello

Ruby is an amazing language with a lot of interesting details that you may not have seen before...

…in this post I compiled some of those details for your own enjoyment in a nice-looking list :)​

1

Heredoc + Method

If you have some data that you want to embed into your program you may want to use a “heredoc”.

Like this:

This will give you a string. But you may want to do some post-processing, like splitting this string into an array of strings.

Ruby lets you do this:

Bonus tip:
Ruby 2.3 introduced the "squiggly heredoc" <<~. This will remove all the extra spaces introduced by indentation, which is a common problem when using heredocs for text.

2

Call a Method Using Double Colon

Apparently this is a thing…

3

Puts with Multiple Arguments

Pretty simple, but could be useful in some situations I guess.

4

Infinite Indexing

Example:

This works because [] is just a method & it keeps returning the first character, which is also a string.

5

De-structuring Block Arguments (or whatever you want to call this)

Want to get rid of some local variables? You will love this trick!

This has the same effect as if we did this:

But it saves you one line of code 🙂

6

Special Global Variables

When you use a regular expression with capture groups it will set the global variable $1 to the first group, $2 for the second group, etc.

The thing about these is that they don't behave like normal variables. They are ‘method-local’ & ‘thread-local’, as described by the documentation.

Also they can’t be directly assigned to like regular global variables.

7

Shovel Method on Strings

There is this “shovel” method on string which doesn’t do what you would expect when you use it with a number

…it’s interpreting the number as an ASCII character.

Here’s another way to do that:

8

Character Literals

Not sure if there are any practical uses for this one…

Let me know in the comments what you think 🙂

9

The RbConfig Module

RbConfig is a module which is not documented & it contains some info about your Ruby installation.

There is some useful info under RbConfig::CONFIG (like compile flags, ruby version & operating system).

10

Spaces, Spaces Everywhere!

You can put as many spaces as you want between a method call & the receiver of that call.

Yes, this is valid Ruby syntax 🙂

11

Infinite Nesting of Constants

You can have an infinite amount of nested constants like this:

The reason this works is that all top-level constants (defined outside any class) are contained in the Object class & every class inherits from Object by default.

Try Object.constants to see what I mean.

12

Chaining the Shovel Operator

You can chain the shovel << operator multiple times:

13

BEGIN & END

Two keywords that you don’t see very often are BEGIN & END. I believe these come from the Perl / Unix world, where it’s common to write short scripts for processing output from other programs.

Let’s see an example of how this works:

This code will print "Program starting..." before it prints 123. It could be useful if you are writing the kind of short scripts that this is meant for, but probably not very useful in web applications.

Update:
Reader Ronald sent me some interesting uses for this trick. Here is what he said:

"It is very useful, for example for fiddling with the RUBYLIB path for the 'require' statements, because it is guaranteed to be executed before all the 'require'. I also use it to set $VERBOSE to true, or set some environment variables, etc."

14

Flip-Flop

I don’t even know why this is a thing, but I would advice to stay away from it because it can be confusing & most people are not familiar with this feature 🙂

But it could be useful to know in case you find this in other people’s code.

This is the syntax:

The idea is that once the first condition is true an invisible “switch” will turn on & everything from there will evaluate as true until the 2nd condition is true.

Example:

This prints all the numbers from 3 to 15, but if you skip 15 it will keep printing.

15

Redo Keyword

Another keyword that you don’t see very often is redo, this allows you to repeat the same iteration inside a loop…

…but unless you use next or break you will have an infinite loop. So I think you should not use this feature.

Conclusion

You learned about a few cool Ruby tricks & tips. If you want more see my other post here.

Don’t forget to share this post so more people can see it!

ruby hash tables

Hash Tables Explained

One of my favorite data structures is the hash table because it’s simple & powerful. You probably have used it before since it’s an efficient way to store key-value pairs.

There are some interesting computer science concepts behind the implementation of a hash table that are worth studying, and that’s exactly what we are going to do in this article!

Buckets & The Hash Function

The basic idea of a hash table is to allow us to efficiently (in O(1)) retrieve data that is indexed by a key.

As a quick refresher, this is what using a hash table looks like in Ruby:

To implement a hash table we need two components:

  • A place to store the table entries
  • A way to assign key/value pairs to a specific position (index) inside this data store

In other words, we need an array & a hash function.

Implementing a Simple Hash Function

The hash function is an important component of a hash table. This function transforms the key into an index that can be used to lookup or update its associated value.

ruby hash tables

This is the big difference between plain arrays & hash tables.

In an array, you can only access values via their index and the index can only be a number. In a hash table, you access values via their key & the key can be anything (string, symbol, integer…), as long as you can write a hash function for it.

You can write a simple hash function for strings by converting all the letters into their ASCII values then adding them up.

Here is an example:

In this method we use to_s to make sure we are working with a string. This will help us avoid ‘undefined method’ errors. Then a combination of chars (to convert the string into an Array of its characters) & inject for adding up the values.

Inside the block I used the ord method to convert the characters into their ordinal values.

Finally, I used the modulo operator % to make sure the resulting value fits into the array. We call each entry in this array a ‘bucket’.

Bucket Distribution

Ideally we want all our buckets to be filled evenly, this will give us the best performance when we need to retrieve a value.

Let’s look at what happens when we test our hash function using this code:

This produces the following results:

It looks like our keys are evenly distributed…

…but what happens if we ramp up the number of buckets?

In this example I used a bucket size of 128 (it was 32 before):

That doesn’t look like a great distribution anymore!

What’s going on?

The problem is that our hash function is not good enough because all the strings of the same size stay in a certain range. That’s why we have a lot of empty buckets in the middle.

A Better Hash Function

We need a better way to convert our string into an index. Let’s take a look at one possible implementation.

What’s happening here is bit shifting (with the >> & << operators). The values are combined using the “exclusive OR operator” (^).

This bit shifting will mix things up, which will give us a better key distribution. Not perfect, but it’s better than our simple ASCII-based function 🙂

If you want a proper hash function you would be looking at something like MurmurHash, which I believe is what Ruby uses.

Handling Collisions

We don’t have a useful hash table yet.

Why?

Well, you may have noticed that when two keys hash to the same index they will overwrite the old value, and that’s not good!

This is called a hash collision & there are a few strategies to deal with these.

For example:

  • Double Hashing
  • Linear probing
  • Separate chaining

Let’s take a look at separate chaining, where we use a linked-list to store the entries for a particular “bucket”.

So if we assume that :abc & :ccc hash to the same index, our hash table would look something like this:

Then we will need a linear search to find the correct key.

This will have an impact on our performance because our lookup time can slowly degrade towards O(n), instead of the expected O(1).

If you are not familiar with this O(something) notation that’s called “Big-O Notation“.

To avoid the linked list from growing too deep & therefore degrade the performance of the hash table, it’s necessary to recreate the hash table using a bigger number of buckets.

Ruby does this for you automatically, but still good to know.

Conclusion

The purpose of this article is not to have you writing a hash table implementation, but to help you understand how they actually work, so I hope that you found that interesting!

Don’t forget to share this post to keep the blog going 🙂

ruby pack unpack

Packing & Unpacking: A Guide to Reading Binary Data in Ruby

Working with text is a lot easier than working with binary data…

…with text you can use regular expressions & methods like scan, match & gsub.

But if you want to work with binary data there is some extra work to do. That’s where the Array#pack & String#unpack methods come into play.

Let me show you some examples, starting with just a plain string & then moving on to more interesting stuff.

String to ASCII Values

This will convert every character in the string into a decimal value:

Notice the "c*" argument for unpack.

This is a “format string” which tells unpack what to do with the data. In this case, c means take one character & convert it into an integer value (the String#ord method also does this).

The asterisk * just says “repeat this format for all the input data”.

Hex to Integer

This format string takes 4 bytes of data & returns an integer. One thing to notice is that these bytes are in “little-endian” format.

Examples:

I used first here because unpack returns an array.

Binary File Parsing

How do you read a binary file like an EXE, PNG or GZIP?

If you treat these like strings you will just see something that looks like random data…

ruby string pack

…but there is a documented structure for most of these file formats & the unpack method is what you would use to read that data and convert it into something useful.

Here is an example:

In this example, the binary data (represented in hexadecimal, which is way more compact than 1s & 0s) has a two-byte (16 bit) length field that contains the length of the following string. Then there is the string itself.

It is very common for binary files & binary network protocols to have a “length” field.

This tells the parser exactly how many bytes should be read (and yes, I know in this example I read both the length & the data in one step, that’s just to keep things simple).

There is also the bindata gem, which is built specifically to help you parse binary structures.

Here is an example:

Notice the read_length parameter. This will tell bindata to work out the length from the field, so this will save you a lot of work 🙂

So if you want to write a parser for any binary format, these are the steps:

  1. Find the specification for this format (if it’s not public you will have to reverse-engineer it, which is an entire topic on its own)
  2. Write a bindata class for every section of the file (you will usually find a header section first with metadata & then multiple data sections)
  3. Read the data & process it however you want (for example, in a PNG you could change the colors of the image)
  4. Profit!

If you want to see a full example of bindata in action take a look at my PNG parser on github.

Base64 Encoding

There is this type of encoding called “Base64”. You may have seen it before on a URL.

Looks something like this:

The double equals at the end is usually the tell-tale sign that you are dealing with Base64, although some inputs can result in the equals signs not being there (they are used as padding).

So why I’m telling you this, besides being a useful thing to know in itself?

Well it turns out that you can convert a string into Base64 using the pack method.

As you can see here:

In fact, this is the exact method used in the Base64 module from the standard library 🙂

Summary

In this post you learned about the pack & unpack methods, which help you work with binary data. It can be used to parse binary files, convert a string into ASCII values & Base64 encoding.

Don’t forget to share & subscribe so you can enjoy more blog post like this! 🙂

ruby tracepoint

How To Spy on Your Ruby Methods

Ruby has a built-in tracing system which you can access using the TracePoint class. Some of the things you can trace are method calls, new threads & exceptions.

Why would you want to use this?

Well, it could be useful if you want to trace the execution of a certain method. You will be able to see what other methods are being called & what are the return values.

Let’s see a few examples!

Tracing Method Calls

Most of the time you will want TracePoint to trace application code & not built-in methods (like puts, size, etc).

You can do this using the call event.

Example:

This prints the file path, the line number, the event name & the method name.

If you don’t specify any events Ruby will call your block for all of them, resulting in more output. So I would recommend that you focus on specific events to find what you want faster 🙂

Here’s is a table of TracePoint events:

Event name Description
call Application methods
c_call C-level methods (like puts)
return Method return (for tracing return values & call depth)
b_call Block call
b_return Block return
raise Exception raised
thread_begin New thread
thread_end Thread ending

TracePoint + Graphviz

Many methods will make more than just 3 methods calls, especially in framework code, so the output from Tracepoint can be hard to visualize.

So I made a gem that lets you create a visual call graph like this:

This generates a call_graph.png file with the results.

ruby call graph

Keep in mind that this is not static analysis, this will actually call the method!

Showing File Paths

Would you like to know where these methods are defined?

Don’t worry, I got you covered! I added an option you can enable to show the file path for each method call.

Which results in:

visual call graph

If you want to see some massive call graphs you just have to trace some Rails methods 😉

Return Values

In the intro I mentioned that you can also get return values. For this you will need to trace the return event and use the return_value method.

Example:

This will print:

Events First

Someone asked on reddit how it’s possible to avoid having the word “bar” printed when calling the foo method in the following code:

There are many ways to achieve this, like prepending a module, redirecting $stdout or redefining the bar method.

If you are feeling creative, comment on this post with your own idea!

But I found one of the answers particularly interesting because it used the TracePoint class.

Here it is:

This code will call exit when the method bar is called, which prevents the string from being printed by ending the program.

Probably not something you want to use in real code, but it proves one thing about TracePoint: Events are triggered before they happen.

Something to keep in mind if you are going to build some sort of tool around this 🙂

Summary

In this post you learned about the TracePoint class, which allows you to trace a few events like methods calls or new threads. This can be useful as a debugging tool or for code exploration.

Remember to share this post so more people can enjoy it 🙂

1 2 3 12