Black Bytes
Share this post!

All posts by Jesus Castello

15 Weird Things About Ruby That You Should Know

15 Weird Things About Ruby That You Should Know

By Jesus Castello

Ruby is an amazing language with a lot of interesting details that you may not have seen before...

…in this post I compiled some of those details for your own enjoyment in a nice-looking list :)​

1

Heredoc + Method

If you have some data that you want to embed into your program you may want to use a “heredoc”.

Like this:

This will give you a string. But you may want to do some post-processing, like splitting this string into an array of strings.

Ruby lets you do this:

Bonus tip:
Ruby 2.3 introduced the "squiggly heredoc" <<~. This will remove all the extra spaces introduced by indentation, which is a common problem when using heredocs for text.

2

Call a Method Using Double Colon

Apparently this is a thing…

3

Puts with Multiple Arguments

Pretty simple, but could be useful in some situations I guess.

4

Infinite Indexing

Example:

This works because [] is just a method & it keeps returning the first character, which is also a string.

5

De-structuring Block Arguments (or whatever you want to call this)

Want to get rid of some local variables? You will love this trick!

This has the same effect as if we did this:

But it saves you one line of code 🙂

6

Special Global Variables

When you use a regular expression with capture groups it will set the global variable $1 to the first group, $2 for the second group, etc.

The thing about these is that they don't behave like normal variables. They are ‘method-local’ & ‘thread-local’, as described by the documentation.

Also they can’t be directly assigned to like regular global variables.

7

Shovel Method on Strings

There is this “shovel” method on string which doesn’t do what you would expect when you use it with a number

…it’s interpreting the number as an ASCII character.

Here’s another way to do that:

8

Character Literals

Not sure if there are any practical uses for this one…

Let me know in the comments what you think 🙂

9

The RbConfig Module

RbConfig is a module which is not documented & it contains some info about your Ruby installation.

There is some useful info under RbConfig::CONFIG (like compile flags, ruby version & operating system).

10

Spaces, Spaces Everywhere!

You can put as many spaces as you want between a method call & the receiver of that call.

Yes, this is valid Ruby syntax 🙂

11

Infinite Nesting of Constants

You can have an infinite amount of nested constants like this:

The reason this works is that all top-level constants (defined outside any class) are contained in the Object class & every class inherits from Object by default.

Try Object.constants to see what I mean.

12

Chaining the Shovel Operator

You can chain the shovel << operator multiple times:

13

BEGIN & END

Two keywords that you don’t see very often are BEGIN & END. I believe these come from the Perl / Unix world, where it’s common to write short scripts for processing output from other programs.

Let’s see an example of how this works:

This code will print "Program starting..." before it prints 123. It could be useful if you are writing the kind of short scripts that this is meant for, but probably not very useful in web applications.

Update:
Reader Ronald sent me some interesting uses for this trick. Here is what he said:

"It is very useful, for example for fiddling with the RUBYLIB path for the 'require' statements, because it is guaranteed to be executed before all the 'require'. I also use it to set $VERBOSE to true, or set some environment variables, etc."

14

Flip-Flop

I don’t even know why this is a thing, but I would advice to stay away from it because it can be confusing & most people are not familiar with this feature 🙂

But it could be useful to know in case you find this in other people’s code.

This is the syntax:

The idea is that once the first condition is true an invisible “switch” will turn on & everything from there will evaluate as true until the 2nd condition is true.

Example:

This prints all the numbers from 3 to 15, but if you skip 15 it will keep printing.

15

Redo Keyword

Another keyword that you don’t see very often is redo, this allows you to repeat the same iteration inside a loop…

…but unless you use next or break you will have an infinite loop. So I think you should not use this feature.

Conclusion

You learned about a few cool Ruby tricks & tips. If you want more see my other post here.

Don’t forget to share this post so more people can see it!

ruby hash tables

Hash Tables Explained

One of my favorite data structures is the hash table because it’s simple & powerful. You probably have used it before since it’s an efficient way to store key-value pairs.

There are some interesting computer science concepts behind the implementation of a hash table that are worth studying, and that’s exactly what we are going to do in this article!

Buckets & The Hash Function

The basic idea of a hash table is to allow us to efficiently (in O(1)) retrieve data that is indexed by a key.

As a quick refresher, this is what using a hash table looks like in Ruby:

To implement a hash table we need two components:

  • A place to store the table entries
  • A way to assign key/value pairs to a specific position (index) inside this data store

In other words, we need an array & a hash function.

Implementing a Simple Hash Function

The hash function is an important component of a hash table. This function transforms the key into an index that can be used to lookup or update its associated value.

ruby hash tables

This is the big difference between plain arrays & hash tables.

In an array, you can only access values via their index and the index can only be a number. In a hash table, you access values via their key & the key can be anything (string, symbol, integer…), as long as you can write a hash function for it.

You can write a simple hash function for strings by converting all the letters into their ASCII values then adding them up.

Here is an example:

In this method we use to_s to make sure we are working with a string. This will help us avoid ‘undefined method’ errors. Then a combination of chars (to convert the string into an Array of its characters) & inject for adding up the values.

Inside the block I used the ord method to convert the characters into their ordinal values.

Finally, I used the modulo operator % to make sure the resulting value fits into the array. We call each entry in this array a ‘bucket’.

Bucket Distribution

Ideally we want all our buckets to be filled evenly, this will give us the best performance when we need to retrieve a value.

Let’s look at what happens when we test our hash function using this code:

This produces the following results:

It looks like our keys are evenly distributed…

…but what happens if we ramp up the number of buckets?

In this example I used a bucket size of 128 (it was 32 before):

That doesn’t look like a great distribution anymore!

What’s going on?

The problem is that our hash function is not good enough because all the strings of the same size stay in a certain range. That’s why we have a lot of empty buckets in the middle.

A Better Hash Function

We need a better way to convert our string into an index. Let’s take a look at one possible implementation.

What’s happening here is bit shifting (with the >> & << operators). The values are combined using the “exclusive OR operator” (^).

This bit shifting will mix things up, which will give us a better key distribution. Not perfect, but it’s better than our simple ASCII-based function 🙂

If you want a proper hash function you would be looking at something like MurmurHash, which I believe is what Ruby uses.

Handling Collisions

We don’t have a useful hash table yet.

Why?

Well, you may have noticed that when two keys hash to the same index they will overwrite the old value, and that’s not good!

This is called a hash collision & there are a few strategies to deal with these.

For example:

  • Double Hashing
  • Linear probing
  • Separate chaining

Let’s take a look at separate chaining, where we use a linked-list to store the entries for a particular “bucket”.

So if we assume that :abc & :ccc hash to the same index, our hash table would look something like this:

Then we will need a linear search to find the correct key.

This will have an impact on our performance because our lookup time can slowly degrade towards O(n), instead of the expected O(1).

If you are not familiar with this O(something) notation that’s called “Big-O Notation“.

To avoid the linked list from growing too deep & therefore degrade the performance of the hash table, it’s necessary to recreate the hash table using a bigger number of buckets.

Ruby does this for you automatically, but still good to know.

Conclusion

The purpose of this article is not to have you writing a hash table implementation, but to help you understand how they actually work, so I hope that you found that interesting!

Don’t forget to share this post to keep the blog going 🙂

ruby pack unpack

Packing & Unpacking: A Guide to Reading Binary Data in Ruby

Working with text is a lot easier than working with binary data…

…with text you can use regular expressions & methods like scan, match & gsub.

But if you want to work with binary data there is some extra work to do. That’s where the Array#pack & String#unpack methods come into play.

Let me show you some examples, starting with just a plain string & then moving on to more interesting stuff.

String to ASCII Values

This will convert every character in the string into a decimal value:

Notice the "c*" argument for unpack.

This is a “format string” which tells unpack what to do with the data. In this case, c means take one character & convert it into an integer value (the String#ord method also does this).

The asterisk * just says “repeat this format for all the input data”.

Hex to Integer

This format string takes 4 bytes of data & returns an integer. One thing to notice is that these bytes are in “little-endian” format.

Examples:

I used first here because unpack returns an array.

Binary File Parsing

How do you read a binary file like an EXE, PNG or GZIP?

If you treat these like strings you will just see something that looks like random data…

ruby string pack

…but there is a documented structure for most of these file formats & the unpack method is what you would use to read that data and convert it into something useful.

Here is an example:

In this example, the binary data (represented in hexadecimal, which is way more compact than 1s & 0s) has a two-byte (16 bit) length field that contains the length of the following string. Then there is the string itself.

It is very common for binary files & binary network protocols to have a “length” field.

This tells the parser exactly how many bytes should be read (and yes, I know in this example I read both the length & the data in one step, that’s just to keep things simple).

There is also the bindata gem, which is built specifically to help you parse binary structures.

Here is an example:

Notice the read_length parameter. This will tell bindata to work out the length from the field, so this will save you a lot of work 🙂

So if you want to write a parser for any binary format, these are the steps:

  1. Find the specification for this format (if it’s not public you will have to reverse-engineer it, which is an entire topic on its own)
  2. Write a bindata class for every section of the file (you will usually find a header section first with metadata & then multiple data sections)
  3. Read the data & process it however you want (for example, in a PNG you could change the colors of the image)
  4. Profit!

If you want to see a full example of bindata in action take a look at my PNG parser on github.

Base64 Encoding

There is this type of encoding called “Base64”. You may have seen it before on a URL.

Looks something like this:

The double equals at the end is usually the tell-tale sign that you are dealing with Base64, although some inputs can result in the equals signs not being there (they are used as padding).

So why I’m telling you this, besides being a useful thing to know in itself?

Well it turns out that you can convert a string into Base64 using the pack method.

As you can see here:

In fact, this is the exact method used in the Base64 module from the standard library 🙂

Summary

In this post you learned about the pack & unpack methods, which help you work with binary data. It can be used to parse binary files, convert a string into ASCII values & Base64 encoding.

Don’t forget to share & subscribe so you can enjoy more blog post like this! 🙂

ruby tracepoint

How To Spy on Your Ruby Methods

Ruby has a built-in tracing system which you can access using the TracePoint class. Some of the things you can trace are method calls, new threads & exceptions.

Why would you want to use this?

Well, it could be useful if you want to trace the execution of a certain method. You will be able to see what other methods are being called & what are the return values.

Let’s see a few examples!

Tracing Method Calls

Most of the time you will want TracePoint to trace application code & not built-in methods (like puts, size, etc).

You can do this using the call event.

Example:

This prints the file path, the line number, the event name & the method name.

If you don’t specify any events Ruby will call your block for all of them, resulting in more output. So I would recommend that you focus on specific events to find what you want faster 🙂

Here’s is a table of TracePoint events:

Event name Description
call Application methods
c_call C-level methods (like puts)
return Method return (for tracing return values & call depth)
b_call Block call
b_return Block return
raise Exception raised
thread_begin New thread
thread_end Thread ending

TracePoint + Graphviz

Many methods will make more than just 3 methods calls, especially in framework code, so the output from Tracepoint can be hard to visualize.

So I made a gem that lets you create a visual call graph like this:

This generates a call_graph.png file with the results.

ruby call graph

Keep in mind that this is not static analysis, this will actually call the method!

Showing File Paths

Would you like to know where these methods are defined?

Don’t worry, I got you covered! I added an option you can enable to show the file path for each method call.

Which results in:

visual call graph

If you want to see some massive call graphs you just have to trace some Rails methods 😉

Return Values

In the intro I mentioned that you can also get return values. For this you will need to trace the return event and use the return_value method.

Example:

This will print:

Events First

Someone asked on reddit how it’s possible to avoid having the word “bar” printed when calling the foo method in the following code:

There are many ways to achieve this, like prepending a module, redirecting $stdout or redefining the bar method.

If you are feeling creative, comment on this post with your own idea!

But I found one of the answers particularly interesting because it used the TracePoint class.

Here it is:

This code will call exit when the method bar is called, which prevents the string from being printed by ending the program.

Probably not something you want to use in real code, but it proves one thing about TracePoint: Events are triggered before they happen.

Something to keep in mind if you are going to build some sort of tool around this 🙂

Summary

In this post you learned about the TracePoint class, which allows you to trace a few events like methods calls or new threads. This can be useful as a debugging tool or for code exploration.

Remember to share this post so more people can enjoy it 🙂

ruby 2.4 features

9 New Features in Ruby 2.4

It has become a tradition to release new Ruby versions on Christmas.

And in this post I want to cover some of the most interesting changes in the next version so you can keep up with the news 🙂

Float#round with keyword

If you are using floats in your app I hope you use floor or ceil for rounding, because the Float#round method is changing its default behavior in Ruby 2.4.

Example:

The default behavior is now to “round-to-nearest-even”. In addition, Float#round now takes an argument which you can use to define the type of rounding you want.

The options are:

  • :even
  • :up
  • :down

Example:

Also Float#floor, Float#ceil & Float#truncate now take an optional argument that lets you set a precision.

Note: Apparently the default behaviour for Float#round is going back to rounding up for the final version of Ruby 2.4. This was a decision taken by Matz after this post was published.

Chomp flag for IO methods

If you ever used a method like gets or each_line I’m sure you remember having to deal with those pesky new line characters.

The ones that looks like this: \n.

This is something beginners always have trouble with, but this new feature might be able to help!

Here’s an example:

Now in Ruby 2.4 you will be able to set the chomp keyword argument & gets will remove the newline character for you.

Example:

This will work with any value that’s not false or nil (the only “falsy” values in Ruby).

So this also works (but not recommended because it can be confusing):

It’s not a super big deal, but it can save you a method call 🙂

Pathname#empty?

Ruby 2.4 implements Dir#empty? & File#empty?, these methods let you check if a directory or a file is empty (that was pretty obvious, right?).

But Pathname#empty? was also added recently.

In case you are not familiar with Pathname, it’s a class that merges together functionality from both the Dir class & the File class.

In addition, it’s also more “OO” (Object Oriented) in the sense that it returns Pathname objects, instead of strings.

Example:

Commit: https://github.com/ruby/ruby/commit/9373c5efb993dd8cae0526118805449b19af2c22

Set compare_by_identity

Sets are a data structure that is available as part of the standard library. It helps you keep a collection of unique items.

By default the objects are compared based on their value (or to be more accurate, based on their hash value).

But in Ruby 2.4 it’s possible to have a set of unique objects based on their object id.

Example:

If anyone knows of any interesting use for this feature leave a comment 🙂

Commit: https://github.com/ruby/ruby/commit/76977611dd68e384fdce8c546efda5e1931e67a6

Refinements with Kernel#send, BasicObject#send, Symbol#to_proc

I’m not going to cover refinements in detail here, but the basic idea is to be able to add methods to a class like String, while keeping this change “localized” to one file or class.

Since Ruby 2.4, methods defined via refinements will be available when called via methods like Kernel#send & Symbol#to_proc.

Example:

If you try this in 2.3 or below you will get an ‘undefined method’ error.

Commit: https://github.com/ruby/ruby/commit/35a29390197750abf97ef16fa0740e377764daef

Hash#transform_values

Here’s another method extracted from Rails & coming directly to Ruby. I’m talking about Hash#transform_values, which works in a similar way to Array#map.

Example:

There is also Hash#transform_values! if you need in-place mutation.

Kernel#clone now takes an optional keyword argument

As you may know, it’s possible to make a copy of a Ruby object. This is useful because most Ruby objects are mutable & you may want to avoid making changes to the original object.

We have two methods for making an object copy:

  • clone
  • dup

There are a few small differences between clone & dup, but for this post let’s just say that clone keeps the “frozen” status of the original object while dup does not.

New in 2.4 is the ability to call clone with a “freeze” flag.

Example:

I’m not sure to what extend this is useful, but who doesn’t want more options 🙂

Commit: https://github.com/ruby/ruby/commit/320ae01c5fb091eab0926c186f304a9caeda1ace

Thread.report_on_exception

Another feature coming with 2.4 is Thread.report_on_exception. This was proposed because Thread exceptions are silent by default, and this can hide problems with your code.

The default value is false, but if your app is using Threads you should try enabling this when upgrading to Ruby 2.4.

Example:

This will show the exception, but it will not crash your program. The alternative is Thread.abort_on_exception, which has always been available.

Notice that there is also an instance version of this method, so you can set this option per-thread instead of all threads.

Binding#irb

Are you a fan of using binding.pry for debugging? Well now we also have binding.irb which works in a similar way.

But since it’s irb you still don’t get all the nice things that pry gives you, like the syntax highlighting.

Better than nothing if for some reason you don’t have access to pry 🙂

Commit: https://github.com/ruby/ruby/commit/493e48897421d176a8faf0f0820323d79ecdf94a

Conclusion

This new version of Ruby brings in many interesting features, so make sure to check it out when it’s available.

Don’t forget to share this post so more people can learn about these new features in Ruby 2.4!

Also join my newsletter so you never miss a post, 1.900+ smart Ruby developers have already joined.

1 2 3 18