ruby Archives - Black Bytes

Tag Archives for " ruby "

Writing a Shell in 25 Lines of Ruby Code

July 12, 2016 /
By Jesus Castello /
2 COMMENTS

If you use Linux or Mac, every time you open a terminal you are using a shell application. A shell is just an interface that helps you execute commands in your system.

In addition to that, the shell also hosts environment variables & has useful features like a command history and auto-completion.

If you are the kind of person that likes to learn how things work under the hood, this post will be perfect for you!

How Does a Shell Work?

To build our own shell application let’s think about what a shell really is: first, there is a prompt, usually with some extra information like your current user & current directory, then you type a command & when you press enter the results are displayed on your screen.

Yeah, that sounds pretty basic, but doesn’t this remind you of something?

If you are thinking of pry then you are right! A shell in basically a REPL (Read-Eval-Print-Loop) for your operating system.

So knowing that we can write our first version of your shell:

prompt = "> "

print prompt

while (input = gets.chomp)

break if input == "exit"

system(input)

print prompt

end

This will give us a minimal, but functional shell. We can improve this by using a library that many other REPL-like applications use. That library is called Readline.

Using The Readline Library

Readline is part of the Ruby Standard Library, so there is nothing to install, you just need to require it.

One of the advantages of using Readline is that it can keep a command history automatically for us. It can also take care of printing the command prompt & many other things.

Here is v2 of our shell, this time using Readline:

require 'readline'

while input = Readline.readline("> ", true)

break if input == "exit"

system(input)

end

This is great, we got rid of the two puts for the prompt & now we have access to some powerful capabilities from Readline. For example, we can use keyboard shortcuts to delete a word (CTRL + W) or even search the history (CTRL + R)!

Let’s add a new command to print the full history:

require 'readline'

while input = Readline.readline("> ", true)

break if input == "exit"

puts Readline::HISTORY.to_a if input == "hist"

# Remove blank lines from history

Readline::HISTORY.pop if input == ""

system(input)

end

Fun fact: If you try this code in pry you will get pry’s command history! The reason is that pry is also using Readline, and Readline::HISTORY is shared state.

Now you can type hist to get your command history

Adding Auto-Completion

Thanks to the auto-completion feature of your favorite shell you will be able to save a lot of typing. Readline makes it really easy to integrate this feature into your shell.

Let’s start by auto-completing commands from our history.

Example:

comp = proc { |s| LIST.grep(/^#{Regexp.escape(s)}/) }

Readline.completion_append_character = " "

Readline.completion_proc = comp

## rest of the code goes here ##

With this code you should be able to auto-complete previously typed commands by pressing the <tab> key. Now let’s take this a step further & add directory auto-completion.

Example:

comp = proc do |s|

directory_list = Dir.glob("#{s}*")

if directory_list.size > 0

directory_list

else

Readline::HISTORY.grep(/^#{Regexp.escape(s)}/)

end

The completion_proc returns the list of possible candidates, in this case we just need to check if the typed string is part of a directory name by using Dir.glob. Readline will take care of the rest!

Implementing The System Method

Now you should have a working shell, with history & auto-completion, not too bad for 25 lines of code

But there is something that I want to dig deeper into, so you can get some insights on what is going on behind the scenes of actually executing a command.

This is done by the system method, in C this method just sends your command to /bin/sh, which is a shell application. Let’s see how you can implement what /bin/sh does in Ruby.

Note: This will only work on Linux / Mac

The system method:

def system(command)

fork {

exec(command)

}

end

What happens here is that fork creates a new copy of the current process, then this process is replaced by the command we want to run via the exec method. This is a very common pattern in Linux programming.

If you don’t fork then the current process is replaced, which means that when the command you are running (ls, cd or anything else) is done then your Ruby program will terminate with it.

You can see that happening here:

def system(command)

exec(command)

end

system('ls')

# This code will never run!

puts "after system"

Conclusion

In this post you learned that a shell is a REPL-like interface (think irb / pry) for interacting with your system. You also learned how to build your own shell by using the powerful Readline library, which provides many built-in features like history & auto-completion (but you have to define how that works).

And after that you learned about the fork + exec pattern commonly used in Linux programming projects.

If you enjoyed this post could you do me a favor & share it with all your Ruby friends? It will help the blog grow & more people will be able to learn

Building Your Own Linux Tools with Ruby: A Practical Guide

June 27, 2016 /
By Jesus Castello /
11 COMMENTS

Tools like ps, top & netstat are great, they give you a lot of information about what’s going on your system.

But how do they work? Where do they get all their information from?

In this post we will recreate three popular Linux tools together. You are going to get a 2×1 meal, learn Ruby & Linux at the same time!

Finding Status Information

So let’s try answering the question of where all these tools find their info. The answer is in the /proc filesystem.

If you look inside the /proc directory it will look like a bunch of directories & files, just like any other directory on your computer. But the thing is that these aren’t real files, it’s just a way for the Linux kernel to expose data to users.

It’s very convenient because they can be treated like normal files, which means that you can read them without any special tools. In the Linux world a lot of things work like this, if you want to see another example take a look at the /dev directory.

Now that we understand what we are dealing with, let’s take a look at the contents of the /proc directory…

104

105

11015

11469

11474

11552

11655

This is just a small sample, but you can quickly notice a pattern. What are all those numbers? Well, it turns out these are PIDs (Process IDs). Every entry contains info about a specific process.

If you run ps you can see how every process has a PID associated with it:

PID TTY TIME CMD

15952 pts/5 00:00:00 ps

22698 pts/5 00:00:01 bash

From this we can deduce that what ps does is just iterate over the /proc directory & print the info it finds.

Let’s see what is inside one of those numbered directories:

attr

autogroup

auxv

cgroup

clear_refs

cmdline

comm

cpuset

cwd

environ

exe

That’s just a sample to save space, but I encourage you to take a look at the full list.

Here are some important / interesting entries:

Entry	Description
comm	Name of the program
cmdline	Command used to launch this process
environ	Environment variables that this process was started with
status	Process status (running, sleeping…) & memory usage
fd	Directory that contains file descriptors (open files, sockets…)

Now that we know this we should be able to start writing some tools!

Process Listing

Let’s start by just getting a list of all the directories under /proc. We can do this using the Dir class.

Example:

1	Dir.glob("/proc/[0-9]*")

Notice how I used a number range, the reason is that there are other files under /proc that we don’t care about right now, we only want the numbered directories.

Now we can iterate over this list and print two columns, one with the PID & another with the program name.

Example:

pids = Dir.glob("/proc/[0-9]*")

puts "PID\tCMD"

puts "-" * 15

pids.each do |pid|

cmd = File.read(pid + "/comm")

pid = pid.scan(/\d+/).first

puts "#{pid}\t#{cmd}"

end

And this is the output:

PID CMD

---------------

1 systemd

2 kthreadd

3 ksoftirqd/0

5 kworker/0

7 migration/0

8 rcu_preempt

9 rcu_bh

10 rcu_sched

Hey, it looks like we just made ps! Yeah, it doesn’t support all the fancy options from the original, but we made something work.

Who Is Listening?

Let’s try to replicate netstat now, this is what the output looks like (with -ant as flags).

Active Internet connections (servers and established)

Proto Recv-Q Send-Q Local Address Foreign Address State

tcp 0 0 127.0.0.1:5432 0.0.0.0:* LISTEN

tcp 0 0 192.168.1.82:39530 182.14.172.159:22 ESTABLISHED

Where can we find this information? If you said “inside /proc” you’re right! To be more specific you can find it in /proc/net/tcp.

But there is a little problem, this doesn’t look anything like the netstat output!

1 2	0: 0100007F:1538 00000000:0000 0A 00000000:00000000 00:00000000 00000000 1001 0 9216 1: 2E58A8C0:9A6A 9FBB0EB9:0016 01 00000000:00000000 00:00000000 00000000 1000 0 258603

What this means is that we need to do some parsing with regular expressions. For now let’s just worry about the local address & the status.

Here is the regex I came up with:

1	\s+\d+: (?<local_addr>\w+):(?<local_port>\w+) \w+:\w+ (?<status>\w+)

This will give us some hexadecimal values that we need to convert into decimal. Let’s create a class that will do this for us.

class TCPInfo

LINE_REGEX = /\s+\d+: (?<local_addr>\w+):(?<local_port>\w+) \w+:\w+ (?<status>\w+)/

def initialize(line)

@data = parse(line)

end

def parse(line)

line.match(LINE_REGEX)

end

def local_port

@data["local_port"].to_i(16)

end

# Convert hex to regular IP notation

def local_addr

decimal_to_ip(@data["local_addr"].to_i(16))

end

STATUSES = {

"0A" => "LISTENING",

"01" => "ESTABLISHED",

"06" => "TIME_WAIT",

"08" => "CLOSE_WAIT"

}

def status

code = @data["status"]

STATUSES.fetch(code, "UNKNOWN")

end

# Don't worry too much about this :)

def decimal_to_ip(decimal)

ip = []

ip << (decimal >> 24 & 0xFF)

ip << (decimal >> 16 & 0xFF)

ip << (decimal >> 8 & 0xFF)

ip << (decimal & 0xFF)

ip.join(".")

end

The only thing left is to print the results in a pretty table format.

require 'table_print'

tp connections

Example output:

STATUS | LOCAL_PORT | LOCAL_ADDR

------------|------------|--------------

LISTENING | 5432 | 127.0.0.1

ESTABLISHED | 39530 | 192.168.88.46

Yes, this gem is awesome!

I just found about it & looks like I won’t have to fumble around with ljust / rjust again

Stop Using My Port!

Have you ever seen this message?

1	Address already in use - bind(2) for "localhost" port 5000

Umm… I wonder what is using that port…

fuser -n tcp -v 5000

PORT USER PID ACCESS CMD

5000/tcp: blackbytes 30893 F.... nc

Ah, so there is our culprit! Now we can stop this program if we don’t want it to be running & that will free our port. How did the fuser program find out who was using this port?

You guessed it! The /proc filesystem again.

In fact, it combines two things we have covered already: walking through the process list & reading active connections from /proc/net/tcp.

We just need one extra step:
Find a way to match the open port info with the PID.

If we look at the TCP data that we can get from /proc/net/tcp, the PID is not there. But we can use the inode number.

“An inode is a data structure used to represent a filesystem object.” – Wikipedia

How can we use the inode to find the matching process? If we look under the fd directory of a process that we know has an open port, we will find a line like this:

1	/proc/3295/fd/5 -> socket:[12345]

The number between brackets is the inode number. So now all we have to do is iterate over all the files & we will find the matching process.

Here is one way to do that:

x =

Dir.glob("/proc/[0-9]*/fd/*").find do |fd|

File.readlink(fd).include? "socket:[#{socket_inode}]" rescue nil

end

pid = x.scan(/\d+/).first

name = File.readlink("/proc/#{pid}/exe")

puts "Port #{hex_port.to_i(16)} in use by #{name} (#{pid})"

Example output:

1	Port 5432 in use by /usr/bin/postgres (474)

Please note that you will need to run this code as root or as the process owner. Otherwise you won’t be able to read the process details inside /proc.

Conclusion

In this post you learned that Linux exposes a lot of data via the virtual /proc filesystem. You also learned how to recreate popular Linux tools like ps, netstat & fuser by using the data under /proc.

Don’t forget to subscribe to the newsletter below so you don’t miss the next post (and get some free gifts I prepared for you)

Ruby Ranges: How Do They Work?

June 14, 2016 /
By Jesus Castello /
5 COMMENTS

Have you ever wondered how ranges work in Ruby?

Even if you haven’t, isn’t it fun to discover how things work under the hood?

That’s exactly what I’m going to show you in this post.

Understanding Ranges

Just as a reminder, this is what a Ruby range looks like:

(1..20)

The parenthesis are not necessary to define a Range, but if you want to call methods on your range you will need them (otherwise you are calling the method on the 2nd element of the range, instead of the range itself).

The Range class includes Enumerable, so you get all the powerful iteration methods without having to convert the range into an array.

Range has some useful methods, like the step method.

Example:

1 2	(10..20).step(2).to_a # [10, 12, 14, 16, 18, 20]

Other Range methods to be aware of are cover? & include?. It would be a mistake to think that they do the same thing, because they don’t.

The include? method just does what you would expect, check for inclusion inside the range. So it would be equivalent to expanding the Range into an Array and checking if something is in there.

But cover? is different, all it does is check against the initial & ending values of the range (begin <= obj <= end), which can yield unexpected results.

Example:

1 2	('a'..'z').include? "cc" # false ('a'..'z').cover? "cc" # true

The cover? example is equivalent to:

1	"a" <= "cc" && "cc" <= "z"

The reason this returns true is that strings are compared character by character. Since “a” comes before “c”, the characters that come after the “c” don’t matter.

Range Implementation

Ranges are not limited to numbers & letters, you can use any objects as long as they implement the following methods: <=> and succ.

For example, here is a time range:

require 'time'

t1 = DateTime.new

t2 = DateTime.new + 30

next_30_days = t1..t2

# Example use

next_30_days.select(&:friday?).map(&:day)

So how does this work? Let’s take a look at this implementation:

def range(a, b)

# if the first element is bigger than the second

# then this isn't a sequential range

return [] if a > b

out = []

# advance until the 2nd element is the same

# as the first one

while a != b

out << a

a = a.next

end

# add last element (inclusive range)

# this also returns the results via implicit return

out << a

end

p range 1, 10

p range 'a', 'z'

I added some comments to help you understand what is going on. The idea is that we keep calling the next method on the first object until it is equal to the second one, the assumption is that they will eventually meet.

Custom Class Ranges

Most of the time you will be using number & character ranges, but it’s still good to know how you can use ranges in a custom class.

Example:

class LetterMultiplier

attr_reader :count

include Comparable

def initialize(letter, count)

@letter = letter

@count = count

end

def succ

self.class.new(@letter, @count + 1)

end

def <=>(other)

count <=> other.count

end

a = LetterMultiplier.new('w', 2)

b = LetterMultiplier.new('w', 8)

# Print array with all the items in the range

p Array(a..b)

The key here is to make sure that you implement the <=> & succ methods correctly. If you want to use the include? method you need to include the Comparable module, which adds methods like ==, <, and > (all based on the results of the <=> method).

Conclusion

In this article you have learned how ranges work in Ruby so you can understand them better & implement your own objects that support range operations.

Don’t forget to subscribe to the newsletter below so you don’t miss the next post

5 Useful Examples From The Ruby Standard Library

May 16, 2016 /
By Jesus Castello

The Ruby Standard Library is a series of modules & classes that come with Ruby but are not part of the language itself.

These classes offer a variety of utilities like Base64 encoding, prime number generation & DNS resolution.

In this article I’m going to show you 5 of these classes with useful examples.

Unique Items with Set

Set is a data structure that guarantees unique items.

Here is an example:

require 'set'

unique_words = Set.new

unique_words << 'abc'

unique_words << 'ccc'

unique_words << 'abc'

p unique_words.entries

# ["abc", "ccc"]

The implementation of this data structure is based on the Hash class, so there is no Array-like indexing.

But if you just need a bucket of data where all elements need to be unique all the time, then the Set might be what you are looking for.

Logging Messages

If you need to log some error / debug message Ruby has you covered with the Logger class. This class provides everything you need to start logging.

In fact, this class is what Rails uses by default.

To use the Logger class you can simply create a Logger object & give it an output stream (or a file name) as a parameter. Then you can use the different logging levels to register your message.

The logging levels are: debug, info, warn, error & fatal.

Example:

require 'logger'

logger = Logger.new(STDOUT)

logger.info 'testing...'

logger.warn 'fun with Ruby :)'

This produces the following output:

1 2	I, [2016-05-14T15:50:21.367590 #12148] INFO -- : testing... W, [2016-05-14T15:50:21.846651 #12148] WARN -- : fun with Ruby :)

The first character is an abbreviated form of the logging level (I for Info, W for Warn)… Then you have a timestamp, and the current process id (which you can get in Ruby using Process.pid).

Finally, you have the full logging level & the actual message. You can change this format by providing a new formatter.

Working with Prime Numbers

Maybe you don’t need to deal with prime numbers on a day-to-day basis, but it’s still good to know (specially if you like programming challenges) that Ruby has good support for them via the Prime class.

When you require Prime it will add the prime? method to Fixnum.

require 'prime'

5.prime? # true

11.prime? # true

20.prime? # false

This class also includes a prime number generator.

Example:

1 2	Prime.take(10) # [2, 3, 5, 7, 11, 13, 17, 19, 23, 29]

Using StringIO

The StringIO class allows you to create a string that behaves like an IO object. This means that you can work with this string like if you were reading from a file or STDIN (Standard Input).

Here is an example:

require 'stringio'

io = StringIO.new

io << 'test'

io << 'code'

puts io.string

# "testcode"

Notice a few things: when you add data into your StringIO object it will not add spaces or newlines for you, and to get the actual string you need to call the string method. Also you can’t use array indexing to access individual characters, like you can with a regular string.

When is this useful? Well, sometimes you may want to substitute a file or some other IO object for another object that you have more control over. For example, in a testing environment you can replace STDOUT with a StringIO object.

You can see a real world example from Rails here: https://github.com/rails/rails/blob/52ce6ece8c8f74064bb64e0a0b1ddd83092718e1/activesupport/test/logger_test.rb

Working with Paths

The Pathname class wraps several file-system exploring utilities, like Dir & File, in a much more powerful class.

While the method names are the same, they return Pathname objects instead of strings or arrays. And what this means is that you can keep working with the results with all the file-related methods.

This is a great example:

require 'pathname'

Pathname.glob("*").count(&:directory?)

If you tried to do this with Dir, you would have to use this code, which is not as elegant.

1	Dir.glob("*").count { \|d\| File.directory?(d) }

Give this one a try the next time you need to do a file-system related task

Conclusion

I hope you found these examples useful! Make sure to explore the Standard Library a bit more so you can learn what it can do for you.

Don’t forget to share this post if you liked it

How to Generate Weighted Random Numbers

May 3, 2016 /
By Jesus Castello /
2 COMMENTS

Random numbers usually follow what we call a ‘uniform distribution’, meaning that there is the same chance that any of the numbers is picked.

But if you want some numbers to be picked more often than others you will need a different strategy: a weighted random number generator.

Some practical applications include:

the loot table in a video game, where enemies can drop different items with varying drop rates.
a raffle, where people with more tickets have more chances to win.

Simple Strategy

If you think about the raffle example you can come up with an obvious solution: generate an array which has one copy of the item for each ‘ticket’.

For example, if John buys 4 raffle tickets and David only buys 1, John will have 4 times more chances to win than David.

Here is a working implementation:

users = { john: 4, david: 1 }

raffle = []

users.map do |name, tickets|

tickets.times { raffle << name }

end

p raffle

# [:john, :john, :john, :john, :david]

p raffle.sample

# :john

I’m adding the person’s name once for every ticket they bought, and then I pick a random name from that list. By virtue of being on the list more times, it will increase the chances of that name being picked.

I like this approach because it’s very simple and once you have your list it’s very fast to pick a winner.

Sum of Weights

There is another way you can do this that is more memory-efficient, the trade-off is that picking a random value is slower.

The idea is to pick a random number between 1 and the sum of all the weights, and then loop until you find a weight that is lower or equal than this number.

Here is the code:

def random_weighted(weighted)

max = sum_of_weights(weighted)

target = rand(1..max)

weighted.each do |item, weight|

return item if target <= weight

target -= weight

end

def sum_of_weights(weighted)

weighted.inject(0) { |sum, (item, weight)| sum + weight }

end

This code takes in a hash where the keys are the items and the values are the weights. You can call this method like this:

1 2	random_weighted(cats: 5, dogs: 1) # :cats

You can test if this works as expected by looking at the distribution of the results after running it many times.

Here is an example:

counts = Hash.new(0)

def pick_number

random_weighted(cats: 2, dogs: 1)

end

1000.times { counts[pick_number] += 1 }

p counts

Run this a few times and look at the output to see if the ratio is what it should be.

Conclusion

While there are more sophisticated algorithms, these two should serve you well. I hope you found this article useful, please share it with your friends so I can keep writing more!

1 2 3 … 7 Next »