Updates to line.rb – Open to more suggestions

I was chatting with a friend the other day and he had some really good ideas concerning last week’s line.rb script. He wants to make an improved version. I look forward to seeing what he will come up with because he shared some great ideas. Well, I couldn’t help but implement them immediately in my version. His ideas also spurred a little creativity from me and I came up with a few new features as well. The usage is now looks like:

  # Check Cmd Line Args and Print Usage if needed
  if ARGV.size <= 1
    puts "usage: line [options] filename numbers"
    puts " options:"
    puts "   --silent or -s   Just print the line, without [#]"
    puts " number formats:"
    puts "   1-3              Prints lines 1 through 3"
    puts "   5                Prints line 5"
    puts "   -1               Prints the last line, (negative)"
    puts " extra formats:"
    puts "   ~5               Prints 2 (default) lines before and after 5"
    puts "   4~10             Prints 4 lines before and after 10"
    puts "   *7 or 8*         Prints all lines before or after the number"
    puts "   5/1              Prints 5 lines, then skips 1 line..."
    puts "   2:5/1            Starts at line 2, prints and skips..."
    exit 1

So there is some cool new syntax:

  • ~# – prints 2 lines before and after the given line number

  • #~# – prints the given number of lines before and after the given line number

  • *# – prints all of the lines before the given line number

  • #* – prints all of the lines after the given line number

  • #/# – is a print and skip option

  • #:#/# – allows you to provide an offset

Lets say you want to print every even line in a text file. With this script that is not a problem. The solution is: starting on the 2nd line, print 1 line, then skip 1 line. (line file 2:1/1):

Print only even lines using line.rb

I think these new formats are pretty neat. You can always access the latest version of the script on Github or in my ~/bin. I’m willing to change the syntax and introduce new concepts. Send suggestions my way by dropping a comment! Cheers.

Quickly Output Lines in a File

The other day I wanted a shell command to have somebody print out the 671st line of /etc/services on their computer. So I gave it some thought, then some more thought, scratched my head, and figured out that I couldn’t really think of a shell command that does that. A google search came up with a few `sed` and `awk` examples but I honestly found those to be a bit awkward for something that should be super simple. So I wrote my own script.

After writing the script to print out a single line, I soon found it made a lot of sense to include ranges. Taking advantage of Ruby I can even print lines from the end with negative numbers. So I spent a few minutes to clean up the script and make it a little more reusable, add some formatting, and cleaner. Here was the result:

line sample usage

line is now the latest addition to my ~/bin.

Also, in case you wanted to use this in a shell script or via piping, there is a `–silent` or `-s` switch that you can use that removes all of the special formatting, and prints only the specified lines. Much nice for scripts. (See the highlighted line in the image below). Enjoy!

line source

So what exactly is line 671 of /etc/services? On my mac it is:

line 671 of /etc/services

Dynamic Web URLs with ExpanDrive

Often when I work with ExpanDrive the files I am working on correspond to some website that I own. When I’m mounted with ExpanDrive each file is accessible via “my hard drive” in the mounted volume and, more importantly, from a web URL! I found myself repeatedly opening up my browser and manually typing the URL for files that I just uploaded or edited. This was error prone, especially if some of the characters needed encoding. So, I spent some time to write up a Ruby script that can read ExpanDrive’s preferences, build the file’s “web URL,” and open it in your default browser.

expanurl usage

Simple to Use

I followed along with ExpanDrive’s previous command line tool named expan and named my script expanurl. Given no arguments it will open the current directory via its web URL, or you can give a list of files and each will be opened at their web URLs. Its usage is pretty straightforward but there is a single catch: the server setting in an ExpanDrive Drive may not be a true one-to-one mapping with the web server’s address.

For example: I provide holly.cs.rit.edu as the server value for one of my personal ExpanDrive drives. However, when I view files on that server (inside the public_html directory) they have a much different looking URL: http://www.cs.rit.edu/~jjp1820/. The result? The script simply keeps its own mapping of ExpanDrive server values to their associated web page prefixes. When you use expanurl on a Drive you have never used it on before the script will prompt you for that mapping, store the value, and never ask again.

Here it is in action. I have removed all stored mappings so I can demonstrate what it would be like using expanurl for the first time. Here I use it on my BogoJoker ExpanDrive drive:

expanurl first usage

Notice that in the prompt it tells you:

  • the server that the ExpanDrive volume is linked to and the one you will be providing a web url prefix for
  • an example of a web url prefix (useful)
  • where the mappings are stored in case you need to edit them later

The script is available on GitHub, so feel free to contribute and improve. Here is a link to the always current version, and here is a snapshot of the current version at the time of writing:


#!/usr/bin/env ruby
# Author: Joseph Pecoraro
# Date: Saturday December 13, 2008
# Description: When I'm using ExpanDrive and I'm
# remotely logged into a server, I can use this
# script to "open filename" and it will open using
# the server's associated URL.

# For URL Escaping and Stored Mappings
require 'uri'
require 'yaml'

# Exit with a msg
def err(msg)
  puts msg
  exit 1

# Simple Class to handle the mappings
class UrlMap
  MAP_PATH = File.expand_path('~/.expanurl')
  def initialize
    @hash = load
  def load
    if File.exists?(MAP_PATH)
      YAML::load_file( File.expand_path(MAP_PATH) )
  def add_mapping(server, mapto)
    @hash[server] = mapto
    File.open(MAP_PATH, 'w') do |file|
  def is_mapping?(server)
  def get_mapping(server)
  def path

# Local Variables
mapping = UrlMap.new
url_prefix = nil
server = nil
volume = nil

# Check the if the current directory is an
# ExpanDrive Volume and a public_html folder
pwd = `pwd`
match = pwd.match(/^\/Volumes\/([^\/]+)/)
if match.nil?
  err("Not inside an ExpanDrive Volume")
elsif !pwd.match(/\/public_html\/?/)
  err("Not inside a public_html directory.")
  volume = match[1]
  defaults = `defaults read com.magnetk.ExpanDrive Drives`
  defaults.gsub!(/\n/, '')
  props = defaults.match(/\{[^\}]+driveName\s+=\s+#{volume}[^\}]+server\s+=\s+"([^"]+)"[^\}]+\}/)
  if props
    server = props[1]
    err("This Volume (#{volume}) is not an ExpanDrive Volume")

# Check if a mapping exists
# Otherwise create and store one
if mapping.is_mapping?(server)
  url_prefix = mapping.get_mapping(server)
  # Prompt
  puts "This is the first time you've used expanurl for #{volume}"
  puts "Please Provide us with a mapping for #{server}"
  puts "Mappings are stored in #{mapping.path}"
  puts "Example: http://bogojoker.com/"
  print ">> "
  # Store user input and proceed
  url_prefix = gets.chomp
  url_prefix += '/' unless url_prefix.match(/\/$/)
  mapping.add_mapping(server, url_prefix)
  # Terminal Output
  puts "Server: #{server}"
  puts "Maps to: #{url_prefix}"


# Build the URL
subpath = pwd.match(/public_html\/?(.*)/)[1]
subpath += '/' unless subpath.length.zero? || subpath.match(/\/$/)
url_prefix += subpath

# If No Files, open the directory
# Otherwise,   open each provided file
if ARGV.size == 0
  `open #{url_prefix}`
  ARGV.each do |filename|
    `open #{url_prefix}#{URI.escape(filename)}`


How it Works

The Ruby Script grabs the current working directory using `pwd` and checks to make sure you’re in an ExpanDrive volume. ExpanDrive volume’s are dynamically generated by parsing the ExpanDrive preferences thanks to their foresight to make them accessible via the `defaults` command. So if you’re in an ExpanDrive volume and inside a public_html directory expanurl will then use its mapping to open a uri encoded web url in your default browser with the `open` command.

The mappings are stored in a hidden YAML file in your home directory (~/.expanurl). This style of storing preferences is just like dozens of other command line applications and scripts. YAML is just a lightweight textual data format popular with Ruby, similar to JSON and XML. Its so simple that you could edit the file yourself if you wanted/needed to. For instance here is what is in mine, just two simple key/value pairs:

~/.expanurl yaml mapping

The Future

Its just that simple. Being a Ruby Script you can call this from GUI applications, anything with built-in shell access, etc. It should play friendly with your usual Unix tools. I will likely make this script more and more robust if others find it useful, so I’d be happy to hear some feedback.


Ruby Process Controller – psgrep

Every once in a while a process will freeze and will be too stubborn to die when I try to “Quit” it. For those stubborn processes I tend to use the terminal to `kill` it. For a while I had been using a simple Perl script for searching through processes. The script would find me the processids and I could then kill it, using whatever power I need.

I found that it was taking far too long for me to do the search, and then carefully type out the process id, and hope I got the right one. I discovered killall, but the problem is that sometimes I don’t want to kill “all” of the processes with that name. So, I gave in and wrote up a Ruby script that did what I wanted. Here is psgrep: (Download)

#!/usr/bin/env ruby
# Start Date: Saturday December 6, 2008
# Current Version: 0.9
# Author: Joseph Pecoraro
# Contact: joepeck02@gmail.com
# Decription: Quicker handling of process searching
# and killing.  Never type in a PID again, just regexes!

# -----------
#   Globals
# -----------
kill_mode = false
icase_mode = false
pattern = nil
targets = []
pids = []

# ---------
#   Usage
# ---------
def usage
  puts "usage: #{$0.split(/\//).last} [options] pattern"
  puts "  -k  or --kill   kills all processes"
  puts "  -k# or --kill#  kills the [#] process"
  exit 0

# -----------
#   Options
# -----------
if ARGV.size > 1
  ARGV.each do |arg|
    if arg.match(/^-(k|-kill)(\d*)$/)
      kill_mode = true
      targets << $2.to_i unless $2.empty?
    elsif arg.match(/^-(i|-ignore)$/)
      icase_mode = true
  ARGV.delete_if { |e| e.match(/^-/) }

# -------------------
#   Remaining Args
# -------------------
if ARGV.size != 1

if icase_mode
  pattern = Regexp.new( ARGV[0], Regexp::IGNORECASE )
  pattern = Regexp.new( ARGV[0] )

# ----------------------
#   Actual `ps` Output
# ----------------------
lines = %x{ ps -Au#{ENV['USER']} }.split(/\n/)
header = lines.shift

# ----------
#   psgrep
# ----------
puts "     #{header}"
count = 0
lines.each do |line|
  unless line =~ /psgrep/
    if line.match(pattern)
      count += 1
      puts "[#{count}]: #{line}"
      if targets.empty? || targets.member?(count)
        pids << line.strip.split[1]

# -------------
#   Kill Mode
# -------------
if kill_mode
  puts "Killing Processes"
  puts "-----------------"
  pids.each_with_index do |pid, i|
    print targets.empty? ? "[#{i}]:" :  "[#{targets[i]}]:"
    print " Killing #{pid}... "
    res = %x{ kill #{pid} }
    puts "Dead" if $?.exitstatus.zero?

# Always

So there it is. Less then 100 lines of ruby to get a pretty straightforward psgrep/kill program. Here is an example where I have three perl processes running on my machine. One of the is running as root (the userid is 0). I just type “!! –kill” or “[up-arrow] –kill” and it tries to kill them all. Note that the root perl process doesn’t terminate and there is an error message but psgrep continues as best as it can, and kills the two normal perl processes: [Note: I could have done `sudo psgrep perl -k` to kill the root process]

psgrep usage

Here is another good example. I have two python instances that I want to kill but there is another python instance running ExpanDrive in the background. I just ran psgrep and found the two I want to kill are [2] and [3]. Therefore, I can send -k2 and -k3 (or –kill2 and –kill3) to kill only those processes. Here is the result:

psgrep target

Note also that by default psgrep is case-sensitive. To ignore case just add the -i or –ignore switch. So there you have it. Usage is straightforward. Switches can go anywhere on the command line makes it easier to just use your history and tack a switch on the end of your previous command.

Feel free to improve it, it is on GitHub!

The ARGFy Experiment

I wrote an earlier article that talked about Ruby’s global ARGF variable. I mentioned that I took that a step further, to experiment and learn a number of aspects about Ruby development. Those included:

  1. RDoc – Ruby Autogenerated Documentation
  2. RSpec – Ruby Test Framework aiding Behavior Driven Development
  3. General Familiarity with Ruby Classes
  4. General Familiarity with GitHub

I have to say that I was really impressed with how strikingly natural, easy, and fun it was to work with these tools. I already wrote about RDoc, hopefully to cover a “void” that I saw in the online documentation for it. I may look into writing about RSpec, however the current RSpec documentation was quite good so I may focus elsewhere. Finally GitHub and Ruby are mostly things that you have to personally practice with to get good at, and there are already plenty of great resources for them. The Ruby community has done a very good job!

The ARGFy Results

So, here are the results of my experiment:

The GitHub README is very similar to the RDoc, but it goes in more depth by showing the output of the sample.rb script included. Its not too exciting, but here is what ARGFy does.

What ARGFy Does

ARGFy is a class. In the constructor it takes an Array of filenames. It then treats those files as one continuous stream of lines. If no filenames are provided, or if “-” is provided as a filename, that input is treated as STDIN. Everything so far makes ARGFy look and act just like ARGF except you can specify your own files instead of only relying on the command line arguments.

Using ARGFy is mostly like ARGF. If you call the ARGFy#each method (note that this allows for any Enumerable method!) it will exhaust all the lines of input from all the files as a single stream. At each line you can check the states of the ARGFy object itself. The states include filename and lineno like the normal ARGF, but they also include filelineno. Because there is a filelineno there is a guaranteed way to know if under-the-hood the stream is now processing a different file. Since this might be a common thing to check there is a ARGFy#new_file? helper method that does just that.

Finally, because its an object you can add a file to the list at any time. Although removing didn’t seem to make much sense considering what its purpose was. Just make use of ARGFy#add_file to add a file to the end of the sequence of input files to the stream.

In the background ARGFy is really just reading and buffering the files one at a time and returning the lines. Its nothing too exciting, just a little fun working with Ruby. The example nicely displays how ARGFy works:

# sample.rb
require 'ARGFy'

argf = ARGFy.new(ARGV)
argf.each do |line|

  # Per File Header
  if argf.new_file?
    filename = argf.filename
    filename = "STDIN" if filename == "-"
    puts '', filename, "-"*filename.length

  # Print out the line with line numbers
  puts "%3d: %s" % [argf.lineno, line]


Calling sample.rb with a few small input files creates some nicely formatted output:

shell> ruby sample.rb in1.txt in2.txt 

  1: one
  2: two

  1: alpha, beta, gamma
  2: 0987654321

Nothing complex. It works like you would expect it too. For more sample usage you can scan the RSpec test cases in the GitHub repository.

RDoc Introduction

Automatically generating documentation from source code has been available as far back as 1993. Its so common now that its expected to be available in any mainstream programming languages. I’ve seen it most commonly in Object Oriented languages offering nicely formatted descriptions of classs and their public methods/attributes.

Consistency is Nice

The main advantage I see with automatically generated documentation is that it is consistent. Take Javadocs for instance. They are all the same. When a developer wants to work with a Java library, they expect Javadocs. Why? Because they are familiar with them. They can easily navigate them and quickly find whatever it is that they are looking for. Documentation in any other way would require wasting time learning how to use/navigate it searching for what you want to know.

RDoc is Ruby’s documentation generator. You see RDoc generated documentation all over the place in Ruby. See YAML, Hpricot, or even core classes like Array.

So, I felt if I want to continue using Ruby I should at least learn how its handled. It turns out that its easier then I thought. I’m a huge fan of Markdown syntax and RDoc turns out to be pretty close to that. So, here is what I think is all you need to know to handle producing some simple, yet thorough, documentation for a class.

RDoc Resources

Start by updating your rdoc. The latest version at the time of writing is 2.2.1. The gem provides you with the rdoc and ri tools so that you can both generate and display documentation from the command line. Here is how you can install them:

shell> sudo gem install rdoc

The best online resources I found were not surprisingly:


Here is a basic example that shows the structure of the RDoc as it describes a File, Class, Attributes, and Methods. The placement of the comments is important. RDoc comments are always on top of what they are documenting:

# Documentation for the file itself
# There should be a blank line between this and any class
# definition to separate the documentation about the file
# and the class.  If there is no space then the entire text
# is used for both the file and the class, no different.

# Documentation for the class itself.
# This will appear at the top of the page specific to this
# class, before any other content.
class Dice
  # Documentation for an attribute
  # To documentation each attribute you must make individual
  # calls to attr_accessor, attr_reader, and attr_writer.
  # Appears next to the attribute name in the attrs section
  attr_accessor :sides

  # Documentation for the constructor
  # Corresponds to the `new` method
  def initialize(sides)
    @sides = sides

  # Documentation for a method
  def roll(times)
    Array.new(times).map { 1+rand(@sides) }

  # Documentation for a method
  def beat(num)
    roll(1).first > num


Running `rdoc` on that file creates this documentation.


Rich documentation makes the important parts stand out. It makes use of HTML’s expressive power and enables lists, headers, links, bold/italics, code, and other presentation helpers. I’ll now document the Dice class and add some style and realistic content.

# == sample.rb
# This file contains the Dice class definition and it runs
# some simple test code on a 16 sided dice.  A 20 dice
# roll fight again the COMPUTER who always rolls 10s!

# Multi-sided dice class.  The number of sides is determined
# in the constructor, or later on by accessing the _sides_
# attribute.
# == Summary
# A #single_roll returns a single integer from 1 to the
# number of sides, _inclusive_.  However, if you want to
# roll multiple times you can can use the #roll method,
# specifying the number of rolls you want, and you will
# get an Array with the values of all the rolls!
# == Example
#    dice = Dice.new(8)   # An eight sided dice
#    four = dice.roll(4)  # An Array containing 4 rolls
#    sum  = four.inject(0) { |mem,i| mem+i } # Sum of rolls
# == Contact
# Author::  Joseph Pecoraro (mailto:joepeck02@gmail.com)
# Website:: http://blog.bogojoker.com
# Date::    Saturday November 29, 2008
class Dice

  # Number of sides on the dice
  attr_accessor :sides

  # Create a dice with `sides` of dice.
  # Defaults to 6.
  def initialize(sides=6)
    @sides = sides

  # Returns an array of size `times` containing
  # a group of dice rolls.
  def roll(times)
    Array.new(times).map { single_roll }

  # Returns the value of a single dice roll.  The
  # values are from 1 to @sides _inclusive_.
  def single_roll

  # A single roll challenge:
  # * makes a single_roll
  # * returns true if the roll was strictly greater
  #   then the given number
  # * returns false otherwise
  def beat(num)
    single_roll > num


# Note that this is a constant, which is special
# and it is documented like a Class Attribute.
# This is in the RDoc generated documentation for
# the file.

# Note that these comments, for generic code
# are not in the RDoc generated documentation.
dice = Dice.new(16)
winCount = loseCount = 0
20.times do
  if dice.beat(COMPUTER)
    winCount += 1
    loseCount += 1

# Output
puts "You won #{winCount} times and lost #{loseCount} times!"
puts "Muhahah.  Try again later!!"           if winCount < loseCount
puts "Well Played.  I'll get you next time." if winCount > loseCount
puts "What a match!  Boy that was fun."      if winCount == loseCount

That generates this documentation.


There are some subtle points that make this documentation format nicely. I’ll point them out and explain them. Most of this is straight from the above resources, however some of it I could not find documented anywhere.

  • The file documentation links to the Dice class. Furthermore the Class documentation links down to the single_roll and roll methods. This is because:

    Names of classes, source files, and any method names containing an underscore or preceded by a hash character are automatically hyperlinked from comment text to their description.

    1. sample.rb was a filename and so it was automatically linked.
    2. Dice was the name of a class and so it was automatically linked.
    3. single_roll had an underscore and happened to be a method name so it was automatically linked in a few places.
    4. #roll had a hash character signifying that it should be linked.
  • Sections begin with a “=” or a “==”. I prefer to use double, because it stands out more in the source code. Technically a single “=” becomes a level 1 header, and a double becomes a level 2 header. However, they both display the same.
  • URIs like http://blog.bogojoker.com and mailto:email are automatically turned into links and formatted nicely.
  • Bold, Italics, and Typewriter Text can be quickly formated much like Markdown:

    _italic_ or <em>italic</em>
    *bold* or <b>bold</b>
    +typewriter+ or <tt>typewriter</tt>

  • Code is displayed if each line
  • Tabular Labeled List, like the Contact information, are formatted like:

    label:: description 1
    label2:: both descriptions will line up

  • Formatting source code is like Markdown. The code that you want formatted must be indented with a few spaces. As long as the indention is maintained the text will display as source code in the HTML documentation.
  • Formatting lists is again like Markdown. Just use *’s or -‘s and they will turn into bullet points. For numbered lists just use numbers followed by a dot and they will be formatted automatically.

Final Notes on `rdoc` itself

When I created the final documentation above I used a few of rdoc’s command line switches to customize the output. What I actually used was:

shell> rdoc --title="Dice Documentation" --line-numbers --tab-width=2

The title switch changed the <title> for the documentation page, and the other two deal with formatting the htmlized source code that RDoc shows when you click on the function name to view the source in the documentation. There are plenty of command line switches. To view the full list do:

shell> rdoc --help

A few useful switches are “–ri” to create ri documentation so you can access your classes from the command line. Also you can output to several formats. For instance you can make a PDF using “–format=texinfo” then using `texi2pdf` on the texinfo file. The PDF doesn’t look that bad, here is my example as a PDF.

NOTE: Finding the generators was tricky. I had to check out the rdoc source code and find the different generators. If anyone knows an easier way to check what generators are available, please let me know.

I hope this helps some people using RDoc for their classes. Enjoy.

DATA and ARGF in Ruby

Of all the “superglobals” in Ruby these seemed to be the least documented. It only takes a quick example to understand them. I had some fun and decided to play around with these variables and more.


Although this is mostly useless, its a neat trick. In any Ruby script, as soon as the __END__ symbol is matched, then the rest of the text in the file is no longer parsed by the interpreter. Whatever is after __END__ can be accessed via DATA. DATA acts like a File Object, so its like you’re reading the current script as though you’re reading from a File.

# DATA is a global that is actually a File object
# containing the data after __END__ in the current
# script file.
puts DATA.read

I can put anything I want
after the __END__ symbol
and access it with the
DATA global.  Whoa!


ARGF takes each of the elements in ARGV, assumes they are filenames, and allows you to process these files as single stream of input. This is common with shell programs. Its a lot like cat. cat takes multiple files on the command line, and outputs them as a single stream. If you want to force input to come from STDIN then just provide a hypen “-“. Finally, if there is nothing in ARGV then ARGF defaults to STDIN.

Here is a simple example of ARGF mimicking cat.

shell> echo "inside a.txt" > a.txt
shell> echo "inside b.txt" > b.txt
shell> cat a.txt b.txt 
inside a.txt
inside b.txt

Here is a Ruby script that can do just that:

# cat.rb
ARGF.each do |line|
  puts line

Example usage:

shell> ruby cat.rb a.txt b.txt 
inside a.txt
inside b.txt

ARGF Confusion

What confused me when I first used ARGF was that it has no special class. It claims it is an Object. Take a look:

>> ARGF.class
# => Object

But at the same time it has so much more then a regular Object:

>> ARGF.methods - Object.methods
# => ["select", "lineno", "readline", "eof", "each_byte", "partition", "lineno=", "read", "fileno", "grep", "to_i", "filename", "reject", "readlines", "getc", "member?", "find", "to_io", "each_with_index", "eof?", "collect", "path", "all?", "close", "entries", "tell", "detect", "zip", "rewind", "map", "file", "any?", "sort", "min", "seek", "binmode", "find_all", "each_line", "gets", "each", "pos", "closed?", "skip", "inject", "readchar", "pos=", "sort_by", "max"]

The important things to note are accessors like lineno and filename. They can give you some information while you read the lines. Such as if you’re reading from a file or STDIN. You can easily give line numbers to everything being read. Like so:

# linenum.rb
ARGF.each do |line|
  puts "%3d: %s" % [ARGF.lineno, line]


shell> ruby linenum.rb a.txt b.txt 
  1: inside a.txt
  2: inside b.txt


As an exercise I wrote a little class to emulate what ARGF does and to make it more useful to me. For instance ARGF can’t tell you when it changes files. You can try and catch when the filename changes but what if the same file is repeated twice in a row? ARGFy has both a global lineno and a per file filelineno I’ll talk more about ARGFy later.