Building a Ruby Gem

With all of my recent interest in Ruby I have been overwhelmed by the number of really awesome rubygems. I have talked about some and I will talk about even more in the future but I felt it would be important to learn how to make my own ruby gem. After all, I’d been using GitHub to update a few people’s gems without really knowing the best way to go about such things. It took longer then I expected to find the right resources for what I wanted but I came across three good articles which pointed me in the right direction:

I’m going to write about what I feel you most likely want to know when creating your first Ruby Gem.

1 File/Directory Structure

Gem File Structure
The most basic structure for a simple ruby gem would be like so:

+ gem_name/
   - History.txt
   - README.txt
   - Rakefile
   + bin/
      - gem_name
   + lib/
      - gem_name.rb
   + test/
      - test_gem_name.rb

This is pretty straightforward but let me explain a little. There is an outermost directory that holds everything. Source files go in lib/, executables go in bin/, and test files will go in test/.

The README.txt is essentially to providing the meta-data, installation instructions, and basic information that you would want to any person downloading your gem to read. History.txt is the common name for the Changelog.

2 Utilities to Help You

Now don’t go making these directories by hand! As you probably expected there are a number of tools to get this job done for you. These tools are gems themselves and are therefore simple to install and start working with. The two that I would suggest looking at are Hoe and the slightly more advanced newgem which even uses Hoe.

Both tools come with command line utilities that automate building this directory structure. I’ll start with Hoe, which cleverly named their utility “sow”.

$ sow first_project
creating project first_project
... done, now go fix all occurrences of 'FIX'

  FirstProject/Rakefile:9:  # p.developer('FIX', 'FIX@example.com')
  FirstProject/README.txt:3:* FIX (url)
  FirstProject/README.txt:7:FIX (describe your package)
  FirstProject/README.txt:11:* FIX (list of features or problems)
  FirstProject/README.txt:15:  FIX (code sample of usage)
  FirstProject/README.txt:19:* FIX (list of requirements)
  FirstProject/README.txt:23:* FIX (sudo gem install, anything else)
  FirstProject/README.txt:29:Copyright (c) 2008 FIX

That was easy. It setup a structure exactly like the above but with an additional Manifest.txt file that lists all the files. Easy enough. Lets take a look at newgem:

$ newgem new_proj
      create
      create  config
      create  doc
      create  lib
      create  script
      create  tasks
      create  test
      create  tmp
      create  lib/new_proj
      create  History.txt
      create  License.txt
      create  Rakefile
      create  README.txt
      create  PostInstall.txt
      create  setup.rb
      create  lib/new_proj.rb
      create  lib/new_proj/version.rb
      create  config/hoe.rb
      create  config/requirements.rb
      create  tasks/deployment.rake
      create  tasks/environment.rake
      create  tasks/website.rake
      create  test/test_helper.rb
      create  test/test_new_proj.rb
  dependency  install_website
      create    website/javascripts
      create    website/stylesheets
      exists    script
      exists    tasks
      create    website/index.txt
      create    website/index.html
      create    script/txt2html
       force    tasks/website.rake
  dependency    plain_theme
      exists      website/javascripts
      exists      website/stylesheets
      create      website/template.html.erb
      create      website/stylesheets/screen.css
      create      website/javascripts/rounded_corners_lite.inc.js
  dependency  install_rubigen_scripts
      exists    script
      create    script/generate
      create    script/destroy
      create  script/console
      create  Manifest.txt
      readme  readme
Important
=========

* Open config/hoe.rb
* Update missing details (gem description, dependent gems, etc.)

Okay don’t get intimidated. There is a little more meat this time but I know you’re hungry. Now there is a License, some more automation with rake, configuration files for hoe, a directory for a website allowing you to describe your gem, and even some scripts to make working with your ruby gem a little easier. This seems like a great choice for real large scale gem requiring some major testing.

For the purpose of this blog post I’m going to stick to the basics. I’m not doing anything major here. I’ll stick to Hoe and begin reaping my newly sowed gem. The majority of this tutorial will still apply no matter which method you decide to choose.

3 Pre-Development

A few quick items before you dive into development.

If you used Hoe and sow then you can start out by changing all the occurrences of “FIX”. Most are found in the README.txt file and one more in the Rakefile. Fill out these details as you think they should be filled out. Get these minor details out of the way before you start your work.

I would suggest setting up version control right now and making an initial checkin for your project. The choices here are huge but not necessarily critical for your project. I would suggest that you take a look at Git, Subversion, and Mercurial. I’ll be using git, and hosting my project on GitHub. Its all free, easy, and most of all fun! Tutorials for GitHub are available online, its the easiest version control system I have ever used!

Finally, since Hoe works so nicely with RubyForge you will want to setup an account and install the rubyforge gem. That takes only a few minutes. Setting up the RubyForge bundle is pretty straightforward:

# Install and Setup
$ sudo gem install rubyforge
$ rubyforge setup

# (optional) FYI the location of the edit file
$ mate ~/.rubyforge/user-config.yml

# Configure and List
$ rubyforge config
$ rubyforge login
$ rubyforge names

You should be all set to start development on your gem.

4 Gem Basics

Now that you’re setup you can develop like normal. Put your Ruby Modules, Classes, etc. into the lib folder. One thing that you will notice that Hoe did for you is setup a Class with your gem name and it defined a single constant VERSION. This value is important when the time comes to release and later update your gem. Each section has a well defined meaning. Here is what Dr. Nic had to say:

VERSION = X.Y.Z
X = major release number (MAJOR) – not backwards compatible
Y = minor release number (MINOR) – backwards compatible, additional features
Z = patch/bug fix number (TINY) – small bug fixes

Try to keep your versioning conventions uniform with these values. Then when you want to update your gem all you would need to do would be update the VERSION, and follow through with a release.

Make use of rake to automize testing, doc creation and more. I am still just learning rake so I’ll leave most of the discussion of rake for another time. However you should be aware of the following to get a list of all the functions rake can perform:

$ rake -T

5 Create the RubyForge Package

Because you’ve used Hoe all you need to do is have a RubyForge account, the rubyforge gem configured correctly, and your source files all ready to go. Create a new package on Ruby Forge, which you can do like so:

$ rubyforge create_package bogojoker regex_replace
$ rubyforge config
$ rubyforge names

You would replace bogojoker with your group/username and regex_replace with your package name (which will be your gem name). This is also possible to do from the RubyForge website GUI if you can’t get this working.

6 Release (Deploy) the Gem

Once the package is created you can configure Hoe to publish right to Ruby forge by editing your Rakefile:

Hoe.new('regex_replace', Rr::VERSION) do |p|
  p.rubyforge_name = 'bogojoker'
  p.developer('Your Name', 'Your Email')
end

Again notice this time that ‘regex_replace’ would be the name of your package, Rr::VERSION would point to your VERSION constant auto-created when you created your directory with Hoe’s sow, and that your rubyforge_name would be your group/username for RubyForge and this case ‘regex_replace’ would be.

Uploading is now breeze, just do the following:

$ rake release VERSION=1.0.0

Where the version number is the same as the VERSION constant for your gem. Hoe will make use of rubyforge to package and upload directly to RubyForge with little to no problems. Follow this up with:

$ rake publish_docs

7 Problems / Troubleshooting

Remember when I said “little to no problems?” Well, what if you have problems? I’ll cover a few issues I had:

Where is my RubyForge Project?

Once you make an account you actually have to request to create a project and that request will get approved later in the day. You can go through with everything and host the gem on your own gem server or at least make your .gem available using ‘$ rake package’ and taking the .gem inside the newly created pkg/ folder in your gem’s directory. But once your RubyForge project is accepted you can proceed to upload using ‘$ rake release …’.

no <group_id> configured for <bogojoker>

Make sure you create the package on RubyForge and make sure it goes through. Run the rubyforge create_package command, then config, and names to see if the new package was created. Try a more unique name if it appears not to work.

no <processor_id> configured for <Any>

In this case something went wrong with your rubyforge config. Its not known to me why this may have happened but a few others have had the same problem. The solution is easy, edit the “processor_ids:” in your your ~/.rubyforge/auto-config.yml file to be the following:

processor_ids:
  IA64: 6000
  Any: 8000
  AMD-64: 1500
  PPC: 2000
  Sparc: 4000
  Other: 9999
  i386: 1000
  Alpha: 7000
  MIPS: 3000
  UltraSparc: 5000

Finally:

$ gem install regex_replace
Bulk updating Gem source index for: http://gems.rubyforge.org/
ERROR:  could not find regex_replace locally or in a repository

I thought I had it! Well, actually I did! It just took about 5 minutes for the my gem to be picked up and indexed. So just a few minutes later it was working! If you ever want to search the gem index you can use the following command:

$ gem search -r regex

*** REMOTE GEMS ***

Bulk updating Gem source index for: http://gems.rubyforge.org/
cnuregexp (1.0.0)
regex_replace (1.0.0)
regexbuilder (0.0.1, 0.0.0)
regexp-engine (0.9, 0.8)
RegexpBench (0.5.2, 0.5.1, 0.5.0)
TextualRegexp (1.8.6)

Where ‘-r’ stands for remote and ‘regex’ would be replaced with your search term.

8Conclusions

Rubygems are part of the reason for Ruby’s massive appeal. I can download and make use of incredible ruby libraries or programs with simple ‘gem install’ commands. Now that I’ve produced my own gem I have a greater appreciation for the developers that have made this so streamlined, efficient, and easy.

I think all Ruby Developers should take the time to produce a gem. Why?

  1. Release open source code and contribute to the Ruby community
  2. It will help you when you want to work on someone else’s code
  3. Facilitates a test-driven mindset with rake
  4. Work with the automatic documentation to improve the code you release
  5. Spur creativity. Some Ruby gems are just brilliant. I wanna see more.

I hope this helps you and encourages you to spend the time to publish some of your useful Ruby code so that others like my self can start using it.

I almost forgot, you can grab my gem, which installs the rr command in your bin by running the following:

$ sudo gem install regex_replace

$ rr
usage: rr [options] find replace [filenames]
       rr [options] s/find/replace/ [filenames]

Cheers.

rr – 1.1 – In Place Edits and Multiple Files

Less then 48 hours after rr becomes 1.0 it gets a few very handy improvements!

In place modification of files is activated via the –modify (or shorthand -m) option. This means that you can bypass any output redirection and just go straight to modifying the original file. This feature does use filename.tmp as a temp file which it later renames to the original filename. Again if no filenames are specified then input is expected to come from STDIN and therefore the new –modify option will be ignored in this special case.

Another original goal of mine was adding support for multiple filenames. Specifically so that useful shell tricks like *.txt file globbing would work nicely with rr. Well support has been added and it works great with the new –modify option.

The usage message has been cleaned up a bit but here is the very basic usage for all new people.

usage: rr [options] find replace [filenames]
       rr [options] s/find/replace/ [filenames]

I wanted to point out a rather hidden feature. The way I implemented the options is that the ARGV array is actually parsed first for all options and then removes the options before going on to parse the find, replace, and filename arguments. This means that your options can go anywhere on the command line so long as they start with a -.

This presents 1 problem, a workaround, and a question for users. Using the second form of usage, where the find and replace portions are separate argument if your regex or replacement text starts with a “-” the script will interpret it as an option. You can avoid this by using the s/find/replace/ usage (or putting the regex in /regex/ format, which is allowed). But really this boils down to deciding whether or not I am being too liberal with my command line arguments. Since this is a very big fringe condition with a workaround I am going to allow options to be placed anywhere, allowing you to bring up the last command in bash with the up arrow and adding an option to the end of your rr command (like the new -m) to repeat your last command with an option much easier.

rr is always free, Try It Out:
rr – Current Version Download
rr – changelog.txt – Click Here

$ gem install regex_replace

rr – 1.0 – Now a Pipe Friendly Filter

rr has reached the 1.0 milestone! The obvious improvement over the last version is that input is allowed from standard input. It seemed silly to always require a filename and the option of having standard input was always on my to do list. Usage is now:

usage: rr [options] find replace [filename]
       rr [options] s/find/replace/ [filename]

Now you can use rr as a filter and happily make find replace changes by piping input into it or out of it! I already have a script that runs a file through 4 rr commands to produce much nicer and cleaner output. Wrap that up in a shell/ruby/perl script and you have a useful tool.

Enjoy. Again its all free!
rr – Current Version Download
rr – changelog.txt – Click Here

$ gem install regex_replace

rr – Updated to 0.9.1

rr now has some improvements, including a new style of usage. Both this new style and the original style usage are available to you.

rr [options] s/find/replace/ filename

The new s/find/replace/ syntax is still weak with respect to the forward slash character in either the find or replace, but works for everything else so far. Of course if you want to include any whitespace then you should wrap the entire argument in quotes. Also, because the strings are coming from the command line, if you want to have literal backslashes then use single quotes around your string so the shell doesn’t escape them itself before sending it to Ruby.

Another highlight is that all escape sequences should now work. That means your typical \n, \t, and all the obscure even including \a (system bell). Check out this example, you will hear two system bells once this has been run:

$ echo "aba" > in.1; rr a "\a" in.1; rm in.1

Also I was considering renaming to fr for “Find/Replace” however I am keeping rr. rr can be interpreted as “Run Regex” with the s/find/replace/ syntax, or “Regex Replace” for the normal 3 argument usage. If you like fr you can easily make an alias like so: Want to know how to always load the alias?

$ alias fr="rr"

So have fun, of course everything is free and available right here:
rr – Current Version Download
rr – changelog.txt – Click Here

rr – Regex Replace on a File – SotD

I was frustrated with regular expression find/replace programs that only did line processing. This was because often I had find/replace needs that spanned multiple lines. Programs like grep, ack (which I recently found and is really, really very awesome for searching code), and sed were easy enough to use for basic needs. But again, when it came to multiple line pattern matching both fell short of my needs.

My solution was to write my own script to parse an entire file as a single string and do my find/replace bidding. The cons being liberal use of memory and a few hundredths of a second longer then the usual find/replace algorithms seemed insignificant to the pros of a multi-line capable find/replace using a regular expression with the capability of using back references (like \1) to incorporate captured groups from the regex into the replacement text.

So, without further ado I present rr.

I am hopeful for some public criticism to help me bring rr up from its current version of 0.9 to a landmark 1.0. The ruby script weighs in at 100 lines but really under 50 are code and the rest is comments, whitespace, or the usage string. Speaking of usage, here is what it currently [v0.9.0] looks like:

usage: rr [options] find replace filename
  find     - a regular expression to be run on the entire file as one string
  replace  - replacement text, \1-\9 and \n are allowed
  filename - name of the input file to be parsed

options: --line or -l process line by line instead of all at once (not default) --case or -c makes the regular expression case sensitive (not default) --global or -g process all occurrences by default (this is already default)
negated options are done by adding 'not' or 'n' in switches like so: --notline or -nl
example usage: The following takes a file and doubles the last character on each line and turns the newlines into two newlines. rr "(.)\\n" "\\1\\1\\n\\n" file

More then likely this will undergo a lot of changes. A quick list of my current ideas include:

  1. If no filename is provided take input from STDIN. Multiple files can be handled by piping the `cat` of multiple files through rr.
  2. Better switch structure, although right now I don’t have any idea what that is

I’ll throw a test scenario at you. I had tabulated data in a file but each row was split across multiple lines. Now this wasn’t the only data in the file but I’ll present you with a simplier version here: [in.1]

Product A  12.99
           2001
----
Product B   1.99
           1997

Here you can see that I can’t just replace every other newline. What I want to actually do is replace newlines where there was a digit followed by a newline, some whitespace and another digit. I ran this through my script:

> rr "(\d)\n\s+(\d)" "\1  \2" in.1 > out.1

And I got the output I wanted: [out.1]

Product A  12.99  2001
----
Product B   1.99  1997

Even cleaner results can be seen by running a more advanced regex to remove the extra lines:

> rr "(\\d)\\n\\s+(\\d.*?\\n)(-+\\n)?" "\\1  \\2" in.1
Product A  12.99  2001
Product B   1.99  1997

So what are you waiting for? Download the script, add it to your bin directory, give it a test run, and tell me how you want it improved!

rr – Most Recent Version – Download

Thanks!

search