Debugging HTTP Headers

While writing the skreemr shell the other week I ran into an issue that required me to dig down to one of the lowest levels of web communication… HTTP Headers. I’m always shocked to learn that so many web developers don’t know much about headers. So I thought I would try to reenforce the point that HTTP Headers matter… and that knowing your stuff can help you debug and solve problems. Here is the story of my real world example.

I’m not going to go over HTTP Headers. thats already been done. Instead I’m going to focus on debugging and working with them a little bit. I’m going to assume you have already the general concepts.

The Problem

I wanted to add pagination to the skreemr shell. So you could run a search, then get the next page of results, etc. So a little experimentation with skreemr in my browser produced the following URLs:

http://skreemr.com/results.jsp?q=test

http://skreemr.com/results.jsp?q=test&l=10&s=10

http://skreemr.com/results.jsp?q=test&l=10&s=20

A simple pattern! “q” is the query string, “l” is the number per page, “s” is the number to start at, indexed from 0. So that last URL would produce results 21-30. This all worked well in my browsers, but it wasn’t working in Ruby:

require 'open-uri'

# Grab the pages and read the content
str1 = open( 'http://skreemr.com/results.jsp?q=test'      ).read  # 1-10
str2 = open( 'http://skreemr.com/results.jsp?q=test&s=10' ).read  # 11-20

#=> Should print false... but its printing true!
puts str1==str2

What this was saying was that the content being returned from both of those urls is EXACTLY the same. That couldn’t be… could it?

More Investigation

At this point I thought it was a problem in Ruby. I figured Ruby was doing some caching in the background that I was going to have to disable or work around. (I now know this is not true, but that was my first guess). To test that hypothesis I turned to my trusty friend curl and checked to see if that showed the proper behavior:

shell> curl 'http://skreemr.com/results.jsp?q=test'      -o 1.html
shell> curl 'http://skreemr.com/results.jsp?q=test&s=10' -o 2.html
shell> diff -q -s 1.html 2.html
Files 1.html and 2.html are identical

What!?! That stunned me. Curl was getting the exact same results as Ruby. I took a look at the html files, and indeed 2.html, which should have contained results 11-20 held 1-10. I opened both urls in my browser… they showed the correct results. Something weird was happening!

Take A Step Back

At this point you’ve got to know what is happening. Curl and Ruby (using Ruby’s Net:HTTP under the hood) are just making a simple GET request. Both my browsers Safari and Firefox are sending far more then just a GET request. They are sending a bunch of other headers. Lets take a look at what Firebug says Firefox sent:

firebug

Thats a mighty long list of headers! Its entirely possible that one of those might be influencing the server. Onto the drama!

Time to Act – Literally

The only difference between what curl sent in its request and what Firefox was sending is that header information and possibly any information contained in those cookies (unlikely in this case). So lets make curl act as if it is Firefox by send the same headers with curl! I took a quick peek at my curl reference for the proper switches/formatting and I was ready:

shell> curl 'http://skreemr.com/results.jsp?q=test' -o 1.html
shell> curl 'http://skreemr.com/results.jsp?q=test&s=10'            \
            -H 'Host: skreemr.com'                                  \
            -H 'User-Agent: Mozilla/5.0 (...) Firefox/3.0.6'        \
            -H 'Accept: text/html,application/xml;q=0.9,*/*;q=0.8'  \
            -H 'Accept-Language: en-us,en;q=0.5'                    \
            -H 'Accept-Encoding: gzip,deflate'                      \
            -H 'Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7'     \
            -H 'Keep-Alive: 300'                                    \
            -H 'Connection: keep-alive'                             \
            -H 'Cache-Control: max-age=0'                           \
            -o 2.html
shell> diff -q -s 1.html 2.html
Files 1.html and 2.html differ

Jackpot! Some subset of those headers is indeed fixing my problem, because now it properly fetched the second page of results! Translating that back into Ruby works as well:

require 'open-uri'

# Headers
HEADERS = {
  'Host'            => 'skreemr.com',
  'User-Agent'      => 'Mozilla/5.0 (...) Firefox/3.0.6',
  'Accept'          => 'text/html,application/xml;q=0.9,*/*;q=0.8',
  'Accept-Language' => 'en-us,en;q=0.5',
  'Accept-Encoding' => 'gzip,deflate',
  'Accept-Charset'  => 'ISO-8859-1,utf-8;q=0.7,*;q=0.7',
  'Keep-Alive'      => '300',
  'Connection'      => 'keep-alive',
  'Cache-Control'   => 'max-age=0'
}

# Get the Pages
str1 = open( 'http://skreemr.com/results.jsp?q=test', HEADERS ).read
str2 = open( 'http://skreemr.com/results.jsp?q=test&s=10', HEADERS ).read

# Correctly Prints false
puts str1==str2

Problem Solved

Sure, we have a solution but its not pretty. How much is really needed? Finding that out is just a matter of eliminating the headers that do nothing. Remove one, test, remove another, test, remove another test… As long as it still works after you’ve removed an individual header you know that header wasn’t needed. It took me a few minutes but I narrowed it down to just a single header:

MAGIC = { 'Accept-Language' => 'en-us' }

This was a rather unique problem. It is rare for me to have to dip down to the HTTP Headers to see what is actually happening for such a high level problem. However, as this problem shows, its important to know what is going on under the hood. If I didn’t know about headers, I would not have been able to solve it.

Mini MP3 Searching Shell – Skreemr

So its way to hard to download an mp3 in Safari. Right click the link and download? Pff, I want to ⌃S and be done with it. Well, this time I decided to avoid the problem all together. I use Skreemr to search for a particular song when it interests me.

In the past I wrote a little bash script, that makes use of curl, to download an mp3 to my desktop unique named so it wouldn’t have conflicts. This shell essentially wraps and drastically improves that to allow for searching, pagination, history, downloading, and opening mp3s off of Skreemr. It gives me just what I need. The functionality that I want without having to use torrents etc. I’m thinking of turning this into a gem.

skreemr.png

This script requires the popular “escape.rb” script that gives some nice and safe shell escaping functions. You can download both from my GitHub scripts project.

Of course its available on my ~/bin and there will be another article later on that goes over a few aspects of this simple little script.

skreemr2.png

Instant Web Sharing on Your Mac

I came across a few articles recently that point out how to instantly share a folder on your computer. They basically ride on top of this elegant python script:

python -m SimpleHTTPServer

It works great but I wanted to improve on it in a number of ways:

  • Automatically copy a URI into my clipboard so I can easily paste it to others.

  • Make that URI nicer then just an IP address.

  • Use a non-standard port, for security.

  • Open in a new tab so I can keep working in the directory and yet still monitor the HTTP requests being made.

Here was what I produced. (Its up in my ~/bin.)

#!/bin/bash
# Start Date: Sunday February 8, 2009
# Current Version: 0.9
# Author: Joseph Pecoraro
# Contact: joepeck02@gmail.com
#
# Decription:  Immediately Share the current directory
#   in a new tab so you can monitor the requests made
#   have your original tab to continue working in that
#   directory.  Meant for Mac OS X.
#
#   1. Echos the URI
#   2. Puts the URI into your Clipboard
#   3. Opens a new tab in the terminal
#   4. Changes Directory to the other tabs directory
#   5. Echos the URI
#   6. Runs the Web Server
#   7. Optionally Opens in Safari
#
# Sources that Helped:
#   New Tab Here: http://justinfrench.com/index.php?id=231
#   HTTPServer:   http://www.commandlinefu.com/commands/view/71/
#   Paul Berens:  http://zibundemo.blogspot.com/
#

# -----------------
#   Host and Port
# -----------------

# This gets your ip address and converts it to a nice string
es_host=$(curl --silent www.whatismyip.com/automation/n09230945.asp)
es_host=$(nslookup $es_host | awk '/name =/{print substr($4,1,length($4)-1)}')
es_port="8000"

# -----------------
#   Script Below
# -----------------

echo "http://$es_host:$es_port"
echo -n "http://$es_host:$es_port" | pbcopy
osascript -e "
Tell application \"Terminal\"
  activate
  tell application \"System Events\" to tell process \"Terminal\" to keystroke \"t\" using command down
  do script with command \"cd '$(pwd)'\" in selected tab of the front window
  do script with command \"clear; echo '$es_host:$es_port/'\" in selected tab of the front window
  do script with command \"python -m SimpleHTTPServer $es_port\" in selected tab of the front window
end tell" &> /dev/null

# Optional: Open Safari, Just Uncomment the next line
# open "http://$es_host:$es_port"

# Cleanup
unset es_host
unset es_port

Now that should work on any Mac. And it should give a nicer URL then an ugly IP address. You should see something like this:

easy_share

As soon as it starts you can paste the URL to anyone you’re chatting with. It couldn’t be simpler!

If you’re experienced enough with DNS servers and you’ve given your computer a Dynamic Name you can customize the script. Paul Berens gave me a great suggestion to determine if I’m on my local network at home. I can check the MAC address of my default gateway (my wireless router). That is a quick check to see if I’m at home. If I’m at home I use my bogojoker.is-a-geek.com URI automatically! Otherwise it defaults to generating the dynamic address generation. Check it out:

# -----------------
#   Host and Port
# -----------------

# Mac Address of my Router At Home
if [ -n "$(arp -a | grep 0:1e:2a:76:17:98)" ]; then
  es_host="bogojoker.is-a-geek.com"
  es_port="8000"

# Otherwise Dynamically Determine
else
  es_host=$(curl --silent www.whatismyip.com/automation/n09230945.asp)
  es_host=$(nslookup $es_host | awk '/name =/{print substr($4,1,length($4)-1)}')
  es_port="8000"
fi

So now when I run easy_share at my house it always throws out bogojoker.is-a-geek.com URIs. Much nicer on the eyes and easy to remember. I’ll write about dynamic names like this another time!

“Back and Forth” Greasemonkey For The Whole Web

Recently I wrote a Greasemonkey script to add keyboard shortcuts to The Big Picture, to improve on some of their already existing shortcuts. Once I started using some of the shortcuts I made I ended up wanting to use them all over the place at other blogs. This functionality is so tiny, but so useful, that I bundled it into its own script that runs on all web pages!

Grab it here:

//
// ==UserScript==
// @name          Back and Forth
// @namespace     http://blog.bogojoker.com
// @description   Keyboard Shortcut to Jump back and forth on a page. (esc key).
// @include       *
// @version       1.0 - Initial Version - Sunday February 15, 2009
// ==/UserScript==

(function() {

  // Global States
  var x = null;
  var y = null;

  // Add a new Global Key Listener for `esc`
  document.addEventListener("keypress", function(e) {
    if(!e) e=window.event;
    var key = e.keyCode ? e.keyCode : e.which;
    if ( key == 27 ) {
      var tempx = x;
      var tempy = y;
      x = Math.max(document.documentElement.scrollLeft, document.body.scrollLeft);
      y = Math.max(document.documentElement.scrollTop, document.body.scrollTop);
      if ( tempx != null ) { // First time it should be null
        window.scrollTo(tempx, tempy);
      }
    }
  }, true);

})();

On any webpage the first time you push the `esc` key position A gets stored. The next time you push `esc` position B gets stored and the browser jumps to position A. The next time you push it, A gets stored and you jump to B. So you always jump back to wherever you pushed `esc` last. Hence the name “back and forth.”

This is useful to me when I jump between comments and the content. When I’m reading a comment and I want to check back to the article, I just just push `esc` to save my position, go back to the article, and when I’m all set I just jump back to my saved position (the comments) with `esc`.

Short, Sweet, Simple: The Back and Forth Greasemonkey Script.

More Big Picture Keyboard Functionality!

I’ve mentioned before how I’m a big fan of The Big Picture blog. One of the things that makes it so great is that it has keyboard navigation! You can use ‘j’ and ‘k’ to automatically jump between pictures. Its so much nicer then scrolling because it jumps to the exact height to maximize the picture in the browser. Huge usability improvement!

Like before, the problem I had was that users were mentioning pictures in their comments. Jumping back to that picture was hard or annoying. So, I wrote a Greasemonkey script that allows you to type in a number and it will automatically jump to that picture! Click here to get the script!

Oh, and if you’re reading comments and you want to jump back and forth between images and comments that works too. Once you’ve jumped to the image, just hit ‘esc’ and you will be taken back to where you were before. Too cool!

//
// ==UserScript==
// @name          The Big Picture Keyboard Enhancements
// @namespace     http://blog.bogojoker.com
// @description   Keyboard Shortcut Enhancements
// @include       http://www.boston.com/bigpicture/*
// @version       1.1 - Added Back+Forth - Thursday February 12, 2009
//                1.0 - Initial Version - Monday February 9, 2009
//
//
//   This allows the user to type in numbers, and after about
//   a half second it will jump directly to that image.
//   For example:
//
//     Push '1'... '2'... User is taken directly to Image "12"
//     Push '9'... '9'... User is taken to the last picture.
//
//   Use 'esc' to jump back and forth between two positions.
//   For example if a comment mentions picture 4:
//
//     Push '4'...        User is taken directory to Image "4"
//     Push 'esc'         User is taken back to the comment!
//     Push 'esc'         User is taken back to Image "4"
//
// ==/UserScript==

(function() {

  // GreaseMonkey (Firefox - unsafeWindow) and GreaseKit (Safari - window)
  var w = ( /a/[-1]=='a') ? unsafeWindow : window;

  // Global States
  var x, y;
  var keypressnumber  = false;
  var builtupnumber   = '';
  var quicknumtimeout = null;
  var imgArr = document.getElementsByClassName("bpImage");

  // Keep the old and create a New Global Keypress listener on top of it
  document.addEventListener("keypress", function(e) {

    // Get the key
    if(!e) e=window.event;
    var key = e.keyCode ? e.keyCode : e.which;

    // Store the current x/y position
    function storePos() {
      x = Math.max(document.documentElement.scrollLeft, document.body.scrollLeft);
      y = Math.max(document.documentElement.scrollTop, document.body.scrollTop);
    }

    // # Character => Jump to that image, Store Position
    if ( key >= 48 && key <= 57 ) {
      if ( e.target.nodeName.match(/TEXTAREA|INPUT/) ) return;
      clearTimeout(quicknumtimeout);
      keypressnumber = true;
      builtupnumber += (key - 0x30);
      quicknumtimeout = setTimeout(function() {
        w['currImg'] = parseInt(builtupnumber,10)-1;
        if (w['currImg'] >= imgArr.length) { w['currImg'] = imgArr.length-1; }
        storePos();
        window.scrollTo(0,imgArr[ w['currImg'] ].offsetTop+174);
        keypressnumber = false;
        builtupnumber = '';
        quicknumtimeout = null;
      }, 300);
    }

    // Esc => Jump back to a Saved Position
    if ( key == 27 ) {
      var tempx = x; var tempy = y; storePos();
      window.scrollTo(tempx, tempy);
    }   

  }, true);

})();

All I do is register a new global keyboard listener to catch numeric keys and act accordingly. I tested thoroughly on Firefox and Safari to make sure it works correctly in all cases. It jumps to the correct image, it maintains the “current image” so j/k will still work, it won’t jump if you’re typing in a textfield/input, etc. It even properly handles situations that the current j/k functionality doesn’t. For instance mine allows the user to type in the comment box, click outside the box and reuse the keyboard shortcuts. When you type a j/k or even click inside the search box at the top the j/k functionality is gone. I didn’t feel like correcting that in this Greasemonkey script in case it gets fixed by the developers behind the Big Picture. (Note to those developers: reset isLoaded back to true or take a different approach.)

I’m open to New Ideas. I have some myself but I have to focus on schoolwork in these next few weeks. Let me know if you want anything.

Big Picture Keyboard Commands Greasemonkey Script!

Enjoy.

Always Have Correct Footer Dates

One thing that really bothers me is when the year changes (2008 to 2009) and I see a ton of websites that sport “© 2008″ in their footer. So, I wanted to share my php code to handle this case so that my sites are always “up to date.”

<?php

/*
  Given Start Year is 2007
  Pretend Current Year is 2007
    => Output is: "2007"
  Pretend Current Year is 2009
    => Output is: "2007-2009"
*/
function footerDate($startYear, $delim='-') {
  $currYear = date('Y');
  if ( intval($currYear) > intval($startYear) ) {
    return $startYear . $delim . $currYear;
  } else {
    return $startYear;
  }
}

?>

<!-- Examples Usage -->
<p>&copy; <?php echo footerDate('2009'); ?> Joseph Pecoraro</p>
<p>&copy; <?php echo footerDate('2007'); ?> Joseph Pecoraro</p>
<p>&copy; <?php echo footerDate('2007', ' to '); ?> Joseph Pecoraro</p>

And the ugly 1 liner if you only need a little clip of php:

<?php $c=date('Y');echo '2009'.((intval($c)>2009)?' - '.$c:''); ?>

So please, update your footers now so that you never have to update them again! Thanks.

My Perl and Ruby Story

I’m not old enough to have grown attached to Perl. When I learned Perl people were already looking at Ruby and Python and proclaiming glorious victories. At that same time I saw people gripping Perl, not willing to leave their witty, terse confidant for those prettier, risky “New Kids on the Block.” I had dabbled with PHP and wanted more power. Perl was the natural choice. Previous experience, I thought, told me that I wanted a programming language where symbols, not words, held power. I quickly saw that in Perl. There were scripts nearly 50% symbols performing enticing feats with less characters then a Java Hello World program. So I took a look at Perl.

Summer Dreams Ripped at the Seams

As is the case with most Perl programmers I felt right at home. Perl has a tendency to do whatever you, the programmer, want’s it to do. “Do you want to suddenly make this variable in the middle of an if statement’s condition? Sure, why not, you’re the boss!” I instantly fell in love with its regular expression support. Learning to exploit the s/// statement alone has probably had far more impact on my entire programming career then any college course I’ve taken. The idea of a default variable was a poison I was happy to swallow. It just made so much sense.

# A Perl Script I wrote... Usage is not important
while (<>) {
   print "n$1: $2n" if /(.*?)s*w+ d+, (d+:d+)s*$/;
   print if ! /s*d+:d+s*$/;
}

My interest in Perl stopped when I saw its Object Oriented behavior. However, by that time I was far more interested in its scripting power. I took a very close look at Perl’s syntax and grammar. I knew every optional space… every optional semicolon… how to reduce an entire program to a single line and then reduce it further and further with dirty tricks. I ate up as many of Perl’s special variables as I could, spitting them out into my scripts. Why write “n” when $/ is two less characters… Essentially I learned to “Perl Golf.” It is the art of dwindling a simple script program down to as few characters as possible. This “black magic” side of Perl that got criticized by so many was a dangerous, but fun, sandbox that I spent my time in. Then, it all stopped.

$_=chr 123*rand,/[da-z]/i?$a.=$_:redo for 1..pop;print$a,$/

Normally, whenever I had a few weeks I would spend them learning a new programming language for fun. When I learned Perl I had three weeks off from school before I was to start my first Co-op. Time flew and my “fling” with Perl had to end. Little did I know that Perl left a rash on me that needed to be scratched. This disease made other programming languages look disgusting. Why is it so hard to parse files in Java, why is there no built-in regular expressions support in C++, what is wrong with these languages?!

A New Direction

Time went on. I was given another two weeks. Around that time I had made a decision that I would invest my personal time into web development. I spent hours and hours reading, learning, experimenting, grasping, producing, and all that jazz. It was reaching the end of those two weeks and one thing had been continually pounded into my head… Ruby on Rails was going to be big. I was young, I wanted in! With little time left in those two weeks I figured my time would be best spent focusing on Ruby and investing in Rails when I got more time. I honestly haven’t gotten around to learning Rails yet because I was so enamored with Ruby.

Learning Ruby was initially painful. I’ll admit it. I took a pass at it and was disgusted. To be honest, I can’t remember what was so painful. For one, I was intimidated by its block structure. Being as immature as I was, it felt too verbose… like a sore thumb in the language. But what the hell did I know? I still gave it a shot.

Why's Poignant Guide To Ruby

This book had a greater impact on me then just teaching me Ruby. It reaffirmed what I had seen from all over the web. Ruby users had a sense of humor, they tended toward creative expressiveness instead of technical explicitness, they invested in fun, they were not satisfied with the mundane. Also, the idea that a tutorial could be a story changed the way I thought about teaching in general. I made it far enough to drop my misapprehensions about Ruby’s syntax and embrace something Ruby did well at, Objects.

Perl Replacement

The more I used Ruby the more I realized that I wouldn’t need Perl. All my special variables were there. Replacing s/// with gsub was painful, but undeniably cleaner. I had matured slightly as a programmer and saw the need for code clarity and I now enjoyed Ruby’s block structure. When I challenged myself to create a simple Object Oriented program I saw how Ruby’s classes removed a lot of the repetition I normally put into Java classes. I was pleased.

I have to mention irb, Ruby’s interactive interpreter. This tool has proven invaluable to me. I almost always have an irb prompt open at all times. It is my calculator, my hacking companion, my playground, a debugger, and a friend. Perl lacks such an environment. I’ve had to really work to get the perlconsole working and Perl’s debug REPL just isn’t nearly the same. I’ve been able to expand and customize irb into an indispensable tool that can help me on any project. I think thats important, because with my Generation-X attention span, being able to copy.paste.sort.filter.reorder any text at a moments notice is “clutch.”

irb

Finally I was impressed with ruby gems. Need an html/xml parsing library and you’re unhappy with Ruby’s? Fine, `gem install hpricot`. Its so easy to download a new library, or even a new tool like `gem install cheat`. I liked the idea of distributing ruby scripts via gems that I even made my script a ruby gem (regex_replace). That way I can quickly download/install on any machine that I need it. The ability to build off of existing Ruby libraries is exciting.

The Beginning

As clichéd as that sounds, I feel like it can only get better. I’ve gotten comfortable enough with Ruby that I now write all my day-to-day scripts in it. I’m truly glad that I learned Perl, because it introduced me to the scripting world, and it is a unix staple. But, like so many others, it seems like there is a brighter path. I think thats more then enough for today.

Cheers.

search