Web Analytics Taken to the Next Level

I came up with a neat idea the other night. Using localStorage and sessionStorage you could theoretically monitor the number of tabs or windows a visitor has opened for your site. As far as I know, this capability has never before been possible. Well, now it is.

Check out this crude example.

Once you open the monitor, leave that tab/window put. Use the “Spawn Tab” link to create new tabs and windows. The monitor will be notified and display some simple debug. There are 10 second updates per tab/window so that when they close the monitor can detect it. The monitor will detect a close within 15 seconds of the tab/window closing and will display the total time the tab/window plus or minus 10 seconds. Correct values are maintained as the tab/windows browse across pages as long as they stay on the domain! Just about everything you’d want or need.

Again, I mentioned this is rather crude. The fact that the monitor tab remains open is only due to the fact that I wanted to prototype the idea. The majority of the state is stored in localStorage, and each tab/window maintains a single identifier in its sessionStorage to remind the tab/window what id it was while it navigates to multiple pages. Because everything is stored in the storage this system has the capability to become completely distributed. Meaning no “monitor” tab is necessary, and the scripts can determine, and monitor, on their own the existence of all other tabs. Thus, this would be a viable option for the next level of web analytics.

As cool as this is, I don’t think it will provide too much value to the analytics. For the first time webmasters will be able to know how many windows or tabs a visitor opens (and to what pages they open). The webmaster will know more about how its user’s use the website, but I don’t think this statistic will be a game changer. Who knows!

So, how does it work? Very simple. Each tab includes the client.js code to handle updating the localStorage and maintaining its own “tab_id” in sessionStorage. Data it maintains can be whatever you want, I went with some simple information such as its start time, current url, and latest keepalive:

localStorage values

The upkeep for a Tab Client is to restore their session information when you navigate to any new page:

// Create or Restore tab_id
var myTabId = sessionStorage.tab;
if ( myTabId === undefined ) {
  var tabs = localStorage.tabs;
  if ( tabs === undefined ) {
    myTabId = 0;
    localStorage.tabs = "0";
  } else {
    var largest = parseInt( tabs.split(/,/).pop(), 10 );
    myTabId = largest+1;
    localStorage.tabs += "," + myTabId;
  }
  sessionStorage.tab = myTabId;
}

And to perform its keepalives:

// Update the Latest Timestamp
function setLatest() {
  var key = 'tab'+myTabId+'_latest';
  localStorage.setItem(key, +(new Date())); 
}

// Update Status every 10 seconds
window.setInterval(setLatest,10000);
setLatest();

I put a few more convenience functions in there to help it update these localStorage keys, and communicate with the monitor which was crudely done through localStorage. That is explained next.

The Tab Monitor as it stands right now receives messages through localStorage’s “storage” event. It also checks all the tab’s “lastest” keepalives to make sure they didn’t pass their 10 second limit. In the case of a tab being closed, it will remove references to that tab and output an approximation of the time the tab was open:

// Listener - receive messages from tabs
window.addEventListener('storage', function(e) {
  if ( e.storageArea === localStorage ) {
    if ( e.key == "tab_msg" ) {
      console.log( e.newValue );
      addMsg( e.newValue ); // Appends to the page
    }
  }
});

// Purger - clean out tabs that died for 15 seconds
window.setInterval(function() {
  console.log("purging");
  var now = +(new Date()),
      tabs = localStorage.tabs;
  if ( tabs !== undefined ) {
    var toRemove = [], toKeep = [];
    tabs = tabs.split(/,/);
    for (var i=0, len=tabs.length; i<len; i++) {
      var tabId = tabs[i],
          tabLatest = parseInt( localStorage.getItem("tab"+tabId+"_latest") );
      ( (now-tabLatest)>=15000 ? toRemove : toKeep ).push(tabId);
    }
    localStorage.tabs = toKeep.join(',');
    for (var i=0, len=toRemove.length; i<len; i++) {
      var tabId = toRemove[i],
          tabLatest = parseInt( localStorage.getItem("tab"+tabId+"_latest") ),
          tabStart = parseInt( localStorage.getItem("tab"+tabId+"_start") ),
          time = (tabLatest-tabStart)/1000;
      addMsg( "Tab " + tabId + " Closed after " + time + " seconds!" );
    }
  }
}, 5000);

This took a little under an hour to get working. There are still minor issues that I didn’t attempt to resolve. However, if there is interest this could be developed into a completely distributed peer-to-peer communication between tabs/windows on a single domain. However, a little warning. Web Storage is not set in stone. Not all browsers have implemented it and the specification is subject to change at any minute. There has been some rather heated debate on the subject of Web Storage recently, with good reason. All I know is that when its settled, this functionality will continue to exist!

Let me know what you think.

Handling the tab key in a <textarea>

Traversing through input elements with the tab key is important for accessibility reasons. However, every once in a while you come across a situation where traversal isn’t really important. Instead, you want the tab key to actually do something for you. Even still, you may want to do something fancy with the tab key. Wether its replacing it with spaces or something else.

I found an interesting website today that had an interesting idea. You could run some test code on the page to test their library. Their instructions said, “push tab to evaluate the code.” Sure enough you could tell it was working “onblur” for the textarea. The problem with this was that when you pushed tab you lost focus.

I thought about it, and figured you could do a rather simple trick to run some code and refocus on the textarea. It goes a little like this:

window.addEventListener('load', function() {
  var textarea = document.getElementById('txt');
  textarea.addEventListener('keydown', function(e) {
    if (e.keyCode === 9) {
      e.preventDefault();
      e.stopPropagation();
      // operation goes here
    }
  });
});

Note that to get the actual character you have to get the character from the event. There are many ways to do it, keyCode, charCode, which, even keyIdentifier. You’ll have to mix things up to work across all browsers. Basically 9 is the code for the tab key. So when you get the tab key, it prevents the default behavior and allows you to execute whatever code you want: run some functions, eval some code, display something, ajax request, whatever you want. Simple. I think it would improve a few interfaces. Neat idea to make use of the tab key to perform a function.

You can check out this example of what I mean.

Visualize the Directory Tree

Ever have to work with a new directory and you have no idea what its structure is? Or maybe you have a few files laying around but you’re not sure which sub-directory they are in? Or maybe you’re showing someone else a project and you want to show them the directory hierarchy. The bare bones solutions of `ls -R` or `find .` are just too archaic and offer no visualization of the structure. To solve this problem, people have built their own “tree” scripts.

There are a few tree scripts available online. Some as simple as find | sed and others are slightly more advanced like a python script. I wasn’t pleased with the existing solutions, so I wrote my own. To get an idea of what I’m talking about take a look at this screenshot showing the listing of a Rails project:

tree

Its a simple, clean listing of the directory tree. I will admit, the style is based off of another tree script that I’ve seen that I liked. Also, this isn’t really production quality code. I take the lazy way out and first get a directory listing and then work from there. This means that for large directories there may be an initial pause before it starts outputting. I wouldn’t suggest running this on your home directory. Although I can think of better algorithms its unlikely that I would want to run this on huge directories so I’m more then happy right now.

The usage is pretty bare bones:

tree usage

This is one more script I’ve added to my ~/bin and its completely open source on GitHub. Its just straight Ruby, no extra packages, works with 1.8 and 1.9. Oh and did I mention its customizable?

I hope you like it.

Markdown => Tutorial in 1 Step

When I wrote my Ruby Readline tutorial I felt I came up with a cool concept. I started with a Markdown file, translated it to html, and I used the headers to generate a Table of Contents on the fly.

table of contents

It didn’t take me long to realize that I could turn this into a framework where I could turn any Markdown file into a tutorial exactly like this one. So with surprisingly little work I modified the scripts to work with any markdown file and the html automatically generated from the standard Markdown.pl script.

I called this markdownorial. Laugh all you want at the name, but I still think the concept is very cool. I’ll probably be using this more and more to automatically generate and format a pretty cool looking tutorial from a single markdown file. The start to finish time for a project like this has instantly dropped to just the raw content part, no design or coding needed!

Advantages include:

  • Writing Markdown is very fast and efficient.

  • Time is spent writing the content. Not messing with design,

  • Table of Contents is automatically built for you.

  • Useful permalinks are automatically generated. Very useful when passing around links.

  • Clean user interface that focuses entirely on the content but the Table of Contents is always available!

  • Git Repository means if I update the design its just a `git pull` away to get the update.

Right now the tutorials shows up elegantly in all standards compliant browsers. Safari/Webkit and Chrome display it perfectly. Opera has some very minor Unicode issues but displays everything perfectly. Firefox has some separate Unicode issues and if you don’t have the latest version it has some working but slow animation. Overall, its entirely usable for people using decent browsers.

Let me know what you think. Feel free to use it and improve it. Its all up on Github.

Check if Your Ruby Script is in a Pipe

Short and simple this week. When you’re writing a Ruby script you might want to know if you’re in a pipe or not. One reason might be changing how you buffer your input/output. A number of basic Unix commands do this, they act differently when piped.

I actually wrote a script where I wanted to change the command line arguments if I was in a pipe. This is probably confusing but its really useful! So there is actually an easy way to do this in Ruby:

#!/usr/bin/env ruby
# Prints if the input or output is regular or piped
puts "INPUT:  #{STDIN.tty?  ? "regular" : "pipe"}"
puts "OUTPUT: #{STDOUT.tty? ? "regular" : "pipe"}"

Proof that it works as advertised:

IO#tty?

The documentation on IO#tty? says the following:

Returns true if [the IO stream] is associated with a terminal device (tty), false otherwise.

For those that don’t know, tty is short for teletype writer. This is an old Unix term for an interactive process that takes in user input. Nowadays it almost always means a shell/console/terminal.

Data URIs

I’m guessing that a lot of web developers don’t know about data URIs. Most probably won’t even have to know about them. However, I still want to go over them because I think they are clever and could prove useful. The main advantage I see is the possibility to avoid an HTTP Request, especially for a small image.

Wikipedia’s overview is quite thorough. The article also links to a number of places where you can quickly throw together some data URI’s for any mime type. The most well known is the data URI kitchen.

Quickly, the format of a data URI is as follows:

data:[<MIME-type>][;charset="<encoding>"][;base64],<data>

Note that almost every section is optional! The mime type option means you can do some very interesting things. The normal usage might be to to embed a small image, as a datauri, inside the html document. However, another usage might be to embed an entire HTML document inside a link!

I was looking at the code for some command line utilities that make datauri’s and I learned an easy way to find out the mime type of a file is to use the `file` command:

shell> file sym.png
sym.png: PNG image data, 76 x 78, 8-bit/color RGBA, non-interlaced

shell> file -i sym.png
sym.png: regular file

shell> file --mime sym.png
sym.png: image/png

So using `file –mime` and reading whats after the ‘: ‘ you can easily get a mime-type for any file. Then reading and base64 encoding the content you have a quick hack to quickly create datauris! Here goes:

#!/usr/bin/env ruby
# Quick & Dirty data URI for a file
#
require 'base64'
require 'uri'

# Filename and Contents
filename = ARGV[0]
contents = File.read(filename)
base64 = Base64.encode64(contents).gsub("\n",'')

# Mime Type for the Filename
output = `file --mime #{filename}`
mime = output.match( /: (.*)$/ )[1].downcase.gsub(/\s/,'')

# Make Data URI
datauri = "data:#{mime};base64,#{base64}"

# Output
puts datauri

Some better scripts are available. Here is a Perl implementation and a pretty neat Ruby Implementation.

Freenode JSBot Command Line Script

So over my week break from college I spent a bunch of time in ##javascript learning and helping others with Javascript problems. This was to help me prepare for one of the projects that I’m working on (still to be announced).

One of the things I really liked in ##javascript was the freenode jsbot that could do all sorts of things. It was so useful in fact that I felt I had to have it for when I’m not using IRC. The website mentioned an API, so I dug in.

I wrote a command line jsbot script and added it to my ~/bin:

jsbot

A clip of the source code (yes its horrible… but its so compact!) shows how easy it is to work with JSON in Ruby. Just a few includes and its just as easy as Javascript, without the cross-site request issues:

#!/usr/bin/env ruby
# Author: Joseph Pecoraro
# Date: Friday March 6, 2009
# Description: Simple Interface for the
# really neat jsbot!

require 'rubygems'
require 'open-uri'
require 'json'
require 'cgi'
require File.dirname(__FILE__) + '/escape'


class JSBot

  JSON_PREFIX = 'http://fn-js.info/jsbot.xhr?'
  SITE_PREFIX = 'http://js.isite.net.au/jsbot?'

  def search(str)
    uri = url(str, JSON_PREFIX, "search=")
    JSON.parse( open( uri ).read )
  end

  def show(str)
    uri = url(str, JSON_PREFIX, "show=")
    JSON.parse( open( uri ).read )
  end

  def url(str, u=SITE_PREFIX, q='q=')
    u + q + CGI::escape(str)
  end

end

...

search