Articles with tag ‘web‘

 
 

Why CDATA Matters in XML

You’ve seen it before, but you may not know what it means. Wikipedia describes CDATA as meaning “Character Data” which makes sense. w3school goes one step further and points out that this text should not be parsed by the XML Parser. The general idea is when you want to display straight textual data, without needing to encode characters or wanting them interpreted by the parser, you can just wrap that data inside of a CDATA tag.

Needless to say this is clearly the ugliest tag currently in existence (lets leave room for the future though):

<![CDATA[ ... ]]>

I promised to tell you why it mattered

Yes, words in the title become a promise. You can hold me to that in the future. Why does CDATA matter? Well, I’ll actually side-step the question for a minute and show you what looks like a perfectly fine looking XHTML document (keep in mind that XHTML is a subset of XML):

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
  "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
  <meta http-equiv="Content-type" content="text/html; charset=utf-8" />
  <title>Simple Example</title>
</head>
<body>
  <h1>Welcome</h1>
  <script type="text/javascript">
    document.write('<p>Hello, World</p>');
  </script>
</body>
</html>

Looks simple enough. We can ignore the fact that its not really the best way of doing things, but who cares… its a Hello World example right? Well, technically this is not Valid XHTML. The validator shows we’ve got a single error:

Line 11, Column 20: document type does not allow element "p" here.
document.write('<p>Hello, World</p>');

The element named above was found in a context where it is not allowed. This could mean that you have incorrectly nested elements — such as a “style” element in the “body” section instead of inside “head” — or two elements that overlap (which is not allowed).

Well, the validator hints at the cause of the error, but it is hard to understand unless you really know your XML! We are inside a “<script>” tag, we’ve got some text, and all of a sudden a “<p>” pops up! XML Parsers don’t care that its in the middle of a string, it sees another tag and that tag doesn’t make sense.

So, here is why CDATA matters. You want your Javascript to be left alone by the XML Parser. In HTML, Javascript is interpreted as text, so it can just be left alone as plain text by wrapping it in a CDATA tag:

<script type="text/javascript">
  <![CDATA[
    document.write('<p>Hello, World</p>');
  ]]>
</script>

Don’t get excited yet. Yes, that passes the W3C validator but the javascript fails to run. Why? Well, I actually haven’t got a clue. My guess would be its not stripped out of the javascript and invalidates the javascript when it tries to run. In any event, lets check our steps… Is it valid xml? Check. Is it being served as xml? Well, actually I don’t think so.

Here is a nice resource that talks about Understanding HTML, XML, and XHTML. If you haven’t read that article either read it now or once you’re done here; its important, no matter how old it is. To pull a quote:

to really send xhtml, an xhtml page must be served as xml and therefore have one of the following Content-Type’s (text/xml, application/xml, application/xhtml+xml) to a browser.

This is a simple one liner in php. I added the following code to the top of the page, and resent it to my browser:

<?php header('Content-type: application/xhtml+xml'); ?>

Doh. Well, we’ve covered all the bases and it still doesn’t work. Its valid, its sent as xhtml, now everything is left up to the browser and it doesn’t seem to work. If you know why, drop a comment. Again my suspicion is that the browsers don’t properly handle XHTML and CDATA completely. However, there is a pretty nice trick that we can make use of to get this to work and validate (even sent as text/html):

<script type="text/javascript">
  // <![CDATA[
    document.write('<p>Hello, World</p>');
  // ]]>
</script>

Well there you go. A 100% valid page, that runs in all browsers, that properly tells the XML Parser “hey, leave these characters alone” and it works. The problem is identifying when this is necessary. For most people, having the original page, which rendered correctly but didn’t validate would be enough. Browser developers are watching out for you and working around mistakes in HTML and XHTML. However, that isn’t always the case.

Real World Example

Here I’ll pull a real world example. Some XML Specifications allow the ability to send XML under a different namespace as content inside of an existing XML tag. Some do so in a “psuedo” way. Take a look at the Atom Publishing Protocol (commonly referred to as Atompub or APP for short). Here is a snippet from the RFC describing the Atom Syndication Format, specifically the structure for an atom:title tag with type=”html” within of an atom:entry:

...
<title type="html">
  Less: &lt;em> &amp;lt; &lt;/em>
</title>
...

If the value of "type" is "html", the content of the Text construct MUST NOT contain child elements and SHOULD be suitable for handling as HTML [HTML]. Any markup within MUST be escaped; for example, "<br>" as "&lt;br>". HTML markup within SHOULD be such that it could validly appear directly within an HTML <DIV> element, after unescaping. Atom Processors that display such content MAY use that markup to aid in its display.

Okay, sorry for the long setup, but we have finally arrived at the point of this post. That type=”html” element cannot have child elements. The XML parser will identify child elements based on a “<" character. Assuming whatever project you would be working on takes that input from the user that means you would have to pass it through a filter, encoding HTML characters like ampersands, less than and greater than signs, the list goes on. That operation is expensive and may even cause problems in itself. I ran into a situation just the other day where an ampersand for an encoded character (like the & inside of an &amp;) was causing errors by itself.

The solution is to make the XML Parsers ignore the data by wrapping it in a CDATA tag. Lets take the above example and show how it could be done much easier:

...
<title type="html">
  <![CDATA[ Less: <em> &lt; </em> ]]>
</title>
...

Easier to understand? You betcha. Less costly for developers? Of course. So CDATA is there to help, not hurt. Don’t look at its ugly face and think of it as a hack, look deeper and you will see its purpose and power. Okay, I admit that sounds a little corny, but it could have been worse.

Side note, Javascript

As managers everywhere throw out buzz words like AJAX and encourage you to participate in new web 2.0 project ideas you’re going to end up sending and receiving XML requests with a server using the good old XMLHttpRequest object. Well if encoding isn’t enough of a problem (and I’m still wrapping my head around it) you might get struck with a problem like the above case and want to make use of your knowledge with CDATA.

Well, you’re in luck. xmlDocument.createCDATASection(…) is part of the Level 2 DOM ECMAScript Spec. Use it just like a createTextNode():

//
// Create an atom:title element with html content
// assume xmlDocument is already an XML Document object
// and entry is an atom:entry element in that document
//
// <entry>
//   <title type="html"><![CDATA[<em> &lt; </em>]]></title>
// <entry>
//
ATOM_NS = "http://www.w3.org/2005/Atom";
var node = xmlDocument.createElementNS(ATOM_NS, "title");
node.setAttribute("type", "html");
var cdata = xmlDocument.createCDATASection("<em> &lt; </em>");
node.appendChild(cdata);
entry.appendChild(node);

Now all I have to learn is encoding, and how each browser deals with it differently. That is an entirely new realm that I don’t expect to cover in single week, but I’ll report back with my findings. Until then, don’t get caught up on the little things!

JavaScript Sort an Array of Objects

I ran into an interesting problem today. I had an array of objects that I wanted sorted on a certain property. My obvious thought didn’t work! (Update: I got a comment below from Peter Michaux who points out a nicer solution, it is included here:)

// Array of Objects
var obj_arr = [ { age: 21, name: "Larry" },
                { age: 34, name: "Curly" },
                { age: 10, name: "Moe" } ];

// This doesn't work!
obj_arr.sort( function(a,b) { return a.name < b.name; });

// This does work! (Peter's update, very fast)
obj_arr1.sort(function(a,b) { return a.name < b.name ? -1 :
                                     a.name > b.name ?  1 : 0; });

That kind of frustrated me. Sorting is one of those things I expect to be available in all languages. I don’t want to have to write a sorting algorithm every time I need to sort. So I looked into things, pulled up a Javascript Quicksort Algorithm and manipulated it to support any compare function.

Now that I have the freedom to truly write a compare function that works for objects! I also changed around certain parts of the code I found online to actually extend the Array class and make the extra functions hidden. Take a look at the sample usage:

// Defaults to (a<=b) sorting.  Great for numbers.
var arr = [1234, 2346, 21234, 3456, 32134, 3456, 1234, 2345, 23, 42523, 1234, 345];

// Object Array
var obj_arr = [ { age: 21, name: "Larry" },
                { age: 34, name: "Curly" },
                { age: 10, name: "Moe" } ];

arr.quick_sort();
// => [23, 345, 1234, 1234, 1234, 2345, 2346, 3456, 3456, 21234, 32134, 42523]

obj_arr.quick_sort(function(a,b) { return a.name < b.name });
// => Curly, Larry, Moe

obj_arr.quick_sort(function(a,b) { return a.age < b.age });
// => Moe (10), Larry (21), Curly (34)

For those who want to see the code be glad, its free. I carried the copyright with it but its rather loose. Grab the JavaScript Source Here! Enjoy:

Array.prototype.swap=function(a, b) {
  var tmp=this[a];
  this[a]=this[b];
  this[b]=tmp;
}

Array.prototype.quick_sort = function(compareFunction) {

  function partition(array, compareFunction, begin, end, pivot) {
    var piv = array[pivot];
    array.swap(pivot, end-1);
    var store = begin;
    for (var ix = begin; ix < end-1; ++ix) {
      if ( compareFunction(array[ix], piv) ) {
        array.swap(store, ix);
        ++store;
      }
    }
    array.swap(end-1, store);
    return store;
  }

  function qsort(array, compareFunction, begin, end) {
    if ( end-1 > begin ) {
      var pivot = begin + Math.floor(Math.random() * (end-begin));
      pivot = partition(array, compareFunction, begin, end, pivot);
      qsort(array, compareFunction, begin, pivot);
      qsort(array, compareFunction, pivot+1, end);
    }
  }

  if ( compareFunction == null ) {
    compareFunction = function(a,b) { return a<=b; };
  }
  qsort(this, compareFunction, 0, this.length);

}

Update

Peter Michaux pointed out something very important. The sort() function can be made to work if it returns numeric output (-1,0,1). His approach is far superior. Here was a benchmark I took:

var obj_arr1 = [];
var obj_arr2 = [];
var filler = [ { age: 21, name: "Larry" },
               { age: 34, name: "Curly" },
               { age: 10, name: "Moe" } ];
for (var i=0; i<5000; i++) {
  rand = Math.floor( Math.random() * 3 );
  obj_arr1.push( filler[rand] );
  obj_arr2.push( filler[rand] );
}

var s = new Date();
obj_arr1.sort(function(a,b) { return a.name < b.name ? -1 : a.name > b.name ? 1 : 0; });
var e = new Date();
console.log(e.getTime()-s.getTime()); // => 75 ms

s = new Date();
obj_arr2.quick_sort(function(a,b) { return a.name < b.name });
e = new Date();
console.log(e.getTime()-s.getTime()); //  => 4444 ms

That shows drastic differences for arrays as large as 5000 elements (with not too random data). 75 ms versus 4444 ms (over 4 seconds). Doing the math: (4444/75) => 59.253 times better! Moral of the story, don’t rush into thinking something doesn’t exist!

So if that’s the way to do it, then I want to make it easier on me. My arrays are generally going to be under 100 in size, and at such a size building a function dynamically instead of writing a custom function works just about as well (although if you were using objects, polymorphism and a compare function would be the best way to go). Here is a simple function I can use to more quickly build compare functions in order to ascend sort an array on multiple properties!

function buildCompareFunction(arr) {
  if (arr && arr.length > 0) {
    return function(a,b) {
      var asub, bsub, prop;
      for (var i=0; i<arr.length; i++) {
        prop = arr[i];
        asub = a[prop];
        bsub = b[prop];
        if ( asub < bsub )
          return -1;
        if ( asub > bsub )
          return 1;
      }
      return 0;
    }
  } else {
    return function(a,b) { return a<=b; };
  }
}

Sample usage would be:

var obj_arr = [
  { name: 'Joe',   age: 20 },
  { name: 'Joe',   age: 10 },
  { name: 'Joe',   age: 30 },
  { name: 'Joe',   age: 40 },
  { name: 'Joe',   age: 20 },
  { name: 'Joe',   age: 15 },
  { name: 'Joe',   age: 35 },
  { name: 'Joe',   age: 25 },
  { name: 'Bill',  age: 5 },
  { name: 'Barry', age: 20 },
  { name: 'Paul',  age: 20 },
  { name: 'Peter', age: 1 },
  { name: 'Smith', age: 25 },
  { name: 'Kary',  age: 30 }
];

obj_arr.sort( buildCompareFunction(['name','age']) );

Firebug Feature - Open With Editor

This one was news to me but it just made my day (and not a minute too late)!. I used to have so much trouble copying and pasting code from the Firebug (now Firefox 3 compatible) console. The paste used to have no formatting or indention and sometimes there was line numbers… Well one more problem has been solved. Check this out:

Firebug\'s \

Checking the changelogs in the repository for Firebug shows that it was included in ReleaseNotes_1.1.txt. I can’t believe I missed it!

Now all thats left is copying and pasting from the HTML’s Style textarea on the right. Still, I think this is a nice small step forward.

Clicky Greasemonkey Menu Script Updated

I noticed that I can’t really live without this script. Clicking those menus is just too difficult. The updates made to Clicky January 19th changed the way they handled their dropdowns. Clicky wrapped them into objects and added a bunch of functionality to make their content updateable via AJAX requests for their new filters. Some spiffy updates but code that I could not use from within Greasemonkey!

So have at it. The user script is available right here.

If you look at the script you will see that I handle opening and closing of the menus much like the Clicky’s object code does it. There is a variable to hold the current open menu if there is one so that opening a different menu closes any other open menu. I only add an onmouseover event and jimmy in my own close code when you click the close menu button. This works around Clicky’s object code without breaking anything that I can see.

Greasemonkey - Clicky Mouseover Menus

I have been a proponent of the Get Clicky web statistics since they first came out. They have been constantly improving them since day one, and I have enjoyed every update. One improvement that I especially liked were two menus in the top right. One that lists each of your websites being monitored by GetClicky so you can quickly jump between them, and one to change the date of the stats you’re currently viewing (if you’re only looking at a single day).

Clicky Menus

These menus currently only activate when you click them. Well, the arrow is there and it so nicely indicates to me that it should be a dropdown. So I wrote up a script to copy to onclick event to the onmouseover. This may be outdated pretty soon but it is one of my first scripts and it gives me a little joy each time I use it.

You’re going to need Firefox with the Greasemonkey extension and of course Site Statistics (available free) at Get Clicky.

So without further ado:
Greasemonkey Script to turn Clicky Menus into Mouseovers

Some recent fun with JavaScript at work and a co-workers really neat Greasemonkey script for Revealing Experts Exchange comments propelled me to write this very simple script. Mine pales in comparison to CoderJoe’s EE script and his knowledge of JavaScript and the DOM are quite vast. I consider this a first step into some individual JavaScript coding.

Renderize Launched

As a personal project I started designing a website this April. Until then I had been using open source designs as the basis for all of my websites. Don’t get me wrong, I am fluent with XHTML/CSS/JS its just I lack the image editing skills and creative juices to design websites that are up to my caliber and liking.

That is up until this week, when I released –>Renderize<-. Some may remember that I designed the layout and style one weekend a few months ago and that I posted about it. Well I spent a number of weekends developing a custom backend so that I can manipulate nearly every portion of the site... your basic content management system. So Renderize is fully capable of being a "blog" and I have decided to commit myself to weekly postings on it.

Oh, and I don't bother with supporting IE on a website that I am going to use almost exclusively. The site should work in IE7, but I know there will be problems in IE6. I should point out that Safari and Firefox render the site perfectly, and that makes me more then happy.

However the posts contained at Renderize are not going to be web development content. Any suggestions of that will soon be removed from Renderize's content, as I have decided to keep my programming and developer posts on this blog because I have already built up a number of links and content at this blog, it would be a shame to duplicate or move the blog elsewhere. Not to mention the completeness of WordPress is pretty nice. Renderize will host my more personal blogging articles, for instance my weekly activities and personal events. Less likely to interest the techies that hopefully subscribe/visit this site.

Renderize does give me a chance to practice implementing many website features, and thus giving me an opportunity to blog about them. For instance I developed the RSS generator for Renderize, and I may very well discuss that in one of my programming blogs here. I have a sandbox to play and experiment with and some pride in having actually sitting up straight and making something happen.

Some final comments:

  • Developing on a Mac (in TextMate) is awesome
  • Building a framework for your own personal website is super cool
  • Turning a Concept into a Reality is way more impressive then just suggesting the Concept

So thanks for reading, check out the blog and leave a comment here if you like it!

Awesome Site Stats - Get Clicky

If you have a website it is only natural that you would like to know the people who visit your site. Where are