Blog

New and improved method for extracting JavaScript from HTML

by | Jun 17, 2011 | Software Development, Webapps | 0 comments

A few weeks back, I wrote about using HTML5 custom data attributes as an enabling mechanism for extracting JavaScript from HTML pages.

Turns out that approach has one significant drawback: HTML attributes are not suitable for storing arbitrary data.  Specifically, we have a need for storing user-entered data and stringified JSON objects; both of which may contain apostrophes and quotes.  Basically, the attribute value is truncated at the first apostrophe or quote, resulting in a depressing situation in terms of being able to parse the JSON:

<body data-titles=’why isn’t storing arbitrary data always the right thing to do?’>

If you see the color highlighting in the HTML sample above, the problem should be obvious: everything after the apostrophe in “isn’t” is truncated from the attribute value.

Yesterday I found out HTML introduces another new data format, one that is more general and much more suited to our needs: HTML5 microdata.  The microdata specification is still in flux; I am writing this blog on June 17 2011, and the latest working draft is dated May 25, only a few weeks ago!  Anyway, microdata offers a much richer data model, and also provides support for standard microdata types; schema.org defines microdata formats for books, movies, people, places, events, products, and many others.  (Plus it has a good writeup on microdata in general).

But I’m getting a little ahead of myself.  I don’t need to define any microdata types (not yet, anyways).  I just need a new way to store some data on my HTML page, so my JavaScript can read it.

Using microdata, my example above looks more like this:

<div id=”data” itemscope=”true” style=”display:none”>
<span itemprop=”titles”>why isn’t storing arbitrary data always the right thing to do?</span>
</div>

The itemscope attribute on the div identifies the div as an item; items have properties (including nested items, but I haven’t used those yet).  The itemprop attribute in the span is the property key, and the text content of the span is the property value.

The awesome thing from my perspective is this: I can put anything in a span.  I’m no longer restricted to the attribute data model; I can put JSON or user-entered data or whichever in there.

So I have these values in HTML, how do I get them from my JavaScript?  The microdata specification includes a DOM API, however no browser whatsoever currently implements this API!  That is how new and in-flux microdata is!

In my short career as a web developer, I already found jQuery is always the answer.  The jquery.microdata.js plugin extends jQuery with functions very close to the microdata spec.  (The only difference I see is the plugin provides a properties function where the spec defines a properties array).   Now I can read microdata in my JavaScript files like so:

myMicrodataItems = $(this).items();
myEncodedTitle = myMicrodataItems.properties(“titles”).itemValue();

The items function gets me a collection of all the items on the page, the properties function gives me access to a specific property, and the itemValue function gets me that property’s value.

With all this in place, I can encode arbitrary data in my HTML basically without restriction, and easily read it from JavaScript, requiring no JavaScript whatsoever on the HTML page.  And this is all using standard HTML5. And I’m ready to use microdata more extensively in our application; we may get a lot of mileage from the schema.org predefined types.

The only fly in the ointment (and it’s a pretty small fly) is that microdata is not ignored by the user agent; so I have to add the “display:none” style to my divs.  Otherwise the encoded data is user-visible.

Categories

0 Comments

Submit a Comment

Your email address will not be published. Required fields are marked *