Optimizing the js1k site

2010-11-29

If you mention framesets to a random web developer he will probably laugh in your face. Well, he should anyways. Framesets are so Geocities era, and even they used iframes. However, for js1k I put myself in a corner I could only get out of using framesets. Whilst an acceptable solution, there is no doubt in my mind that tripling the requests for each demo on an already bad performing backend was not likely in the best interest of the site. The (true) slashdot effect knocked me down for about a week, although that's probably due to other things.

This is a long article. TL;DR? I've added a note on how to get the frames of a frameset inline in the frameset document, reducing the number of requests for such page to one request rather than three. This article will go into detail on what steps I've taken to optimize the js1k.com website.

So when I redesigned (technically) the js1k site I wanted to take its history into account and go for optimization and caching. There were a few easy wins to make here.

First of all, every request was talking to my database to gather stats. Whilst this gave me great insight into traffic to the site, it put a large burden on the database. Being shared hosting, it's already not something to be considered very stable anyways and tripling the request for each demo probably was the last straw in more than one occasion. First win; use analytics.

Secondly, and I think this brought the server to it's knees in the final showdown, the demos page had about 400 individual images (one thumb for each demo) totaling about 4mb. That's per request. So looking back, it was actually quite surprising that the server last the way it did. And although you might be tempted to say 400 is an obvious bad design decision, I'd like to remind you that the whole thing was a quick hack, not designed to support the overwhelming popularity it eventually received. After (or during) the real slashdot effect I checked out that pain point. Of course, by then it was far too late. Thousands of people were causing millions of requests. I basically just ddossed myself. Ouch.

The straightforward fix was to use spriting. But even the sprite image took up about 2mb. So I changed the format to gif, reduced the number of colors and other things. Now it's a gif of about 350k. Still a lot, but since it represents the thumbs for about 400 or 500 demo's, I think it's acceptable. Especially since it takes just one request now ;)

Thirdly, the page needs no database at all. Even though all the demo's and submitted (meta) content is saved in a database, it doesn't change unless I tell it to change. No public requests to the site will make changes though. So a static pre cached website was an obvious next choice. Before the contest I hadn't put a lot of effort into the site (just a simple submit). During the contest I simply didn't have the time to make the neccessary changes. After the contest there was no point (until the delayed slashdot effect hit, I guess, and by then it was far too late).

But getting ready for the xmas special I wanted to take the above knowledge and change the setup for the site. Everything should be precached and the stats should be handled by Google Analytics. I was very hesitant to use GA but decided it was probably the best choice as I couldn't handle the database load required. The reason is that I'm losing control and am dependant on another service (even if it's google) having no control over what's going on. That bothers me but whatever. Now two days in, I still don't have any stats and am frustrated with the limited support you can get on GA. So there.

Precaching was quite easy to setup in the end. I won't go into too many details, but it's basically just creating about 2500 plaintext html files.

Then there's the frameset. I can't help the frameset. I had to use it because I needed to introduce some kind of navigation bar into the demo pages without altering the environment for the demo's. On a page where hacks are relying on various edge case conditions of the web page, you really do need a sandboxed page. At first I wanted to use an iframe, but I couldn't get it reliably fill the remainder of the screen (with the nav header). It was simply a no-go.

So the only (viable) alternative I could think of was to use one of those ancient framesets. Putting the nav in the top frame and the rest in the bottom. In retrospect, framesets are actually one of the very few stable alternatives of reaching my goal. An obvious drawback is that the frameset takes in fact three page requests. I'd rather not. So for the new version I wanted to at least try to reduce this number.

My first attempt was using data uri's. Using data:text/html,<html...> for the source element would effectively allow you to reduce the page to a single resource. Nice. So what's the support on this? Not bad actually. All current browsers support it, except for IE, which was almost too predictable. However, it turns out that IE8 explicitly forbids using datauri's as url content. And as I suspected, they didn't change this policy for IE9.

Now js1k does not target IE. The first edition obviously didn't because IE8 lacks canvas. But it still doesn't, because IE9 is still a beta (it also doesn't require beta versions of the other browsers). Maybe the next version will require IE9 as well though, depending on how I feel about its final support.

But regardless of that line of thought, I know people are going to try demo's in IE9 just for the sake of "will it work". I did not want to force them to jump through hoops just because of this. So I wanted a different approach. I wasn't sure there was an alternative though. You can't literally inline html in the src attribute and document.writing the content would give dire problems with getting the existing demo's to work properly.

Then I received a tweet from Christopher. He told me about a method using src="javascript:'<html....>'" that ought to work. I never heard of it but tried anyways. By George, it worked! And cross browser too! And even under IE! I was impressed, and a bit baffled. What the hell kind of standard did this originate from? Well, I guess I should not be complaining, let's put this to good use.

Now the data uri would have allowed me to encode everything in base64 and be done with everything else. Easypeasy. For this new approach I couldn't do that, because it obviously doesn't support base64. In the end, there are about 5 steps I had to take to get things right.

The first one is obvious. A frame element looks like <frame ... src="javascript:'....'" ... />. So the single and double quote were both used. No way to hide one in the other. Even though the single quoted string was javascripty, the double quotes were still seen by the html parser as part of the html instead of part of a js string. Not quite surprising, escaping them with a backslash didn't help. It turns out you need to use html entities to escape them (using &quote;).

I figured, okay cool. So I can just convert all silly chars to html entities and be done with it? Wrong. Only the double quote appears to be (even) required to appear as an html entity. Anything else uses different approaches. Take the single quote for instance. Since the double quotes actually wrapped the js part, I hoped it would now in fact be under js rules. It was. Escaping the quote with a single backslash worked wonders. The obvious next candidate were newlines, as they are pretty much unacceptable in js strings. They were quickly disposed of with \n's. The html was accepted now, albeit very inconsistently. Some pages would be okay while a lot of others were not.

Usually the only hint you get from browsers was that a semi-colon was missing from a statement. Big help that is. Debugging is virtually impossible because the inspectors are incapable of displaying the source of an inlined html page. For them it's clearly an edge case they don't know or care about. They probably have different mechanisms of fetching the source and this isn't one of them. For debugging problems that turned out to be a bitch though.

Luckily the dom inspector did properly show the contents of a script tag after decoding the many layers of encoding. So that was in fact a help. However, my test cases all consisted of extremely minified 1k demo's. Good luck finding your way through that. *sigh*

So after debugging a few demo's with trial and error of removing parts of them, I ended up with the % being the cause of problems. Luckily this is not my first encounter with url encoding, so I quickly resolved that by replacing all %'s with %25's. As you might expect that's the url escaped version of %. This seemed to fix pretty much all the dmeo's for me on firefox. Hurray! But I certainly wasn't finished yet.

Even though firefox and opera stopped complaining, webkit still refused quite a few demo's and completely b0rked on setting the width/height of the canvas element properly. At least compared to firefox and opera. So I had to fix that first. Luckily I started with a broken demo which exposed the issue quite fast, even though it confused me a little more than it probably should have. It started with a construct like with(document)width(getContext('2d'))width=innerWidth,height=innerHeight,.... It turned out innerWidth and innerHeight were zero on webkit where they'd be properly computed on other browsers. Wtf?

So at first I was confused. I thought those were properties of the document and/or the context objects. Later I realized that they were implied globals through window. Kind of silly, but I was thinking in context of with. So then the reduction game began. Getting a minimal test case as to pinpoint the problem. The minimal testcase was reached soon enough. src="javascript:'<script>console.log(innerWidth,innerHeight);'". This still printed 0,0 for webkit and the correct value for the others. Meh? Was I screwed here?

I didn't give up just yet though. This in fact wasn't the most minimal test case. The frameset itself also contained a script to load GA. Removing that suddenly fixed the problem for webkit! Seriously, wtf. This put me in a tight spot. I tried moving the script inline one of the frames, but the problems persisted. Within the nav frame the same problem persisted, within the demo frame you would not have the same structure of the html and certain demo's would trip over that. The dynamic (async!) script was causing webkit to screw up and I simply didn't have a remidy for it!

Since the problem seemed to be that the document wasn't quite "ready" yet, I thought it might be a good idea to postpone running the demo. About 50ms seemed to be a workable solution. Many demo's suddenly worked properly again! I do say many because every now and then there was still one that would screw up. This is actually where the hackish fragile nature of the demo's showed through. After a bit of fiddling I noticed one failing demo had a document.write. Oh fuck, those can't be run after the closing body tag. Putting a timer on the demo effectively did, that's notably different from the initial setup (because the original shim would execute the demo before closing the document). So even though the timer was reasonably reliable (the time was still an arbitrary works-for-me choice) I couldn't use it because it's virtually imossible to detect whether a script used document.write. And even in that case, I can't use it because those same demo's would still have the same problem. Back to the drawing board for me.

At some point I discovered that merely querying document.body.clientWidth would in fact set window.innerWidth (and height) properly. However, it suffered from another drawback I've forgotten. I didn't end up using that hack.

I did end up reversing the process; setting a timer to load GA. That actually led me to figure out a much cleaner way of loading GA. The default snippet google gives you requires you to expose a global array which would be read by the script and processed accordingly. Of course the globals were a pain (even if it was a js1k compo) but more importantly, I would not be able to use this in future contests because scripts might alter the exposed global and interfere with GA. I know, it would probably not have been such a big deal, but I didn't want to even go there in the first place.

Luckily Kyle pointed me to a few github examples using labjs queue loader. And although I wasn't planning on including a script loader just to include frigin GA (talk about overkill), it did show me the obvious; the globals were caching the functions to execute onload, with their arguments. So using that I quickly formed the following GA loading snippet...

Code: (js)
new function(){
var ga = document.createElement('script');
ga.async = true;
ga.defer = true;
ga.src = 'http://www.google-analytics.com/ga.js';
ga.onload = function(){try{_gat._getTracker('UA-xxxxxxxx-x')._trackPageview();}catch(e){window.console&&console.log("ga fail :'( ");};};
var s = document.getElementsByTagName('script')[0];
s.parentNode.insertBefore(ga, s);
};

No globals, no nothing. A clean closure with a callback. Works for me, at least I think. Since I still can't access my stats it's obvious something is wrong. But when I debug I clearly see the ga.js script and the token being requested from GA. So I don't know why it wouldn't register a hit, nor know why GA won't show me stats. Grrrr. On with the story.

So wrapping that snippet with a timer didn't screw up the dom for webkit before the demo could load. And it didn't mess with anything else. GA still loads and I should still get my stats. With that, I thought my last problem was fixed. Flipping through a couple of random demo's didn't show me any broken demo's that didn't already break before. I was relieved. I finally to get this hack to work. Time was against me as I had (I guess kind of stupidly, time-wise) already said I'd be doing another js1k compo.

So I put the whole thing live, confident things were okay. Well, they weren't. For some reason, certain demo's would suddenly cease to function, again. Investigation led me to believe it had something to do with encoding. Back to that now? I thought I had covered my bases. After spending a few hours with php and utf8 (always fun...) I gave up on the matter. At least I fixed the utf8 characters on all the other pages.

Later, Sjoerd thought it might have something to do with not having encoded (enough) special chars. And even though fully uri-encoding the src contents didn't work, I tried. But I hadn't bothered to encode other characters, besides %. So I encoded ? and # as well. And everything worked! I was so happy right then. I wasn't sure whether the hack was actually going to work, but it had.

So all the demo's from the first compo seem to be working now, using the inline frameset hack. The only thing that's failing now is setting focus to the bottom frame. I believe that's because both frames have their own origin, which isn't equal to each other. Will check that out some time, but it's not a high priority for me.

Everything seems to work on the non-ie's and IE9 seems to load stuff but not actually run everything. Haven't taken the time to investigate yet.

So that's how I set up the new JS1K site :)