The whole shebang

My colleagues have been amicably bickering, online and offline, about the use of hashbangs.

Ben Cherry explains that HashBangs are necessary, though not pretty, temporary workarounds to the lack of pushState support, which is part of the HTML5 spec.

Dan Webb explains that this temporary workaround goes against the fundamental rule that "cool URIs don't change", and that they are bad for the web in general.

When we discuss this sort of thing, we're talking about the web in general, not about any Twitter pages, URLs, sites or strategy. Our personal sites are for personal opinions.

Dan and Ben have strong points, but I think the Internet is sufficiently large as to accommodate a variety of architectures. While building one way may simply be better, for excellent reasons, than another way, one is often forced to choose the lesser route due to external constraints.

New Twitter is a good example: a clientside application built using JavaScript. Besides the obvious UI upgrade, there was also a desire to reduce the load on our servers, and leverage our own API. This enabled us to concentrate resources on scaling and bugfixing the API in the backend (good for everyone), while letting frontend developers iterate quickly against an agreed specification. It was a win-win architecture.

The result is an application which needs to allow navigation without server calls. Thus the hashbang in the URL. It's a necessary consequence of the requirements of the project.

Dan does raise some excellent points that the hashbangs cause problems. Returning to a hypothetical example, we might still have reasons not to rearchitect back to the classical page-request method. So we should address the problems Dan raises directly.

1 Crawlers should execute JavaScript

There's essentially no reason why a crawler shouldn't be executing some form of JavaScript. It's mature enough. We're using JavaScript to write out our content, and so the crawler should be able to execute our JavaScript before it evaluates our DOM for content. It merely needs to know when the page is ready for this to happen.

2 Ajax pages need a 'Page Loaded' event

Crawlers and search indexers can and do sometimes implement JavaScript runtimes. However, even in this case there’s no well recognised way to say ‘this is a redirect’ or ‘this content is not found’ in a way that non-humans will understand.

We just need this specification. Let us have a way to signal HTTP response code equivalents like 404, 500 and "Page complete" through manipulation of the DOM with JavaScript.

3 Hashbangs are a fallback

The original hashbang proposal from Google aliases hashbanged urls to escaped_fragment querystring parameters, allowing it to read the content of hashbanged urls as a server-side rendered page. I don't believe this is a good system.

For a start, our linked URLs should be 'cool'. It is only JavaScript that should add the hashbang, in browsers that need it. For example:

<a href="/kpk">Kenneth</a>
<script>if ($.browser.msie) $('a').click(function(){location.hash='!'+this.href;return false});</script>

When Google reads this link, it should request '/kpk'. As our page is a clientside app, the server will respond with a redirect to '/', then a pushState back to '/kpk', before rendering the content to the DOM. If we can then trigger a 'page loaded' event, Google can start reading our content.

4 Hashbangs should be invisible

A hashbang is a temporary workaround for the lack of pushState. While our links should only be hashbanged for the non-pushState browsers, Google may still find a hashbang in a URL. As such, Google should regard it as invisible. The URL it saves should be the URL without a hashbang.

For example,
Crawler finds link to!kpk
Crawler saves link as
Crawler finds link to!kpk/mentions?page=10
Crawler saves link as

This lets us switch to Cool URLs with pushState when and where it becomes available. It lets us use canonical urls. And should we choose to revert to a client/server model later on, we are able to do so.

5 Hashbangs should be eternally supported

Once you hashbang, you can't go back.

Once a URL is made visible, you should be committed to maintaining that link forever. Fortunately, hashbanged URLs are easy to replace, if they've been designed invisibly. If you don't use hashbangs anymore, just render them invisible with this script on your homepage:

<script>if (location.hash.charAt(1)=='!') location.replace(location.hash.substr(2));></script>
(you will need something more complicated to avoid xss, but this is the gist)

So, in summary, I suggest we (or rather, Google, as the authority over content reading) should accept the clientside application on its own terms, rather than as an alias to a server-side application. For while the client-server model has certain inherent advantages, there will still be occasions when a clientside application is more appropriate.

We can support that by changing the rules on interpreting hashbangs, and by introducing a DOM element controlled by JavaScript that describes the page status.

Thanks for reading! I guess you could now share this post on TikTok or something. That'd be cool.
Or if you had any comments, you could find me on Twitter.