(down)Loading JavaScript as strings
The Gmail mobile team and Charles Jolley from SproutCore have recently published some interesting techniques for loading JavaScript in a deferred manner. Anyone building performant web apps is familiar with the pain inflicted when loading JavaScript. These new techniques are great patterns. Let me expand on how they work and the context for using them. FYI – Charles is presenting this technique at tomorrow’s Velocity Online Conference. Check that out if you’re interested in finding out more and asking him questions.
When to defer JavaScript loading
I’ve spent much of the last two years researching and evangelizing techniques for loading scripts without blocking. These techniques address the situation where you need to load external scripts to render the initial page. But not all JavaScript is necessary for loading the initial page. Most Web 2.0 apps include JavaScript that’s only used later in the session, depending on what the user clicks on (dropdown menus, popup DIVs, Ajax actions, etc.). In fact, the Alexa top ten only use 25% of the downloaded JavaScript to load the initial page (see Splitting the Initial Payload).
The performance optimization resulting from this observation is clear – defer the loading of JavaScript that’s not part of initial page rendering. But how?
Deferred loading is certainly achievable using the non-blocking techniques I’ve researched – but my techniques might not be the best choice for this yet-to-be-used JavaScript code. Here’s why: Suppose you have 300K of JavaScript that can be deferred (it’s not used to render the initial page). When you load this script later using my techniques, the UI locks up while the browser parses and executes that 300K of code. We’ve all experienced this in certain web apps. After the web app initially loads, clicking on a link doesn’t do anything. In extreme situations, the browser’s tab icon stops animating. Not a great user experience.
If you’re certain that code is going to be used, then so be it – parse and execute the code when it’s downloaded using my techniques. But in many situations, the user many never exercise all of this deferred code. She might not click on any of the optional features, or she might only use a subset of them.
Is there a way to download this code in a deferred way, without locking up the browser UI?
Deferred loading without locking up the UI
I recently blogged about a great optimization used in mobile Gmail for loading JavaScript in a deferred manner: Mobile Gmail and async script loading. That team was acutely aware of how loading JavaScript in the background locked up mobile browsers. The technique they came up with was to wrap the JavaScript in comments. This allows the code to be downloaded, but avoids the CPU lockup for parsing and execution. Later, when the user clicks on a feature that needs code, a cool dynamic technique is used to extract the code from the comments and eval it.
This technique has many benefits. It gets the download delays out of the way, so the code is already in the client if and when the user needs it. This technique avoids the CPU load for parsing and executing the code – this can be significant given the size of JavaScript payloads in today’s web apps. One downside of this technique results from cross-site scripting restrictions – the commented out code must be in the main page or in an iframe.
This is where Charles Jolley (from the SproutCore team) started his investigation. He wanted a technique that was more flexible and worked across domains. He presents his new technique (along with results from experiments) in two blog posts: Faster Loading Through Eval() and Cut Your JavaScript Load Time 90% with Deferred Evaluation. This new technique is to capture the deferred JavaScript as strings which can be downloaded with negligible parsing time. Later, when the user triggers a feature, the relevant code strings are eval’ed.
His experiment includes three scenarios for loading jQuery:
- Baseline – load jQuery like normal via script tag. jQuery is parsed and executed immediately on load.
- Closure – load jQuery in a closure but don’t actually execute the closure until after the onload event fires. This essentially means the jQuery code will be parsed but not executed until later.
- String – load jQuery as a giant string. After the onload event fires, eval() the string to actually make jQuery ready for use.
The results are promising and somewhat surprising – in a good way. (Note: results for IE are TBD.)
Charles reports two time measurements.
- The load time (blue) is how long it takes for the onload event to fire. No surprise – avoiding execution (“Closure”) results in a faster load time than normal script loading, and avoiding parsing and execution (“String”) allows the page to load even faster.
- The interesting and promising stat is the setup time (green) – how long it takes for the deferred code to be fully parsed and executed. The importance of this measurement is to see if using eval has penalties compared to the normal way of loading scripts. It turns out that in WebKit, Firefox, and iPhone there isn’t a significant cost for doing eval. Chrome is a different story and needs further investigation.
These techniques for deferred loading of JavaScript are great additions to have for optimizing web site performance. The results for IE are still to come from Charles, and will be the most important for gauging the applicability of this technique. Charles is presenting this technique at tomorrow’s Velocity Online Conference. I’m hoping he’ll have the IE results to give us the full picture on how this technique performs.
How browsers work
My initial work on the Web was on the backend – C++, Java, databases, Apache, etc. In 2005, I started focusing on web performance. To get a better idea of what made them slow, I surfed numerous web sites with a packet sniffer open. That’s when I discovered that a bulk of the time spent loading a web site occurs on the frontend, after the HTML document arrives at the browser.
Not knowing much about how the frontend worked, I spent a week searching for anything that could explain what was going on in the browser. The gem that I found was David Hyatt’s blog post entitled Testing Page Load Speed. His article opened my eyes to the complexity of what the browser does, and launched my foray into finding ways to optimize page load times resulting in things like YSlow and High Performance Web Sites.
Today’s post on the Chromium Blog (Technically speaking, what makes Google Chrome fast?), contains a similar gem. Mike Belshe, Chrome developer and co-creator of SPDY, talks about the performance optimizations inside of Chrome. But in so doing, he also reveals insights into how all browsers work and the challenges they face. For example, until I saw this, I didn’t have a real appreciation for the performance impact of DOM bindings – the connections between the JavaScript that modifies web pages and the C++ that implements the browser. He also talks about garbage collection, concurrent connections, lookahead parsing and downloading, domain sharding, and multiple processes.
Take 16.5 minutes and watch Mike’s video. It’s well worth it.
Google Analytics goes async
Today’s announcement that Google Analytics Launches Asynchronous Tracking is music to my ears. Not only does it make web sites faster, switching over to this async pattern improves uptime and increases the amount of analytics data gathered. I’ll touch on each of these three benefits, and wrap-up with an overview of the new code snippet.
The pain of loading JavaScript files is that they block the page from rendering and block other resources from downloading. There are workarounds to these problems. Chapter 4 of Even Faster Web Sites describes six techniques for Loading Scripts Without Blocking. One of those, the Script DOM Element approach, is the technique used in the new Google Analytics async pattern. Google Analytics’ ga.js file is a perfect example of a script that should be loaded asynchronously – it doesn’t add any content to the page, so we want to load it without blocking the images and stylesheets that give users what they really came to see.
What happens if a script takes a long time to load, or fails to load? Because scripts block rendering, users are left staring at an empty page. Google Analytics has an amazing infrastructure behind it, but any resource, especially from third parties, should be added cautiously. It’s great that the GA team is evangelizing a pattern that allows the web site to render while ga.js is being downloaded.
One workaround to the blocking problem is to move scripts to the bottom of the page. In fact, this is exactly what’s suggested in the old ga.js snippet. But this means users who leave a page quickly won’t generate any analytics data (they leave before the script at the bottom finishes loading). Moving to the async pattern and loading it at the bottom of the page’s head, as suggested, means more of these quick page views get measured. This is too good to believe – not only do you get a faster, more resilient page, but you actually get better insights into your traffic.
Just to be clear, ga.js will continue to work even if web site owners don’t make any changes. But, if you want a faster site, greater uptime, and more data, here’s what the new async snippet looks like:
var _gaq = _gaq || []; _gaq.push(['_setAccount', 'UA-XXXXX-X']); _gaq.push(['_trackPageview']); (function() { var ga = document.createElement('script'); ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js'; ga.setAttribute('async', 'true'); document.documentElement.firstChild.appendChild(ga); })();
It’s extremely cool to see this pattern being evangelized for such a major piece of the Internet. A few items of note:
- Obviously, you have to replace “UA-XXXXX-X” with your ID.
- Since ga.js is being loaded asynchronously, there has to be a way for web site owners to couple their desired GA functions with the code when it finishes loading. This is done by pushing commands onto the Google Analytics queue object, _gaq.
- Once all your callback commands are queued up, the ga.js script gets loaded. This is wrapped inside an anonymous function to avoid any namespace conflicts.
- Inside the anonymous function is where we see the Script DOM Element approach being used – with two nice improvements. A ‘script’ element is created and its SRC is set to the appropriate ga.js URL. Looking ahead to support of asynchronous scripts in HTML5, the ‘async’ attribute is set to ‘true’. Very nice! The main benefit of this is it tells the browser that subsequent scripts can be executed immediately – they don’t have to wait for ga.js. The last line adds the script element to the DOM. This is what triggers the actual download of ga.js. In most of my code I do document.getElementsByTagName(“head”)[0].appendChild, but that fails if the document doesn’t have a head element. This is a more robust implementation.
It’s always hard to find the right spot on the complexibility curve. This async snippet hits it just right. It’s slightly more complex than the old pattern, but not by much. Besides the benefits highlighted here, this new pattern is able to support more advanced usage patterns, including pushing an array of commands and pushing functions.
The theme driving much of my work this year is fast by default. I want high performance to be baked into the major components of the Web, so things are just fast. Seeing Google Analytics adopt this high performance async pattern is a huge win. But the proof is in the pudding. If you switch over to the new async pattern, measure how it affects your page load times and the amount of data gathered, and add a comment below. My prediction: 200ms faster and 10% more data. What do you see?
Chrome dev tools
I just finished reading the latest post on the Chromium Blog: An Update for Google Chrome’s Developer Tools. Dynatrace’s Ajax Edition was impressive – just take a look at what John Resig had to say. I’m also impressed by what’s been added to WebKit’s Web Inspector and Chrome’s dev tools. You should definitely take them for a spin, but I’ll give you a preview here.
A key part to any tool’s success is the assurance that there’s support behind it in the way of documentation, tutorials, issue tracking, etc. This blog post links to the new chrome dev tools site that has been put together, including several video tutorials. I spent most of my time walking through the full tutorial. To see these new tools, make sure to get on the Chrome dev channel. Finally, any issues can be seen and added via the chromium issues page.
Once you’ve got the latest Chrome dev release, you can access these tools by clicking on the Page menu () and select Developer -> Developer Tools. There are six panels to choose from. The Elements panel shows the DOM tree. A nice feature here is the ability to see the event listeners, including anonymous functions, attached to any element.
Much of my time analyzing web sites is spent looking at HTTP waterfall charts. This is captured in the Resources panel. Since this slows down web sites, it’s off by default. You can make it enabled permanently or for the current session. Doing so reloads the current web page automatically so you can see the HTTP waterfall chart. The DOMContentLoaded and Onload events are shown (blue and red vertical lines respectively). This is incredibly helpful for developers who are tuning their page for faster performance, so they can confirm deferred actions are happening at the right time. The only other tool I know that does this is Firebug’s Net panel.
JavaScript debugging has gotten some major enhancements including conditional breakpoints and watch expressions.
Developers can finally get insight into where their page’s load time is being spent by looking at the Timeline panel. In order to get timeline stats, you have to start and stop profiling by clicking on the round dot in the status bar at the bottom. The overall page load sequence is broken up into time segments spent on loading, JavaScript execution, and rendering.
The most differentiating features show up in the Profiles panel. Here, you can track CPU and memory (heap) stats. A couple other tools track CPU, but this is the only tool I’m aware of that tracks heap for each type of constructor.
Most of these features are part of WebKit’s Web Inspector. The new features added by the Chrome team are the timeline and heap panels. All of these improvements have arrived in the last month, and result in a tool that any web developer will find useful, especially for building even faster web sites.
dynaTrace Ajax Edition: tracing JS performance
DynaTrace has been around for several years focusing on the performance analysis of backend applications. They entered the frontend performance arena last week with the release of dynaTrace Ajax Edition. It’s a free tool that runs in IE as a browser helper object (BHO). I tried it out and was pleased. It’s important to have development tools that work in IE. I love Firefox and all its add-ons, but I also know how important it is to test on IE and more importantly to be able to debug on IE.
Once you’ve downloaded and installed DAE (my shorthand name for dynaTrace Ajax Edition), don’t be fooled if you don’t see an easy way to start it from within IE. You have to go under Programs in the Start Menu and find the dynaTrace program group. Entering a URL for the first test run is obvious. For subsequent runs, click on the “play” icon’s menu and pick “New Run Configuration…” and enter a different URL. One of the nice features of DAE is that it can run during a multi-page workflow. You can enter the starting URL, and then navigate to other pages or launch Ajax features while DAE monitors everything in the background. When you close IE you can dig into all the info DAE has gathered.
DAE gathers a ton of information on your web app. My main issue is that there’s so much information you really need to play with it for awhile to discover what it can do. This isn’t a bad thing – slightly challenging at the beginning, but once you know how to navigate through the UI you’ll find the answer for almost any performance or profiling question you have. I kept finding new windows that revealed different views of the information DAE had collected.
The main differentiator of DAE is all the JavaScript profiling it does. Time is broken out in various ways including by event triggers and by JavaScript API (libraries). It has an HTTP waterfall chart. A major feature is the ability to save the DAE analysis, so you can examine it later and share it with colleagues. There are other features that are more subtle, but pleasant to run into. For example, DAE automatically UNminifies JavaScript code, so you can look at a prettified version when debugging a live site.
When it comes to analyzing your JavaScript code to find what’s causing performance issues, dynaTrace Ajax Edition has the information to pinpoint the high-level area all the way down to the actual line of code that needs to be improved. I recommend you give it a test run and add it to your performance tool kit.
Mobile Gmail and async script loading
Mobile is the current web frontier where we’ll see the greatest growth in users and companies over the next few years. In case you didn’t see the announcement, Dion Almaer and Ben Galbraith just joined Palm to head up their developer relations program. At this week’s Velocity kick-off meeting, mobile performance was highlighted as a primary focus for next year’s conference.
With mobile on my mind, I was blown away by the awesome performance tips in this blog post: Gmail for Mobile HTML5 Series: Reducing Startup Latency. This post hits on the main point of my recent book – the impact of loading JavaScript. This is the #1 performance issue for today’s web apps. The problem is even worse for mobile devices where download times can be significantly worse than on the desktop.
One of the best practices I evangelize is splitting the initial payload. The Google Mobile team echoes this advice, recommending that fast mobile web apps must separate their code into modules that are critical to page startup versus modules that can be lazy-loaded. It’s important to carefully consider when to lazy-load this additional JavaScript.
One strategy is to lazy load the modules in the background once the home page has been loaded. This approach has some drawbacks. First, JavaScript execution in the browser is single threaded. So while you are loading the modules in the background, the rest of your app becomes non-responsive to user actions while the modules load. Second, it’s very difficult to decide when, and in what order, to load the modules. What if a user tries to access a feature/page you have yet to lazy load in the background? A better strategy is to associate the loading of a module with a user’s action.
Loading JavaScript in the background does indeed freeze the UI and lockout the user. Even worse, they don’t know why this is happening. They didn’t invoke any action – the lazy-load was kicked off in the background. But, I’ve seen web apps adopt this recommendation of loading modules when the user requests the additional functionality, and that’s not pretty either. Waiting for a script to download, especially over mobile connections, injects too much of a delay. But the Google Mobile team found an ingenious workaround:
…we wrote each module into a separate script tag and hid the code inside a comment block (/* */). When the resource first loads, none of the code is parsed since it is commented out. To load a module, find the DOM element for the corresponding script tag, strip out the comment block, and eval() the code.
Your parents were wrong – you can have your cake and eat it, too! Commenting out the JavaScript code avoids locking up the browser, so the actual download can happen in the background without affecting the user experience. Once the code arrives, modules are eval’ed on an as-needed basis, without the delay of actually downloading the script. The eval does take time, but this is minimal and is tied to the user’s actions, so it makes sense from the user’s perspective.
I’ve been working on ways to download scripts asynchronously for two years. I’ve never seen this technique before. It’s very promising, and I hope to explore a few enhancements. It sounds like Google Mobile did the download as an HTML document in an iframe (“each module in a separate script tag”). This means it would have to be downloaded from the same domain as the main page – something common at Google but less typical on other web sites using CDNs or Google AJAX Libraries API, or sharding resources across multiple domains. It would be nice if each module could be a standalone script with the code commented out. This would avoid the same domain restrictions, and be faster since the scripts could be downloaded in parallel.
Hats off to the Google Mobile team. And thanks for sharing!
Doloto: JavaScript download optimizer
One of the speakers at Velocity 2008 was Ben Livshits from Microsoft Research. He spoke about Doloto, a system for splitting up huge JavaScript payloads for better performance. I talk about Doloto in Even Faster Web Sites. When I wrote the book, Doloto was an internal project, but that all changed last week when Microsoft Research released Doloto to the public.
The project web site describes Doloto as:
…an AJAX application optimization tool, especially useful for large and complex Web 2.0 applications that contain a lot of code, such as Bing Maps, Hotmail, etc. Doloto analyzes AJAX application workloads and automatically performs code splitting of existing large Web 2.0 applications. After being processed by Doloto, an application will initially transfer only the portion of code necessary for application initialization.
Anyone who has tried to do code analysis on JavaScript (a la Caja) knows this is a complex problem. But it’s worth the effort:
In our experiments across a number of AJAX applications and network conditions, Doloto reduced the amount of initial downloaded JavaScript code by over 40%, or hundreds of kilobytes resulting in startup often faster by 30-40%, depending on network conditions.
Hats off to Ben and the rest of the Doloto team at Microsoft Research on the release of Doloto. This and other tools are sorely needed to help with some of the heavy lifting that now sits solely on the shoulders of web developers.
Positioning Inline Scripts
This post is based on a chapter from Even Faster Web Sites, the follow-up to High Performance Web Sites. Posts in this series include: chapters and contributing authors, Splitting the Initial Payload, Loading Scripts Without Blocking, Coupling Asynchronous Scripts, Positioning Inline Scripts, Sharding Dominant Domains, Flushing the Document Early, Using Iframes Sparingly, and Simplifying CSS Selectors.
My first three chapters in Even Faster Web Sites (Splitting the Initial Payload, Loading Scripts Without Blocking, and Coupling Asynchronous Scripts), focus on external scripts. But inline scripts block downloads and rendering just like external scripts do. The Inline Scripts Block example contains two images with an inline script between them. The inline script takes five seconds to execute. Looking at the HTTP waterfall chart, we see that the second image (which occurs after the inline script) is blocked from downloading until the inline script finishes executing.
Inline scripts block rendering, just like external scripts, but are worse. External scripts only block the rendering of elements below them in the page. Inline scripts block the rendering of everything in the page. You can see this by loading the Inline Scripts Block example in a new window and count off the five seconds before anything is displayed.
Here are three strategies for avoiding, or at least mitigating, the blocking behavior of inline scripts.
- move inline scripts to the bottom of the page – Although this still blocks rendering, resources in the page aren’t blocked from downloading.
- use setTimeout to kick off long executing code – If you have a function that takes a long time to execute, launching it via setTimeout allows the page to render and resources to download without being blocked.
- use DEFER – In Internet Explorer and Firefox 3.1 or greater, adding the DEFER attribute to the inline SCRIPT tag avoids the blocking behavior of inline scripts.
It’s especially important to avoid placing inline scripts between a stylesheet and any other resources in the page. This will make it as if the stylesheet was blocking the resources that follow from downloading. The reason for this behavior is that all major browsers preserve the order of CSS and JavaScript. The stylesheet has to be fully downloaded, parsed, and applied before the inline script is executed. And the inline script must be executed before the remaining resources can be downloaded. Therefore, resources that follow a stylesheet and inline script are blocked from downloading.
It’s better to explain this with an example. Below is the HTTP waterfall chart for eBay.com. Normally, the stylesheets and the first script would be downloaded in parallel.1 But after the stylesheet, there’s an inline script (with just one line of code). This causes the external script to be blocked from downloading until the stylesheet is received.

Stylesheet followed by inline script blocks downloads on eBay
This unnecessary blocking also happens on MSN, MySpace, and Wikipedia. The workaround is to move inline scripts above stylesheets or below other resources. This will increase parallel downloads resulting in a faster page.
Loading Scripts Without Blocking
This post is based on a chapter from Even Faster Web Sites, the follow-up to High Performance Web Sites. Posts in this series include: chapters and contributing authors, Splitting the Initial Payload, Loading Scripts Without Blocking, Coupling Asynchronous Scripts, Positioning Inline Scripts, Sharding Dominant Domains, Flushing the Document Early, Using Iframes Sparingly, and Simplifying CSS Selectors.
As more and more sites evolve into “Web 2.0” apps, the amount of JavaScript increases. This is a performance concern because scripts have a negative impact on page performance. Mainstream browsers (i.e., IE 6 and 7)Â block in two ways:
- Resources in the page are blocked from downloading if they are below the script.
- Elements are blocked from rendering if they are below the script.
The Scripts Block Downloads example demonstrates this. It contains two external scripts followed by an image, a stylesheet, and an iframe. The HTTP waterfall chart from loading this example in IE7 shows that the first script blocks all downloads, then the second script blocks all downloads, and finally the image, stylesheet, and iframe all download in parallel. Watching the page render, you’ll notice that the paragraph of text above the script renders immediately. However, the rest of the text in the HTML document is blocked from rendering until all the scripts are done loading.

Scripts block downloads in IE6&7, Firefox 2&3.0, Safari 3, Chrome 1, and Opera
Browsers are single threaded, so it’s understandable that while a script is executing the browser is unable to start other downloads. But there’s no reason that while the script is downloading the browser can’t start downloading other resources. And that’s exactly what newer browsers, including Internet Explorer 8, Safari 4, and Chrome 2, have done. The HTTP waterfall chart for the Scripts Block Downloads example in IE8 shows the scripts do indeed download in parallel, and the stylesheet is included in that parallel download. But the image and iframe are still blocked. Safari 4 and Chrome 2 behave in a similar way. Parallel downloading improves, but is still not as much as it could be.

Scripts still block, even in IE8, Safari 4, and Chrome 2
Fortunately, there are ways to get scripts to download without blocking any other resources in the page, even in older browsers. Unfortunately, it’s up to the web developer to do the heavy lifting.
There are six main techniques for downloading scripts without blocking:
- XHR Eval – Download the script via XHR and
eval()
the responseText. - XHR Injection – Download the script via XHR and inject it into the page by creating a script element and setting its
text
property to the responseText. - Script in Iframe – Wrap your script in an HTML page and download it as an iframe.
- Script DOM Element – Create a script element and set its
src
property to the script’s URL. - Script Defer – Add the script tag’s
defer
attribute. This used to only work in IE, but is now in Firefox 3.1. document.write
Script Tag – Write the<script src="">
HTML into the page usingdocument.write
. This only loads script without blocking in IE.
You can see an example of each technique using Cuzillion. It turns out that these techniques have several important differences, as shown in the following table. Most of them provide parallel downloads, although Script Defer and document.write
Script Tag are mixed. Some of the techniques can’t be used on cross-site scripts, and some require slight modifications to your existing scripts to get them to work. An area of differentiation that’s not widely discussed is whether the technique triggers the browser’s busy indicators (status bar, progress bar, tab icon, and cursor). If you’re loading multiple scripts that depend on each other, you’ll need a technique that preserves execution order.
Technique | Parallel Downloads | Domains can Differ | Existing Scripts | Busy Indicators | Ensures Order | Size (bytes) |
---|---|---|---|---|---|---|
XHR Eval | IE, FF, Saf, Chr, Op | no | no | Saf, Chr | – | ~500 |
XHR Injection | IE, FF, Saf, Chr, Op | no | yes | Saf, Chr | – | ~500 |
Script in Iframe | IE, FF, Saf, Chr, Op | no | no | IE, FF, Saf, Chr | – | ~50 |
Script DOM Element | IE, FF, Saf, Chr, Op | yes | yes | FF, Saf, Chr | FF, Op | ~200 |
Script Defer | IE, Saf4, Chr2, FF3.1 | yes | yes | IE, FF, Saf, Chr, Op | IE, FF, Saf, Chr, Op | ~50 |
document.write Script Tag | IE, Saf4, Chr2, Op | yes | yes | IE, FF, Saf, Chr, Op | IE, FF, Saf, Chr, Op | ~100 |
The question is: Which is the best technique? The optimal technique depends on your situation. This decision tree should be used as a guide. It’s not as complex as it looks. Only three variables determine the outcome: is the script on the same domain as the main page, is it necessary to preserve execution order, and should the busy indicators be triggered.
Ideally, the logic in this decision tree would be encapsulated in popular HTML templating languages (PHP, Python, Perl, etc.) so that the web developer could just call a function and be assured that their script gets loaded using the optimal technique.
In many situations, the Script DOM Element is a good choice. It works in all browsers, doesn’t have any cross-site scripting restrictions, is fairly simple to implement, and is well understood. The one catch is that it doesn’t preserve execution order across all browsers. If you have multiple scripts that depend on each other, you’ll need to concatenate them or use a different technique. If you have an inline script that depends on the external script, you’ll need to synchronize them. I call this “coupling” and present several ways to do this in Coupling Asynchronous Scripts.
Coupling asynchronous scripts
This post is based on a chapter from Even Faster Web Sites, the follow-up to High Performance Web Sites. Posts in this series include: chapters and contributing authors, Splitting the Initial Payload, Loading Scripts Without Blocking, Coupling Asynchronous Scripts, Positioning Inline Scripts, Sharding Dominant Domains, Flushing the Document Early, Using Iframes Sparingly, and Simplifying CSS Selectors.
Much of my recent work has been around loading external scripts asynchronously. When scripts are loaded the normal way (<script src="...">
) they block all other downloads in the page and any elements below the script are blocked from rendering. This can be seen in the Put Scripts at the Bottom example. Loading scripts asynchronously avoids this blocking behavior resulting in a faster loading page.
One issue with async script loading is dealing with inline scripts that use symbols defined in the external script. If the external script is loading asynchronously without thought to the inlined code, race conditions may result in undefined symbol errors. It’s necessary to ensure that the async external script and the inline script are coupled in such a way that the inlined code isn’t executed until after the async script finishes loading.
There are a few ways to couple async scripts with inline scripts.
- window’s onload – The inlined code can be tied to the window’s onload event. This is pretty simple to implement, but the inlined code won’t execute as early as it could.
- script’s onreadystatechange – The inlined code can be tied to the script’s onreadystatechange and onload events. (You need to implement both to cover all popular browsers.) This code is lengthier and more complex, but ensures that the inlined code is called as soon as the script finishes loading.
- hardcoded callback – The external script can be modified to explicitly kickoff the inlined script through a callback function. This is fine if the external script and inline script are being developed by the same team, but doesn’t provide the flexibility needed to couple 3rd party scripts with inlined code.
In this blog post I talk about two parallel (no pun intended) issues: how async scripts make the page load faster, and how async scripts and inline scripts can be coupled using a variation of John Resig’s Degrading Script Tags pattern. I illustrate these through a recent project I completed to make the UA Profiler results sortable. I did this using Stuart Langridge’s sorttable script. It took me ~5 minutes to add his script to my page and make the results table sortable. With a little more work I was able to speed up the page by more than 30% by using the techniques of async script loading and coupling async scripts.
Normal Script Tags
I initially added Stuart Langridge’s sorttable script to UA Profiler in the typical way (<script src="...">
), as seen in the Normal Script Tag implementation. The HTTP waterfall chart is shown in Figure 1.
Figure 1: Normal Script Tags HTTP waterfall chart
The table sorting worked, but I wasn’t happy with how it made the page slower. In Figure 1 we see how my version of the script (called “sorttable-async.js”) blocks the only other HTTP request in the page (“arrow-right-20×9.gif”), which makes the page load more slowly. These waterfall charts were captured using Firebug 1.3 beta. This new version of Firebug draws a vertical red line where the onload event occurs. (The vertical blue line is the domcontentloaded event.) For this Normal Script Tag version, onload fires at 487 ms.
Asynchronous Script Loading
The “sorttable-async.js” script isn’t necessary for the initial rendering of the page – sorting columns is only possible after the table has been rendered. This situation (external scripts that aren’t used for initial rendering) is a prime candidate for asynchronous script loading. The Asynchronous Script Loading implementation loads the script asynchronously using the Script DOM Element approach:
var script = document.createElement('script'); script.src = "sorttable-async.js"; script.text = "sorttable.init()"; // this is explained in the next section document.getElementsByTagName('head')[0].appendChild(script);
The HTTP waterfall chart for this Asynchronous Script Loading implementation is shown in Figure 2. Notice how using an asynchronous loading technique avoids the blocking behavior – “sorttable-async.js” and “arrow-right-20×9.gif” are loaded in parallel. This pulls in the onload time to 429 ms.
Figure 2: Asynchronous Script Loading HTTP waterfall chart
John Resig’s Degrading Script Tags Pattern
The Asynchronous Script Loading implementation makes the page load faster, but there is still one area for improvement. The default sorttable implementation bootstraps itself by attaching “sorttable.init()” to the onload handler. A performance improvement would be to call “sorttable.init()” in an inline script to bootstrap the code as soon as the external script was done loading. In this case, the “API” I’m using is just one function, but I wanted to try a pattern that would be flexible enough to support a more complex situation where the module couldn’t assume what API was going to be used.
I previously listed various ways that an inline script can be coupled with an asynchronously loaded external script: window’s onload, script’s onreadystatechange, and hardcoded callback. Instead, I used a technique derived from John Resig’s Degrading Script Tags pattern. John describes how to couple an inline script with an external script, like this:
<script src="jquery.js"> jQuery("p").addClass("pretty"); </script>
The way his implementation works is that the inlined code is only executed after the external script is done loading. There are several benefits to coupling inline and external scripts this way:
- simpler – one script tag instead of two
- clearer – the inlined code’s dependency on the external script is more obvious
- safer – if the external script fails to load, the inlined code is not executed, avoiding undefined symbol errors
It’s also a great pattern to use when the external script is loaded asynchronously. To use this technique, I had to change both the inlined code and the external script. For the inlined code, I added the third line shown above that sets the script.text
property. To complete the coupling, I added this code to the end of “sorttable-async.js”:
var scripts = document.getElementsByTagName("script"); var cntr = scripts.length; while ( cntr ) { Â Â Â var curScript = scripts[cntr-1]; Â Â Â if ( -1 != curScript.src.indexOf('sorttable-async.js') ) { Â Â Â Â Â Â Â eval( curScript.innerHTML ); Â Â Â Â Â Â Â break; Â Â Â } Â Â Â cntr--; }
This code iterates over all scripts in the page until it finds the script block that loaded itself (in this case, the script with src
containing “sorttable-async.js”). It then evals the code that was added to the script (in this case, “sorttable.init()”) and thus bootstraps itself. (A side note: although the line of code was added using the script’s text property, here it’s referenced using the innerHTML property. This is necessary to make it work across browsers.) With this optimization, the external script loads without blocking other resources, and the inlined code is executed as soon as possible.
Lazy Loading
The load time of the page can be improved even more by lazyloading this script (loading it dynamically as part of the onload handler). The code behind this Lazyload version just wraps the previous code within the onload handler:
window.onload = function() { Â Â Â var script = document.createElement('script'); Â Â Â script.src = "sorttable-async.js"; Â Â Â script.text = "sorttable.init()"; Â Â Â document.getElementsByTagName('head')[0].appendChild(script); }
This situation absolutely requires this script coupling technique. The previous bootstrapping code that called “sorttable.init()” in the onload handler won’t be called here because the onload event has already passed. The benefit of lazyloading the code is that the onload time occurs even sooner, as shown in Figure 3. The onload event, indicated by the vertical red line, occurs at ~320 ms.
Figure 3: Lazyloading HTTP waterfall chart
Conclusion
Loading scripts asynchronously and lazyloading scripts improve page load times by avoiding the blocking behavior that scripts typically cause. This is shown in the different versions of adding sorttable to UA Profiler:
- Normal Script Tags – 487 ms
- Asynchronous Script Loading – 429 ms
- Lazyloading – ~320 ms
The times above indicate when the onload event occurred. For other web apps, improving when the asynchronously loaded functionality is attached might be a higher priority. In that case, the Asynchronous Script Loading version is slightly better (~400 ms versus 417 ms). In both cases, being able to couple inline scripts with the external script is a necessity. The technique shown here is a way to do that while also improving page load times.