Page Speed 1.6 Beta – new rules, native library
Page Speed 1.6 Beta was released today. There are a few big changes, but the most important fix is compatibility with Firefox 3.6. If you’re running the latest version of Firefox visit the download page to get Page Speed 1.6. Phew!
I wanted to highlight some of the new features mentioned in the 1.6 release notes: new rules and native library.
Three new rules were added as part of Page Speed 1.6:
- Specify a character set early – If you don’t specify a character set for your web pages or specify it too low in the page, the browser could parse it incorrectly. You can specify a character set using the META tag or in the Content-Type response header. Returning charset in the Content-Type header will ensure the browser sees it early. (See this Zoompf post for more information.)
- Minify HTML – Top performing web sites are already on top of this, right? Analyzing the Alexa U.S. top 10 shows an average savings of 8% if they minified their HTML. You can easily check your site with this new rule, and even save the optimized version.
- Minimize Request Size – Okay, this is cool and shows how Google tries to squeeze out every last drop of performance. This rule sees if the total size of the request headers exceed one packet (~1500 bytes). Requiring a roundtrip just to submit the request hurts performance, especially for users with high latency.
The other big feature I wanted to highlight first came out in Page Speed 1.5 but didn’t get much attention – the Page Speed C++ Native Library. It probably didn’t get much attention because it’s one of those changes that, if done correctly, no one notices. The work behind the native library involves porting the rules from JavaScript to C++. Why bother? Here’s what the release notes say:
Making Page Speed run faster is great, but the idea of implementing the performance logic in a C++ library so the rules can be run in other programs is very cool. And where have we seen this recently? In the Site Performance section recently added to Webmaster Tools. Now we have a server-side tool that produces the same recommendations found from running the Page Speed add-on. Here are the rules that have been ported to the native library:
added in 1.5:
|
added in 1.6:
|
Webmaster Tools Site Performance today shows recommendations based on the rules in native library 1.5. Now that more rules have been added to native library 1.6, webmasters can expect to see those recommendations in the near future. But this integration shouldn’t stop with Webmaster Tools. I’d love to see other tools and services integrate native library. If you’re interested in using native library, check out the page-speed project on Google Code and contact the page-speed-discuss Google Group.
Zen and the Art of Web Performance
The Zen Wall
We moved into a teardown 2BR/1BA house in 1993. After two years of infrastructure repairs (heating, plumbing, electrical, doors, windows) we started work on the yard. One of the jobs that I enjoyed most was building a dry stone retaining wall in the front yard. Here are the before and after pictures:
During this multi-month project my neighbor, Les Kaye, came by to survey the progress. Les is the Zen monk at the Kannon Do Zen Meditation Center and author of Zen at Work. He made a comment about my rock wall that I’ve always remembered – “very Zen-like”.
I’m not a Zen student, but I have read a few Zen books (Les’ Zen at Work, Tao Te Ching by Lao Tsu, the Tao of Leadership by John Heider, and Zen and the Art of Motorcycle Maintenance by Robert Pirsig). I think Les’ comment had to do with the way I was building the wall. I wasn’t using any tools, and didn’t have any drawings or plans. I had never built a dry stone wall before. I read a few articles on dry stone construction, got advice from the folks where I bought the stone, and then just started.
I had a sense of where the wall should go in the yard, and where the steps should go in the wall. I laid the base, and then worked up from there. I would try a few stones for a given spot until I found the right one. I could recognize if a stone worked, and if it didn’t. Sometimes I’d have to backtrack a bit to undo some bad stones and start over where the good stones left off. In the end, the wall turned out great. It’s now covered with moss, overhung by rosemary, and filled with lizards that come out to sunbathe in the summer.
Les’ comment that my rock wall project was Zen-like had to do with this marriage of an artistic sense of what was right with the technical process of building a wall. The separation of art and technology, and the need to rejoin them, is a major theme in Zen and the Art of Motorcycle Maintenance that’s been on my mind lately.
Art & Technology
The way I approached this rock wall project seems a bit ill-conceived. You certainly wouldn’t run a software development project this way. A well-planned project would make use of good tools. It would have a detailed plan. It would include people that had experience, at least in a consulting role. And it would have regular milestones that could be objectively measured and quantified. But these (technical, mechanical, objective) process parts aren’t sufficient to ensure success. A sense of craftsmanship is also needed.
I’ve had the experience where I’ve hired a home contractor who had all the right tools and experience, and was equipped with an agreed upon plan, but at the end of the job I wasn’t satisfied with the outcome. The end result didn’t fit. It was out of proportion – something we couldn’t have noticed looking at the plans. Or it didn’t fit the flow of the people living in the house – the light switches were placed in an awkward location in a counter intuitive order, or the door swung out too far and blocked traffic.
I’m not saying you shouldn’t have plans and tools, and metrics and experts. And I’m certainly not running down home contractors – many great contractors have worked on our house. I’m saying that the high quality results I’ve experienced resulted when technical skills were combined with a sense of what was “right” in the work. Robert Pirsig says it this way in describing a wall built in Korea:
[The wall] was beautiful, but not because of any masterful intellectual planning or any scientific supervision of the job, or any added expenditures to “stylize” it. It was beautiful because the people who worked on it had a way of looking at things that made them do it right unselfconsciously. They didn’t separate themselves from the work in such a way as to do it wrong. […] In each case there’s a beautiful way of doing it and an ugly way of doing it, and in arriving at the high-quality, beautiful way of doing it, both an ability to see what “looks good” and an ability to understand the underlying methods to arrive at that “good” are needed.1
Beautiful Web Performance
My work on web performance has helped identify the “underlying methods” for arriving at a quality result in terms of a fast web site. To make it easy to find out which methods to use, I created YSlow and encouraged the team that launched Page Speed. But these tools don’t preclude the need to understand the techniques behind web performance. I was reminded of this today by an email I received through my web site. The emailer said he had installed YSlow and saw what needed to be fixed, and asked for the step-by-step instructions on how to accomplish these changes.
Without any details of his web site, HTML & JavaScript frameworks, etc. it was impossible to generate step-by-step instructions. But even if it was possible, if he had followed someone else’s list of instructions he would be separated from the work and miss the experience of the “beautiful way of doing it”. Optimizing a web site isn’t something that can be undertaken without understanding what you’re changing and why. Doing so inevitably introduces complexity, confusion, and loss of quality. It would be better for him to develop that understanding, or if that’s not possible to use a framework that incorporates performance best practices.
In addition to an understanding of the underlying methods, the ability to see what “looks good” is also needed, as Pirsig says. When it comes to web performance, it’s pretty easy to distinguish the good from the bad. How does your site feel when you use it? How about when you use it from home or from a mobile device? WebPagetest.org helps by offering multiple ways to visualize the performance of your web site, and to compare that to other sites.
Over time, combining an understanding of the underlying methods for web performance with an ability to see what “looks good” leads to doing quality development “unselfconsciously”. It becomes second nature, and permeates every site you build. As I read Zen and the Art of Motorcycle Maintenance I was amazed at how well it spoke to the work I do, even more amazing given that it was published in 1974. The search for Quality is timeless. I recommend you read ZMM, or read it again as the case may be. Raising the level of quality in your work will benefit your users and your employer, but it’ll benefit you most of all.
1Robert M. Pirsig, Zen and the Art of Motorcycle Maintenance An Inquiry Into Values (P.S.) (New York, Harper Perennial Modern Classics, 2005) p. 298.
jQuery 1.4 performance
JQuery 1.4 was released yesterday. I lifted the text from the release announcement, removed stop words, converted to lowercase, and found the ten most used words:
- jquery (71)
- function (27)
- performance (23)
- object (20)
- events (19)
- element (15)
- ajax (15)
- dom (13)
- json (12)
- request (10)
That’s right, “performance” comes in third ahead of “object”, “element”, and even “dom”. Anyone think jQuery 1.4 had a focus on performance? Here’s what John Resig says.
Performance Overhaul of Popular Methods
Many of the most popular and commonly used jQuery methods have seen a significant rewrite in jQuery 1.4. When analyzing the code base we found that we were able to make some significant performance gains by comparing jQuery against itself: Seeing how many internal function calls were being made and to work to reduce the complexity of the code base.
He includes this chart that shows the reduction of complexity for some popular functions.
Of course, all of this is music to my ears. There was one other specific note that caught my eye in this commit comment:
Switched from using YUI Compressor to Google Compiler. Minified and Gzipped filesize reduced to 22,839 bytes from 26,169 bytes (13% decrease in filesize).
Minifying JavaScript is one of the rules I wrote about in High Performance Web Sites. Back then (2006-2007), the best tool for minifying was JSMin from Doug Crockford. It still might be the best tool today for minifying in realtime (e.g., dynamic Ajax and JSON responses). For minifying static files, YUI Compressor (released in late 2007) does a better job. It also works on CSS. So this move from YUI Compressor to the Google Closure Compiler by John Resig, someone who obviously cares about performance, is a big deal.
For jQuery 1.4, the savings from switching to Compiler was 13%. If you have done comparisons with your code, please add your stats via a comment below.
My last blog post (Stuck inside Classic Rock) got pretty esoteric at the end when I started talking about Quality, and I promised a follow-up post on how that related to web performance. I’m still working on that post, but am happy to take this digression. But is it a digression? I’ve been talking to folks over the past week about how they strive for and compromise on quality in their jobs. We all compromise on quality to a certain degree. But occasionally, a person is afforded the opportunity to dedicate a significant portion of their life to a single-minded purpose, and can reach levels of Quality that standout in comparison. John Resig has achieved that. Congratulations to John and the jQuery team. Keep up the good (high performance) work!
Stuck inside Classic Rock
Help! I’m trapped inside Classic Rock and can’t get out!
I grew up in the 60s and 70s listening to what is now called “classic rock”. My first album was Creedence Clearwater Revival’s Cosmo’s Factory. Ramble Tamble is still one of my favorite rock songs. I bought that album when I was 10 years old. I wish I was that cool but I was (am) not – my next album was The Partridge Family Album.
Through my teenage and college years I listened to the bands that make up every classic rock playlist: Allman Brothers, Grateful Dead, Rolling Stones, The Who, Steve Miller, Lynyrd Skynyrd, Bruce Springsteen, Led Zeppelin, Tom Petty, etc. I don’t listen to Partridge Family any more, but I listen to these other bands every day. My iTunes library is full of this music. My default station on both Pandora and Slacker is “Classic Rock”. And my main radio station is KFOG, home of “world class rock”.
Even though I love classic rock, I do like to mix in some new music now and then. New music that’s cut from the same mold, that is. Some new bands I’ve added over the years include Counting Crows, U2, and Red Hot Chili Peppers. That shows how old I am. I think of U2 as “new” – a band that’s been together for 30 years. I desperately needed some truly new music, so I grabbed the copy of Rolling Stone featuring the decade’s best songs & albums and went searching.
I found some great new (10 years old or less) music, and wanted to share what I found. What’s this have to do with web performance? I’ll get to that at the end. If you don’t want to wade through my music recommendations, skip to the bottom and find out what the connection is.
“New” Classic Rock
Here are singles I added via iTunes:
- Wake Up by Arcade Fire – Theme song from the Where the Wild Things Are movie. You should listen to this song first thing every morning when you get up.
- Take Me Out and No You Girls by Franz Ferdinand – Great lyrics to a good euro beat. Maybe danceable, but also capable of generating some headbanging.
- Hurt by Johnny Cash – Johnny Cash covers Nine Inch Nails?! I had to listen to this and was hooked. With lyrics like “what have I become” and “empire of dirt” it makes you think.
- One More Time by Daft Punk – Electro-disco dance mix. Play this in the car when the sun’s out and the windows are down, or the next time a dance party breaks out in your living room (whichever comes first).
- Paper Planes by M.I.A. – I’m not huge on hip-hop, so I fell short of buying the album. But this catchy song featured in Slumdog Millionaire is a great addition to the playlist. My daughters and wife liked it – bonus points!
- Float On by Modest Mouse – Singer Isaac Brock sounds like David Byrne in this song that mixes driving choruses with melodic lyrics.
- Last Nite by The Strokes – A rocking song that’ll get you moving.
I buy my CDs on Amazon. Just this week a buddy predicted the music industry would soon stop making CDs. I hope not. I have so much music it’s overwhelming. I like holding a collection of songs in my hand, taking it with me in the car or on a trip, looking at the cover art, and reading the liner notes. All of this helps me better capture a mental picture of the music. I also believe artists arrange songs together and in a particular order to achieve additional impact. True or not, I like physical CDs. Here’s what I bought:
- Fleet Foxes by Fleet Foxes – This was an easy, almost backward, transition from classic rock to new music. These guys sound like CSNY – harmonies and easy lyrics. Good listening.
- Funeral by Arcade Fire – I bought the single of Wake Up on iTunes and then got it again when I bought the CD – not smart. But it was worth the extra $0.69 to have that one song for the few days it took for the album to arrive.
- A Rush of Blood to the Head by Coldplay – Coldplay churns out hits, which is a turnoff for me. This album has hits, or at least songs you’ll recognize, like In My Place, The Scientist, and Clocks. But the lesser known songs on this album are what’s intriguing – Politik, God Put a Smile upon Your Face, Green Eyes – let’s face it, all the songs on this album are good or great.
- Yoshimi Battles the Pink Robots by The Flaming Lips – Wow. Incredible and hard to describe. It feels like someone used a Pink Floyd machine to translate a comic book to music. I listened to Flight Test and Yoshimi Battles the Pink Robots, Pt. 1 multiple times each day when I first got this. You can hear the influences from Cat Stevens (Flight Test) and Neil Young (In the Morning of the Magicians).
- Only by the Night by Kings of Leon – You’ll recognize the track Use Somebody. I like the singer’s voice – it reminds me of Bob Seeger.
- Z by My Morning Jacket – Southern rock meets Hothouse Flowers, with some early Who influence.
What’s the Connection?
It has been a few weeks since my last blog post. I’m having a hard time getting going again after the long holiday break. This blog post on classic rock was an easy first step back into the world of blogging.
But there’s more to it than that.
I’m slow getting back into blogging because I’m having trouble imagining putting down words that have value. Over the break I read Zen and the Art of Motorcycle Maintenance by Robert Pirsig (for the third time). My mind is still swimming from the visions and thoughts stirred up by that book. Having experienced what an author is capable of with words, how can I attempt to do anything even remotely similar? I am not worthy!
It gets worse, or at least more complicated.
As I drive around listening to my “new” classic rock, I get the same overwhelming, swimming feeling. Many of these songs move me. They’re beautiful. Even the headbanging songs are beautiful. They’re beautiful in the way the artist has reached through the car stereo speakers and changed the way I feel. They’re beautiful in the way they connect and convey. Just like ZMM is beautiful and connects and conveys. And I stop and ask myself – Is my blog beautiful? Really what I mean is, can my blog connect and convey?
I’ve spent a lot of time over the last year thinking about and observing how we connect with each other through writing, music, art, movies, personal interactions, and the Web. I’ve been thinking about what makes those connections more successful, more enjoyable, and ultimately more moving. For my world I’ve been concerned with what makes a blog post, tech presentation, book, or piece of code more beautiful. It came down to asking and defining – What is Beauty? And that’s when I hit a brick wall. I couldn’t define it. I know it when I see it. I can experience it. But I can’t define it. How can I improve something I can’t define?
As I read ZMM, I realized this was similar to Phaedrus’ struggle to define Quality. And the connection Phaedrus found between Quality and what the Greeks called aretê, or Excellence, is similarly connected to what I have been calling Beauty, and is probably a better name for what I’ve been searching to improve. Quality, Excellence, and Beauty – they’re all the same, or at least closely related.
Quality, excellence, and beauty are, or at least should be, in our work. And that’s the connection. This music is excellent and beautiful, and I want to find that quality in the world of web performance, and find a way to express it and communicate it, and have all of us carry it with us as we do our work. As I read ZMM, I saw many similarities between web performance and motorcycle maintenance. I feel like there’s at least one more blog post on the topic, if not an entire book. Hmmm, Zen and the Art of Web Performance. I like it. But let’s see how the blog post goes first. Stay tuned…
Crockford, webhosting, online dating, JSON, alert
This is a fun story that has a security and performance point to it.
Earlier today, Dion Almaer tweeted:
Wow, Doug added an alert() to http://www.json.org/json2.js which just alerted a LOT of people on the Internet. Proving a point on hotlinks?
I was talking to Doug about his keynote at Add-on-Con tomorrow, and asked him what the motivation was for this alert message. It turns out his webhosting service had contacted him about the unusually high amount of traffic on json.org. Doug investigated and discovered that OnlineBootyCall was linking directly to http://json.org/json.js, in spite of this statement in the file:
USE YOUR OWN COPY. IT IS EXTREMELY UNWISE TO LOAD CODE FROM SERVERS YOU DO NOT CONTROL.
Linking directly to http://json.org/json.js is bad. Certainly, it puts a load on Doug’s webhosting company that shouldn’t be there. But more importantly, it exposes the content site to security and performance vulnerabilities. Loading third party scripts into the parent window gives that third party access to cookies and other potentially confidential information in the page. Accessing that script from a third party domain requires an additional DNS lookup (which can be costly). Also, if the script is at the top of the page (which it is in this case) and the third party site is slow or not responding, the entire page is left blank for thirty seconds or more.
It’s best to reduce the number of third party scripts on your site. That was the reason Doug added the alert message to the top of json.js. If you haven’t gone to OnlineBootyCall yet today, here’s what you would’ve seen:
In Doug’s words,
I’m helping onlinebootycall improve its performance.
Speed Tracer – visibility into the browser
Is it just me, or does anyone else think Google’s on fire lately, lighting up the world of web performance? Quick review of news from the past two weeks:
- timeline and heap profiler added to Chrome Dev Tools
- Google Analytics publishes async script loading pattern
- latency and Page Speed recommendations added to Webmaster Tools
- deep dive into what makes Chrome (and browsers in general) fast
- Google Public DNS launched
- and now… the release of Speed Tracer
Speed Tracer was my highlight from last night’s Google Campfire One. The event celebrated the release of GWT 2.0. Performance and “faster” were emphasized again and again throughout the evening’s presentations (I love that). GWT’s new code splitting capabilities are great for performance, but Speed Tracer easily wowed the audience – including me. In this post, I’ll describe what I like about Speed Tracer, what I hope to see added next, and then I’ll step back and talk about the state of performance profilers.
Getting started with Speed Tracer
Some quick notes about Speed Tracer:
- It’s a Chrome extension, so it only runs in Chrome. (Chrome extensions is yet another announcement this week.)
- It’s written in GWT 2.0.
- It works on all web sites, even sites that don’t use GWT.
The Speed Tracer getting started page provides the details for installation. You have to be on the Chrome dev channel. Installing Speed Tracer adds a green stopwatch to the toolbar. Clicking on the icon starts Speed Tracer in a separate Chrome window. As you surf sites in the original window, the performance information is shown in the Speed Tracer window.
Beautiful visibility
When it comes to optimizing performance, developers have long been working in the dark. Without the ability to measure JavaScript execution, page layout, reflows, and HTML parsing, it’s not possible to optimize the pain points of today’s web apps. Speed Tracer gives developers visibility into these parts of page loading via the Sluggishness view, as shown here. (Click on the figure to see a full screen view.) Not only is this kind of visibility great, but the display is just, well, beautiful. Good UI and dev tools don’t often intersect, but when they do it makes development that much easier and more enjoyable.
Speed Tracer also has a Network view, with the requisite waterfall chart of HTTP requests. Performance hints are built into the tool flagging issues such as bad cache headers, exceedingly long responses, Mozilla cache hash collision, too many reflows, and uncompressed responses. Speed Tracer also supports saving and reloading the profiled information. This is extremely useful when working on bugs or analyzing performance with other team members.
Feature requests
I’m definitely going to be using Speed Tracer. For a first version, it’s extremely feature rich and robust. There are a few enhancements that will make it even stronger:
- overall pie chart – The “breakdown by time” for phases like script evaluation and layout are available for segments within a page load. As a starting point, I’d like to see the breakdown for the entire page. When drilling down on a specific load segment, this detail is great. But having overall stats will give developers a clue where they should focus most of their attention.
- network timing – Similar to the issues I discovered in Firebug Net Panel, long-executing JavaScript in the main page blocks the network monitor from accurately measuring the duration of HTTP requests. This will likely require changes to WebKit to record event times in the events themselves, as was done in the fix for Firefox.
- .HAR support – Being able to save Speed Tracer’s data to file and share it is great. Recently, Firebug, HttpWatch, and DebugBar have all launched support for the HTTP Archive file format I helped create. The format is extensible, so I hope to see Speed Tracer support the .HAR file format soon. Being able to share performance information across tools and browsers is a necessary next step. That’s a good segue…
Developers need more
Three years ago, there was only one tool for profiling web pages: Firebug. Developers love working in Firefox, but sometimes you just have to profile in Internet Explorer. Luckily, over the last year we’ve seen some good profilers come out for IE including MSFast , AOL Pagetest, WebPagetest.org, and dynaTrace Ajax Edition. DynaTrace’s tool is the most recent addition, and has great visibility similar to Speed Tracer, as well as JavaScript debugging capabilities. There have been great enhancements to Web Inspector, and the Chrome team has built on top of that adding timeline and memory profiling to Chrome. And now Speed Tracer is out and bubbling to the top of the heap.
The obvious question is:
Which tool should a developer choose?
But the more important question is:
Why should a developer have to choose?
There are eight performance profilers listed here. None of them work in more than a single browser. I realize web developers are exceedingly intelligent and hardworking, but no one enjoys having to use two different tools for the same task. But that’s exactly what developers are being asked to do. To be a good developer, you have to be profiling your web site in multiple browsers. By definition, that means you have to install, learn, and update multiple tools. In addition, there are numerous quirks to keep in mind when going from one tool to another. And the features offered are not consistent across tools. It’s a real challenge to verify that your web app performs well across the major browsers. When pressed, rock star web developers I ask admit they only use one or two profilers – it’s just too hard to stay on top of a separate tool for each browser.
This week at Add-on-Con, Doug Crockford’s closing keynote is about the Future of the Web Browser. He’s assembled a panel of representatives from Chrome, Opera, Firefox, and IE. (Safari declined to attend.) My hope is they’ll discuss the need for a cross-browser extension model. There’s been progress in building protocols to support remote debugging: WebDebugProtocol and Crossfire in Firefox, Scope in Opera, and ChromeDevTools in Chrome. My hope for 2010 is that we see cross-browser convergence on standards for extensions and remote debugging, so that developers will have a slightly easier path for ensuring their apps are high performance on all browsers.
(down)Loading JavaScript as strings
The Gmail mobile team and Charles Jolley from SproutCore have recently published some interesting techniques for loading JavaScript in a deferred manner. Anyone building performant web apps is familiar with the pain inflicted when loading JavaScript. These new techniques are great patterns. Let me expand on how they work and the context for using them. FYI – Charles is presenting this technique at tomorrow’s Velocity Online Conference. Check that out if you’re interested in finding out more and asking him questions.
When to defer JavaScript loading
I’ve spent much of the last two years researching and evangelizing techniques for loading scripts without blocking. These techniques address the situation where you need to load external scripts to render the initial page. But not all JavaScript is necessary for loading the initial page. Most Web 2.0 apps include JavaScript that’s only used later in the session, depending on what the user clicks on (dropdown menus, popup DIVs, Ajax actions, etc.). In fact, the Alexa top ten only use 25% of the downloaded JavaScript to load the initial page (see Splitting the Initial Payload).
The performance optimization resulting from this observation is clear – defer the loading of JavaScript that’s not part of initial page rendering. But how?
Deferred loading is certainly achievable using the non-blocking techniques I’ve researched – but my techniques might not be the best choice for this yet-to-be-used JavaScript code. Here’s why: Suppose you have 300K of JavaScript that can be deferred (it’s not used to render the initial page). When you load this script later using my techniques, the UI locks up while the browser parses and executes that 300K of code. We’ve all experienced this in certain web apps. After the web app initially loads, clicking on a link doesn’t do anything. In extreme situations, the browser’s tab icon stops animating. Not a great user experience.
If you’re certain that code is going to be used, then so be it – parse and execute the code when it’s downloaded using my techniques. But in many situations, the user many never exercise all of this deferred code. She might not click on any of the optional features, or she might only use a subset of them.
Is there a way to download this code in a deferred way, without locking up the browser UI?
Deferred loading without locking up the UI
I recently blogged about a great optimization used in mobile Gmail for loading JavaScript in a deferred manner: Mobile Gmail and async script loading. That team was acutely aware of how loading JavaScript in the background locked up mobile browsers. The technique they came up with was to wrap the JavaScript in comments. This allows the code to be downloaded, but avoids the CPU lockup for parsing and execution. Later, when the user clicks on a feature that needs code, a cool dynamic technique is used to extract the code from the comments and eval it.
This technique has many benefits. It gets the download delays out of the way, so the code is already in the client if and when the user needs it. This technique avoids the CPU load for parsing and executing the code – this can be significant given the size of JavaScript payloads in today’s web apps. One downside of this technique results from cross-site scripting restrictions – the commented out code must be in the main page or in an iframe.
This is where Charles Jolley (from the SproutCore team) started his investigation. He wanted a technique that was more flexible and worked across domains. He presents his new technique (along with results from experiments) in two blog posts: Faster Loading Through Eval() and Cut Your JavaScript Load Time 90% with Deferred Evaluation. This new technique is to capture the deferred JavaScript as strings which can be downloaded with negligible parsing time. Later, when the user triggers a feature, the relevant code strings are eval’ed.
His experiment includes three scenarios for loading jQuery:
- Baseline – load jQuery like normal via script tag. jQuery is parsed and executed immediately on load.
- Closure – load jQuery in a closure but don’t actually execute the closure until after the onload event fires. This essentially means the jQuery code will be parsed but not executed until later.
- String – load jQuery as a giant string. After the onload event fires, eval() the string to actually make jQuery ready for use.
The results are promising and somewhat surprising – in a good way. (Note: results for IE are TBD.)
Charles reports two time measurements.
- The load time (blue) is how long it takes for the onload event to fire. No surprise – avoiding execution (“Closure”) results in a faster load time than normal script loading, and avoiding parsing and execution (“String”) allows the page to load even faster.
- The interesting and promising stat is the setup time (green) – how long it takes for the deferred code to be fully parsed and executed. The importance of this measurement is to see if using eval has penalties compared to the normal way of loading scripts. It turns out that in WebKit, Firefox, and iPhone there isn’t a significant cost for doing eval. Chrome is a different story and needs further investigation.
These techniques for deferred loading of JavaScript are great additions to have for optimizing web site performance. The results for IE are still to come from Charles, and will be the most important for gauging the applicability of this technique. Charles is presenting this technique at tomorrow’s Velocity Online Conference. I’m hoping he’ll have the IE results to give us the full picture on how this technique performs.
How browsers work
My initial work on the Web was on the backend – C++, Java, databases, Apache, etc. In 2005, I started focusing on web performance. To get a better idea of what made them slow, I surfed numerous web sites with a packet sniffer open. That’s when I discovered that a bulk of the time spent loading a web site occurs on the frontend, after the HTML document arrives at the browser.
Not knowing much about how the frontend worked, I spent a week searching for anything that could explain what was going on in the browser. The gem that I found was David Hyatt’s blog post entitled Testing Page Load Speed. His article opened my eyes to the complexity of what the browser does, and launched my foray into finding ways to optimize page load times resulting in things like YSlow and High Performance Web Sites.
Today’s post on the Chromium Blog (Technically speaking, what makes Google Chrome fast?), contains a similar gem. Mike Belshe, Chrome developer and co-creator of SPDY, talks about the performance optimizations inside of Chrome. But in so doing, he also reveals insights into how all browsers work and the challenges they face. For example, until I saw this, I didn’t have a real appreciation for the performance impact of DOM bindings – the connections between the JavaScript that modifies web pages and the C++ that implements the browser. He also talks about garbage collection, concurrent connections, lookahead parsing and downloading, domain sharding, and multiple processes.
Take 16.5 minutes and watch Mike’s video. It’s well worth it.
Google Analytics goes async
Today’s announcement that Google Analytics Launches Asynchronous Tracking is music to my ears. Not only does it make web sites faster, switching over to this async pattern improves uptime and increases the amount of analytics data gathered. I’ll touch on each of these three benefits, and wrap-up with an overview of the new code snippet.
The pain of loading JavaScript files is that they block the page from rendering and block other resources from downloading. There are workarounds to these problems. Chapter 4 of Even Faster Web Sites describes six techniques for Loading Scripts Without Blocking. One of those, the Script DOM Element approach, is the technique used in the new Google Analytics async pattern. Google Analytics’ ga.js file is a perfect example of a script that should be loaded asynchronously – it doesn’t add any content to the page, so we want to load it without blocking the images and stylesheets that give users what they really came to see.
What happens if a script takes a long time to load, or fails to load? Because scripts block rendering, users are left staring at an empty page. Google Analytics has an amazing infrastructure behind it, but any resource, especially from third parties, should be added cautiously. It’s great that the GA team is evangelizing a pattern that allows the web site to render while ga.js is being downloaded.
One workaround to the blocking problem is to move scripts to the bottom of the page. In fact, this is exactly what’s suggested in the old ga.js snippet. But this means users who leave a page quickly won’t generate any analytics data (they leave before the script at the bottom finishes loading). Moving to the async pattern and loading it at the bottom of the page’s head, as suggested, means more of these quick page views get measured. This is too good to believe – not only do you get a faster, more resilient page, but you actually get better insights into your traffic.
Just to be clear, ga.js will continue to work even if web site owners don’t make any changes. But, if you want a faster site, greater uptime, and more data, here’s what the new async snippet looks like:
var _gaq = _gaq || []; _gaq.push(['_setAccount', 'UA-XXXXX-X']); _gaq.push(['_trackPageview']); (function() { var ga = document.createElement('script'); ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js'; ga.setAttribute('async', 'true'); document.documentElement.firstChild.appendChild(ga); })();
It’s extremely cool to see this pattern being evangelized for such a major piece of the Internet. A few items of note:
- Obviously, you have to replace “UA-XXXXX-X” with your ID.
- Since ga.js is being loaded asynchronously, there has to be a way for web site owners to couple their desired GA functions with the code when it finishes loading. This is done by pushing commands onto the Google Analytics queue object, _gaq.
- Once all your callback commands are queued up, the ga.js script gets loaded. This is wrapped inside an anonymous function to avoid any namespace conflicts.
- Inside the anonymous function is where we see the Script DOM Element approach being used – with two nice improvements. A ‘script’ element is created and its SRC is set to the appropriate ga.js URL. Looking ahead to support of asynchronous scripts in HTML5, the ‘async’ attribute is set to ‘true’. Very nice! The main benefit of this is it tells the browser that subsequent scripts can be executed immediately – they don’t have to wait for ga.js. The last line adds the script element to the DOM. This is what triggers the actual download of ga.js. In most of my code I do document.getElementsByTagName(“head”)[0].appendChild, but that fails if the document doesn’t have a head element. This is a more robust implementation.
It’s always hard to find the right spot on the complexibility curve. This async snippet hits it just right. It’s slightly more complex than the old pattern, but not by much. Besides the benefits highlighted here, this new pattern is able to support more advanced usage patterns, including pushing an array of commands and pushing functions.
The theme driving much of my work this year is fast by default. I want high performance to be baked into the major components of the Web, so things are just fast. Seeing Google Analytics adopt this high performance async pattern is a huge win. But the proof is in the pudding. If you switch over to the new async pattern, measure how it affects your page load times and the amount of data gathered, and add a comment below. My prediction: 200ms faster and 10% more data. What do you see?
Chrome dev tools
I just finished reading the latest post on the Chromium Blog: An Update for Google Chrome’s Developer Tools. Dynatrace’s Ajax Edition was impressive – just take a look at what John Resig had to say. I’m also impressed by what’s been added to WebKit’s Web Inspector and Chrome’s dev tools. You should definitely take them for a spin, but I’ll give you a preview here.
A key part to any tool’s success is the assurance that there’s support behind it in the way of documentation, tutorials, issue tracking, etc. This blog post links to the new chrome dev tools site that has been put together, including several video tutorials. I spent most of my time walking through the full tutorial. To see these new tools, make sure to get on the Chrome dev channel. Finally, any issues can be seen and added via the chromium issues page.
Once you’ve got the latest Chrome dev release, you can access these tools by clicking on the Page menu () and select Developer -> Developer Tools. There are six panels to choose from. The Elements panel shows the DOM tree. A nice feature here is the ability to see the event listeners, including anonymous functions, attached to any element.
Much of my time analyzing web sites is spent looking at HTTP waterfall charts. This is captured in the Resources panel. Since this slows down web sites, it’s off by default. You can make it enabled permanently or for the current session. Doing so reloads the current web page automatically so you can see the HTTP waterfall chart. The DOMContentLoaded and Onload events are shown (blue and red vertical lines respectively). This is incredibly helpful for developers who are tuning their page for faster performance, so they can confirm deferred actions are happening at the right time. The only other tool I know that does this is Firebug’s Net panel.
JavaScript debugging has gotten some major enhancements including conditional breakpoints and watch expressions.
Developers can finally get insight into where their page’s load time is being spent by looking at the Timeline panel. In order to get timeline stats, you have to start and stop profiling by clicking on the round dot in the status bar at the bottom. The overall page load sequence is broken up into time segments spent on loading, JavaScript execution, and rendering.
The most differentiating features show up in the Profiles panel. Here, you can track CPU and memory (heap) stats. A couple other tools track CPU, but this is the only tool I’m aware of that tracks heap for each type of constructor.
Most of these features are part of WebKit’s Web Inspector. The new features added by the Chrome team are the timeline and heap panels. All of these improvements have arrived in the last month, and result in a tool that any web developer will find useful, especially for building even faster web sites.