Web performance for the future
I started working on web performance around 2003. My first major discovery was the Performance Golden Rule:
80-90% of the end-user response time is spent on the frontend. Start there.
Up until that point all of my web development experience had been on the backend – Apache, MySQL, Perl, Java, C & C++. When I saw how much time was being spent on the frontend, I knew my performance research had to focus there.
My first discussion about web performance was with Nate Koechley when we both worked at Yahoo!. (Now we’re both at Google!) I hadn’t met Nate before, but someone told me he was the person to talk to about clientside development. I don’t think YUI existed yet, but Nate and other future YUI team members were present, leading pockets of web development throughout the company.
God bless Nate and those other folks for helping me out. I was so ignorant. I was good at finding performance inefficiencies, but I hadn’t done much frontend development. They helped me translate those inefficiencies into best practices. The other thing was – this was still early days in terms of frontend development. In fact, when I was writing my first book I didn’t know what words to use to refer to my target reader. I asked Nate and he said “F2E – frontend engineer”.
Today it might seem funny to ask that question, but frontend engineering was a new discipline back then. This was before YUI, before Firebug, before jQuery – it was a long time ago! Back then, most companies asked their backend (Java & C) developers to take a swag at the frontend code. (Presumably you had a head start if you knew Java because JavaScript was probably pretty similar.)
Fast forward to today when most medium-to-large web companies have dedicated frontend engineers, and many have dedicated frontend engineering teams. (I just saw this at Chegg last week.) Frontend engineering has come a long way. It’s a recognized and respected discipline, acknowledged as critical by anyone with a meaningful presence on the Web.
I like to think that web performance has helped frontend engineering grow into the role that it has today. Quantifying and evangelizing how web performance is critical to creating a good user experience and improves business metrics focuses attention on the frontend. People who know the Web know that quality doesn’t stop when the bytes leave the web server. The code running in the browser has be highly optimized. To accomplish this requires skilled engineers with established best practices, and the willingness and curiosity to adapt to a constantly changing platform. Thank goodness for frontend engineers!
This reminiscing is the result of my reflecting on the state of web performance and how it needs to grow. I’ve recently written and spoken about the overall state of the Web in terms of performance. While page load times have gotten faster overall, this is primarily due to faster connection speeds and faster browsers. The performance quality of sites seems to be getting worse: pages are heavier, fewer resources are cacheable, the size of the DOM is growing, etc.
How can we improve web performance going forward?
The state of web performance today reminds me of frontend engineering back in the early days. Most companies don’t have dedicated performance engineers, let alone performance teams. Instead, the job of improving performance is tacked on to existing teams. And because web performance spans frontend, backend, ops, and QA it’s not clear which team should ride herd. I shake my head every time a new performance best practice is found. There’s so much to know already, and the body of knowledge is growing.
Asking backend developers to do frontend engineering is a mistake. Frontend engineering is an established discipline. Similarly, asking fronted|backend|ops|QA engineers to take on performance engineering is a mistake. Performance engineering is its own discipline. The problem is, not many people have realized that yet. Our performance quality degrades as we ask teams to focus on performance “just for this quarter” or for “25% of your time”. Progress is made, and then erodes when attention focuses elsewhere. Best practices are adopted, but new best practices are missed when we cycle off performance.
What’s needed are dedicated web performance engineers and dedicated performance teams. Just like frontend engineering, these teams will start small – just one person at first. But everyone will quickly see how the benefits are bigger, are reached sooner, and don’t regress. The best practices will become more widely known. And the performance quality of the Web will steadily grow.
Vid Luther | 27-Aug-13 at 10:02 pm | Permalink |
Steve,
In your opinion, should these performance engineers and teams focus on page load time? or render time? In your other post about moving beyond window.onload, and your other talks about perception, you seem to be suggesting that focus should change to user perception than just mechanical measurements. So, what tools would these engineers need to develop or use that will help them make better decisions about the effectiveness of their work?
Charlie | 28-Aug-13 at 12:34 am | Permalink |
If performance engineers are additional people what will they actually do? I think of performance as something similar to testing – it should be in from the beginning and more or less part of everybody’s job, at least everyone should understand its importance. You might have specialists to help with integration and functional testing. Maybe it will be the same with performance.
Regarding performance as a discipline, I’m increasingly convinced of edge-style improvements such as mod_pagespeed where indeed specialists are required. This does not entirely remove the requirement for improvements in the source code but can prevent well-intentioned but ultimately misguided premature optimisations.
Resources like httparchive.org are great for exposing website performance in such a comparative way. After a relaunch of a client’s site earlier this year we’ve been looking at ways to gradually improve performance. Being able to benchmark the site against comparable ones is often crucial to get the permission to work on improvements on sites which don’t have a conversion model. Weirdly, however, recent improvements have led to a lower pagespeed score. Maybe it’s time to revisit the metrics behind it?
Drew | 28-Aug-13 at 1:23 am | Permalink |
I will also chime in to say that “performance as a discipline” seems contentious. Yes, it’s a cross-cutting concern. Yes, it matters. But if performance then also why not security or accessibility or even things like interoperability (or standards conformance) or stability (what most businesses actually want from their engineering teams)? There are arbitrarilary many “disciplines” we can create for one reason or another.
I strongly suspect that you (as usual) have things to say that would open my eyes a lot but even the definition of “performance” seems to be something contentious. What does that include? Page load? Round trip time? Customer satisfaction metrics? Are there also team metrics in a given iteration that should be used? Who owns all of that? The “performance engineer”?
The “best practices” thing seems like a canard but I don’t want to dwell on that.
Sorry about all of the quotes above. They honestly weren’t all meant as sarcasm quotes. Well . . . so some were. Sorry. I won’t take them back now. I love the work you do, Steve, and I’d like to see how you flesh this thing out.
Markus Leptien | 28-Aug-13 at 8:03 am | Permalink |
We (actually *I*) introduced a performance team within my company. Reason for this was, that after having almost two years of cyclic performance sprints it became clear to all of us, that if we didn’t instantiate a “standing” organizational unit, we would clearly burn money by fixing over and over again things, that have gotten worse again in times we were not having performance as Number 1 Priority.
Why is it, that we had to do this over and over again? Because websites of a decent size are never static over time. You get new agencies delivering content, get new external contractors delivering code to your site, get new 3rd-Party Snippets being added to your sites. Then throw in relaunches, changed customer flows, changed audience etc. etc.
The skillset is pretty different compared to what you would expect from frontend engineers or QA people. In our case it is a mixture of Frontend-Dev (analyzing root causes), Operations (Maintaining RUM and WPT Backends), Statistics (getting meaningful insights out of all that data) and “Business” (Calculating Business Cases, discussing Customer Flows with the Business side). So this team meets once a week with the Business Side (Sales&Service) and the Frontend Devs to discuss the insights of last weeks analysis and focus areas / way forward for the next week.
To reply to some of the questions here:
We are focussing on Speed-Index, which means progressive rendering. We are also collecting all the other data that you can get out of WPT and NavTiming, but use these more in a sense of signals for issues/problems and correlations.
Security IS a dedicated unit within our company. Because it is backed by a business case. Interop and Accessability is not, because there is no business case to back a dedicated team for that. Simple as that.
Regarding edge-style optimization: Would love to have one, but currently I am not convinced, that there is a solution, that would fit into our architecture / (Caching-)infrastructure and give us the same level of optimization. As it would require to optimize across multi-page customer flows and not page by page. And is blind to business level discussions to the likes of “Do we need this 4th Analytics-Provider-Snippet on these pages”.
Steve Souders | 28-Aug-13 at 8:42 am | Permalink |
Vid: What matters most is the user experience. That’s hard to measure – there’s no event handler for it. Measuring window.onload was a good proxy 5-10 years ago, but much less so today. So we need to evolve. I think we’ll see new metrics in three areas.
1. content-based metrics – When does the content appear in the page? Speed Index from WebPagetest is the best measure of this (as corroborated in Markus’ comment).
2. user-interaction metrics – When does the user engage with the page? It’s possible with today’s RUM solutions to measure when the user first moves their mouse, clicks, scrolls, etc. It’s likely that these times reflect the users’ perception of when the page is ready for many websites (but not all). I’m surprised we don’t see more of these metrics today.
3. business metrics – Ultimately the website is intended to support the business along some dimension (ecommerce, downloads, eyeballs, etc.). If a performance optimization improves these metrics, then that’s a good change.
Steve Souders | 28-Aug-13 at 9:13 am | Permalink |
Charlie: There are many things that are important when building & running a website. I definitely agree that to be a good web dev you need to understand how the whole stack works. Theo Schlossnagle addressed this topic well in his keynote at Velocity Berlin. That doesn’t mean that everyone on the team is responsible for every aspect of the website. We have teams that focus on specific areas. But, as Theo describes, there needs to be someone who is looking across the entire stack, someone who has a holistic perspective. Theo talks about the need for this perspective for the Ops Team to maximize availability and scalability issues. The Performance Team has this perspective to maximize website speed and a fast user experience.
The tasks for the Performance Team include selling performance internally, establishing metrics and targets, integrating performance into the build process, tracking these metrics, diagnosing performance issues, adopting new performance best practices (like async widgets), creating new performance optimizations (specific to the website itself as well as more general patterns), and making performance part of the company culture (training, tech talks, code reviews, etc.). Markus’ comment provides a good case study of such a team.
Steve Souders | 28-Aug-13 at 9:25 am | Permalink |
Drew: You present an interesting hypothetical situation where every aspect of web development should result in a separate team. The reality is this isn’t practical, nor optimal. Markus touches on the business motivations behind where a company should focus. Other factors include the amount of work, specialized skillset, innovation, etc. For example, security and accessibility are separate teams in larger companies. The rest of the dev team still have to address security and accessibility as they code, but they can follow guidelines established by the core teams and rely on centralized services and technology for building, maintaining, and validating that code.
“Performance” is an overloaded term. (I get a lot of email about improving the click-through performance of ads and maximizing the performance of employees.) In terms of websites, I break it into two parts: efficiency and speed. “Efficiency” is addressed by the Ops side of Velocity chaired by John Allspaw. When I use the word “performance” I really mean speed from the user’s perspective. As mentioned in my response to Vid, we need to evolve our performance (speed) metrics to include content-based metrics, user interaction metrics, and business metrics.
Steve Souders | 28-Aug-13 at 9:40 am | Permalink |
Markus: Thanks for adding a your story of a real world Performance Team. You validate much of what I’m saying. It’s great to hear about the specific tasks the team is responsible for. The cross-discipline meetings is a good reminder for other performance engineers. I think “edge-style” optimizers (mod_PageSpeed and PageSpeed Service, CloudFlare, Akamai FEO, Radware, Yottaa, Riverbed, etc.) are a great solution for many optimizations (image optimization, fingerprinting URLs for caching, etc.). Optimizations that are specific to the single website require performance-minded engineers. Going forward companies will need both.
Andy Still | 28-Aug-13 at 10:03 am | Permalink |
Steve, I think you have hit the nail on the head with your idea of “performance as a discipline” and I am frequently surprised at the lack of traction that this has gathered in the industry. Having said that I may be slightly biased as I setup a company several years ago aimed at delivering this a service to businesses.
For me the issue is that performance is such a different skillset from standard development that it is unreasonable to expect any developer to just be able to pick it up when working on a system where performance is a real concern. Just as it would be unreasonable for a company to expect a developer to be able to understand the concepts necessary to ensure that a banking application is sufficiently secure – that would be the task of a dedicated security team. Security is something that you can’t just play at, it is too important and too high risk. For me performance is the same.
Performance involves entirely different ways of identifying problems, acceptance criteria, root cause analysis, environmental understanding, understanding behaviour under load, understanding environmental issues and relationships between devices and services etc.
We identify this as a separate skillset and maintain this as a separate discipline within the company. We embed this either as a member within the development scrum team or as a dedicated scrum team that just focuses on performance optimization of existing code.
Chris | 30-Aug-13 at 7:00 am | Permalink |
Great post. I currently fill this role at my company. My title is “Web UI Architect” but I’ve been focusing maybe 80% of my time on webperf for the last few months, working with DevOps, Web UI and QA teams. Setting up private WPT instances, introducing mod_pagespeed, optimizing asset pipeline w grunt and imageoptim… and educating leads across various teams incl outside engineering. Long way to go but making strides. (FYI ImageOptim shaved 32% of useless bytes from our image assets, highly recommended!)
Please keep doing what you do, helping move this discipline forward. Thanks!
Dean J | 31-Aug-13 at 5:16 pm | Permalink |
Measure the amount of time until the user sees useful content.
Require changes – backend, frontend, or wherever – that significantly slow that down to justify themselves, the same way you’d have to justify if you used significantly more RAM in a large environment.
Reward changes that significantly speed up that number; call out changes that make this better.
Instead of performance as a discipline, it’s performance as a requirement.
Alex Podelko | 04-Sep-13 at 7:37 am | Permalink |
I believe that the scope of performance engineering teams should be end-to-end performance – see my recent post Breaking Performance Silos http://www.speedawarenessmonth.com/breaking-performance-silos/
Old articles, such as Five Steps to Establish Software Performance Engineering in Your Organization by Connie Smith and Lloyd Williams http://www.perfeng.com/papers/establish5.pdf, still look pretty relevant to the topic.
Jason Buksh | 11-Nov-13 at 7:04 am | Permalink |
The problem with the current state of the market is that roles tend to be siloed. There is a ‘we’ve done our part’ over the fence mentally. In order for a Performance Engineering function to work correctly as you describe, it needs to be driven from senior management. Where I’ve seen this work best is when Key Performance Indicators are stated and visible. SLA’s tend to be a mistake (!?). I actually prefer to have an SLA without an hard objective .. “Lets measure the site load time at the 95th Percentile…. [Don’t insert metric objective] ) â€. Its tends to be embraced more easily and allow collective accountability when all can see. When executive management see and take responsibility for KPI’s it allows a more collective response to be enacted as this requires a strategic response, not tactical. The Perf Eng’ing function then needs to become more a Perf Engineering Management function.
Jason Buksh