April 12, 2017 8:01 am | Comments Off on HTTP Archive New Leadership
I announced the HTTP Archive six years ago. Six years ago! It has exceeded my expectations and its value continues to grow. In order to expand the vision, I’ve asked Ilya Grigorik, Rick Viscomi, and Pat Meenan to take over leadership of the project.
The HTTP Archive is part of the Internet Archive. The code and data are open source. The project is funded by our generous sponsors: Google, Mozilla, New Relic, O’Reilly Media, Etsy, dynaTrace, Instart Logic, Catchpoint Systems, Fastly, SOASTA mPulse, and Hosting Facts.
From the beginning, Pat and WebPageTest made the HTTP Archive possible. Ilya and Rick will join Pat to make the HTTP Archive even better. A few of the current items on the agenda:
- Enrich the collected data during the crawl: detect JavaScript libraries in use on the page, integrate and capture LightHouse audits, feature counters, and so on.
- Build new analysis pipelines to extract more information from the past crawls
- Provide better visualizations and ways to explore the gathered data
- Improve code health and overall operation of the full pipeline
- … and lots more – please chime in with your suggestions!
Since its inception, the HTTP Archive has become the goto source for objective, documented data about how the Web is built. Thanks to Ilya, that data was brought to BigQuery so the community can perform their own queries and follow-on research. It’s a joy to see the data and graphs from HTTP Archive used on a daily basis in tech articles, blog posts, tweets, etc.
I’m excited about this next phase for the HTTP Archive. Thank you to everyone who helped get the HTTP Archive to where it is today. (Especially Stephen Hay for our awesome logo!) Now let’s make the HTTP Archive even better!
Comments Off on HTTP Archive New Leadership
November 9, 2016 5:45 am | 5 Comments
Velocity Origin Story In 2007, my first book, High Performance Web Sites, was selling very well. That, plus the launch of YSlow, brought more attention to web performance. As a result, Jon Jenkins invited me to give a tech talk at Amazon. Afterward, he, John Rauser, and I, plus a few other performance-minded folk, had a […]
May 23, 2016 9:08 am | 8 Comments
The HTTP Archive crawls the world’s top URLs twice each month and records detailed information like the number of HTTP requests, the most popular image formats, and the use of gzip compression. In addition to aggregate stats, the HTTP Archive has the same data for individual websites plus images and video of the site loading. […]
February 10, 2016 7:37 pm | 6 Comments
A big change in the World of Performance for 2015 [this post is being cross-posted from the 2015 Performance Calendar] is the shift to metrics that do a better job of measuring the user experience. The performance industry grew up focusing on page load time, but teams with more advanced websites have started replacing PLT […]