If you’ve been following the rant in the SEO blogosphere, you are probably aware of the recent study that tested the impact of pagespeed metrics on SEO performance. The study was first published by Zoompf’s CEO Mark Isham on the Moz community blog. Its goal was to demystify Google’s definition of “page speed” by pinpointing the specific factors that impact SEO performance.
To accomplish their goal, the researchers observed 100,000 webpages and 2,000 search queries, looking for a correlation between Google rankings and over 40 various performance-related metrics. They came back with a somewhat surprising conclusion. Their data showed that none of the front-end metrics had any direct effect on Google’s indexing decisions. In fact, the only performance factor that seemed to matter was the pages' Time to First Byte (TTFB):
This isn’t the first time that TTFB benefits are being mentioned. Pre-dating this report, the value of TTFB was already mentioned by several industry experts, including Ilya Grigorik, a developer advocate on Google’s ‘Make the Web Fast’ team who wrote:
So what does this all mean for CDN users? Mark Isham’s study concludes with this suggestion:
Unfortunately this isn’t the correct answer or at least not an entire one. Fact is, 9 out of 10 CDNs will actually contribute to the delay – not promote a solution. To understand why this is true and how (if at all) CDNs can be used to improve your TTFB, I want to introduce you to Time to First Byte biggest delay factor - the processing time of dynamic HTML.
How Dynamic HTML Affects TTFB
Simply put, TTBF measures the time it takes for the first response to arrive from the server. And so, to understand what impacts this response, one should ask: What resources will my server send first?
Obviously the response header is the first in that line. However,as Ilya points out, receiving it will mean nothing in terms of browser parsing - making it nothing more than a “show-off” statistic.
Looking beyond that, the first meaningful response is always the webpage’s HTML component, whose rendering time is composed of:
- Processing (Waiting) time – The time it takes to generate the dynamic HTML on the webserver.
- Network (Receiving) time – The time it takes for the compiled HTML to reach the browser.
The network time for HTML is inconsequential, as these are usually very lightweight resources. However, the processing time for dynamically generated HTML can be extremely high - to the point that it can significantly impact TTFB as well as overall performance.
This actually makes a lot of sense. With today’s widely accessible high-speed networking infrastructures, it isn`t the transfer time slowing down our browsers but rather the compute time spent in generating it. Or, in other words, the time it takes for the webserver to compile the HTML content, before it's even ready to be sent.
From a user’s point-of-view this is the “empty white screen” delay - the single event that impacts user experience more than any other performance related factor.
It’s no wonder that Google used this for its UX-focused algorithms. TTFB is just another term for “How fast can you have your HTML ready for rendering?” In era of speed-of-light network communication, there is no better place to look for bottlenecks in your content delivery.
So How Can CDNs Improve SEO?
Dynamically generated content and CDNs were never the best of friends. After all, with processing time being 99% of the overall “problem”, CDN-based proxy delivery can’t really provide an effective solution.
In fact, barebone CDNs, which rely heavily on caching header directives, are likely to cause more harm than good, as they actually increase the overall load time by providing an additional connection point between the origin server and browser.
Various compression methods, which are now being introduced by modern CDNs, are also ineffective. While these can further reduce networking time by delivering a tightly packaged version of your content, they too will have no impact on the time it takes to generate the HTML.
By far, the best way a CDN can be used to improve delivery of dynamic HTML is by intelligently caching it, thus removing the need for processing on the origin webserver.
Think about it - if you are using any kind of database driven CMS, your homepage is dynamically re-generated time and time again for each user. Same goes for all of your blog posts, your contact pages, your “About us” and etc… Yet how many of these are actually updated and how often? Wouldn’t it be much easier - and much more TTFB-friendly - to classify the HTML components of these pages as static and have them delivered directly from the CDN, with no processing and from the nearest possible location?
This is exactly what we are doing with Incapsula’s Advanced Caching feature, which uses proprietary learning algorithms to cache the "typically uncacheable" dynamically rendered content.
By monitoring usage patterns and changes to the dynamically generated content, Incapsula’s algorithms automatically recognize under-utilized caching opportunities; including “stale” dynamically generated HTML. Once such an object is identified, its cached copy is retained on our caching network. As a result, even if dynamic in nature, the content will now be served with zero processing time, directly from the nearest data center.
Needless to say that this approach significantly improves TTFB, as well as other speed and bandwidth related metrics. Its only downside - a possible impact on content's “freshness” - is resolved by a continuous validation algorithm that work behind the scenes, repeatedly comparing cached copies to their original versions to ensure freshness.
Using Incapsula CDN to Manage TTFB and Freshness
Until recently Incapsula's process of dynamic content caching was run on auto-pilot. However, our newly introduced CDN settings enable users to fine-tune their caching policies, providing some added benefits which are definitely worth mentioning:
- Purge Cache – This newly added ability to manually purge all cache objects complements our automated validation mechanisms by removing dependency on our pre-determined validation cycles. With a click of a button your cached copy can be removed, assuring instant freshness for all newly updated content – dynamically generated or otherwise.
- Cache Everything – This is a more aggressive version of our learning-based algorithms. Turning this on will cache all available objects, regardless of their type or function. The obvious downside is a significant impact on freshness. Still, if you are running a static blog, and can develop a routine of manual cache purging after each update, this may be the option for you.
- Always Cache (by resource) – A more selective version of the above-mentioned “Cache Everything” mode, this is a great tool for manual fine-tuning of your caching directives. If you only want to cache the HTML component of a specific page to improve TTFB without affecting the rest of the objects on page, this is a setting you should use.
- Async Validation – This option controls our automated validation cycles. Enabling this option makes sure that all users will always be served from cache, improving TTFB across the board. However, once per caching period, a single user may be served a “stale” copy, as the cache will asynchronously update during his/her visit.
- Override "Vary:User-Agent" headers – "Vary:User-Agent" headers will prevent caching. However, these headers are often misused, appearing on pages where they serve no real purpose. Enabling this option will allow Incapsula to override these directives, improving your TTFB, user-experience and search engine rankings.