Earth Notes: On Website Technicals (2017-11)

Updated 2024-04-18.
Tech updates: Googlebot warp space, image re-optimisation, even liter, defer, inlining, video.
This month had two apparently opposing goals: making pages lighter and faster, and adding video. Inlining images is a dilemma all by itself.

2017-11-28: Video

Two of the pages on the site now have working Twitter video 'player' cards. Both with YouTube-hosted videos, but one also with a local highly-compressed .mp4 as a test. They still have an og:image in place too, and that image is still used as the hero source.

I have now embedded in the footer for 'full' pages, links to the hero image (etc) for each page, including any embedded Twitter player as an inline data minimal HTML page!

(It took about a day for Twitter to approve/whitelist the first player card; the next one just worked. I have never needed approval for image cards.)

2017-11-27: Less Defer

For desktops, which should have more network and CPU bandwidth, and whose final rendering is delayed until the share buttons appear at the top of the page (ie ATF), I have decided to switch the Share42 script back to async. But I will keep defer for mobile/lite to keep JS load priority low and to nominally defer parsing as long as possible even through there is no following HTML.

Also I am changing the header link to the 'desktop' site to 'full'.

2017-11-26: Moar Inlining

I am now allowing very small images (no more than 750 bytes) to be inlined in places beyond mobile ATF hero images. This now allows a handful of desktop pages to inline their very-compressible simple PNG ATF hero images, and the home page(s) opportunistically also to inline corresponding the simple column hero image versions.

The 750-byte break point for non-ATF content is chosen for a number of reasons, mainly the assumed cost of the HTTP/1.0 headers (outbound and inbound) avoided by inlining, plus some other minor savings such as not needing an actual URL and not needing the width and height attributes. When pages get to be served over HTTP/2 that break point might bear adjusting a little. Note that for non-ATF content saving a round trip is generally unimportant.

2017-11-16: Image Inlining

For the mobile/lite site, to try to overcome a potentially-lengthy RTT (Round-Trip Time, ie latency) to fetch the ATF (Above The Fold) hero image (~300ms on 2G/3G), if the image (or its lo-fi variant as used for Save-Data is small enough, then it will be inlined directly into the HTML as a base-64 data URL. (Some other very small hero images could also be inlined where that would probably save overall bandwidth.)

The 'lite' site is also intended for bandwidth-limited connections such as dial-up. On modern mobile latency is the key constraint, on dial-up it is bandwidth, but the overall effects can be very similar!

This has the effect of preventing caching of the inlined images, and of getting less text in the initial inbound TCP packet(s), but the trade-off is hopefully a better overall UX.

With this hero inlining in place the entire mobile home-page ATF content is rendered from the HTML on the first round trip in ~1.7s according to WebPageTest's 'simple' test: Dulles, VA - Moto G4 - Chrome - 3GSlow. The (gzipped) home page HTML is still well within the initcwd, at 8.4kB.

With a 'simple' test with 3G 'fast' (~150ms, ~200ms with trans-Atlantic RTT) Dulles, VA - Moto G4 - Chrome - 3GFast I can just about hit the 1s target on the front page. On other lighter pages (eg with no hero image at all) I am more nearly hitting 1s. (Disabling JavaScript so that parsing it does not compete for CPU has WebPageTest claiming visually complete faster than actually possible in ~300ms, but the charts suggest ~750ms! Even the home page seems to come in under 1s ATF without JavaScript.)

Interestingly the sizes of the mobile/lite and main gzipped pages are now fairly similar, the inlined image filling the gap taken by minimising the HTML boilerplate.

2017-11-14: Defer

Given Prefer DEFER Over ASYNC: "DEFER scripts don't execute until the HTML document is done being parsed", I am testing switching the already-at-the-end Share42 script from async to defer to possibly marginally improve behaviour on mobiles in particular (CPU starved, share icons not shown ATF).

A preliminary WebPageTest suggests at least that the loading of the script stays late (low priority), and that there is no obvious harm from doing this.

Note that defer seems to be slightly less supported than async, eg Opera was reported as not.

2017-11-12: Even Liter

As an experiment I am dropping most of the ads from the mobile/lite site as the nominal revenue (loss) is tiny but the effect on page weight is huge! Partly inspired by "Banner Ads Considered Harmful (Here)".

2017-11-11: Image Re-optimisation

Always poking the hornets' nest out of sheer devilment, I went and took another run at PageSpeed Insights.

By folding in some of Google's specific ImageMagick convert suggestions from "Optimize Images", in particular -sampling-factor 4:2:0 for JPEGs, I was able to hit 100/100 for the mobile home page in a mobile browser, with all other site/browser combinations in the high 90s.

2017-11-05: Firework Routing

Googlebot owns an interesting warp in the fabric of space-time!

See GSC Crawl Stats up to a couple of days ago. (All the record highs are in that snapshot, plus the record low download time.) Up before the last day the usual rule of thumb of ~200ms + ~1ms/kB was holding true, but for the very last point the download time drops ~50ms below the usual baseline, which was due to nothing at my end that I know of, and usually only happens with a very small number of fetches in a day.

I looked through my logs for being crawled locally (eg from within the EU) which could knock down the RTT significantly (vs ~150ms London to California).

Other than a few obvious fakes all the (non-Images) Googlebot IPs appear to be registered in Mountain View, but a couple of traceroutes are interesting:

% traceroute 66.249.76.95
traceroute to 66.249.76.95 (66.249.76.95), 30 hops max, 60 byte packets
 1  192.168.0.254 (192.168.0.254)  0.766 ms  0.997 ms  1.160 ms
...
18  crawl-66-249-76-95.googlebot.com (66.249.76.95)  20.784 ms  20.212 ms *
% traceroute 66.249.70.31
traceroute to 66.249.70.31 (66.249.70.31), 30 hops max, 60 byte packets
 1  192.168.0.254 (192.168.0.254)  0.825 ms  1.116 ms  1.407 ms
...
19  crawl-66-249-70-31.googlebot.com (66.249.70.31)  103.652 ms  103.580 ms  95.009 ms

Note that it can take ~10--20ms RTT across my local FTTC link to my ISP's infrastructure. StatusCake's London server sees a minimum of ~27ms and a mean of ~70ms to pull down the mobile home page.

One Googlebot is 80ms RTT closer than the other, and both seem closer than Mountain View (for normal traffic)!

(Just after writing the above the data for the became visible, with more normal download time but still fairly vigorous spidering.)