Earth Notes: On Website Technicals (2017/10)

Tech updates: rounded corners, mobile usability, HTTP/2 vs mobile, bad bot, uncss tweaks, latency, unit tests...

2017/10/15: Latency

Although I am able to serve precompressed HTML pages locally to my laptop in as little as 4ms or so, even with a simple tool such as curl (see I'm occasionally getting huge spikes up to 200ms+. Eeek.

% sh script/
% sh script/
% sh script/
% sh script/
% sh script/
% sh script/
% sh script/
% sh script/

I don't believe that this is anything particularly to do with my Apache configuration now, since (when I am testing) there should be thread free and waiting.

I have installed NGINX to have a little play, and to understand how it works. It is reportedly faster and more memory efficient than Apache.

One aspect that I noticed while looking at configuration, is that NGINX allows log files to be written buffered (and even gzipped) to reduce filesystem activity.

I suspect that my latency spikes arise from contending with other filesystem traffic, so I could aim (for example) to cache key critical path HTML content in memory with Apache and/or tweak how it writes logs.

If log file activity is any part of the problem then I can enable server wide BufferedLogs for Apache, though for 2.2 it is still 'experimental' and would (for example) prevent me viewing activity in real time.

If contending with other filesystem traffic is an issue then there is mod_mem_cache which could be set up to hold (say) up to 100kB of objects up to 14600 bytes each (ie that could be sent in the initcwnd. Some limitations are that this cache is worker process based (so may be held in multiple copies and continually discarded so with a low hit rate too), it is shadowing and likely duplicating any filesystem cacheing done by the OS, and there is no obvious way to keep this cache for just the key HTML pages let alone just the precompressed ones. For the mobile site (this directive appears to be per-site) almost all objects are reasonable candidates for caching however.

To enable cacheing generally (though the default setup parameters are system-wide and not really appropriate):

a2enmod cache
a2enmod mem_cache
/etc/init.d/apache2 restart

With that global setting there is no visible improvement, nor even with a specific config block for the mobile site instead:

# Set up a small in-memory cache for objects that will fit initcwnd.
<IfModule mod_mem_cache.c>
CacheEnable mem /
MCacheMaxObjectSize 14600
MCacheSize 100

So I have disabled mem_cache again for now.

The latency seems much more consistent (and at ~14ms) when tested locally (on the server), possibly because this way the CPU will likely be spun up to max by running the script before Apache is invoked, or possibly because my router (and WiFi) is introducing jitter.

% sh script/
% sh script/
% sh script/
% sh script/
% sh script/
% sh script/
% sh script/
% sh script/

Unit Tests

I've introduced cleancss to further minify CCS after the uncss (etc) step, to trim some remaining redundancy, from newlines to adjacent residual rules for the same selector. I have thus squeezed out a few more bytes from the header.

However, there is quite a narrow path to tread to minify more, but not to remove a few pieces that need to remain, such as fallbacks for old browsers. To this end I have wrapped up cleancss in a script with the right properties, and created a makefile for unit testcases to ensure that the minification does strike the required balance.

Tests for other delicate/fragile functionality can be added to the new makefile in due course.


Amongst all this fun I made some minor design changes, such as having full-width hero banners narrower than a full container width (ie up to desktop 800px width) stretch to the container width to look consistent with other elements. This includes all mobile hero banners, so I bumped up the ceiling size of auto-generated mobile images from 5kB to 7.5kB to look better when they are stretched. (And 2.5kB on an image already being streamed, and still within the initcwnd, is not a big deal.)

Column/carousel headers and header floats are now selected for a nearer 1:1 aspect ratio, and for not being hugely over-width, for a more consistent appearance, and to avoid inefficient use of bandwidth.)

2017/10/13: uncss Tweaks

Discussing "With -n avoid injecting blank lines (and unnecessary newlines)" with the uncss maintainers I have resolved to avoid leaving any trailing newline(s) in my minified CSS files (which is easy now that I have cleancss available on the command line, so no more manual cut-n-paste), and they may be willing to tweak uncss not to inject a newline after processing each CSS fragment when -n is used.

The first of those (my bit) means that I have been able to remove an extra pipeline process that I was using to remove blank lines.

(The uncss maintainers want to avoid scope creep and starting to actively strip out whitespace, which is reasonable.)

The second will save a few more bytes in the header (and thus the critical rendering path) without extra minification work, if it happens.

2017/10/10: Bad Bot

I wasn't quite believing my logs-derived stats, and in particular they claimed that I was seeing much less mobile traffic from real humans than (say) Google claims to send my way from searches alone.

There's one inexplicable type of activity where a varying IP, usually somewhere in China, though occasionally elsewhere else in Asia, downloads the same legitimate page over and over at a moderate rate. The User-Agent is not recognisably a bot, ie looks like a human's browser. The stats' computed mean page download size seemed to be too close to that page's size (with gzip Content-Encoding) as if it might be dominating the calculation.

I wrote a simple filter in the stats to ignore most of the effect of that traffic, and count it as 'bot' traffic, and my bot traffic percentage jumped from about 50% to about 75%, ie that one bot is responsible for 25% of all my traffic! Bah! What a waste. And presumably trying to break in. I may be able to add other defences in due course.

Now mean www and m page transfer size (eg including HTTP header overhead) show as 10745 and 13913 bytes respectively, both within the TCP initcwnd. So able to be delivered to the browser in the first volley without waiting for an ACK, which is good.

Also, mobile page downloads now seem to be about 13% of www, which is lower than I'd like, but closer to what other data sources suggest.

2017/10/09: Mobile vs HTTP/2 Dilemma

It's going to be a while before I can easily update this site to HTTP/2, since basically it requires a complete OS upgrade to support the underlying TLS that HTTP/2 in practice requires.

But articles such as "Mobile Networks May Put HTTP2 On Hold" and "HTTP/2 Performance in Cellular Networks [PDF]" strongly suggest that HTTP/2's placing of all eggs in one TCP connection basket, coupled with the lossy nature of cellular/mobile, the loss of HTTP/1.1's multiple connections' initial connection windows (initcwnd), and maybe how TCP deals with loss (still largely assuming it to indicate congestion), makes it perform worse than venerable HTTP/1.1 for mobile (especially unencrypted). TLS adds lots of latency that HTTP/2 doesn't fully undo, and that's what really matters for mobile performance it seems: "More Bandwidth Doesn’t Matter (much)".

My current thinking: when I introduce HTTPS I will make it the canonical form for desktop, but have plain old plain text HTTP for mobile as the preferred alternate form. (I'll support both for both, without redirects or pinning, to let users choose; search engines can exercise some policy too.)

In the future, improved TCP congestion and loss management algorithms such as "BBR, the new kid on the TCP block" may help ease the difference in favour of HTTP/2.

2017/10/08: Mobile Usability

I just had a couple more pages show up in the GSC "Mobile Usability" report, dated 2017/10/09 (well, "10/09/17" in what I consider a particularly perverse display format that I am unable to adjust) even though I pre-emptively 'fixed' them a week or more ago and verified them to be 'mobile friendly' with Google's tool then and again now. Very odd that it should take 4 weeks to propagate a stale notification to the GSC UI.

Incidentally, on the mobile site, there seems to be an effective floor of 200ms on "Time spent downloading a page" in GSC with occasional dips below that possibly from G then crawling from (say) Europe rather than the US west coast. Very low download times only seem to be reported when there are very few items downloaded that day. (The average is currently running at 239ms, minimum 86ms.)

I'm now marking the (already-end-of-body) Share42 JavaScript async to try to lower its priority (from medium to low in Chrome) and to ensure that nothing at all blocks waiting for it. With async the script does seem to get loaded later, and some images sooner, but nothing totally clear, and I fear that this may break something subtle for some browsers, eg by introducing a timing race.

I've also significantly pushed up the number of threads that Apache (worker) has waiting to service requests (MinSpareThreads, MaxSpareThreads, plus MaxClients), for when browsers open many concurrent connections, possibly amplified by sharding. This seems to be dealing with cases where one or two connections seemed to be blocked for hundreds of milliseconds, delaying final rendering. For some reason this effect seemed more pronounced on slow/mobile connections, which has been puzzling me. I suspect that although the connections were quickly accepted by Apache, it was not able to send content on some of them until others had completed. For example for Pingdom, testing from Australia, the site moves from the top half of all sites to the top 20%. This may also kill some of the spikes in download time seen by GSC. There is more to life than just TTFB to optimise for! I will have to keep an eye on memory use on my little RPi though.

2017/10/01: Rounded Corners

Time to visit the dark side again. Long after the fashion has been and gone, no doubt, I feel that it is time to knock the sharp corners off the hero images in the column display on the front page at least. (Indeed, after a bit of fiddling, I've applied this as a house image style to 'soften' all floats too.)

So I have applied a small amount of border-radius magic, which involved a new CCS class in the base CSS support, and some tweaking of the wrapper scripts. Yes, it's a little smoother, at very little increased page weight, maybe ~10s of bytes after compression, given the removal of most unused CSS by static analysis per-page.