Earth Notes: On Website Technicals (2020-11)
Updated 2024-12-18.2020-11-20: Ads Subtracted
I am tweaking to reduce blank spaces and pointless page weight for pages where Google won't show ads because of low traffic.
I have simplified the logic for desktop to be the same as AMP, ie to insert ad code only on pages with at least a specified popularity by page hits.
My expectation is that only a small fraction of (desktop and AMP) pages will now carry ad code, but effective site-wide RPM and earnings should be only slightly reduced.
AMP and desktop pages with ads are periodically rebuilt to purge ad weight from those that are no longer popular enough.
After a full rebuild , the count of (eg) desktop pages carrying ads fell to under 70 from more than 210, from nearly 300 candidate pages.
Very early indications () are that views of pages with ads have dropped about 20% against the 3-fold reduction in pages carrying ads. I can move the threshold for pages to carry ads or not to adjust this balance.
2020-11-17: Old Apache Stop
I have turned off the Apache2 instance running on the old RPi2, since it is not now running any material static site.
# /etc/init.d/apache2 stop # update-rc.d apache2 disable
A quick attempt to contact one of the residual services now hangs/fails, correctly.
After a reboot Apache is still not responding, correctly.
Note that netstat
does show the servlet-based listeners still.
2020-11-16: Apple Touch Icons
I get the occasional blast of requests from an Apple device like so:
www.earth.org.uk:80 "GET /apple-touch-icon-120x120-precomposed.png HTTP/1.1" www.earth.org.uk:80 "GET /apple-touch-icon-120x120.png HTTP/1.1" www.earth.org.uk:80 "GET /apple-touch-icon.png HTTP/1.1"
Sometimes a 152x152 icon is requested.
This happens (I think) when EOU is added to an i-device's homescreen.
So using ManyTools' Apple-touch-icon generator in this case, I created a set of icons, I then svn cp
ed down to the desktop root the 60x60, 120x120 and 152x152 versions (generated as apple-touch-icon-iphone-60x60.png
, apple-touch-icon-iphone-retina-120x120.png
, apple-touch-icon-ipad-retina-152x152.png
) where they are usable by default, and avoiding multiple copies of the pixels in the repo.
apple-touch-icon.png apple-touch-icon-120x120.png apple-touch-icon-152x152.png
Before putting them in the repo I reduced their weight with zopflipng -m -m
.
tinypng.com
first would have been even better! I may copy them to the AMP root too, though probably not to the lite/m-dot to avoid incurring extra bandwidth (and storage) costs for users, for just a little bit of eye-candy.
2020-11-15: WWW Soft Canonical
Without adding any overhead (eg extra headers) to normal connections, but to gently redirect spiders to the WWW https versions of most files, I have inserted the following early config for www.earth.org.uk
.
# Redirect most Referer-less http accesses to https. # Aim to gently redirect spiders to canonical https for most content. # Avoid redirecting (top-level) HTML files that contain own rel=canonical, # so users directly choosing http can stay on http. # Avoid breaking the LE ACME challenge. # Use a 302 (temporary) redirect, for now. RewriteEngine on RewriteCond %{HTTPS} off RewriteCond %{HTTP_REFERER} ^$ RewriteCond %{REQUEST_URI} !^/\.well-known/ RewriteCond %{REQUEST_URI} !^/[^/]*\.html$ RewriteCond %{REQUEST_URI} !^$ RewriteCond %{REQUEST_URI} !^/$ RewriteRule ^/(.*)$ https://www.earth.org.uk/$1 [L,R=302]
Currently there are slightly more http than https WWW requests, by ~10%.
2020-11-12: AMP HTTPS Only
Today I am making amp.EOU https only, eg by redirecting http to https.
Since AMP users are already paying the overhead of the AMP JavaScript, etc, the bandwidth- and latency- saving aspects of http are likely less important to them. They may likely already have being pushed to the AMP page via https-based search and AMP cache.
Making the http option disappear should save a little crawl bandwidth but spiders. About one quarter of AMP page hits are currently http.
Then I will be down from serving 6 variants of each page — www, m, amp for each of http and https — to 5. It's a start.
There's a little wrinkle to avoid interfering with Let's Encrypt auto-renew.
RewriteEngine on RewriteCond %{HTTPS} off RewriteCond %{REQUEST_URI} !^/\.well-known/ RewriteRule ^/(.*)$ https://amp.earth.org.uk/$1 [L,R=301]
I also aim to add a couple of other tweaks, eg that make my security rating in WebPageTest better than the current "F"!
The biggest complaint from WebPageTest is fixed by adding the header Strict-Transport-Security: max-age=31536000
which should force a browser to use https for amp.EOU for a year. This raises the WPT security rating for the home page from "F" to "E".
Adding the header X-Frame-Options: DENY
, which I already use for the desktop site, improves the security score to "D", but seems to stop images loading in Firefox (though not Chrome Canary). The header is apparently effectively obsoleted by the Content-Security-Policy
header though. Given all that, it's not staying!
I note that https://www.theguardian.com/uk includes these headers:
x-frame-options: SAMEORIGIN
content-security-policy: default-src https:; script-src https: 'unsafe-inline' 'unsafe-eval' blob: 'unsafe-inline'; frame-src https: data:; style-src https: 'unsafe-inline'; img-src https: data: blob:; media-src https: data: blob:; font-src https: data:; connect-src https: wss:; child-src https: blob:; object-src 'none'; base-uri 'none'
referrer-policy: no-referrer-when-downgrade
strict-transport-security: max-age=31536000; includeSubDomains; preload
x-content-type-options: nosniff
x-xss-protection: 1; mode=block
It seems as if x-xss-protection
is also only for older browsers, and I should concentrate my efforts on crafting content-security-policy
.
Part may be Referrer-Policy: origin-when-cross-origin
, or the equivalent via content-security-policy
.
Another part may be script-src https://cdn.ampproject.org:*
to let the AMP scripts run, though that may not let Google ads run.
I do use a little inline CSS to keep the header and CRP (Critical Rendering Path) small, which implies something like style-src 'unsafe-inline'
which weakens the whole mechanism. Maybe I should wean myself off local CSS in all critical cases instead.
2020-11-11: Slow Switchover
I made the https desktop pages canonical (rather than http) 2020-09-21.
As well as the slow switchover of entries in the https-canon sitemap, there have been interesting glitches such as doubling-up in some of the items (http and https) for GSC enhancements such as breadcrumbs, and bizarre double entries (ie the same item listed twice) in AMP. The latter may be because I am now redirecting http AMP to https...
(I removed from the sitemap an XHTML page that is not canonical.)
2020-11-05: Lazy Wins
Lazy loading seems to win in two ways. Reducing bandwidth is the obvious one, but also in reducing initial visible page rendering time even when not.
So, for example, on the home page, Chrome does not avoid loading any images because even the ones below the fold are not far enough below. But it seems that by letting Chrome concentrate on the important bits above the fold, initial layout is faster. Firefox manages to save bandwidth too by avoiding loading several images for the initial view. Those images will never be loaded if the visitor does not scroll down; even if they do, load on the server is spread out.
Here are three simple scenarios, all from WebPageTest instances in London, all over HTTP/2 (ie one TCP connection) to https://www.earth.org.uk.
Note that the bandwidth limits were not identical across all runs, but higher bandwidth does not beat the advantages of lazy loading!
Chrome without Lazy Loading
For this run all loading=lazy
attributes were manually removed from the HTML.
Chrome with Lazy Loading
Firefox with Lazy Loading
2020-11-02: Let's Encrypt Auto-renew Un-snagging
Amongst ignorable email complaints from Let's Encrypt I received a worrying one that implied that the actual EOU TLS certs were going to expire.
Looking in the Apache logs I could see redirects and errors during the http-01 challenge
for the amp.
and m.
sites. The two sites fairly aggressively redirect to www.
anything that does not look like a top-level HTML page (or script or favicon
, etc). That breaks the GET /.well-known/acme-challenge/...
. So I put in special-case fixes to not rewrite/redirect any such requests.
Having done that, renewal succeeded by manually running:
% sudo certbot renew
With a fair wind behind, auto-renewal should "just work"!
(Note from future me in : it does!)
2020-11-01: More Work Storage
(See previous work storage note and next.)
I have made a few more tweaks, eg to stop almost all DAILY and WEEKLY periodic updates when battery is LOW or below.
A few more tweaks up to and on mean that almost no periodic page rebuilds will happen unless the sun is out. Nor will changing the build scripts force a rebuild in the absence of sunshine and a decent state of battery charge.
(: Pleasingly, with the battery VLOW from several dark foggy days in a row, no pages were rebuilt at all , not even the stats page. Good dynamic conservation response from work storage!)