Earth Notes: On Website Technicals (2025-08)

Updated 2025-08-30.
Tech updates: side quests, RSS badness, ansible trouble, HTML minifier upgrade aborted...
Taking a holiday from the fevered consideration of the optimal method to choose the distribution for randomising the scope for consulting on considering a grass-roots movement to form a citizens' assembly to set terms for a Royal Commission to guide the forming of a ministerial task force to consider the scoping and creation of a focus group to sketch terms of reference for a study group to outline the agenda for a pre-meeting to form a steering group to get ready to think about upgrading the RPi server, easy does it...

2025-08-30: html-minifier-next

The HTML5 minifier that I have been using (html-minifier) has not been maintained for a long time.

I would like switch to to the html-minifier-next fork.

I would like to enable the new no-newlines-before-tag-close option.

But the server's JavsScript is too old:

% html-minifier --help
Usage: html-minifier [options] [files...]
...

% html-minifier-next --help
/usr/lib/node_modules/html-minifier-next/cli.js:28
import fs from 'fs';
^^^^^^

SyntaxError: Unexpected token import
    at createScript (vm.js:80:10)
    at Object.runInThisContext (vm.js:139:10)
    at Module._compile (module.js:616:28)
    at Object.Module._extensions..js (module.js:663:10)
    at Module.load (module.js:565:32)
    at tryModuleLoad (module.js:505:12)
    at Function.Module._load (module.js:497:3)
    at Function.Module.runMain (module.js:693:10)
    at startup (bootstrap_node.js:188:16)
    at bootstrap_node.js:609:3

2025-08-29: Tuning

I sent polite emails a few days ago to two of the pollers in the 'greedy' list, asking them to follow the good practice in the Feed Reader Behavior Project. Both poll far too often, though one manages to do conditional polls and the other allows compression.

One has not reacted at all and the other seems to have stopped polling.

I have lowered the maximum polls per day threshold to 7, which should be plenty if I am sending the right signals and any attention is being paid to even a subset of them. The default Cache-Control max-age is a little over 4h. though Retry-After can be shorter when attempting to steer clients toward the preferred noon-ish poll time.

# Greedy podcast feed pullers: keys are MD5 hashed User-Agent.
# A non-empty txt map lookup of the %{md5:%{HTTP:User-Agent}} means bad!
# Built: 2025-08-29T12:47+00:00Z
# MAXHITSPERUAPERDAY: 7
# MAXUAS: 25
#----------------
# request-count User-Agent
# MD5hash approx-hits-per-day
#----------------
# 257 Podbean/FeedUpdat
54e0e9df937b06cc83fab29f44c02b7f 257
# 203 Spotify/1.0
4582d9bdbcef42af27d89da91c6eb804 203
# 138 Google-Podcast
8dea568b39db0451edd6b30f29238eaf 138
# 98 Gofeed/1.0
4a9d728c458902d6ff716779ff72841d 98
# 63 Amazon Music Podc
d69be2563c9f1929edf2906d41809aea 63
# 63 -
d41d8cd98f00b204e9800998ecf8427e 63
# 60 iTMS
97f76eb7e02c5ff923e1198ff1c288cd 60
# 35 MuckRackFeedParse
62b46fff1cf5f8af7b4b37a2f783b57a 35
# 21 PocketCasts/1.0 (
5caee5a0a53fcbcae25244d8770516ed 21
# 19 itms
2e7f714a929b3f52f3c094710819a99a 19
# 16 Mozilla/5.0 (Wind
6b9a00393fb1607b0ada13520f814ab5 16
# 16 fyyd-poll-1/0.5
0222c1d79b7a96a5bc998a674717f750 16
# 13 axios/1.6.8
b534882134248c9a5957e0c011a37037 13
# 12 Podcasts/1555.2.1
1155c008fa426b4f89c4a1e832eacaee 12
# 12 Overcast/1.0 Podc
c8bf931c39e0b216181afc441001e58b 12
# 12 Mozilla/5.0 (Maci
9bb586dfc3329d2b522978a294f5c138 12
# 11 deezer/curl-3.0
b99188f8b12adffe0f92ae9c03f03c7c 11
# 10 axios/1.11.0
5c8ae194a6f98a725c992886f3da6e04 10
# 9 FeedBurner/1.0 (h
7678d16f662d98189ceb769c5dd002bc 9
# 8 TPA/1.0.0
86e71fcf5ba78d28f18270f7f83256bb 8
# 7 PodchaserParser/2
599ce17a1a1a11800a9905e39fa49f10 7
# 7 iVoox Global Podc
5651428ffa1960ddd9c4e5ff5ec4c270 7

Ignore Sec-CH-UA-Mobile

I was forcing clients with a Sec-CH-UA-Mobile: ?1 HTTP header to receive the 'lite' EOU page versions. That can be confusing since it takes away any choice to see the full page on a capable device without expensive data for example.

I do still force Save-Data: on clients, and those (likely badly-written bots) that will clearly not accept compression, to 'lite' mode, to conserve bandwidth and to keep a little rich data from bad actors.

Sample from the updated Apache EOU configuration:

  # No precompression usable, or client does not support gzip or br compression.
  RewriteCond %{HTTP:Accept-Encoding} !((gzip)|(br)) [OR]
  #RewriteCond %{HTTP:Sec-CH-UA-Mobile} "[?]1" [OR]
  RewriteCond %{HTTP:Save-Data} on [NC]
  RewriteCond %{DOCUMENT_ROOT}/m%{REQUEST_FILENAME} -s
  RewriteRule ^/([^_][^/]+)\.(html)$ /m/$1.$2 [L]

2025-08-25: Post-429 Change in RSS Badness Snaphot

Because of the data sampling period, and maybe other factors, this does not fully represent any change in response to the 429 change.

# Greedy podcast feed pullers: keys are MD5 hashed User-Agent.
# A non-empty txt map lookup of the %{md5:%{HTTP:User-Agent}} means bad!
# Built: 2025-08-25T10:05+00:00Z
# MAXHITSPERUAPERDAY: 12
# MAXUAS: 25
#----------------
# request-count User-Agent
# MD5hash approx-hits-per-day
#----------------
# 256 Podbean/FeedUpdat
54e0e9df937b06cc83fab29f44c02b7f 256
# 202 Spotify/1.0
4582d9bdbcef42af27d89da91c6eb804 202
# 137 Google-Podcast
8dea568b39db0451edd6b30f29238eaf 137
# 97 Gofeed/1.0
4a9d728c458902d6ff716779ff72841d 97
# 62 Amazon Music Podc
d69be2563c9f1929edf2906d41809aea 62
# 59 iTMS
97f76eb7e02c5ff923e1198ff1c288cd 59
# 59 -
d41d8cd98f00b204e9800998ecf8427e 59
# 35 MuckRackFeedParse
62b46fff1cf5f8af7b4b37a2f783b57a 35
# 22 PocketCasts/1.0 (
5caee5a0a53fcbcae25244d8770516ed 22
# 20 fyyd-poll-1/0.5
0222c1d79b7a96a5bc998a674717f750 20
# 19 itms
2e7f714a929b3f52f3c094710819a99a 19
# 15 Mozilla/5.0 (Wind
6b9a00393fb1607b0ada13520f814ab5 15
# 13 axios/1.6.8
b534882134248c9a5957e0c011a37037 13
# 12 Podcasts/1555.2.1
1155c008fa426b4f89c4a1e832eacaee 12

2025-08-18: RSS Feeds and Ansible Trouble

I wanted to update how I handle bad RSS feed consumers (more later) but ansible was throwing a strange error: SyntaxError: invalid syntax\n", "module_stdout": "", "msg": "MODULE FAILURE: No start of json char found.

It seems that this may be a new macOS-related issue, eg see MODULE FAILURE: No start of json char found on macOS and aarch64.

Ansible and Python mismatch suggests pinning ansible to V2.9. I tried a slightly older version installed via brew install ansible@10 but the error persisted.

I went further back with brew install ansible@9 and my ansible script runs! But during install I get the somewhat alarming warning:

ansible@9 has been deprecated because it is not maintained upstream! It will be disabled on 2025-11-30

I now have to remember to run ansible like this for now:

% setenv PATH /usr/local/opt/ansible@9/bin:$PATH
% ansible-playbook whatever.yml

RSS feed fiddle

I have changed all feed 503 responses to 429 as I believe that to be more logically correct, eg in terms of FRB, though I expect (bad) traffic to increase somewhat.

2025-08-22: YouTube less bad with 429

YouTube has now stopped sending me a nightly complaint (from receiving a 503 in response to its bad behaviour):

Failed to retrieve the RSS feed for your podcast 'Earth Notes - LITE'

We were blocked from retrieving the RSS feed for your podcast 'Earth Notes - LITE'. For YouTube to be able to ingest your RSS feed, Googlebot must be allowed to crawl your feed in your robots.txt file. If you host your own RSS feed, please follow these instructions to allow Googlebot to crawl your feed in your robots.txt file. Otherwise, please contact your hosting company for more information.

Until this problem is fixed, some episodes from your RSS feed will not be uploaded to YouTube.

The YouTube team

Which is several levels of wrong, as previously noted.

2025-08-09: RSS Badness Snaphot

Because I can:

# Greedy podcast feed pullers: keys are MD5 hashed User-Agent.
# A non-empty txt map lookup of the %{md5:%{HTTP:User-Agent}} means bad!
# Built: 2025-08-09T11:14+00:00Z
# MAXHITSPERUAPERDAY: 12
# MAXUAS: 25
#----------------
# request-count User-Agent
# MD5hash approx-hits-per-day
#----------------
# 296 Podbean/FeedUpdat
54e0e9df937b06cc83fab29f44c02b7f 296
# 203 Spotify/1.0
4582d9bdbcef42af27d89da91c6eb804 203
# 138 Google-Podcast
8dea568b39db0451edd6b30f29238eaf 138
# 97 Gofeed/1.0
4a9d728c458902d6ff716779ff72841d 97
# 68 Amazon Music Podc
d69be2563c9f1929edf2906d41809aea 68
# 67 iVoox Global Podc
5651428ffa1960ddd9c4e5ff5ec4c270 67
# 64 -
d41d8cd98f00b204e9800998ecf8427e 64
# 61 iTMS
97f76eb7e02c5ff923e1198ff1c288cd 61
# 33 MuckRackFeedParse
62b46fff1cf5f8af7b4b37a2f783b57a 33
# 33 Mozilla/5.0 (Wind
6b9a00393fb1607b0ada13520f814ab5 33
# 23 Overcast/1.0 Podc
c8bf931c39e0b216181afc441001e58b 23
# 23 itms
2e7f714a929b3f52f3c094710819a99a 23
# 23 fyyd-poll-1/0.5
0222c1d79b7a96a5bc998a674717f750 23
# 22 PocketCasts/1.0 (
5caee5a0a53fcbcae25244d8770516ed 22
# 16 Mozilla/5.0
0e3c1d553071f45ae73c51aa46fc11d8 16
# 15 axios/1.6.8
b534882134248c9a5957e0c011a37037 15
# 13 TPA/1.0.0
86e71fcf5ba78d28f18270f7f83256bb 13

2025-08-06: Side Quests

Not really a tech thing, but some of my not-secret not-main tasks (ie not planned PhD research and not volunteering such as TTK, RCK, KEF), also not in order:

  • "How (and why) to install a GB heatpump in 2 days" open letter to DESNZ and Octopus.
  • Letter to ICO and FCA about absolutely bonkers misapplied security at my bank.
  • RSS Efficiency paper for arXiv / NLUUG, including collaboration on CO2 estimates.
  • DoI for LOCO24 lighting talk (and/or FOSDEM).
  • Put KEHS talks up.
  • Put up more ambient field recordings.
  • Do the RPi server upgrade!