Earth Notes: On Website Technicals (2025-08)
Updated 2025-08-30.2025-08-30: html-minifier-next
The HTML5 minifier that I have been using (html-minifier
) has not been maintained for a long time.
I would like switch to to the html-minifier-next
fork.
I would like to enable the new no-newlines-before-tag-close
option.
But the server's JavsScript is too old:
% html-minifier --help Usage: html-minifier [options] [files...] ... % html-minifier-next --help /usr/lib/node_modules/html-minifier-next/cli.js:28 import fs from 'fs'; ^^^^^^ SyntaxError: Unexpected token import at createScript (vm.js:80:10) at Object.runInThisContext (vm.js:139:10) at Module._compile (module.js:616:28) at Object.Module._extensions..js (module.js:663:10) at Module.load (module.js:565:32) at tryModuleLoad (module.js:505:12) at Function.Module._load (module.js:497:3) at Function.Module.runMain (module.js:693:10) at startup (bootstrap_node.js:188:16) at bootstrap_node.js:609:3
2025-08-29: Tuning
I sent polite emails a few days ago to two of the pollers in the 'greedy' list, asking them to follow the good practice in the Feed Reader Behavior Project. Both poll far too often, though one manages to do conditional polls and the other allows compression.
One has not reacted at all and the other seems to have stopped polling.
I have lowered the maximum polls per day threshold to 7, which should be plenty if I am sending the right signals and any attention is being paid to even a subset of them. The default Cache-Control max-age
is a little over 4h. though Retry-After
can be shorter when attempting to steer clients toward the preferred noon-ish poll time.
# Greedy podcast feed pullers: keys are MD5 hashed User-Agent. # A non-empty txt map lookup of the %{md5:%{HTTP:User-Agent}} means bad! # Built: 2025-08-29T12:47+00:00Z # MAXHITSPERUAPERDAY: 7 # MAXUAS: 25 #---------------- # request-count User-Agent # MD5hash approx-hits-per-day #---------------- # 257 Podbean/FeedUpdat 54e0e9df937b06cc83fab29f44c02b7f 257 # 203 Spotify/1.0 4582d9bdbcef42af27d89da91c6eb804 203 # 138 Google-Podcast 8dea568b39db0451edd6b30f29238eaf 138 # 98 Gofeed/1.0 4a9d728c458902d6ff716779ff72841d 98 # 63 Amazon Music Podc d69be2563c9f1929edf2906d41809aea 63 # 63 - d41d8cd98f00b204e9800998ecf8427e 63 # 60 iTMS 97f76eb7e02c5ff923e1198ff1c288cd 60 # 35 MuckRackFeedParse 62b46fff1cf5f8af7b4b37a2f783b57a 35 # 21 PocketCasts/1.0 ( 5caee5a0a53fcbcae25244d8770516ed 21 # 19 itms 2e7f714a929b3f52f3c094710819a99a 19 # 16 Mozilla/5.0 (Wind 6b9a00393fb1607b0ada13520f814ab5 16 # 16 fyyd-poll-1/0.5 0222c1d79b7a96a5bc998a674717f750 16 # 13 axios/1.6.8 b534882134248c9a5957e0c011a37037 13 # 12 Podcasts/1555.2.1 1155c008fa426b4f89c4a1e832eacaee 12 # 12 Overcast/1.0 Podc c8bf931c39e0b216181afc441001e58b 12 # 12 Mozilla/5.0 (Maci 9bb586dfc3329d2b522978a294f5c138 12 # 11 deezer/curl-3.0 b99188f8b12adffe0f92ae9c03f03c7c 11 # 10 axios/1.11.0 5c8ae194a6f98a725c992886f3da6e04 10 # 9 FeedBurner/1.0 (h 7678d16f662d98189ceb769c5dd002bc 9 # 8 TPA/1.0.0 86e71fcf5ba78d28f18270f7f83256bb 8 # 7 PodchaserParser/2 599ce17a1a1a11800a9905e39fa49f10 7 # 7 iVoox Global Podc 5651428ffa1960ddd9c4e5ff5ec4c270 7
Ignore Sec-CH-UA-Mobile
I was forcing clients with a Sec-CH-UA-Mobile: ?1
HTTP header to receive the 'lite' EOU page versions. That can be confusing since it takes away any choice to see the full page on a capable device without expensive data for example.
I do still force Save-Data: on
clients, and those (likely badly-written bots) that will clearly not accept compression, to 'lite' mode, to conserve bandwidth and to keep a little rich data from bad actors.
Sample from the updated Apache EOU configuration:
# No precompression usable, or client does not support gzip or br compression. RewriteCond %{HTTP:Accept-Encoding} !((gzip)|(br)) [OR] #RewriteCond %{HTTP:Sec-CH-UA-Mobile} "[?]1" [OR] RewriteCond %{HTTP:Save-Data} on [NC] RewriteCond %{DOCUMENT_ROOT}/m%{REQUEST_FILENAME} -s RewriteRule ^/([^_][^/]+)\.(html)$ /m/$1.$2 [L]
2025-08-25: Post-429 Change in RSS Badness Snaphot
Because of the data sampling period, and maybe other factors, this does not fully represent any change in response to the 429
change.
# Greedy podcast feed pullers: keys are MD5 hashed User-Agent. # A non-empty txt map lookup of the %{md5:%{HTTP:User-Agent}} means bad! # Built: 2025-08-25T10:05+00:00Z # MAXHITSPERUAPERDAY: 12 # MAXUAS: 25 #---------------- # request-count User-Agent # MD5hash approx-hits-per-day #---------------- # 256 Podbean/FeedUpdat 54e0e9df937b06cc83fab29f44c02b7f 256 # 202 Spotify/1.0 4582d9bdbcef42af27d89da91c6eb804 202 # 137 Google-Podcast 8dea568b39db0451edd6b30f29238eaf 137 # 97 Gofeed/1.0 4a9d728c458902d6ff716779ff72841d 97 # 62 Amazon Music Podc d69be2563c9f1929edf2906d41809aea 62 # 59 iTMS 97f76eb7e02c5ff923e1198ff1c288cd 59 # 59 - d41d8cd98f00b204e9800998ecf8427e 59 # 35 MuckRackFeedParse 62b46fff1cf5f8af7b4b37a2f783b57a 35 # 22 PocketCasts/1.0 ( 5caee5a0a53fcbcae25244d8770516ed 22 # 20 fyyd-poll-1/0.5 0222c1d79b7a96a5bc998a674717f750 20 # 19 itms 2e7f714a929b3f52f3c094710819a99a 19 # 15 Mozilla/5.0 (Wind 6b9a00393fb1607b0ada13520f814ab5 15 # 13 axios/1.6.8 b534882134248c9a5957e0c011a37037 13 # 12 Podcasts/1555.2.1 1155c008fa426b4f89c4a1e832eacaee 12
2025-08-18: RSS Feeds and Ansible Trouble
I wanted to update how I handle bad RSS feed consumers (more later) but ansible
was throwing a strange error: SyntaxError: invalid syntax\n", "module_stdout": "", "msg": "MODULE FAILURE: No start of json char found
.
It seems that this may be a new macOS-related issue, eg see MODULE FAILURE: No start of json char found on macOS and aarch64
.
Ansible and Python mismatch suggests pinning ansible
to V2.9. I tried a slightly older version installed via brew install ansible@10
but the error persisted.
I went further back with brew install ansible@9
and my ansible
script runs! But during install I get the somewhat alarming warning:
ansible@9 has been deprecated because it is not maintained upstream! It will be disabled on 2025-11-30
I now have to remember to run ansible
like this for now:
% setenv PATH /usr/local/opt/ansible@9/bin:$PATH % ansible-playbook whatever.yml
RSS feed fiddle
I have changed all feed 503
responses to 429
as I believe that to be more logically correct, eg in terms of FRB, though I expect (bad) traffic to increase somewhat.
2025-08-22: YouTube less bad with 429
YouTube has now stopped sending me a nightly complaint (from receiving a 503
in response to its bad behaviour):
Failed to retrieve the RSS feed for your podcast 'Earth Notes - LITE'
We were blocked from retrieving the RSS feed for your podcast 'Earth Notes - LITE'. For YouTube to be able to ingest your RSS feed, Googlebot must be allowed to crawl your feed in your robots.txt file. If you host your own RSS feed, please follow these instructions to allow Googlebot to crawl your feed in your robots.txt file. Otherwise, please contact your hosting company for more information.
Until this problem is fixed, some episodes from your RSS feed will not be uploaded to YouTube.
The YouTube team
Which is several levels of wrong, as previously noted.
2025-08-09: RSS Badness Snaphot
Because I can:
# Greedy podcast feed pullers: keys are MD5 hashed User-Agent. # A non-empty txt map lookup of the %{md5:%{HTTP:User-Agent}} means bad! # Built: 2025-08-09T11:14+00:00Z # MAXHITSPERUAPERDAY: 12 # MAXUAS: 25 #---------------- # request-count User-Agent # MD5hash approx-hits-per-day #---------------- # 296 Podbean/FeedUpdat 54e0e9df937b06cc83fab29f44c02b7f 296 # 203 Spotify/1.0 4582d9bdbcef42af27d89da91c6eb804 203 # 138 Google-Podcast 8dea568b39db0451edd6b30f29238eaf 138 # 97 Gofeed/1.0 4a9d728c458902d6ff716779ff72841d 97 # 68 Amazon Music Podc d69be2563c9f1929edf2906d41809aea 68 # 67 iVoox Global Podc 5651428ffa1960ddd9c4e5ff5ec4c270 67 # 64 - d41d8cd98f00b204e9800998ecf8427e 64 # 61 iTMS 97f76eb7e02c5ff923e1198ff1c288cd 61 # 33 MuckRackFeedParse 62b46fff1cf5f8af7b4b37a2f783b57a 33 # 33 Mozilla/5.0 (Wind 6b9a00393fb1607b0ada13520f814ab5 33 # 23 Overcast/1.0 Podc c8bf931c39e0b216181afc441001e58b 23 # 23 itms 2e7f714a929b3f52f3c094710819a99a 23 # 23 fyyd-poll-1/0.5 0222c1d79b7a96a5bc998a674717f750 23 # 22 PocketCasts/1.0 ( 5caee5a0a53fcbcae25244d8770516ed 22 # 16 Mozilla/5.0 0e3c1d553071f45ae73c51aa46fc11d8 16 # 15 axios/1.6.8 b534882134248c9a5957e0c011a37037 15 # 13 TPA/1.0.0 86e71fcf5ba78d28f18270f7f83256bb 13
2025-08-06: Side Quests
Not really a tech thing, but some of my not-secret not-main tasks (ie not planned PhD research and not volunteering such as TTK, RCK, KEF), also not in order:
- "How (and why) to install a GB heatpump in 2 days" open letter to DESNZ and Octopus.
- Letter to ICO and FCA about absolutely bonkers misapplied security at my bank.
- RSS Efficiency paper for arXiv / NLUUG, including collaboration on CO2 estimates.
- DoI for LOCO24 lighting talk (and/or FOSDEM).
- Put KEHS talks up.
- Put up more ambient field recordings.
- Do the RPi server upgrade!