Earth Notes: On Website Technicals (2025-07)

Updated 2025-08-06.
Tech updates: heating schedule thoughts, FLAC bot good behaviour but RSS badness continues, 4kW clamp.
Considering the optimal method to choose the distribution for randomising the scope for consulting on considering a grass-roots movement to form a citizens' assembly to set terms for a Royal Commission to guide the forming of a ministerial task force to consider the scoping and creation of a focus group to sketch terms of reference for a study group to outline the agenda for a pre-meeting to form a steering group to get ready to think about upgrading the RPi server, easy does it...

2025-07-22: Suppressing 4kW Export

Nominally 16WW's DNO could have restricted our PV generation to 16A per phase, or ~3.9kW at the typical 245V seen here.

What if PV diversion was permitted (if otherwise forbidden) when generating or exporting over ~4kW? Sampling recent PV generation it is evident that there are few such over-16A generation minutes, and they are rarely adjacent (unlike low-frequency events), so attempting to squash peaks this way would probably just add noise to the grid.

% cat /var/log/SunnyBeam/202507{1,2}?.log | awk '$2>=4000'
20250715T10:41Z 4180
20250715T11:12Z 4301
20250715T11:20Z 4339
20250715T11:22Z 4120
20250715T11:33Z 4182
20250715T12:10Z 4173
20250715T12:11Z 4005
20250715T12:36Z 4004
20250716T12:01Z 4123
20250716T12:06Z 4099
20250716T12:07Z 4006
20250717T10:52Z 4052
20250717T10:53Z 4052
20250717T10:54Z 4247
20250717T10:55Z 4241
20250717T10:56Z 4240
20250720T13:02Z 4099
20250720T13:05Z 4000
20250720T13:06Z 4230
20250720T13:15Z 4200
20250720T13:40Z 4067
20250721T12:06Z 4383
20250721T12:07Z 4465
20250721T12:31Z 4370
20250721T12:48Z 4128

2025-07-13: Taking FLAC

A silly amount of bandwidth was being used by bots repeatedly pulling down FLAC files. Without much hope, I added this to robots.txt to forbid spidering of an URL ending .flac:

# DHD20250615: avoid spiders etc pulling down huge FLAC files; MP3 will do.
User-agent: *
Disallow: *.flac$

Remarkably, today the highest ranked FLAC bandwidth hog is at 26. A mixture of .mp4 video and .mp3 audio files has taken most of the top spots.

That weekly bandwidth hog list (path, count, bytes) currently starts:

/img/video/20201112/20201112-EcoHomeLab-talk-on-smart-thermostatic-radiator-valves-TRVs.mp4 33 723168188
/img/a/v/20201112-EcoHomeLab-talk-on-smart-thermostatic-radiator-valves-TRVs.800x408.mp4 8 165929600
/img/school/BWEA_School_Pack.pdf 40 67323600
/img/audio/ambient/Herne-Bay/202107-Herne-Bay-ambient-sea-amusements-people.mp3 3 64409614
/img/audio/soundwalk/soundwalk-20200912.mp3L 6 64285640
/img/video/OpenTRV/OpenTRV-mashup-1.mp4 1 63580623
/img/audio/20210730-GTKN/20210730-GTKN-hardware-designer.mp3 7 57280816
/img/audio/ambient/travel/202101-parcel-en-route.mp3 2 55065945
/ 3926 50895430
/img/audio/202502-Hackathon17-Surrey-University/20250216-Hackathon17-why-start-up.mp3 4 45221337
/OpenTRV/talks/BCS/20151215-IoT-Avert-Disaster.odp 2 43891972
/_gridCarbonIntensityGB.html 3708 32697504
/img/audio/statscast/statscast-202005.mp3 4 26980399

And nothing is pulling down /out/monthly/archive/* either. Good.

: oh dear, last week's top bandwidth-consuming files were almost entirely .flac, with many distinct bots (by UA) ignoring the new robots.txt directive...

Greedy RSS bots

As built this morning the greediest RSS feed-puller list was:

# Greedy podcast feed pullers: keys are MD5 hashed User-Agent.
# A non-empty txt map lookup of the %{md5:%{HTTP:User-Agent}} means bad!
# Built: 2025-07-13T10:00+00:00Z
# MAXHITSPERUAPERDAY: 25
# MAXUAS: 15
#----------------
# request-count User-Agent
# MD5hash approx-hits-per-day
#----------------
# 291 Podbean/FeedUpdat
54e0e9df937b06cc83fab29f44c02b7f 291
# 204 Spotify/1.0
4582d9bdbcef42af27d89da91c6eb804 204
# 137 Google-Podcast
8dea568b39db0451edd6b30f29238eaf 137
# 98 Amazon Music Podc
d69be2563c9f1929edf2906d41809aea 98
# 97 Gofeed/1.0
4a9d728c458902d6ff716779ff72841d 97
# 64 iVoox Global Podc
5651428ffa1960ddd9c4e5ff5ec4c270 64
# 61 iTMS
97f76eb7e02c5ff923e1198ff1c288cd 61
# 57 -
d41d8cd98f00b204e9800998ecf8427e 57
# 37 Mozilla/5.0 (Wind
6b9a00393fb1607b0ada13520f814ab5 37

Note that I had to block the lite feed in robots.txt from ordinary Googlebot to get rid of its (top) entry of >1000 last week.

Also note that probably none of the top five in this greedy list send me any listeners: it is all lazy, inefficient and ineffectual.

Taking some courage from [kroll2025behavior] I reduced the threshold MAXHITSPERUAPERDAY: 25 which allowed a common but lazy pattern of one unconditional hit per hour. If any attention was being paid to eg FRB022 or FRB023 or FRB024, then nothing would be polling hourly. I also raised the number of distinct UAs throttled so as to send pushback more widely.

Now:

# Greedy podcast feed pullers: keys are MD5 hashed User-Agent.
# A non-empty txt map lookup of the %{md5:%{HTTP:User-Agent}} means bad!
# Built: 2025-07-13T10:32+00:00Z
# MAXHITSPERUAPERDAY: 12
# MAXUAS: 25
#----------------
# request-count User-Agent
# MD5hash approx-hits-per-day
#----------------
# 292 Podbean/FeedUpdat
54e0e9df937b06cc83fab29f44c02b7f 292
# 204 Spotify/1.0
4582d9bdbcef42af27d89da91c6eb804 204
# 138 Google-Podcast
8dea568b39db0451edd6b30f29238eaf 138
# 98 Gofeed/1.0
4a9d728c458902d6ff716779ff72841d 98
# 98 Amazon Music Podc
d69be2563c9f1929edf2906d41809aea 98
# 64 iVoox Global Podc
5651428ffa1960ddd9c4e5ff5ec4c270 64
# 61 iTMS
97f76eb7e02c5ff923e1198ff1c288cd 61
# 58 -
d41d8cd98f00b204e9800998ecf8427e 58
# 37 Mozilla/5.0 (Wind
6b9a00393fb1607b0ada13520f814ab5 37
# 24 Overcast/1.0 Podc
c8bf931c39e0b216181afc441001e58b 24
# 24 axios/1.6.8
b534882134248c9a5957e0c011a37037 24
# 23 fyyd-poll-1/0.5
0222c1d79b7a96a5bc998a674717f750 23
# 22 itms
2e7f714a929b3f52f3c094710819a99a 22
# 21 axios/1.8.1
d19b92b5b8c33eb88ecaa61a9ad6c0a0 21
# 18 PocketCasts/1.0 (
5caee5a0a53fcbcae25244d8770516ed 18
# 14 axios/1.9.0
ec26fa2ac1b5f15e39ed983bf34f32bc 14
# 13 Mozilla/5.0 (Maci
ca86b008fc3e2c3d301606d50a2f120e 13
# 12 Podcasts/1555.2.1
1155c008fa426b4f89c4a1e832eacaee 12

I have added a link to [kroll2025behavior] to the 429 error page, and duplicated that to a new 503 error page, though not yet hooked the latter into Apache's error response.

Noon export notch still going well

bucketed
13-day diversion-shifting snapshot of eheat (electricity for heat) and net grid flows from . Times UTC. Data and other views are available.

2025-07-10: Heating and SSES

Mid-summer and temperatures in the 30s (°C) is obviously an optimal time to tweak the heat-pump weather compensation adjustment schedule.

(Do not look at me in that tone of voice!)

Thus I am now looking at when there was actually a call for heat to the boiler ("b":1) seen in the OpenTRV data January to March this year. (I am ignoring the effect of the clocks going forward at the end of March, as we turned the heating and the boiler controller off . So UTC times are good.

(Maybe I should have looked at the data before devising the new SSES-friendly schedule ... stop scowling at me!)

% cat data/OpenTRV/pubarchive/remote/20250{1,2,3}.json.gz | gzip -d | egrep '"b":0' | wc -l
   11993
% cat data/OpenTRV/pubarchive/remote/20250{1,2,3}.json.gz | gzip -d | egrep '"b":0' | egrep 'T00:' | wc -l
     781
% cat data/OpenTRV/pubarchive/remote/20250{1,2,3}.json.gz | gzip -d | egrep '"b":1' | egrep 'T00:' | wc -l
      81
% cat data/OpenTRV/pubarchive/remote/20250{1,2,3}.json.gz | gzip -d | egrep '"b":1' | egrep 'T01:' | wc -l
      76
% cat data/OpenTRV/pubarchive/remote/20250{1,2,3}.json.gz | gzip -d | egrep '"b":1' | egrep 'T02:' | wc -l
      59
% cat data/OpenTRV/pubarchive/remote/20250{1,2,3}.json.gz | gzip -d | egrep '"b":1' | egrep 'T03:' | wc -l
      72
% cat data/OpenTRV/pubarchive/remote/20250{1,2,3}.json.gz | gzip -d | egrep '"b":1' | egrep 'T04:' | wc -l
      64
% cat data/OpenTRV/pubarchive/remote/20250{1,2,3}.json.gz | gzip -d | egrep '"b":1' | egrep 'T05:' | wc -l
     123
% cat data/OpenTRV/pubarchive/remote/20250{1,2,3}.json.gz | gzip -d | egrep '"b":1' | egrep 'T06:' | wc -l
     447
% cat data/OpenTRV/pubarchive/remote/20250{1,2,3}.json.gz | gzip -d | egrep '"b":1' | egrep 'T07:' | wc -l
     642
% cat data/OpenTRV/pubarchive/remote/20250{1,2,3}.json.gz | gzip -d | egrep '"b":1' | egrep 'T08:' | wc -l
     530

Raising the flow temperature in the hour from rather than as previously in the hour from may make a noticeable difference.

Plans for the GB grid seem to be placing significant weight on load shifting (demand response) by pre-heating to reduce peak demands and other grid stress.

References

(Count: 3)