Earth Notes: On Website Technicals (2025-04)

Updated 2025-04-24.
Tech updates: attributes sorted, data repair, micro-opt, sub-pages, energy stats insert...
Consulting on considering a grass-roots movement to form a citizens' assembly to set terms for a Royal Commission to guide the forming of a ministerial task force to consider the scoping and creation of a focus group to sketch terms of reference for a study group to outline the agenda for a pre-meeting to form a steering group to get ready to think about upgrading the RPi server, easy does it...

2025-04-24: Energy Stats Page Insert Trim

To save some build time and some page weight on multiple pages for a feature I suspect that no reader is using, I have just adjusted the script that inserts energy stats in the footer of those relevant pages (such as microgeneration-related) to insert only a link to the full energy series page.

That reduces an uncompressed example page by more than 20kB:

% dhd% ls -al measuring-appliance-consumption.html
113526 measuring-appliance-consumption.html
% make measuring-appliance-consumption.html
...
% ls -al measuring-appliance-consumption.html
 90079 measuring-appliance-consumption.html

That is ~3kB saved from the compressed live page too:

% ls -alS measuring-appliance-consumption.html*
113505 measuring-appliance-consumption.html
 27890 measuring-appliance-consumption.htmlgz
 23499 measuring-appliance-consumption.htmlbr
% make measuring-appliance-consumption.html*
% ls -alS measuring-appliance-consumption.html*
 90135 measuring-appliance-consumption.html
 24558 measuring-appliance-consumption.htmlgz
 20624 measuring-appliance-consumption.htmlbr

2025-04-18: Sub-pages

Some of the 'aggregate' and auto-generated pages on EOU are far too large, well beyond 100kBytes HTML before compression.

Prime candidates are the bibliography (~800kB) and the energy series dataset (~500kB). Other potential candidates for similar treatment are LED lighting, and the glossary, maybe even the site technicals, where each day could be broken out.

Each of these might benefit from a similar template solution. For each main page XXX.html there could be an optional flat XXX/ directory containing any number of named (or numbered) HTML pages that are 'sub-pages' of the main one.

They may get the same extra CSS styling as their main page, maybe also the preview and header image(s), but would otherwise be relatively light-weight and clearly 'satellite' in style, and link up to their main page and the site home only by default. (Lighter-weight versions for lite/offline pages should also be possible.)

Design assumptions

  • Most sub-pages will be small (and infrequently visited) so on-the-fly gzip compression will provide most of the available bandwidth savings.
  • Sub-pages will not need most of the complex machinery such as SECTION, and will be quicker to build and easier to manage.
  • Sub-pages will be as unchanging as possible, eg not containing timestamps, so that they can be selectively updated and get new timestamps as infrequently as reasonably possible.
  • Images (IMG) are likely to be needed in sub-pages, and that may force a lot of refactoring.

And we are off...

Having started updating the key build scripts, the site is going to be fully rebuilt over and over until I finish, so I should get on with it! Maybe fix a few bugs on the side.

I am starting to refactor some elements out of the monolithic wrap_art.sh, such as CSS file collation for a page, and moving the versions of which base/desktop/etc CSS files to use to the softparams.txt system.

Features

  • Each sub-pages directory gets a index.html page (marked noindex) that prevents the directory itself being served by Apache, and contains a link upwards to the main page; the file has no content that might change such as the main-page title.
  • Each sub-pages directory gets a MANIFEST.txt file containing one of the sub-page IDs per line, preferably sorted, and atomically updated only if its content changes where possible; the presence of a line/ID XXX indicates that there should be a XXX.html sub-page present.

Bibliography down

There is now a skeletal set of glossary sub-pages, not yet exposed, because sub-pages do not yet support IMG (or references).

The bibliography diet is ready though (as an initial pass, subject to bugs and unexpected features):

779244 20 Apr 17:12 20250420-bibliography.html
177845 21 Apr 16:49 20250421-bibliography.html
 42405 21 Apr 16:53 bibliography.htmlgz
 35229 21 Apr 16:53 bibliography.htmlbr

This remains larger than the ~128kB target maximum page size. More tweaking:

% ls -al bibliography.html{,gz,br}
143683 21 Apr 17:02 bibliography.html
 41090 21 Apr 17:04 bibliography.htmlgz
 34421 21 Apr 17:04 bibliography.htmlbr

The 'lite' version is not far off, and offline squeaks in:

128251 21 Apr 18:47 m/bibliography.html
127717 21 Apr 19:22 .offline/bibliography.html

Not quite there, but good enough for now for this problem page at least!

2025-04-15: Energy Series Micro-optimisation

In the energy series dataset display for each datum there is a meter which in its body has a variable length bar of + symbols as a fall-back for browsers that do not implement meter such as lynx. But that case applies to very few visitors. Currently the bar is 0 to 10 +s long, but many many equivalents these days, eg star ratings, are up to 5.

So in the script that generates the tables I have trimmed this to be 0 to 5. This will save a tiny bit of bandwidth, most people will not see the change, and for those that do the result may be more countable and salient.

% ls -alS energy-series-dataset.html{,gz,br}
-r--r--r--  1 dhd  staff  548189 15 Apr 07:58 energy-series-dataset.html
-r--r--r--  1 dhd  staff   42149 15 Apr 15:58 energy-series-dataset.htmlgz
-r--r--r--  1 dhd  staff   29744 15 Apr 15:58 energy-series-dataset.htmlbr
% make energy-series-dataset.html{,gz,br}
% ls -alS energy-series-dataset.html{,gz,br}
-r--r--r--  1 dhd  staff  541786 15 Apr 16:07 energy-series-dataset.html
-r--r--r--  1 dhd  staff   41728 15 Apr 16:07 energy-series-dataset.htmlgz
-r--r--r--  1 dhd  staff   29235 15 Apr 16:07 energy-series-dataset.htmlbr

Yes, I am procrastinating: how could you tell?

2025-04-14: Enphase Data Repair

It seems like many of the Enphase monthly net energy report files have been full of errors. I noticed this when seeing suspect battery monthly discharge summaries. Note the missing final columns.

% gzip -d < data/16WWHiRes/Enphase/adhoc/net_energy_202501.csv.gz | head
Date/Time,Energy Produced (Wh),Energy Consumed (Wh),Exported to Grid (Wh),Imported from Grid (Wh)
2025-01-01 00:00:00 +0000,0,14,0,18
2025-01-01 00:15:00 +0000,0,30,0,34
2025-01-01 00:30:00 +0000,0,28,0,32
2025-01-01 00:45:00 +0000,0,62,0,66
2025-01-01 01:00:00 +0000,0,45,0,49
2025-01-01 01:15:00 +0000,0,28,0,32
2025-01-01 01:30:00 +0000,0,32,0,36
2025-01-01 01:45:00 +0000,0,31,0,35
2025-01-01 02:00:00 +0000,0,30,0,34
% head ~/Downloads/14XXXXX_monthly_energy_report.csv
Date/Time,Energy Produced (Wh),Energy Consumed (Wh),Exported to Grid (Wh),Imported from Grid (Wh),Stored in batteries (Wh),Discharged from batteries (Wh)
2025-01-01 00:00:00 +0000,0,14,0,18,4,0
2025-01-01 00:15:00 +0000,0,30,0,34,4,0
2025-01-01 00:30:00 +0000,0,28,0,32,4,0
2025-01-01 00:45:00 +0000,0,62,0,66,4,0
2025-01-01 01:00:00 +0000,0,45,0,49,4,0
2025-01-01 01:15:00 +0000,0,28,0,32,4,0
2025-01-01 01:30:00 +0000,0,32,0,36,4,0
2025-01-01 01:45:00 +0000,0,31,0,35,4,0
2025-01-01 02:00:00 +0000,0,30,0,34,4,0

Inspection of the captured files shows that all such captured reports from August last year to January this year were/are missing the final two columns. I shall have to eyeball for this in future, or maybe set an automated check.

Re-fetching January's data returned an apparently complete set. I have manually re-fetched and replaced the other five damaged months' files. I also rebuilt the net_energy_2024.csv.xz full-year file. The charge/discharge monthly and yearly graphs look much more plausible now (not just zeroes)!

2025-04-13: HTML Sorted Attributes

One method to try to minimise the size of compressed HTML on the wire is to sort the attributes in each element in a consistent (eg lexical) order.

When I cared about posting articles to Twitter I could not safely sort, because Twitter would fail to read important meta tags in the head unless the attributes were ordered a certain way.

I still use some Twitter tags in the head to help (eg) Mastodon provide previews, but I should no longer need to work round buggy attribute ordering issues for other users of those tags.

So prompted by a Fediverse thread I reinstated attribute ordering in the HTML5 minification script.

I expect this in general to save ~1% of gzip- and br- compressed HTML, though sometimes it might undo careful manual ordering and cause some expansion:

% ls -lS 4{04,29}.html{,gz,br}
   881 429.html
   834 404.html
   505 429.htmlgz
   483 404.htmlgz
   392 429.htmlbr
   337 404.htmlbr
% make 4{04,29}.html{gz,br}
% ls -lS 4{04,29}.html{,gz,br}
   881 429.html
   834 404.html
   503 429.htmlgz
   482 404.htmlgz
   353 429.htmlbr
   334 404.htmlbr

% ls -lS note-on-site-technicals-94.html{,gz,br}
 21808 note-on-site-technicals-94.html
  7081 note-on-site-technicals-94.htmlgz
  5968 note-on-site-technicals-94.htmlbr
% make note-on-site-technicals-94.html{,gz,br}
% ls -lS note-on-site-technicals-94.html{,gz,br}
 21919 note-on-site-technicals-94.html
  7083 note-on-site-technicals-94.htmlgz
  5988 note-on-site-technicals-94.htmlbr

% ls -lS energy-series-dataset.html{,gz,br}
547826 energy-series-dataset.html
 42273 energy-series-dataset.htmlgz
 29381 energy-series-dataset.htmlbr
% make energy-series-dataset.html{,gz,br}
% ls -lS energy-series-dataset.html{,gz,br}
547831 energy-series-dataset.html
 42186 energy-series-dataset.htmlgz
 29531 energy-series-dataset.htmlbr

% ls -lS bibliography.html{,gz,br}
775365 bibliography.html
 98856 bibliography.htmlgz
 74714 bibliography.htmlbr
% make bibliography.html{,gz,br}
% ls -lS bibliography.html{,gz,br}
775424 bibliography.html
 98867 bibliography.htmlgz
 74762 bibliography.htmlbr