Earth Notes: On Website Technicals (2020-10)Updated 2020-10-28 22:14 GMT.
2020-10-17: Ancient History
Doesn't really belong here, but c/o TheOldNet.com, ExNet's home page circa 1995/1996!
Swirly background ahoy, and Java applet that most browsers will decline to show these days...
I have included my ORCID link in author metadata in each page.
I think that the main benefit will accrue to the ~13 Datasets that refer to that metadata as the
When computing readability of articles, I use
unfluff. The latter sometimes discards most of the content, resulting in whacky scores.
I have given
textract a little spin. It's a little slow but spits out decent plain text given my HTML core source.
(I'm having some difficulty getting it to install properly on the Mac, as I did with the Google AMP validator, for exmaple.)
The code change for checking would be from
reado --unfluff to
textract | reado, though there would be some complications...
I haven't yet found a compelling improvement, but I may reassess that!
2020-10-08: HTTPS Dataset Page Canonical
For all pages containing a
schema.org/Dataset, I have hardwired the
https://www. to be canonical. This avoids a confusion at the moment with both the http and https pages claiming to be the canonical copy.
I also made the survey results page a
Dataset in its own right!
Also, for those datasets that are under
data/, I have flagged them as
isPartOf the main 16WW dataset. The reverse
hasPart relationship appears to be rejected by Google at the moment, eg by the Structured Data Testing tool with
CreativeWork is not a known valid target type for the hasPart property.
2020-10-06: Is HTTPS Fast Yet?
It's clear amongst the volatility in "Interactive" and "Speed Index" values that https is consistently slower for visitors, by ~150ms. Even as here where the client and server are both in London.
This is basically the https negotiation time from cold.
Maybe gains from HTTP/2 (h2) mean that some metrics are a bit less volatile, and page complete is only up by ~100ms (~510ms vs ~410ms).
These numbers are a mixture of mobile and desktop renderings of the same desktop home page, over 'cable', eg for users coming in over WiFi from home. Desktop numbers are showing as ~560ms, mobile ~460ms, to page complete.
2020-10-05: Image Size Smaller Than Recommended
I received a warning from GSC for one AMP page:
Image size smaller than recommended size.
It doesn't say which image, and as I have multiple ImageObject alternates for that page now, I think what it really means is "none of the images that you list with schema.org ImageObject markup is large enough" at the point that the page was last checked, many days ago.
According to the Google developer guidelines for Article, there should be one or more schema.org
URLs) with images
representative of the article.
The guidelines suggest minimum width and area, and the warning I got was with reference to those, so GSC claims.
The relevant dimensions are different for normal and AMP (non-story) pages.
Images should be at least 696 pixels wide and they should be a
minimum of 300,000 pixels when multiplying width and height with aspect ratios 16x9, 4x3, and 1x1.
For AMP non-story pages
Images should be at least 1200 pixels wide and be a
minimum of 800,000 pixels when multiplying width and height, with the same aspect ratios.
I use the same set of images to select from for the top-of-page hero banner as I mark up at the various aspect ratios as
ImageObjects. As I want (at least) 800px for that, the non-AMP minimum width is not much of a restriction in practice.
I have these limits (or higher) already coded into the page-build script. I am tweaking this to warn about not meeting the higher width for just for the AMP page builds, and will continue to provide info for others. (The minimum area suggestions seem to be weaker.) That will generate some more noise, but at least it should be in sync with GSC complaints.