Earth Notes: On IoT Communications Backhaul (2015)

Updated 2023-09-22.
How small Internet-of-Things (IoT) devices should talk to one another. #IoT #comms

Overview

(Part of WP1 Research, D17 Communications. Which radio technologies to use, what concentrators to use. 1 box (in a bus shelter) with sensors - micro controller - radio module. This box transmits via radio to the concentrator or gateway (one of our boxes) (on top of a building / indoors probably next to router) that connects to the Internet. Output is prioritised list of of RF backhaul solutions. ED - Research what data models and wire formats would be suitable to receive data on a central platform, taking into account the security requirements in WS11).

In the world of the Internet of Things, and particularly as relates to OpenTRV's IoT Launchpad project running from April 2015 to March 2016, we are validating alternative pluggable comms solutions (ie 'bearers') to get data from sensor nodes (much traffic might be one-way) to some sort of concentrator/relay at the edge of the Internet, for fan-out and onward routing to real-time consumers, databases, etc, etc.

This flavour of data flow is also known as "M2M" (machine-to-machine).

We want it to be easy for IoT deployers and developers simply to slot in an appropriate radio module for their use case and not worry about coverage, reliability and security etc. Applications range from deployment in homes (eg OpenTRV's intelligent radiator valves to save money and carbon, and the IoT Launchpad project's building health monitoring) through "smart city" urban (eg real-time tracking of bus-stop foot-fall) to rural applications that may have sparse coverage by conventional comms networks...

In many cases we assume that a fit-and-forget solution not needing mains power or regular maintenance or physical plumbing into other infrastructure (eg comms) will keep costs down and expand the range of deployments that become practical for IoT.

The simplest uniform scalable version of the network design is as below:

battery voltage graph

On the left are various different distance scales from home heating control and automation over typically at most 10s of metres, up through building health monitoring for offices over maybe 100s of metres, urban scale deployments such as footfall monitoring at bus stops ~kilometres, and sparse rural deployments (transport, environmental, etc) right up to emergency deployments in disaster areas dispersed rapidly over a large area and controlled from far away.

In general all of these leaf nodes (eg sensors) are likely to talk wireless, at least for simple, quick, cheap deployments, to some kind of receiver and concentrator.

Immediately behind or part of that concentrator is an IP-enabled distributor, typically something of the power and complexity of a Raspberry Pi, with the possibility of remote management. It is connected to the Internet and securely fans out data to a number of listeners, with different routes for mainly-private data and mainly-public data, and further variations depending on how much massaging and integration the data needs to be useful. If the data can be used as-is or with simple low-state transformation, eg from a bus-shelter sensor to a tweet warning people away from a packed one, then the work can be done at the redistributor itself. The distributor is largely stateless in terms of the data flowing through it, though may, for example, merge data over a small time window time, or look for unexpected outages to help with estate management. Otherwise, and for long-term storage and analytics, the data will be routed to/via a provider of analytic and other services.

(Java/Kura/OSGi is one way to deploy modular customised edge logic, eg for format conversion from highly-compact custom over-the-air binary to JSON/SenML, though nothing says that the source language need be Java!)

For reverse data flows back up to the leaf nodes, eg to actuators or to poll or for minor reconfiguration (sub-TR-069) it may be possible to push data directly to the redistributors, and this may be the best route where latency needs to be minimised. Or it may be preferable to reduce Internet-exposed attack surface by having a smaller number of machines on the scale of TR-069 config servers (possibly even the same machines) which mediate all reverse data flows; thus redistributers would only have to accept any inbound connections from them and could even be firewalled against others.

This forms a very flat scalable/distributed network with a number of leaves (up to hundreds) directly managed over some appropriate flavour of RF for the distance scale, managed by a relatively small (say) Linux concentrator box with full IPv4/IPv6 smarts on the other side, itself which can be managed from upstream if needed, in parallel with many other similar clusters. That concentrator is capable of talking to the main private and public data sinks itself, eg maybe 10s of services from analytics to Twitter/Bluemix and other data brokers with high fan-out to end users and (say) a transport operator's inbound feeds.

For (say) huge offices or transport networks many of those concentrators can be run in parallel, independently of one another.

Key management/sharing and partitioning so that parallel concentrators' fiefdoms don't clash is another story.

Note that though a basic concentrator/distributor mechanism has been created by OpenTRV, tools such as Apache Storm, Apache Camel, and Eclipse Kura could probably provide matching or superior functionality.

See Bruno's Kura review (also here).

See Bruno's ApacheCamel review (also here).

See Bruno's Eclipse Krikkit review (also here).

See Bruno's Apache Storm review (also here).

See Bruno's Eclipse SmartHome review (also here).

Further Bruno suggests (2015-07-31) re SmartHome: not for immediate use but got me to think about the overall architecture and I think I'd want to do it this way:

  1. Re-build our current code using Kura, making sure we decouple:
    • Device handling
    • Backhaul communication
    • Routing between device and backhaul
    • Configuration management
  2. Create a simple default routing component (Apache Camel is a good candidate) that would suit the use cases we have for the IoT Launchpad project.
  3. Create a configuration management component that implements the TR-069 protocol.
  4. At a future date, implement glue code with SmartHome (and possibly other frameworks) to provide more advanced routing and device management.

Use Cases

This section briefly enumerates some of the possible Launchpad and other use cases that this architecture has to service.

These use cases cover escalating physical scales, and often also increased estate complexity (number of units managed), and possibly also rising reliability requirements (though that is less clear).

Home (~10m): Heating / Automation

Example: OpenTRV's intelligent radiator valves calling for heat from the central heating boiler from around a house. FS20, OOK, 868MHz as at 2014.

Scope/size: within a home or small office for heating control (eg OpenTRV) or home automation or security (eg monitoring vulnerable elderly). Some sensors/actuators with easy access to mains power, others not; batteries and energy harvesting plausible alternatives. System complexity of the order of 10s of devices, and management including (say) physical pairing is plausible.

Inbound: ISM radio bands (eg 434MHz, 868MHz, 2.4GHz, 5GHz all plausible), meshing may not be important but some relaying may be (eg in houses with thick walls or internal foil vapour control or other bulk metallic elements). TinyHAN is a candidate non-meshed bearer/MAC (Medium Access Control). ZigBee and Z-Wave are examples of well-known mesh protocols. Current OpenTRV home heating application and tech uses 868.35MHz 0.1% duty cycle with FS20 carrier/encoding, and two-byte unique unit ID. (Also think about EnOcean ASK + protocol.)

Concentrator/distributor: often can be on-site and (eg) plugged into home broadband router or connected by WiFi if a suitable UI is available. May also be able to service some systems out-of-building eg over SIGFOX or LoRa network for lower-bandwidth roles such as activity monitoring.

Security: with a small number of devices on the market open/unsecured protocols such as FS20 not a disaster, but if installed in the millions that at least invited pranks or worse (eg disabling heating in sheltered accommodation) from 'drive by' RF-based attacks. The Internet side must be fully secured from the outside and the concentrator/distributor must be powerful enough to support it and be (remotely, automatically) upgraded to fix evolving security threats.

Outbound: some uses will not require any outbound distribution at all, some may require limited logging and control eg via smartphone/WiFi within the building, some may want to expose that control outside the home, and some may wish to feed/accept data/control to/from third party systems for monitoring/control/billing and other purposes. Publishing microgeneration stats (non-sensitive) or calling a mobile phone when activity is detected (sensitive) are existing understood cases, and can be brokered via (say) Twitter for human-readable messages or opensensors.io for raw data, or maybe by EnergyDeck for energy management for social housing. Some of these may slightly stateful for local use, with any complex operation or data storage (within a user's privacy comfort zone) may be handed off to the cloud.

Other: there may also be numerically-small but intellectually-interesting geeky applications such as contributing real-time weather/environmental data and controlling more esoteric devices remotely and/or interfacing with general home automation. The redistributor may be combined with or hosted by OpenHAB.

Note: this is not officially one of the Launchpad verticals, but should be reasonably serviced by the Launchpad technology outputs.

Home Data Traffic Estimate

Here is a simple worked estimate for radio bandwidth (in home and outbound) in an OpenTRV-enabled home with some valves and some pure sensors, and possibly with stats being relayed off-site, and little or no reverse traffic, based on typical OpenTRV leaf radio behaviour as of the end of 2015:

  • Assume a typical target home has a total of 5 (five) OpenTRV radiator valves and pure sensors plus the boiler hub reporting boiler status for example.
  • Assume each transmits frames at a fixed/maximum rate, and packs what it can into those outgoing frames (so if there are more stats then it takes longer to rotate through them for transmission for example).
  • Assume that OpenTRV secure frames have a content of about 64 bytes, (including the few the securable frame's header and trailer) and in particular that frame size does not vary (much) with content.
  • Suppose that each valve sends a valve position or (no) heat-call every 2 (two) minutes; this may have stats piggybacked on it.
  • Suppose that each sensor sends on average 1.5 stats messages each 4 minutes (basic 4 minute cycle plus extra optional message at randomised time to help avoid collisions).
  • That implies roughly one message each 2 (two) minutes from each device, or 3 messages per minute across all valves/sensors.
  • To avoid breaking the 1% duty cycle in ISM band 48 (eg 868.35MHz), at one message per minute TX time can not be more than 60s/100 (0.6s/msg) including overheads, double TXes, etc, and assuming no reverse traffic.
  • At 3 messages per minute and 64 bytes (512 bits) per frame, (200ms max TX per message) minimum TX bit rate is nearly 3kbps.
  • At 3 messages per minute and 64 bytes per frame, that implies ~4300 frames/day or ~280kBytes/day, or a monthly bandwidth (excluding overheads) of ~8.5MByte.
  • Assume that a stats hub or lightweight relay may forward all valve and sensor messages if permitted to send any, and is not able to filter, so that entire 8.5MB/month represents external bandwidth usage.

Office (100m+): Building Health

Example: per-room or finer-grain footfall and light/temperature/CO2/RH% monitoring in an office block.

Scope/size: Within a single large office building or a group of buildings, or similar such as a small university campus or dorms. Potentially thousands of units including energy (sub)meters and environmental.

Inbound: nominally within the reach of (say) 100mW ISM 433/868MHz for example, but probably too many walls and too many units online for a naïve implementation with a single concentrator. 2-byte on-air random node IDs may have unacceptably-high collision rates when deploying thousands of nodes (eg think birthday paradox). OpenTRV units currently have a 8-byte (64-bit) internal ID of which ~56 bits are random, so more of those probably should be exposed in this environment. Also, with blind transmissions (no listen before/during transmit and no 'slots') the maximum utilisation will be less than the 1/2e (18%) of pure ALOHA, which with a maximum 0.1% duty cycle as OpenTRV currently uses in its ISM band, implies 18%*1000 or ~180 live nodes in the system without some sort of management such as listen-before, slotting, time division, power control for cellular-like behaviour and multiple concentrators, and extreme parsimony of transmission bandwidth. Some limited two-way working, at least of metadata, may be helpful; TinyHAN-style beaconing may be part of the solution.

Concentrator/distributor: multiple concentrators managing sub-groups of nodes, preferably as radio 'cells', may be useful.

Outbound: connectivity likely to be to private LAN not public Internet, and the data is likely to be private and thus not directly routed to public data brokers such as opensensors.io or Twitter, but retained on the organisational LAN and/or forwarded to agents such as EnergyDeck and facilities management consoles for analysis and storage. Digested/anonymised/time-delayed data may get pushed to public sinks, with possibly very small amounts of non-sensitive public-interest real-time data such as weather/environmental info. (One issue potentially here is how the data gets offsite: dealing with IT departments can be tricky, so sometimes requires a dedicated internet connection for the metering data, or a SIM based solution to extract data. It does also raise issues about placement of gateways throughout a building, relays, etc.)

Other: this may be a fairly heterogeneous set of users and data sets, and at the larger end, eg a large corporate or university campus, may shade up to the 'urban' in terms of appropriate backhaul though possibly with higher data rates.

Bruno's (CTO EnergyDeck) note D17 2015-06-07...

# D17: Communication #

Communication between devices and data platform.

## Transport ##
- HTTP
- MQTT

## Payload ##
- JSON (SenML or derivative?)

Another possible application: monitoring of power flows in off-grid PV/wind system, needs to be low (parasitic?) power and probably low density, but distances may be more like office campus.

This is a vertical officially within the remit of the Launchpad project.

COHEAT Radio Performance Notes

The COHEAT site has many characteristics in common with a small office building for radio purposes. This deployment has ~150 radiators each controlled by an ELV/FHT8V valve managed by an OpenTRV REV9 relay with the REV9s controlled from nominally a REV2+RPi per household. That makes for ~300 transmitters on the same (FHT8V/FS20) carrier which was expected to be pushing things, so please note feedback and analysis as of 2016-01-12:

Urban (1km+): Bus Shelter Footfall

Example: footfall and temperature monitoring in a central London bus shelter, with the primary/central antenna up to ~4km away.

Scope/size: sensors deployed across a town or city, often not line-of-site, for sensors and/or actuators, in the "smart city" arena. Might be a local authority or transport operator or an arm of a large dispersed corporation tracking dispersed (or mobile) assets. Generally no sensor very far from network connectivity such as GSM/cellular or even broadband.

Inbound: candidates GSM/cellular, LoRa, Telensa/EnTalk/similar eg based on streetlight meshed networks, SIGFOX or other carefully-crafted ISM 169/433/868MHz use, WiFi to nearest buildings in selected locations.

Concentrator/distributor: for simple deployments with a single project-placed antenna, co-location with the antenna/transceiver may make most sense. In other cases the distributor may need to connect over the public Internet to a smart-ish virtual concentrator (such as the synthesised output from multiple antennae/masts in a 2- or 3- antenna LoRA deployment, or from (say) GSM or SIGFOX, in which case the distributor could be virtual also and in the cloud for example.

Outbound: almost always directly across the public Internet for public and private consumption, archiving and analytics, though possibly filtered in some simple way to some destinations (eg not tweeting to the world a lone bus shelter user in the early hours of the morning). Quite possibly some more 'industrial scale' data sinks such as Bluemix, and specialist ones such as a transport operator's system or TransportAPI.

Other: it is likely that more of the data than any other case, real-time and not (eg archived or with rolling analysis), from this case will end up open/public. The expectation is that all this data is essentially telemetry, eg higher-bandwidth data streams (such as CCTV images of bus shelters from traffic systems) will where available take different routes, and be coordinated downstream as required.

This is a vertical officially within the remit of the Launchpad project.

Rural (10km+): Environmental and Transport

Example: footfall and environmental monitoring and simple information signage (ie data flow from the centre) in rural bus shelters, with the primary/central antenna up to ~20km away.

Scope/size: dispersed over a wide area such as rural sensor networks, longer-distance transport such as non-urban buses and trains and taxis, patchy coverage by existing radio networks and possible high maintenance costs once units are installed. Probably mainly-public data.

Inbound: candidates include LoRA, Telensa, GSM/cellular, and possibly custom HF (licensed) channels for example. In some cases GSM will not be reliable, and in some WiFi (or an ISM-based hope to WiFi) to nearby Internet connections may be available.

Concentrator/distributor: possibly one at each local connection to public Internet (eg over WiFi, eg ISM to concentrator then over WiFi to broadband), and one at (say) LoRa antennae, then the rest in cloud-based (or racked) concentrators pulling data in from GSM/etc telcos and filtering/transforming and pushing out to analytics and data sinks.

Outbound: probably mainly-open data, available in real-time and cumulatively, sometimes with analytics, via variety of routes such as Twitter/opensensors/EDX/Bluemix plus (say) transport operator and local authority data feeds. Most of this will flow across the public Internet.

Other: expected to be largely public/open data, though may need some screening of sensitive items for privacy or security reasons, eg time-based screening to protect individuals using transport networks in the early hours. Reverse flows for (eg) information signage and or environmental control such as helping with river management may be valuable.

Note: this is not officially one of the Launchpad verticals, but should be reasonably serviced by the Launchpad technology outputs.

Global (100km+): Disaster Relief

Example: radiation monitors air-dropped in after a leak and evacuation, monitored remotely via satellite.

Scope/size: in areas where local networks are not to be relied on, eg after a natural disaster or widespread/cascading utility failure, and where there may be unusual tolerance of costs or reliability parameters, such as in parachuting in short-life battery-powered 'help me' buttons or longer-term self-powered monitors (eg of radiation). A mixture of satellite and adaptive use of whatever comms is up and running may be necessary.

Inbound: satellite and/or adaptive/facultative use of any open GSM and WiFi systems that happen to be working, paid or free, by preference, for example.

Concentrator/distributor: could be run up in the cloud, eg AWS instances, for emergency response.

Outbound: a mixture of direct and private/analytics channels may be appropriate, eg to provide crowdsourced "mechanical turk" style help, or routed to emergency services privately for the 'help me' button. Much of this is likely to be ad hoc, but elements of the data could be routed directly into public data channels Twitter/opensensors/Bluemix etc.

Other: likely ad hoc and expensive, but must be quick and easy to assemble.

Note: this is not officially one of the Launchpad verticals, but should be reasonably serviced by the Launchpad technology outputs.

RF: Inbound

This section deals with the in-bound side, from sensor to concentrator (and possibly reverse flows), usually over RF (radio) links.

We want to improve on existing backhauls especially around commissioning, maintenance and security. In some cases that may mean developing or adapting protocols.

(Note that there are so many protocols/communications/etc in IoT, all vying to be top dog, that an exhaustive search is quickly out of date as new announcements are made frequently, maybe daily. This research is on a focussed sample, and we stay on top of announced and planned developments continually and will revise this page if necessary during the project.)

Bluetooth Smart

A lot of smartphones from ~2015 onwards, especially Android, will come equipped Bluetooth Smart which while probably not directly suitable as a permanent backhaul may be useful for other 'intermittent', discovery and sensing purposes:

  • wireless connection of close-by sensors to a leaf sensor node
  • temporary interaction with end-users' smartphones to provide services to the user or temporarily join the phone to the sensor network or count people or even piggyback some data to send upstream
  • bulk upload of non-real-time data to, for example, buses passing bus-shelters for forwarding upstream
  • providing tracking beacons where users consent to be briefly tracked for service quality or for location in (say) an office environment

LoRa: First Pass

Amongst the options for backhaul that we'd like to try for small sensors outside buildings but in urban areas for 'telemetry' data speeds and low power drain is 'LoRa'(TM) which is explicitly designed for this 'IoT' world with relatively Long-Range links from leaf nodes back to concentrators (hundreds of metres to kilometres)

In a conference call with IBM and Microchip 2015-04-29 we confirmed that LoRa is basically in line with our expectations such as:

  • Frame sizes of ~64 bytes supported (minimum ~51 payload guaranteed).
  • Bit rates upwards of 300bps, and frame rates ~1 per 10 minutes.
  • A single concentrator/gateway should be able to manage ~4M messages/day, or as many as 100,000 leaf nodes each sending ~40 messages per day.
  • At low packet rates battery life from 2xAA can be 10 years.
  • Physically small, potentially integrateable into OpenTRV's nominal 50mm x 50mm standard footprint.
  • Operating in 868MHz (or 434MHz or others if necessary), so OpenTRV common assumptions about antennae OK, eg for PCB integration.
  • Most of the heavy lifting happens at the gateway.

Other nice features from our point of view:

  • Open protocol.
  • Encryption/auth apparently available built-in.
  • Microchip RN2483 module small and 868MHz and 2.1V to 3.6V operation.

(Thanks to both IBM and Microchip for our rapid introduction to the tech, and I hope that the Launchpad can make an interesting public case study and test bed.)

We may be able to operate two concentrators in tandem at different sites to ensure better reliability and provide public data about robustness with distance (for example) in a real city environment. We hope to have our main concentrator near the Shoreditch bus routes that OpenTRV hopes to focus on, but have also been told that we can probably put an antenna on the roof of the Digital Catapult on the Euston Road ~4.5km away which may be at the practical limit of reception roughly from London's West End to East End!

We intend to compare LoRa or something like it against more conventional GSM (cellular) comms which is clearly available without difficulty in the heart of London, but which is quite heavyweight (eg in terms of power) for our application.

LoRa appears to provide security over the air, meaning that leaf/sensor units using LoRa possibly do not need to layer additional crypto on top, saving code space and CPU cycles and energy at the MCU.

We have more research to do, obviously, but this looks good.

See the LoRaWAN Things Network in Amsterdam.

Telensa

A discussion with Jon Lewis, Telensa Director of Strategy, 2015-05-06, suggests that although its technology is mature and deployed, that (a) because Telensa's business strategy which is not just about providing a commodity telemetry network, and (b) because Telensa's deployments tend to number in the thousands to hundreds and thousands the Launchpad project is probably too small to engage with them at this stage, though should be back in the ring for any scaled deployment.

(Typical deployment size is 5k nodes for parking and 50k for lighting.)

The Telensa technology certainly seems to be able to offer the micropower and other features that would be needed for (for example) the bus shelter deployment, and is bidirectional where some of the other bearers are not.

The Telensa technology stack is also mature (many years deployed).

Telensa will not be considered further for the Launchpad project, given its size as mentioned above.

SIGFOX

Initiated a couple of conversations with SIGFOX via a couple of different routes but did not get any engagement.

No further response from SIGFOX by 2015-06-07.

Neul

(Made call 2015-05-21 to Julien G of Huawei. Tech is no longer whitespace, uses licensed spectrum, meant to work in conjunction with LTE.)

No further response from Neul by 2015-06-07.

Telematics

(2015-06-02 met Ray Wescott at Vision London after IoT talk and agreed to explore possibilities for their technology (CMS).)

Wireless Things

Wireless Things has mentioned on a number of occasions that it has a simple low-power solution that will cover good distances and has security built in. Even if not necessarily for the final roll out this may be good for initial warm-up trial stages.

RDN

Various discussions with Brian Back, MD of RDN (initially venturing from our stand at Interop to his!), indicate a small deployment in Shoreditch to (say) 10 leaf nodes including a concentrator (if we can find somewhere to site it) should be possible quickly and for a few hundred GBP per leaf all-in with some subsidy from RDN for this public research project. Their equipment is on a licensed band (~150MHz?) and has been used in similar ways in France to our aims (now ~2 million nodes). Their equipment can be tweaked to send our JSON-format messages directly.

Satellite

In cases without dependable extant communications networks such as GSM/cellular, such as in rural areas in radio shadows from hills, or disaster areas, LoRa and GSM and the like may be impossible, and for this something like satellite may be a useful backstop, though usually more expensive for hardware, per-byte charges and power consumption.

As part of the Launchpad project OpenTRV wants to establish if it is reasonable to simply treat such a satellite device as just another pluggable backhaul.

To that end OpenTRV has borrowed a naked RockBLOCK Iridium satellite transponder which arrived 2015-04-29. Many thanks to Rock Seven!

2015-06-29: Jonathan S is investigating use of the IridiumSBD Arduino library to speak to the RockBLOCK.

Internet: Outbound

This section deals with the out-bound side, from concentrator/distributor fanout to data sinks (and possibly reverse flows), to existing public APIs, usually across fixed IP-based links over the wired Internet.

Since concentrators/redistributors will probably be (a) relatively low-power and (b) often physically difficult to reach and (c) often exposed to the Internet without much firewalling, etc, then it seems wise from a resourcing and security point of view to limit as far as possible connections across the Internet to be outbound only. This is harder for configuration management and two-way data flows, but probably remains a reasonable guiding principle.

Some of the data brokers that the distributor may wish to talk to may not like inbound connections for similar security reasons, so data may need to be routed to them via other (real-time, reliable) brokers maybe in the mould of Xively or opensensors.io. Or it may be necessary to run up simple logically-transparent storage-limited adapters/couplers such as Stomp to allow both sides to be making outbound-only connections. (Stomp provides reliable message passing over unreliable HTTP (long lived) connections.)

2015-06-10: at IBM Hursley park today, IBM and Microchip participants suggested that given where shortages lie in developers amongst other issues, the concentrators should be even simpler than depicted here as of today, doing essentially no more than forwarding frames elsewhere for storage and non-trivial processing.

We want to improve on existing backhauls especially around commissioning, maintenance and security. In some cases that may mean developing or adapting protocols.

Twitter

Since about 2015-04-12 I have been tweeting, initially from unit tests, via OpenTRV Sensor b39a, with what I think is a fairly old (OAuth) authentication that is certainly a monumental pain to set up each time for me, relying on buggy 'demo' / deprecated library features. I have been using the same mechanism for a while to make regular informational tweets (which regular humans seem to appreciate) on a 'green' topic based on data gathered from the GB's electricity grid.

The messages have been the raw textual form of the underlying incoming stats messages, eg:

RAW_29%64@0601> {"@":"b39a"}

and if those are sent through (say) every 15 minutes, they start being rejected by Twitter with messages such as:

code 226: This request looks like it might be automated. To protect our users from spam and other malicious activity, we can't complete this action right now. Please try again later.

So the low-level transport works, but I need to raise my presentation game!

As a start I changed the message body to something of the form:

Temperature at 15:52 18°C

to see if the anti-SPAM imps in the Twitter machina feel better disposed, but the first attempts at least were also rejected.

As of 2015-05-25 (and 26) I forced a longer interval between messages (in part helped by removing a rounding error that was causing jitter and extra messages) and provided a more variable/chatty/human message style, and the messages are being allowed through.

Opensensors.io and Generic MQTT

Established test MQTT feed of data from a local data source (grid-tied PV generation from my roof), cron/script-driven.

2015-05-18: spoke to Yodit, currently waiting for secure data push (from Java) to be come available. May be able to do insecure thing first, then switch to secure-only.

2015-06-06: a prod from Aideen suggests to me that the Opensensors.io driver should also be generic (secure) MQTT as far as possible to enable pushing data to other MQTT consumers. The redistributor should behave like a proxy for the leaf nodes rather than letting subscribers bind to it (for security and capacity reasons). "A local MQTT broker can forward data to a remote broker. jpmens.net has lots of helpful stuff and created mqttwarn which may help. (already used by some openenergymonitor users)."

Open Energy Monitor

Open Energy Monitor is a fairly-widely used open-source software and hardware system based on energy monitoring and management. Ability to interoperate with (eg at least sensibly push real-time data to) OEM is a useful test of the data architecture. OEM systems are typically deployed at home, but might also be useful for SME businesses, ie more simple/informal IoT deployment.

2015-05-18: spoke to Bo, would like (http) push into local OEM node for graphing, etc, and will attempt to make some progress over next week or two. When connections only traverse local WAN security may not be an issue and https may be meaningless complexity/cost in money and CPU cycles (and thus energy).

Bo notes:

Started to google for those links and examples...

http://www.imaginaryindustries.com/blog/?p=378

# Prepare the URL
        url  ="http://" + emonHostname + "/" + "input/post.json?node=1&apikey=" + emonApiKey + "&json=" + urllib2.quote(json)

Above 2 lines taken from here:
http://we-io.net/hardware/log-and-visualize-your-sensors-with-weio-and-emoncms/

Here is a good one on how to send multiple data at same time:
http://harizanov.com/2012/04/software-only-solution-for-feeding-weather-data-to-emoncms/

Will actually implement the last one as we have a weather station within visual range, and its even an official one.

2015-06-05: I now have most of the machinery working to push data into emoncms (V8.4 or higher) eg generating (encoded) URLs of the form:

http://127.0.0.1/emoncms/api/post?apikey=ABC&node=555&json=%7B%22Temp16%22%3A308%7D

This is not great semantically (using a GET to perform a non-idempotent action) or from a security point of few (sending the authenticator in plain text) but is probably OK where the OEM server is on the same (private) LAN as the redistributor.

Note that a node value such as "819c" does seem to be acceptable, so other than clashes with reserved IDs such as 0, maybe sensor node hex IDs can be used directly as emoncms node IDs, as Bo suggested, providing that only (say) one household's nodes are being captured to avoid unexpected collisions.

This is nowhere near complete code for a production quality feed, but serves as an initial proof of principle.

TransportAPI

Still waiting for some feedback from TransportAPI to set up an initial dummy feed to it as of 2015-06-07.

Bluemix

Currently on hold 2015-06-07 as a backup outbound route if TransportAPI and/or opensensors.io do not work out or if there is extra time available.

IFTTT

(Sent enquiry 2015-05-22 to IFTTT asking if shared public triggers were possible, with state pushable from our nominal redistributor.)

No further response from IFTTT by 2015-06-07.

Xively

Investigating opensensors.io in preference to Xively given the time constraints of this project, as opensensors is the newer platform and may benefit from some public critique.

Configuration Scheme

In order to make concentrators/distributors easy to work with during this project, be a proof of concept, and also to get feedback from other/external users ASAP I am suggesting 2015-06-06 making a first pass configuration mechanism that has the following characteristics:

1) allows static configuration at process start-up, and occasional
   reconfiguration eg to add and remove downstream sinks; not fully dynamic

2) this should possibly also manage the upstream leafs explicitly for
   deployment

3a) should be do-able from a small number of plain-text, vi-editable files,
   probably in JSON (or YAML?) or XML

    (Bruno ED CTO:) JSON would be my preference. This way, you can use the same
    (or similar) format later when we implement remote provisioning
    driven by the data platform.

3b) should be sensibly and safely manageable remotely

    My feeling is one of two routes: either allow the config file to
    be replaced remotely in its entirety (or specify an https URL to
    pick config up from, cache locally, and poll for updates), or allow
    “includes” for other file/URL components which may also allow
    more modularity.  Any thoughts?

3c) *possibly* a blend of 3a) and 3b) should be possible, but nothing too
   complex

4) should be able to have sensitive material such as passwords and shared keys
   in the main config or delegated to a separate credentials store

    If the node is remotely managed anyway it may not make sense to force
    breaking out the material into a separate place which also has to
    be managed remotely.  Also, the sensitivity of some items varies
    by deployment, eg even the location of a server may be slightly
    sensitive in some cases so you'll see for the EMONCMS stuff that
    I currently have the base URL in the credentials.

    So I think a mix-n-match approach may be best if not too complex,
    eg if the credential is not inline in the config then look for it
    in a secure (local?) credential store.

5) allows set up of a simple filter/transform/distribute forest on the output
   side and an gather/authenticate/manage forest on the input side

    On the output side: I expect to have a set of pipelines (processed
    concurrently to minimise latency/blocking) that do some filtering
    eg to select only data of interest to particular downstream nodes
    and to filter for, say, authenticated data only, some transformation
    such as scaling or combining or eliminating dud values or batching to
    a complete map of all live values, and then the output drivers that
    push the values in a suitable form for the data sinks such as Twitter
    or opensensors.io or the EnergyDeck supercomputing analytics array.

    On the input side I expect (concurrent) drivers for input sources
    such as OpenTRV serial inputs, but also local sources such as files
    and sensors at the concentrator such as temperature and battery
    levels, and any metadata handling for deployment and liveness and
    authentication/encryption as needed.  This can be rudimentary to
    start with but I'd like us to at least understand having more than
    one input source and type to each concentrator.

6) doesn't get in the way of low-latency and some concurrency in data handling

Attributes potentially in scope of this configuration are:

  • The set of leaf node IDs being managed and any security credentials and (say) warning thresholds such as expected maximum transmission gaps and minimum safe battery/power levels.
  • The output data fanout channels such as, for each channel, the input leaf IDs of interest (filtering) any basic transformation eg scaling, filtering or grouping of values eg over a time window for median filtering or simple averaging or building a full map, then mapping to the sink API such as Bluemix or Twitter or Xively including (sufficiently secure) transport across the Internet.

Note: along with config text and keys could be (Java) implementation classes.

Note from Jeremy P 2015-06-06:

It may be a good idea to consider how integration with TR-069 (http://en.wikipedia.org/wiki/TR-069) might work. This is extremely common when managing large numbers of devices, eg if you have a ISP provided router in your home that will be managed using TR-069.

Not that I am suggesting that this should be on the leaf nodes but it would be a good idea to make it easy for some intermediate device to translate, eg a similar data model.

One other point, you may want to try to avoid polling a URL for config updates as once a system is setup config changes tend to happen infrequently so it can be a huge waste of resources both in terms of the leaf node's radio, power, etc but also the serving endpoint, especially when you are dealing with thousands of devices. That being said it is also a lot simpler to implement on the leaf node and can be very useful as a kind of heartbeat.

TR-069 principles look good for remote management of concentrators, and I take the relevant thrust to be that concentrators (CPE) pull config down from central configuration servers (ACS/SCM), and can be prompted to do so by the ACS to eliminate the need for the CPE to poll if need be. There are various protections eg to avoid DDoS attacks. (The idea is to at least follow the same structure, ie client/concentrator controls/initiates all updates from central server, but can be prompted to look by that central server, a bit like newer DNS.)

It remains the case that we should think about pre-existing tools such as Apache Camel, possibly offloading work to them to reduce development time.

(2015-06-19: see proposed configuration file format.)

2015-07-09: See TR-069 review (also here) from OpenTRV perspective.

2015-07-31: See Bruno's Proximity Beacon API review (also here).

Prioritised Backhaul List

The deliverable from task D17 is a prioritised list of preferred backhaul mechanisms to evaluate during the Launchpad project is possible. This includes inbound and outbound routes.

Lower numbers within each list indicate higher priority.

Deployment/testing will usually be a blend of the alternatives for comparison.

Inbound (RF) for indoors (building health) applications:

  1. [core] Current 'OpenTRV' FS20/OOK/868.35MHz modified bearer as baseline
  2. TinyHAN (two-way comms)
  3. Improved 'OpenTRV' GMSK/868.35MHz/listen-before-talk/TX-power-management and other incremental improvements in spectrum use (eg such as with COHEAT test deployment)

All of these would require crypto on top for security.

Outbound (building health data) across the Internet to data sinks:

  1. [core] New protocol/channel to be developed that supports:
    • easy commissioning and maintenance
    • security by default
    • a sensible and standard data format
  2. RKDAP/AMR over HTTPS to EnergyDeck data servers

Inbound (RF) for outdoors (bus shelter) applications (at 2015-06-07):

  1. [core] LoRa
  2. [core] GSM/cellular as baseline and for early pathfinder deployments
  3. Satellite (1-off) for proof of principle

All of these can probably be assumed for now to provide sufficient security as-is.

Outbound (bus shelter data) across the Internet to data sinks:

  1. [core] opensensors.io for general open-source real-time brokerage
  2. Twitter for direct-to-public alerts/status
  3. TransportAPI for specialised transport data usage

Plus permanent bulk archival to public data stores at the project end. Plus some further proof-of-principle work with OEM, EDX, etc, if time permits.

Additional output:

  • For central management of a large number of concentrators, eg distributed, following the same pattern as TR-069, and/or making upgrade to full TR-069, seems like an excellent idea. (Any initial version can probably avoid much of the complexity of NAT navigation, etc.)

Appendix 1: IBM Comment

2015-07-01: comment received by email from Anthony O'Dowd, STSM, Cloud Technologies, CTO Team, IBM. (Lightly edited for HTML.)

IBM diagram

... thought I'd share some concrete feedback - about how we'd recommend thinking about the overall architecture for TfL/OpenTRV solution ... I've tried to capture [it] in a simple chart, with some basic notes in it.

  1. It's a left to right diagram, with data collected by sensors on the left eventually being consumed by "applications" on the right. This represents the bridging of two worlds - the physical one of sensors, and the more ephemeral one of applications. As you and Mark have described nicely, it's really these 2 communities we're trying to bring together with the OpenTRV solution.
  2. The diagram is pretty similar to (exactly the same as?) your architecture diagram on the left hand side - up to the server, but after that point it's a little more prescriptive. Really it gets into the nuts and bolts of how data gets to applications, and if look at point 8 below, I think you can view the two slightly different diagrams as variations on a theme.
  3. The four phases of the data gathering process are sense, integrate, distribute and consume.
  4. I think we all understand the sense phase quite well, though of course, there's a lot of engineering here to make the solution really work.
  5. The integrate phase is where data is gathered from all the sensors and stored in a data store. We're recommending doing this as soon as possible to simplify the later stages of data distribution and consumption. However, I'm very sensitive to your needs to have an early routing capability for prototyping, debug, and general flexibility, so it's totally possible to do distribution at the server. However, in production, it makes it difficult to maintain the server if there's lot of distribution processing going on there as servers tend to be harder to monitor and administer, certainly when compared to the cloud. The major objective of this phase is to get the OpenTRV data into a single logical area where it can then be made available to a wider population - the data store. I think it also makes it clear that the data is OpenTRV's data. The OpenTRV data store is subsequently made available through a set of APIs. It's been the working assumption that this is with TransportAPI, but the key thing is that these APIs make the data store available to a wide variety of distribution platforms. APIs should support both push and pull models for the data. Systems like Twitter are intrinsically push based, but some users will prefer pull APIs for periodic monitoring. For bulk/historic, then pull is also a more typical choice.
  6. The third phase is distribution, and this is very similar to your diagram. I've listed a few, but of course there are many, and these can change over time. I think the key point is that the distributors are responsible for their ecosystem of consumers (i.e. applications), and they access OpenTRV data uniformly via the OpenTRV APIs (push and pull). They handle "scale" processing to applications whether that's technical items like compute capacity or network connectivity, or more non-technical items like marketing and ecosystem. Some distributors have quite proprietary ways of distributing data (e.g. Twitter API), where as others, like Bluemix and other cloud platforms will just consume the native OpenTRV API. This really is the business of the distributors.
  7. Finally, we have the applications/consumers who are tied to these distributors. All fairly straightforward I think - they extend the value chain of the distributors' APIs.
  8. A final point on the overall diagram is that you can float just about any of these components, apart from sensors, into the cloud. I think however, for your architecture at the moment, the cloud should really starts with data gathering. Indeed, I think one can actually view the difference between your initial architecture diagram and these proposals as really where that slider sits - I think you have it further to the right, with distribution happening on the server, whereas there are real operation benefits to storing the data in the cloud from the server and doing the distribution later, from a centralized point.

So I think the major item is the refinement of the integration tier, but a refinement rather than change. As an implementation point -- which we can discuss more later -- there is some very nice open source technology which you can use on the server or the cloud to do distribution (called Node Red: nodered.org, a bit like IFTTT) so that the only real point of discussion is using a cloud data store.

2015-07-03: DHD comment: for smaller (eg private/hobby/poc systems) and/or where confidentiality or fees are an issue, it may still make sense to do the work in the concentrator/redistributor and omit the cloud datastore. For larger deployments the IBM comment above is likely right.

2015-07-06: BG comment:

The IBM view is interesting and reflects the fact that a lot of their offering thrives on a cloud infrastructure. I agree that a cloud store provides operational benefits in terms of data processing. However that doesn't mean that the solution should depend on a cloud store. Here are my reasons why a data platform shouldn't be essential:

  1. Internet connections can fail, both in the domestic and commercial worlds (as aptly demonstrated by my ADSL dropping dead on Friday night due to a massive storm) which would result in missing data if the concentrator is unable to operate in disconnected mode.
  2. There are already dozens of IoT platform offerings on the market: at the moment, every single piece of IoT hardware you can buy comes with its own platform that requires custom integration to make it talk to other platforms (HyperCat notwithstanding) and an iOS or Android app to configure the device which results in a lot of silos so it would be nice to change that. The current approach allows OpenTRV to let customers use the data platform of their choice and move away from silos.
  3. Designing and operating a cloud data platform is complex and expensive so even if that was something OpenTRV wanted to do, it would probably require a larger team than the current one.

That said, a data platform can be a nice addition for large commercial deployments where one doesn't exist yet. The way for OpenTRV to offer that could be an open source implementation that people would deploy on their own infrastructure or cloud.

Now for a competing view on this, if you were to ask Cisco, they are currently pushing the concept of smart devices at the edge of the network, ie devices that fit in a network router and have the power of a RaspberryPi 2 and can do some of the processing on premise before it leaves the network. The reasoning behind this is that you can then implement an efficient feedback loop to control on-premise devices without a round trip to the internet (eg change the room's target temperature as a result of local changes).