Providing appropriate security for small Internet-of-Things devices
(Part of WP1 Research, D16 Security. How to transmit data securely so that it can't be decoded by unauthorised people. Dummy info / no injection of unauthorised data. ED - Research standard IP based security mechanisms to ensure security in transit.)
Starting from the assumption for our work that the Internet will not reach right to the leaf sensor/actuator nodes because they are not powerful enough to handle IP, or else will consume too much energy and silicon and cost more than necessary, the security model looks something like this:
(Not in scope is anything beyond the handover to the data sinks.)
Security here is protecting privacy and safety and property against attack, eg:
Security does not always imply encryption; sometimes the important aspects are data integrity eg to minimise noise and errors in the system, or authentication eg to avoid allowing spoofing of data destined for public records.
(Note that there are plug-in security devices that can do or assist with authentication and encryption, and these may be especially valuable with leaf nodes if they (a) reduce code size and CPU load and (b) handle secure key management eg the keys are preshared in the hardware components. We may get to look at one of these devices later in the project.)
In general this is likely governed by common-sense principles, eg:
Bruno's (EnergyDeck CTO) note D16 2015/06/07...
# D16: Security # Things to consider: - authentication (identifying the sensor and who has access to it) - encryption - privacy policy (e.g. which data streams should be sent, which shouldn't) - tamper detection, such as: - sensor moved - battery removed
The tamper protection movement sensor and battery/restart counts are forwarded to D14 Sensor Set for further consideration.
As at 2015/06/15 a summary of the security framework to be followed is:
This is a snapshot of the security thoughts/abstraction as to 2015/05/24 in Messaging.h with some minor reformatting:
The design aim is to allow transmission of (optionally secure) telemetry from low-power sensor nodes over a number of alternate backhaul media such as one-way packet-based ISM radios. Assume that the leaf end is a low-powered CPU and so the code interface and implementation has to be simple, and with minimal support from some hardware for features such as encryption. Assume that the messaging maximum possible frame size will generally be 64 bytes or less, and may vary significantly with the options chosen below, especially if encryption is added. Assume that some of the data carried may be sensitive, eg privacy related or for driving actuators. Assume that some implementations can/will not run below a specific integrity level, eg with data checksums/CRCs. Assume that the raw messaging transport is by default: * one way * lossy * noisy * bandwidth limited (low bit rate and/or (say) frames/day capped) and/or expensive per bit or frame * real-time but possibly with significant latency * overhearable, eg over ISM radio or similar. (Some variants like TinyHAN allow two-way flows, and others may be radically different such as tunneled in HTTPS over a LAN.) Have one or more backhaul layers available at run-time leaf (with superset at concentrator) with some constant capabilities, ie that can be checked/selected at pref at compile time, such as: * Frame formats that can be carried on this channel (1 or more): * JSON object {...} (compact ASCII7 subset, only printable chars 32--126 ie with no linebreaks or other control). * Whitened binary (with no 0x00 or 0xff bytes), so limited-length runs of either bit, and both values possible as delimiters. * Structured binary (as interpreted by underlying channel eg with TinyHAN). * Pure binary. * Ability to mark some frames as 'important' (bool), eg containing critical or changed values, with extra delivery effort (eg double TX or FEC). * Maximum data integrity protection available from the channel (enum / small int): * CHECK: (required) simple frame check value applied and verified, eg typically 7--16 bit check sum or CRC, or in the underlying medium. * SEQ: (optional) above plus small frame sequence number. * AUTH: (optional) above plus crypto-based authentication. * ENC: (optional) above plus encryption (eg AES-GCM or EAX). * ENCHIGH: (optional) above with enhanced security (eg longer keys and/or IVs etc) at cost of frame size and CPU. (Data receiver should usually check data for semantic/syntactic integrity etc also, especially if a low level is used here.) [DHD20150409: note that all current OpenTRV traffic is effectively sent at level CHECK.] [DHD20150409: dropped NONE at Jeremy P suggestion to reduce complexity.] All systems should support at least JSON object and whitened binary formats with a simple (CHECK) integrity check. (Note that JSON formats are assumed NOT optimal in bandwidth terms, and should generally not be used for prolonged production deployments (use a binary format), but the underlying medium may be able to make some optimisations such as simple compression on the wire.) All systems with privacy-related data must support encryption (ENC), and/or have the ability selectively not to send sensitive data, and/or the underlying backhaul must be able to guarantee ENC-level integrity itself (eg tunnelling over HTTPS or VPN). At run time (and possibly at compile time) it must be possible to discover the maximum data frame size possible with the selected transmission parameters. Note that for higher integrity levels suitably-sized keys may have to have been pre-shared for example, and any modes not supported by the concentrator may have to be removed to the 'available' list. At run time it should be possible to specify above parameters with each frame to send from leaf, and those parameters plus some associated values (eg sequence numbers/range) should be recoverable. Data that fails integrity checks is in normal circumstances not available nor are crypto keys used, though parameters such as algorithm and strength may be). Note that key, IV, etc lengths that are acceptable in 2015 may prove inadequate to future; to some extent that is implicitly dealt with outside this definition by the key-sharing mechanism, but frame size limits may ultimately limit available security. See also: http://blog.cryptographyengineering.com/2011/11/how-not-to-use-symmetric-encryption.html http://crypto.stackexchange.com/questions/7951/aesctrhmac-encryption-and-authentication-on-an-arduino http://www.cs.berkeley.edu/~jaein/papers/cs294_9_paper_fec.pdf http://packetpushers.net/ipsec-bandwidth-overhead-using-aes/ http://nordsecmob.aalto.fi/en/publications/theses_2008/thesis_gabrielalimon_tkk.pdf http://www.iacr.org/workshops/fse2010/content/slide/Fast%20Software%20AES%20Encryption.pdf http://tools.ietf.org/html/rfc4106 Public domain uNaCl crypto for AtMega: http://munacl.cryptojedi.org/ and https://cryptojedi.org/papers/avrnacl-20130514.pdf https://github.com/kokke/tiny-AES128-C (public domain) http://csrc.nist.gov/publications/nistpubs/800-38a/addendum-to-nist_sp800-38A.pdf
The assumption is that for small frames and with auth/enc done by the MCU, symmetric encryption with pre-shared keys is the most practical solution for protecting comms between leaf nodes and the concentrator. (In some cases the bearer provides the security, or it isn't needed.)
GSM provides security over the air that is probably good but for all the most sensitive data for now, but the carrier has to be trusted if you don't provide auth/enc of the data inside the channel.
LoRa provides security over the air and additionally/separately security from end-node to final application, so the carrier does not have to be trusted.
Tony Brookes kindly did an initial security assessment of OpenTRV itself, which may be one consumer of the Launchpad security outputs. Here is his note to me (2015/02/08), more or less verbatim (some reformatting):
The first time I used the OCTAVE approach and found it a waste of time. I hope this attempt isn't a total waste of time. Open TRV initial security assessment. Approach suggested in A Framework for Assessing and Improving the Security Posture of Industrial Control Systems (ICS), Systems Network and Analysis Centre, NSA, Pub: Aug 20, 2010, version 1.1 As the Open TRV project is an open source project, it is therefore assumed that any and all of the technical details are freely available to anyone who wants to download them or buy a unit. This assessment concentrates on the attacks that can be carried out on the radio communications and what information can be derived from intercepting, jamming or spoofing them. Unknown: the range of the radio transmitter units, how the “family” of devices in a house are considered unique or keyed to one boiler control unit. Assumption: The OpenTRV units and the boiler controller are not connected to the Internet Q1 - is the radio transmission encrypted, if so how is it implemented and the key(s) managed? Attack 1: Alter process status in transit: i.e. the unit transmits a heat request to the central controller, which is somehow intercepted and replaced by stay off instruction. Likelihood: low Impact Over heated or under heated room(s), boiler firing more or less than normal. Attack 2: randomly request the boiler to fire or not. Likelihood: Low unless there is a mechanism by which the boiler can authenticate or know which requests are valid. It’s unknown how complex it would be to implement this, nor the overhead. Attack 3: Denial of service attack on the system Continually tell the boiler to either stay off (cold house) or run (hot house). aim: disrupt the system and hence under the occupant’s uncomfortable. Attack 4: Disable the device It is thought more likely this would happen by battery failure or human error than deliberate attack. Likelihood: low Mitigation: clear instructions. widespread operational testing in households with children and pets to increase the number of potential different for Attack 5: install malicious software Whilst this is a possible exploit it would require considerable skill and expertise Likelihood: Low, mitigation pot the processor chip: Attack 6: Installation of the devices incorrectly Likelihood: High (has already happened. Mitigation: clear instructions and public liability insurance. Question - does this system process any (sensitive) personal data? The data concerned (boiler on or off) is not sensitive personal data as defined in the Data Protection Act 1998. Additionally the data would need to be combined with the dwelling address before anyone could be identified. Even then, it is unclear what malicious uses the data could be put to that simple observation of the house would also not yield. (i.e. the data is considered equivalent to that in the public domain).
Dr Paul Galwas provided much useful advice 2014/10/06: see notes and email.
Along with many links and references the central point, which I've separately come to again, is that AES with Galois/Counter Mode (GCM) seems the way to go for encryption, and maybe Galois Message Authentication Code (GMAC) for authentication only, where the underlying comms channel does not itself provide adequate security.
Some further informal remarks 2015/06/04 (lightly edited):
- General: looks good to me: some detailed comments below. - General: I think there needs to be a section of key management that includes: a) long-lived and transient keys, b) creation, c) destructions, d) 'revoking' when compromised, e) lifetime, f) re-keying. This might help: https://developer.bluetooth.org/TechnologyOverview/Pages/LE-Security.aspx - Overview: last para: I think that 'or authentication' is strictly 'or authenticity of origin'. - Appendix 1: I think that 'must support (ENC)' needs to be 'must support (ENC & AUTH)', since ENC without mutual AUTH is fundamentally broken - Overview: https://tools.ietf.org/html/rfc5084 recommends IV of 12Bytes: not sure whether this would work since I've not seen the data frame structure. Uniqueness of IV is important to security. - Overview: Note that 'IV' requires a plausible random source to be effective, which is likely to be challenging on some limited platform. - Max data integrity: is the comms mode set at installation, or negotiated? If the latter, then take care that the protocol cannot be fooled into downgrading the security level. - Appendix 2: it might be worth also considering these threats Misuse of unprotected meta data (e.g. Address info) Accessing long-lived key(s), e.g. through theft or loss of device, possibly leading to other attacks, e.g. Remote spoofing? Replay attack, a) causing mistaken understanding of device reading(s); b) telling the device to do something wrong (including attacks 2,3,4). UI weaknesses, e.g. Relative to mistakenly using the wrong mode [probably covered in 6] [...] I think it's worth putting some lower bounds on bandwidth and packet sizes, since below a certain threshold, the mechanisms are likely to have to be somewhat different. Not sure that the order of decrypt and error detection/correction is defined: there's no point in decryption data with errors.
2015/06/23: I asked: "Do you see any obvious pitfalls in the scheme that I am proposing, ie using part of the device ID in the nonce and sending it in the ADATA section, which clearly leaks data about which devices may be transmitting and thus allows traffic analysis?"
To which Paul responded:
This is tricky: clearly wireless leaks knowledge of the transmissions, and in a hub and spoke topology its not hard to guess where the traffic is going (and probably not that interesting), especially when it's acked'. Re traffic: you might find some inspiration from IPsec, which has a mode that hides IP addresses. Personally, I'd consider the threat in more depth: e.g. what information exactly could be leaked? and who cares? And then seek other countermeasures than encryption. One idea would be to keep the packets a near the same as possible, irrespective of the content. If power were not an challenge, I'd consider sending 'random' packets. However, low cost devices deployed in uncontrolled environments are easy to obtain, for in-depth analysis: which potentially gives 'class-break' information - to those who may have an interest. So, it's important not to seek to put the bar too high; and mitigations may come in unexpected ways and places.
2015/06/11 thoughts: for an AES-GCM-protected 64-byte radio data frame minus a few bytes of overhead, and with preshared 128-bit keys (thus making this AES-128), have the 12-byte nonce consist of:
Adding 16 bytes of tag/authenticator gets to 26 bytes of raw overhead. (It may be possible to trim the (transmitted) nonce a little further, or even to trim the tag, without significant security compromise.) So half the raw frame is still available.
Note that since the pre-shared keys for leaf nodes are likely to be very long-lived, and avoiding reuse of nonce/IV is critical to AES-GCM security, these details are critical.
The current AVR-based OpenTRV hardware seems able to gather entropy reasonably well for the purposes of generating a random ID. It would not need to for key generation if secret keys are supplied to it. No further randomness is actually nneded in the nonce/IV scheme above.
2015/06/11:
it appears that
a 8-byte tag is possible with AES-128 GCM, which would get overhead down to 18
bytes, or about 1/3rd of the maximum frame size.
JDK complains with java.security.InvalidAlgorithmParameterException: Unsupported TLen value; must be one of {128, 120, 112, 104, 96}
when asked for an 8-byte tag with
Cipher.getInstance("AES/GCM/NoPadding", "SunJCE")
but works with a 12-byte tag (96 bits), corresponding to 22 bytes total
overhead on a protected frame.
Note that security (effective key length) is likely
reduced to 96 bits
in this case however.
2015/06/14: a couple of days' work based on a liberally-licensed (BSD) AES-GCM implementations available produced a 14kb (down from 16kB originally) Arduino UNO version that agrees (for a simple test case) with a Java-based unit test. The existence of very small AES implementations such as tiny-AES128-C suggest that implementations with lower code/data (Flash/RAM) requirements are plausible. Something ≤4kB code and ≤512B RAM would probably be possible and usable.
Assuming that the overall scheme is satisfactory, side-channel attacks such as some variant of timing or RF/EMI leak powe, etc, will need to be checked for as code approaches production status.
(Note that the leaf MCU will probably need to be set up with 'conservative' 'fuse' setting to protect the keys from being extracted from the device, and that those keys may be safer hidden in Flash than EEPROM on some MCUs whose EEPROM can be read out without being cleared by loading new code, and there will be many other subtle foibles of particular devices.)
2015/06/12 I suggested to the dev list:
... where we need to verify at some future point that sensor data points have not been tampered with, that we do the following:
So the leaf does not have to support keys good enough for long-term use for example.
- Use the authentication mechanism between leaf/sensor and concentrator to confirm that data got to the concentrator intact.
- Have the concentrator by default sign with a private key (for which the public part is retained) any incoming data that it has authenticated as above.
To which Bruno (EnergyDeck CTO) responded:
That's a good start. I would suggest the following for specialised use cases (where people are possibly happy to pay more):
In addition, the following metrics could help a concentrator / data platform identify sensors that need attention before something bad happens:
- Have some sort of movement detection so that the sensor can send an event if it is moved. This can be essential for sensors in industrial environments where it is essential to ensure that once the sensor is installed it is not moved. The typical example we've seen is industrial cold storage where sensors are installed to prove that the temperature in the fridge is always within legal limits. One typical tampering in such cases is to move the sensor from one "problematic" fridge to a less problematic one.
- Have a way to keep the device powered for a few seconds when batteries are removed so that a "battery removed" event can be sent before dying altogether. The concentrator itself could auto-detect sensors that stop sending data but when that happens it can be for a number of reasons, some of them completely legit.
- battery voltage: can then be used to predict when replacement is needed,
- data link quality: can then be used to identify sensors with weak connections that are at risk of disconnection.
(mCube Introduces Accelerometers Optimized for the ‘Internet of Moving Things’... which may also be good for things get moved that shouldn't.)