Thoughts on the IPv6 Transition

I’ve been discussing the IPv6 transition with our customers more recently; for over 3 years we’ve been dual-stack IPv4 and IPv6 for public-facing AWS-Cloud-based solutions and services for our customers.

So what?“, you’re thinking?

It’s worth noting that from Google’s numbers, global IPv6 is now approaching 36%, while at home in Australia 27%, helped by TelCo carriers like Telstra enabling IPv6 to their mobile phone subscribers, and advanced ISPs like Aussie Broadband and Internode making IPv6 trivial to enable.

Google IPv6 Adoption, as of 12/Oct/2021

I first had an IPv6 tunnel established to Hurrican Electric in 1999 when I worked for The University of Western Australia. I championed the adoption of IPv6 as a first-class citizen in the cloud when I worked at Amazon Web Services as a Solution Architect, and these days, a large majority of AWS public-facing services already support dual-stack approaches, and more are on the way.

As the next billion people come online, the unavailability of more existing IPv4 Internet is a limiting factor. The temporary value of the IPv4 address space, being reallocated (“sold“) between assignees will eventually presumably peak when a majority of clients (people) and the services they are accessing are all on IPv6.

I have been advising a government body, who had two IPv4 “Class B” sized IPv4 subnets allocated to them. Each of these subnets is a “/16” netblock (65,535 addresses); they had only ever used a handful of /24 ranges from within their first allocation.

Most services they use, both for staff and for public-facing services, now run on the cloud, from cloud-provider address space. They’re unlikely to need all of the address blocks they currently have from the first /16 block, let alone the second.

This netblock has a current value of a couple of million dollars (AUD).

It’s likely that many public sector agencies have IPv4 address netblocks that they’re unlikely to ever use, and could also benefit from reallocating to service providers desperate for their own address space to host solutions from.

Well, desperate until most clients are using IPv6.

I’d urge any public sector organisation to review their plans for using their address space, and if they have large unused, contiguous address space, consider reallocating that. The funds raised can then help with further modernisation of workloads – including those workloads to move to IPv6 addressing.

For any managed service providers, I would urge you to “dual-stack” all public-facing Internet services. You should continue to use strong encryption in flight, modern TLS protocols, and strong authentication, regardless of the network transport protocol version.

If you are using AWS CloudFront as a CDN in front of your origin service, then enable IPv6 in the CloudFront configuration, and then publish the corresponding AAAA DNS record just as you have to the A DNS record. Similar works if using CloudFlare, Akamai, Fastly or others.

For those who use managed service providers for their corporate business networking, ask why your work Internet connection is not dual-stacked already. It’s typically a configuration question, and rarely has any actual cost associate with it. If you have a corporate proxy service, then if it is dual-stacked, the clients (on your internal corporate network) already get some benefit of being able to talk to IPv6 services.

If you have DNS services, check they not only can serve IPv6 records (AAAA), but they are reachable using IPv6. Services like AWS Route53 have done this for years (see my earlier point about getting IPv6 as a first-class citizen within AWS).

While you’re looking at DNS, have a look at creating a simple CAA record, to list the Certificate Authorities you obtain certs from.

IoT and AWS IoT Core for Lorawan: Getting Started

Oskar loves sailing. He’s been doing it for a little over a year, and it’s the first time that he’s really taken to a sport. We’ve found a very inclusive mob of people around East Fremantle who are encouraging children to get into sailing, coupled with some awesome massively overqualified coaches (eg, State, National. and Olympic sailers) who are keen to see their little fleets of junior sailors take up the sport.

I’ve done my bit; I learnt to sail in a Mirror Dinghy, many many years ago, and in my late teen’s early twenties, learnt to sail a much larger, locally famous three-masted barquentine, Sail Training Ship Leeuwin. My formative late school years were spent around the B Sheds on the Fremantle wharf; I managed to sneak out of home to the ship as I had family with me: a second cousin who at the time was the permanent 1st mate (later captain). I used to crew, navigate, rig, and refit; the summer period in port we’d spend the time sleeping onboard taking shifts on the boat to keep it safe.

When you sail on the ship on a voyage, the young sailors are split into four Watches (teams): Red, Blue, Green, Yellow. When we were in port, and working the day down in the bilges, and guarding the ship at night, the small team would call ourselves the Black Watch. New Year’ s Eve we would even get the (small) canon out, for the strike of midnight. Yo ho ho!

Aaany ways… Oskar’s taken to sailing a small boat originally designed by the Bic pen company. This small skiff is basically a piece of hollowed plastic, a small sail, centreboard and rudder. It flips about as readily as a politician faced with the truth and facts, but luckily, as a flat piece of plastic, there’s no bailing and it rights easily. The design is now open and in a great riff on the fact that Bic started it, it’s now called the O’pen skiff. Heh – pen, get it?

Some of these kids can see when they are about to capsize, and calmy step out over the side just as the boat keels over, and have been known to seamlessly step onto the exposed centreboard, and counter the capsize!

I’ve done my bit, helping the coaches in the support boats (rigid inflatable boat or RIBs, or a classic tinny), which has mostly been about helping do running repairs, help tow stricken vessels, or swap kids in and out of boats (I’m not a soon-to-be Paris-2024 Olympian; I’ll let the pros do the instruction). But to become a little more useful, I sat the Skipper’s Ticket license (Dept Transport WA) to I can now drive the powerboats and not just be a passenger.

The Fleet of 6 – 9 boats also race in the Swan River by the East Fremantle Yacht Club. And thus, as parents stand around the shoreline at Hillary’s Marina a few weeks back, our children taking their 2-metre plastic, one parent says to me ” could we get real-time positioning and a map of their boats”.

And herein starts the rabbit hole of my first foray into IoT.


I first saw Lorawan at the AWS Summit in Sydney around 2016 or so. Back in the early 2000’s, I was playing with long-distance 802.11b, with cantennas (antennas made from large cans, and old commercial-sized coffee tin if I recall), and at one stage had a 17-metre antenna on my father’s Osborne Park factory roof, with an Apple Access Point, powered via PoE, rigged at the top. I’ve done a bit of networking over the years (I hold the AWS Certified Network – Specialty certification, and have contributed Items (questions) to it).

So now was the time to look at how do this log distance, low power, low throughput data now.

Requirements

We want to get frequent (second or two) GPS location of between 5 and 20 boats. They’ll be travelling along the Swan River, mostly (occasionally a few coastal regattas). We want to have a map showing the location of all boats, and a tail of their last few moments, and last known speed and direction. We’ll then display this map in the public spaces around the venue.

Hardware

Pretty quickly we zoomed in on the Dragino LGT-92, LoRaWAN GPS Tracker. It’s around AUD$100, and has a good battery life. It recharges by a micro USB port. It can be adjusted via a TTL serial interface (for which I don’t yet have a device to chat with it).

Noticing that I was not covered by any The Things Network (TTN) public gateways in my area, I also purchased a RAK7246 LoRaWAN Developer Gateway at $225 delivered (IoT Store Perth). And having seen the data rates I’d like to do, I’m glad I have my own gateway.

Cloud

So how does the cloud come into this? Well, the gateway device is just one part of it; it’s effectively a data forwarder. There may be multiple gateways in my network to extend coverage; yes, they could be a mesh of device, or they could be separately homed to the Internet. Each Gateway registers against a Lorawan Network Server (LNS). It is the LNS that has the central configuration of gateways and end devices, and processes the data coming from them all.

I could deploy my own LNS, or I can use the AWS Managed version of it, and then trundle the data out to the application that I want to have consume it. At this point, that application is probably just DyanmoDB, with items containing the device unique identifier, timestamp, latitude, longitude, battery level, and firmware revision. And thus, the IoT Core for Lorawan.

Getting started

As an initial overview, thanks to Greg Breen from AWS, is this YouTube video in which Ali Benfattoum describes putting these together. This video from December 2020 is now slightly out of date with the AWS Console (things move pretty quickly), but you can follow along easily enough.

The first thing I did was update the installed Raspian. A new major release has come out, so an apt-get update && apt-get dist-upgrade is in order. Some CA certificates have expired (in the chain of) one of the repositories listed in /etc/apt/apt.sources.d/, so a little bit of work to get this amenable. A quick reboot (having updated the Raspian OS) and I dutifully pulled in git as described in the above video, cloned the Lorawan BasicStation, and built it (make).

I found that the Gateway device registered exactly as shown in the video, and showed up with no problems. However, my radio devices weren’t attaching. Well, turns out there was a process running on the gateway for The Things Network, which had exclusive access to the local Lora radio. So I stopped that process, repeated, and data flowed through. Knowing I didn’t want that TTN process to restart, I found its SystemD config file in /etc/systemd/, and removed it (well, copied it away to my home directory).

The first hurdle

I rebooted the device overnight, and the next day went to restart the basic station service from the command line. But no matter what, it couldn’t turn on the local Lorawan radio.

I lucked upon a post that suggested the radio have a GPO pin reset, and that it was either pin 25 or 17 that would do the trick. Hence, I had this small script that I called reset_gw.sh:

#!/bin/sh
gpioset --mode=time --usec=500 pinctrl-bcm2835 17=1
gpioset --mode=time --usec=500 pinctrl-bcm2835 17=0

I ran this, and then the radio reset! Browsing through posts it appears that the basicstation doesn’t initalise or reset the radio; I can only presume that the TTN daemon did, and when I initially killed it and fired up basicstation, the radio was good to go. So rule now is reset the radio as part of the initalisation of basicstation; I found basicstation has support for a command line argument to call the above script.

Given I want basicstation to start and connect on boot, it needed its own startup script in /etc/system.d/system/:

[Unit]
Description=basicStation

[Service]
WorkingDirectory=/home/pi
Environment=RADIODEV=/dev/spidev0.0
ExecStart=/home/pi/basicstation/build-rpi-std/bin/station --radio-init=/home/pi/reset_gw.sh
SyslogIdentifier=basicstation
Restart=always
RestartSec=5

[Install]
WantedBy=multi-user.target

Note I also put a symlink to this in /etc/systemd/system/multi-user.target.wants/.

The other optimisation I did was to go into the WiFi settings for this little device, in /etc/wpa_supplicant/. I want to list a few networks (and preshared keys/passwords) that I want the device to just connect to. Hence my /etc/wpa_supplicant/wpa_supplicant.conf file now looks like:

ctrl_interface=DIR=/var/run/wpa_supplicant GROUP=netdev
update_config=1
country=AU

network={
        ssid="22A Home"
        psk="Password"
        priority=1
}
network={
        ssid="JEB Phone"
        psk="AnotherPasswoed"
        priority=2
}

No, that’s not the real password or ssid. But the JEB Phone one means that if I take the gateway on the road, I can power it up (USB) and then have it tethered to my mobile phone to backhaul the data.

The Data Flows

Following the above demo, I now have data showing up. I long pressed the one and only button on it for 5 seconds, and this is what ends up on the IoT Topic:

{
  "WirelessDeviceId": "46d524e5-88f6-8852-8886-81c3b8f38888",
  "PayloadData": "AAAAAAAAAABPd2Q=",
  "WirelessMetadata": {
    "LoRaWAN": {
      "ADR": true,
      "Bandwidth": 125,
      "ClassB": false,
      "CodeRate": "4/5",
      "DataRate": "0",
      "DevAddr": "01838d9f",
      "DevEui": "a8408881a182fb39",
      "FCnt": 5,
      "FOptLen": 1,
      "FPort": 2,
      "Frequency": "917800000",
      "Gateways": [
        {
          "GatewayEui": "b888ebfffe88f958",
          "Rssi": -48,
          "Snr": 11
        }
      ],
      "MIC": "600d5102",
      "MType": "UnconfirmedDataUp",
      "Major": "LoRaWANR1",
      "Modulation": "LORA",
      "PolarizationInversion": false,
      "SpreadingFactor": 12,
      "Timestamp": "2021-10-05T12:41:44Z"
    }
  }
}

That’s a lot of metadata for the payload of “AAAAAAAAAABPd2Q=“. That’s 11 bytes, and behold, it has data embedded in it. I used the following python to decode it:

#!/usr/bin/python3
import base64
import sys

v = base64.b64decode(sys.argv[1])
lat_raw = v[0]<<24 | v[1]<<16 | v[2]<<8 | v[3]
long_raw = v[4]<<24 | v[5]<<16 | v[6] << 8 | v[7]
if (lat_raw >> 31):
  lat_parsed = (lat_raw - 2 ** 32) / 1000000
else:
  lat_parsed = lat_raw/1000000

if (long_raw >> 31):
  long_parsed = (long_raw - 2 ** 32) / 1000000
else:
  long_parsed = long_raw/1000000

alarm = (v[8] & 0x40) > 0
batV = ((v[8] & 0x3f)<<8 | v[9]) / 1000
motion = v[10]>>6
fw = 150 + (v[10] & 0x1f)
print("Lat: {}, Long: {}".format(lat_parsed, long_parsed))
print("Alarm: {}, Battery: {}, Motion mode:{}, Fw: {}".format(alarm, batV, motion, fw))

The end result of running this with the payload a parameter on the command line shows the result:

Lat: 0.0, Long: 0.0
Alarm: True, Battery: 3.959, Motion mode:1, Fw: 154

And that’s what we expected: the alarm button was depressed. and the documentation says that when this is the case, lat and long are set to zero on the initial packet sent in alarm state.

And so

Now that I have a gateway I can move around, reboot, and have it uplink on home wifi or mobile phone tether, I can wander around and then put the token out there. What I am up to next is to pull out that payload and push it to Dynamo. Stay tuned for the next update….

Blocking outbound HTTP from the Home Network, 2021

With the move to HTTPS as a default, I took a chance and recently blocked outbound (EGRESS) HTTP (TCP 80) traffic from my home network. I’ve got around 30 – 40 devices on the network, and I was intrigued to see what we (my family and I) would experience.

With my Unfi Dream Machine Pro, this was a reasonably easy update: Settings -> Traffic & Security -> Global Threat Management -> Firewall. I added a rule for Internet Out that that dropped anything going to port 80:

HTTP reject rule for Internet Out from Unifi Dream Machine Prod

This is a rule I had in place for two days. I checked my own laptop access for HTTPS, SSH, IMAPS and SMTPS egress, and all was fine.

What transpired over the following two days helped me identify the devices and vendors that still produce products that have a dependency to operate using unencrypted HTTP over the Internet.

Logitech Smart Radio

We have had a streaming radio for some time; we still like to listen to London Capital Radio despite the 7-8 hour timezone offset. Within a few minutes of blocking HTTP, the audio stream stopped.

We purchased this device around 2012. On my network, it identifies as a Squeezebox running RedHat. The manufacturer discontinued it years ago and there have been no firmware updates for a long time. It only supports 802.11g wifi in the 2.4 GHz spectrum (is this WiFi 3?).

I wasn’t prepared to replace the device, so for the moment, a work-around rule to permit HTTP (by MAC address) fixes this for the short term. We’re unlikely to see any updates from Logitech anyway.

GoToWebinar

This one surprised me; I was signing in to a webinar, and the obligatory download tried to execute and stalled. It turned out the installer was doing an HTTP based OCSP check.

Now, for web browsers, OCSP has been mostly relegated to the annals of history, replaced with OCSP Stapling.

OCSP is a network efficient query that a client can do against a Certificate Provider’s endpoint to get a signed confirmation that the certificate in question has not been revoked recently. However, in doing so, it tells the certificate authority which site you (your source IP address) just visited; this is called an Information Disclosure vulnerability. Instead, the website in question fetches these signed validations at a regular interval and passes this to the clients that it’s already communicating with – stapling the validation to the certificate during TLS negotiation: “Hi client, here’s my certificate, and here’s a recent verification that my certificate is not revoked”.

Using HTTP for OCSP isn’t too bad, as the response that is being downloaded is itself cryptographically signed. But it’s still visible in the plain for all to see.

Update 27/9: No response from GoToWebinar.

Enphase Envoy

12 hours later, my solar panel data aggregation service, Enphase Enlighten, alerted me that it was no longer receiving data from my solar panel inverters. With another rule to permit the Envoy controller to make HTTP outbound requests, and the data started flowing.

This is a reasonable issue. The submission of the generation and consumption of power in my home should not be trundling over the Internet unencrypted.

I raised a support request with Enphase (21/September 2021 at 15:04 AWST UTC +8), asking them to contact me, I received a ticket auto-response (03164xxx) but no other contact.

Later that night I tried reaching out over Twitter to any security folk at Enphase, but after 24 hours, no response.

Update 27/9: After a few days of no reply, Enphase asked (via automated email) how my support experience was. Unbelievable!

I then tried calling them on their Australian support number as shown on their website. I ended up in a call queue which was quite amusing in itself; every 5 seconds (no, literally, every 5,000 ms) it would announce that all callers were busy, and then it would restart the same audio music clip, only to then interrupt itself… I gave up after 20 minutes in the queue.

Lastly, I have DMed the Enphase Twitter account, and await a reply.

Enphase does not have a security.txt file on their website!

Any customer data should be transmitted to an HTTPS endpoint. The firmware of the Envoy device should have the Certificate Authority’s Root Certificate, used to issue that Endpoint certificate, in its trust store. The device should receive updates as that Root CA expires and is replaced (this happens every 10 – 20 years per CA). The embedded firmware also would need to keep step with the improving TLS protocols over time, now TLS 1.3 would be ideal, but in future, who knows.

What also struck me was the lack of IPv6 being picked up by this device; not only should it have picked up a new IPv6 address locally, the Endpoint it submits its data to should also be dual-stack IPv4 and IPv6.

Apple iPad 14.8 -> 15.0 upgrade

This one was very unusual. Apple had released iOS 15, and our iPads were about to make the jump from 14.8. However, despite full WiFi signal, the devices keep announcing they couldn’t verify the downloaded image because they couldn’t connect to the WiFi!

I’m hoping Apple can address this dependency before the next iOS update.

Update 27/9: Apple responded on Twitter concerned only that I could install the update, not that the security issue was being investigated, resolved or understood.

Nintendo Switch

It turns out the Nintendo Switch won’t join a WiFi network that doesn’t have outbound HTTP access; it must do a call home or validation using HTTP.

Summary

I’ve paused the experiment for the moment, but next month I’ll resume it and find more edge cases where devices we rely upon still use unencrypted channels, exposing our data, without us even knowing…

CloudFront Functions and Security Headers

For a long time, I’ve been using Lambda@Edge to inject various HTTP security-related headers to help browsers improve the security model of the content that they fetch and render.

I’ve been doing this as I have been using S3 as the origin (accessed via a CloudFront Origin Access Identity). S3 itself cannot add/inject many of the common security headers when it passes

These Functions execute when the origin returns the content to the CloudFront regional edge; the returned content then gets cached with the injected headers included.

The end result is getting a good rating on securityheaders.com, hardenize.com, and other public security evaluation services.

An alternate in the Lanbda@Edge execution lifecycle is to trigger on Viewer Response; in which case the cached version doesn’t have the headers injected, and every viewer request triggers the code execution. Clearly, if every viewer has the same set of headers, there’s no need to execute each view response and pay for the additional Lambda@Edge executions.

Now there’s a new option – CloudFront Functions (AWS blog post). Written entirely in JavaScript, it executes only at Viewer Request, or Viewer Response. There is no Origin Request or Origin Response option. It also executes at the CloudFront Edge, not the Regional Edge.

Thie example injects a number of headers, and would need only minor potential customisation on the Content Security Policy (and possibly Permissions Policy) to work for most sites:

function handler(event) {
    var request = event.request;
    var response = event.response;
    response.headers['strict-transport-security']= { value: 'max-age=31536000' };
    response.headers['x-xss-protection']= { value: '1'};
    response.headers['x-content-type-options']= { value: 'nosniff'};
    response.headers['x-frame-options']= { value: 'DENY'};
    response.headers['referrer-policy']= { value: 'strict-origin-when-cross-origin'};
    response.headers['expect-ct']= { value: 'enforce, max-age=86400'};
    response.headers['permissions-policy']= { value: 'geolocation=(self), midi=(), sync-xhr=(self), microphone=(), camera=(), magnetometer=(), gyroscope=(), fullscreen=(), payment=(), autoplay=(self)'};
    response.headers['content-security-policy'] = { value: "default-src: 'self'; img-src 'self' data: ; style-src 'self' 'unsafe-inline' ; frame-ancestors 'none'; form-action 'none'; base-uri 'self'; "};
    return response;
}

You may want to evaluate the cost of both Lambda@Edge and CloudFront Functions. After the first year, Functions is charged at US$0.10 per million functions. As an equivalent, Lambda@Edge for a similar Node.JS function that executes in one millisecond with 128 MB of memory would be US$0.2021 per million requests.

However, given a busy website, you may want to look at the efficiency differences between Viewer Response execution for CloudFront Functions, and Origin Response and the caching for Lambda@Edge (multiplied by the number of Edge Cache locations (13), and the cache retention rate).

If you have only a few unique URLs, and content that can be cached for a long period, and large volumes of requests, then Lambda@Edge may result in near free execution.

 Lambda@EdgeCloudFront Functions
Unique URLs100100
HTTP viewer Requests10M/month10M/month
Execution time1msN/A
Number of Regional Edges13N/A
Memory/execution128MBN/A
Execution timeOrigin ResponseViewer Response
Number of code invocations1300 (once per Regional Edge, Per Unique URL, and possibly cached for a month – depending on Edge cache expiry)10M
Possible Costs  (as at 28/Aug/2021)Duration: US$0.0000000021 * 1300 = Requests: US$0.2 * 0.0013 Total: US $0.00026273US$0.1 * 10
Total: US$1
CloudFront Functions cost uplift compared to Lambda@Edge 3,806 times more expensive

If we were using Lambda@Edge on ViewerResponse, and not caching the object with headers injected, then CloudFront Functions would be cheaper; or if the content being sent was dynamic from the origin and not suitable to be cached, in which case we wouldn’t get the efficiency savings of fewer executions.

Even if we are using Origin Response with Lambda@Edge, we can’t determine the cache expiry of the Lambda@Edge cached responses (we can influence it); the cached objects could expire and re-execute every day, so the Lambda@Edge costs could go up 30x (which would only make CloudFront functions 126 times more expensive). YMMV. TIMTOWTDI.

Browser support for FTP (another sunset)

As Hanno Böck noted in the recent Bulletproof TLS Newsletter, FTP Support in Firefox 90 has been removed. We’ve seen similar messaging from most major browser vendors over the last few years.

I’m going to make a bold prediction, and say in 10 years time we’ll be seeing the removal of (plain text) HTTP support as well. Regardless of internal or external networks (an out-dated concept aligned to the Crunch Shell of network security), the move to stronger security for all communications, backed by free TLS Certificate Authorities (such as Let’s Encrypt) means we should be doing end-to-end encryption for everything the common web browser fetches.

For some time, Firefox has had an HTTPS-only mode, with warnings when services try and dip back to unencrypted access. I’ve typically found this warning pops up when various link-shortening services are chained together, and I’m grateful for the awareness that a jump in that chain is poorly implemented.

In the meantime, the distribution of files using FTP needs to stop. If you run an FTP service then you need to think about transitioning to something that permits access using HTTPS as the transport protocol.

Another sunset in the circle of life of a protocol.