Frustrating IoT Devices!

I’ve been continuing my IoT journey, finding that IoT devices are a little fickle.

My first LGT92 GPS Tracker device failed back in 2021; and I tried contacting both the retailer (IoT Store Perth), and the manufacturer. I was instructed by the manufacturer to open the clamshell case and take numerous photos to send to them. They suggested a fault, and that they would organise a replacement, but after 6 months, nothings happened.

During that period, i ordered a second LGT92, and it failed on first use. I contacted IoT Store again – by webform, email, and phone, and after many weeks, spoke to “Sam”, who from the sound of it was on the phone in his car. While he said he would look into this (and the original, nothing came of it, and I tried following up several times.

I then tried to get an IP67 rated, solar power device; however, what IoT Store sent me had no solar panel or GPS tracker device, just a box with some wires and screws. Again I spoke with “Sam” (in his car again) having tried webforms, email and his mobile number multiple times, and again he said he’d follow up on it, and that’s been three months and no success.

So I’m never buying anything from IoT Store again, and I strongly advise against anyone else doing so. The customer service is terrible. Not one of the emails I’ve sent have been replied to. Not one of the contact me forms have been responded to. And when I have managed to speak with Sam, he is evasive, and does not follow up on the actions he says he’ll take.

Next up is the RAK Wireless 10700, a new GPS tracker device, again IP67 rated, with solar power. Released in 2022, these devices shipped from China after about 3 months, but without a battery that the solar panel would charge. I ordered a LiPO battery from Amazon.com.au, but naturally these had a different connector, so I find myself soldering again after 15 years.

But they do power up, with device firmware 1.0.4 installed. I connected a serial power and enter the AT command to dump the config: Dev EUI, App EUI and App Key.

I enter this into the AWS IoT Core device registration, and ensure thing slike the frequency are correct, but the device refuses to join the LoRaWAN network with the local gateway running basicstation (current build at this time), with the best log output from the basicstation gateway showing:

Mar 28 14:15:19 rak-gateway basicstation[12538]: 2022-03-28 13:15:19.629 [S2E:VERB] RX 917.0MHz DR2 SF10/BW125 snr=-14.8 rssi=-89 xtime=0x6900001BA46F94 - jreq MHdr=00 JoinEUI=ac1f:9ff:f915:4631 DevEUI=ac1f:9ff:fe06:7117 DevNonce=35258 MIC=1390227384

Mar 28 14:15:20 rak-gateway basicstation[12538]: 2022-03-28 13:15:20.093 [S2E:WARN] Unknown field in dnmsg - ignored: regionid

And the output on the tracker device showing:

+EVT:JOIN FAILED

Out of interest, the AT+STATUS shows (with some of the keys and addresses hidden with underscores):

Device status:
   Auto join enabled
   Mode LPWAN
   Network not joined
LPWAN status:
   Dev EUI AC1F09FFFE______
   App EUI AC1F09__________
   App Key AC1F09__________________________
   Dev Addr 26021F__
   NWS Key 323D155A000DF335307A16DA0C______
   Apps Key 3F6A66459D5EDCA63CBC4619CD______
   OTAA enabled
   ADR enabled
   Public Network
   Dutycycle disabled
   Send Frequency 2
   Join trials 2
   TX Power 0
   DR 3
   Class 0
   Subband 1
   Fport 2
   Unconfirmed Message
   Region AU915
LoRa P2P status:
   P2P frequency 916000000
   P2P TX Power 22
   P2P BW 125
   P2P SF 7
   P2P CR 1
   P2P Preamble length 8
   P2P Symbol Timeout 0

I did notice the documentation from RAKWireless says that firmware 1.0.1 supports LoRaWAN MAC version 1.0.2 (not the 1.0.3 that the LGT92 supported); and this version difference is defined in a device profile in AWS IoT Core for LoRaWAN.

What I also noticed was the documentation for the RAK 10700 at https://docs.rakwireless.com/Product-Categories/WisBlock/RAK10700/Datasheet/#software mentioned that the firmware version available is 1.0.1, so older than what shipped to me on the device:

+VER:1.0.4 Jan 14 2022 14:17:02

But, on that same documentation page, is a link to download for a firmware, but is unfortunately a 404!

So, my journey continues, but I’ve learnt a few lessons. The IoT device landscape seems… littered with failures. The quality, of commodity devices is low, the compatibility is bewildering, and the standards are evolving.

Transitioning to IPv6 in AWS

There are a large number of workloads that operate in the AWS Cloud using traditional virtual machines (Instances) on traditional IPv4 networking. And for the last few years, we’ve seen the steady growth in IPv6 adoption globally. For those who haven’t started this journey yet, here’s some notes on what you may want to look at as you start to embrace the future of the Internet.

It should be noted that this transition is a two way street:

  1. you need to get ready to offer your digital services to your clients over both IPv4 and IPv4 (Dual Stack)
  2. you need to have your dependant services you use to offer (listen) on an IPv6 address, and probably via a gradual transition via offering both IPv4 and IPv6 for a (long) period of time

Within your internal (to your VPC) network architecture you can use either network protocol: the initial focus needs to be on enabling your incoming traffic to use either IPv4 or IPv6.

Your transport layer security (TLS) should be identical on either network protocol. The IP protocol is just a transport protocol.

Here are the steps:

  1. VPC Changes
  2. Subnet Changes
  3. Load Balancers Changes
  4. Routing Changes
  5. Security Group Changes
  6. DNS Changes

VPC Configuration

Adding an IPv6 address block is reasonably simple in VPC. While you can allocate from your own assigned pool, its far easier to use the AWS pool; its ready to go and doesn’t need any other preparation.

There are three ways to add an IPv6 address allocation:

  • In the console, via ClickOps
  • Via the API (including the CLI)
  • Via the CloudFomation template that defines your VPC – highly recommended

Assigning the address block to the VPC does not actually use it, and should make zero impact to already running workloads. You should be safe to apply this at any time.

Subnet Configuration

Once the VPC has an allocation, we can then update existing subnets to also include an allocation from within the VPC’s range. The key difference we see here is that in IPv4 we can chose the size of the subnet, in IPv6 you cannot: every IPv6 allocation to a subnet is a /64, which is about 18 billion billion IP addresses.

You can undo an allocation if no Network interfaces (ENIs) are present in the subnet using those addresses.

The configuration is relativity simple: you get to those which slice of the VPC IPv6 address block will be used for which subnet. I follow a pretty simple rule: I anticipate that my VPCs will perhaps one day spread across 4 Availability Zones, so I allocate subnets sequentially across Availability zones in order to be able to reference the range via a supernet.

The reason for this is:

  • subnetting is done in powers of two: so for continuous addressing (supernetting) we’re looking at using two AZs, four AZs, or eight AZs, etc.
  • two availability zones is insufficient. If one fails, then I you are running on a single Availability Zone during the incident (which may last several hours). This AZ may be constrained in capacity, while other AZs may be underutilised. Hence we want to use three AZs to have fault tolerance able to be restored DURING a single AZ outage

Most Regions have between three and 5 AZs. Preparing for 8 in most Regions will be reserving address space we’ll likely never be allocating.

Hence, starting with public subnets, we want to sequentially allocate them with space to accommodate four AZs. These allocations are a hexadecimal number between 00 and FF – and hence a 256 limit on the total number of subnets. If we recall the four AZ allocation, then that’s 64 sets of Subnets across all AZs.

Again, you can allocate these by:

  • Click Ops in the console on each existing subnet (or when creating new subnets)
  • API call (including the CLI)
  • CloudFormation template – recommended – in which case, look at the Fn::Cidr to calculate the allocation. Check out my post form March 2018 on this.

If your focus is to start with your services being dual-stack available, then the only subnets you need to allocate initially are the Public Subnets: the subnets where your client facing (internet facing) load balancers are.

Once again, there’s no interruption to existing traffic during this change; indeed you’re less than half way through the required changes.

You may also allocate the rest of your private subnets at this time if you wish.

Routing Changes

For public subnets to function, they need a route for the default IPv6 address via the existing Internet Gateway (IGW). This looks like “::/0”, and when pointing to the IGW, it permits two way traffic just like IPv4. Your set of public subnets will need this route, and this can be done at any time: permitting IPv6 routing wont start clients using it.

If you have private subnets with IPv6 allocations, and you want them to be able to make outbound requests on IPv6 to the Internet, then you may want to consider an Egress Only IGW as the destination for “::/0” for private subnets. Note your public subnets still will use the standard IGW.

The Egress only IGW resource does what it says, and supplants the need for NAT Gateway as used in IPv4 (more on NAT GW later).

Again, you can add the Egress Only IGW and the Routing changes in several ways:

  • Click Ops on the console
  • Via the API (including the CLI)
  • In your CloudFormation template for your VPC – recommended

Load Balancer Changes

Now you have public load balancers in public subnets that have IPv6 available, you can modify your load balancer to have it get an IPv6 address. This is yet another action that will have no impact on current traffic.

You can modify the existing load balancers by:

  • Click ops on the console
  • An API call (including the CLI)
  • In your CloudFormation template for your Workload – recommended

Security Group Changes

Now we’re down the the last two items. By default, your security group is closed unless you have made changes. Your typical load balancer will be listening on TCP 80 and/or 443 for web traffic, and be open to the entire [IPv4] Internet with a source of 0.0.0.0/0.

To enable this security group for IPv6, we add a set of rules for source of ::/0 for the same ports you have for IPv4 (typically 80 and 443 for web traffic, different for other protocols).

Its at this time you can now test connectivity to your load balancer using IPv6 end-to-end – assuming you have another end on the IPv6 Internet somewhere.

If your workstation/cellphone is using IPv6, then you could browse to IPv6 address – but you’ll probably get a certificate warning as the name in the certificate doesn’t match the raw IP address.

If you’re not familiar yet, this should also be a CloudFormation template update.

DNS Changes

This is when we announce to the world that your service can be accessed with IPv6. You want to make sure you have done the above test to ensure you can connect, as this is the final piece in the puzzle.

Typically a custom DNS name for a load balancer is a Route53 ALIAS record of type A (Address). The customer DNS name is what also appears in any TLS Certificates.

To finally flick the switch on IPv6, you add an additional Route53 ALIAS record of type AAAA (four As), with the destination being the same as you have used for the existing Alias A record (one A).

You should now be able to check that you can resolve your service using the dnslookup utility. From a command prompt or Powershell, type:

  • nslookup -type AAAA my.custom.load.balancer.name
  • nslookup -type A my.custom.load.balancer.name

Your Dependencies

Now you’re up and running, you need to think about the services you depend upon. Services within your VPC, such as RDS, require AWS to enable these to be dual stack. Some services already are, such as the Link-Local MetaData service, Time Sync Service and VPC DNS resolver (note: always use the DNS resolver).

Some services will be outside of your VPC but still AWS-run, like SQS, and S3: in which case, look to use VPC Endpoints to communicate with them.

But other third party resources across the Internet may be stack back on IPv4. if you have an EC2 Linux Instance then its sometimes worth running a TCPDUMP to inspect the traffic you see using IPv4. A command like tcpdump ip and port not 22 may be useful. You can extend that to also exclude HTTP/HTTPS traffic with tcpdump ip and port not 22 and port not 80 and port not 443. Remember, your service port on your instance may be a different number on the inside of your network.

You’ll need to ask your dependencies to include dual-stack support on their services. In the mean time, you’ll be having to fall back to using IPv4 from your network to communicate with these dependencies. There’s two ways this can happen:

  1. If the subnet with your EC2 instance in it is dual-stack, hen the host can use an IPv4 connection itself, possibly via a NAT Gateway to communicate with the external IPv4 dependency
  2. If the subnet with your EC2 instance is IPv6 only (which is rather new), then the subnet can be configured to use DNS64 addressing (a subnet level configuration), and can route its traffic via the NAT GW, which will translate from IPv6 on the VPC-internal network, to IPv4 across the Internet (and back). See this.

Moving to IPv6 only internal networks is a long term goal, probably in the order of half a decade or so. A number of additional AWS updates will be needed before this becomes a default.

Additional IPv6 Notes in AWS

In this transition period (which has been going for nearly 25 years), you’re going to find stuff that silently falls back to IPv4. With host able to simultaneously have two addresses (IPv4 A, and IPv6 AAAA), then things that look them up can have a choice. For more things this is the newer AAAA, with a fall-back to A if needed (see the Happy Eyeballs RFC).

However, at this time (Mar 2022), CloudFront still preferences IPv4 origins when the origin is dual-stack. CloudFront also still uses TLS 1.2 instead of the newer and faster TLS 1.3, and HTTP/1.1 instead of the slightly more efficient HTTP/2 request protocol.

AWS IoT core exposes IPv4 endpoints, which is unusual as a key element of IoT is having millions of devices connected, a situation best served by IPv6.

Similar considerations exist for Route53 Health Checks, and others.

Summary

If you’re thinking this is all very new in cloud, you’d be mistaken. I was transitioning customer environments (including production) in AWS to dual stack in 2018 – four years ago. I’ve been dual-stack for my home Internet connection since I swapped to Aussie Broadband (I churned away from iiNet, who once had an IPv6 blog and strong implementation plans).

For several years, Australia’s dominant telco, Telstra, has had IPv6 dual stack for its consumer mobile broadband, something that the other players like Optus are yet to enable.

But these changes are inevitable.

The future is here, its just not evenly distributed.

AWS Local Zones expansion 2022

AWS recently made a bold announcement; at re:invent in specified a few countries it planned to open Local Zones in, but last week it revealed some 32 locations, including Perth, Brisbane, and Auckland

Perth is isolated by the vast distances between east and west coast of Australia – 2044 miles, similar distances to the continental United States between DC and LA (2200 miles), or London to Moscow (2500 miles). The Round Trip Time (RTT) of packets online is around 50ms, which for many applications is not immediately noticeable.

But for some time-critical workloads, its a deal breaker.

Local Zones offer a very cut down version of an AWS Region, targeting compute workloads that use a virtual machine Instance. First available in Japan, there are currently 16 in service; this recent announcement of 32 more will make 48 Local Zones.

While many have become familiar with AWS, the minimal viable product of a Local Zone may leave some confused: the options at your disposal are listed here.

Local Zone attachments

Local Zones are attached to a host Region. In the case of the announced Perth Local Zone, the API designation for this indicates this will be linked to the yet-to-launch Melbourne Region.

When it comes to load balancing within the Local Zone, typically only Application Load Balancing (ALB) is available. That’s perfect for HTTP based workloads with multiple local application servers, but if you’re looking to then add a managed RDS database behind that, you’ll be reaching back to the host Region. Same for SQS, SNS, and most everything else.

Instance types will also be limited, typically focusing on a subset of the latest general purpose families; this is likely to be true of the Elastic Block Store (EBS) volumes, where until now, GP2 (General Purpose SSD) has been the primary option.

When it comes to networking, it appears that Local Zones do not yet support IPv6 dual-stack addressing, as shown in the Console option for defining a subnet with the current Oregon/Los Angeles Local Zone:

IPv4 only subnet creation in Oregon/LA

So, what would benefit from Local Zones? Well architectures with local access direct to instances, that perhaps transform and validate requests on the edge, or perhaps cache responses at the edge before forwarding more efficient queries across the “VPC-internal” connectivity to the host Region. Another use case may be local EC2 Windows Instances, where the reduced latency may make RDP access a seamless desktop experience.

Perhaps some Local Zones will supplant the need for on-premesis Outposts deployments.

Perhaps over time more architectural patterns will come about, and more services will start to make their way into the common Local Zone implementation. Some Local Zones may grow to become full Regions, as happened with the original Osaka (Japan) Local Zone.

Regardless of the way it ends up being used, the expansion is a massive step up in the globally deployed infrastructure.

Stronger SSH Keys for EC2

For those not familiar, SSH is the Secure Shell, an encrypted login system that has been in use for over 25 years. It replaced unencrypted Telnet for remote (text) terminal connections used to access (and administer) systems over remote networks.

Authentication for SSH can be done in multiple ways: simple passwords (not recommended), SSH Keys, and even MFA.

SSH keys is perhaps one of the most common ways; its simple, free, and relatively easy to understand. It uses asymmetric key pairs, consisting of a Private key, and a Public Key.

Understandably, the Private key is kept private, only on your local system perhaps, and the Public key which is openly distributed to any system that wishes to give you access.

For a long time, the Key algorithm used here was the RSA algorithm, and keys had a particular size (length) measured in bits. In the 1990s, 128 bits was considered enough,but more recently, 2048 bits and beyond has been used. The length of the key was one factor to the complexity of guessing the correct combination: fewer bits means smaller numbers. However, the RSA algorithm becomes quiet slow when key sizes start to get quite large, and people (and systems) start to notice a few seconds of very busy CPU when trying to connect across the network.

Luckily, a replacement key algorithm has been around for some time, leveraging Elliptical Curves. This article gives some overview of the Edwards Curve Elliptical Curve for creating the public and private key.

What we see is keys that are smaller compared to RSA keys of similar cryptographic strength, but more importantly, the CPU load is not as high.

OpenSSH and Putty have supported Edwards curves for some time (as at 2022), and several years ago, I requested support from AWS for the EC2 environment. Today, that suggestion/wish-list item has come to fruition with this:

Amazon EC2 customers can now use ED25519 keys for authentication with EC2 Instance Connect

AWS has been one of the last places I was still using RSA based keys, so now I can start planning their total removal.

  • Clearly generating a new ED25519 key is the first step. PuttyGen can do this, as can ssh-keygen. Save the key, and make sure you grab a copy of the OpenSSH format of the key (a single line that starts with ssh-ed25519 and is followed by a string representing the key, and optionally a space and comment at the end). I would recommend having the Comment include the person name, year and possibly even the key type, so that you can identify which key for which individual.
  • You can publish the Public Key to systems that will accept this key – and this can be done in parallel to the existing key still being in place. The public key has no problem with being shared and advertised publicly – its in the name. The worse thing that someone can do with your public key is give you access to their system. In Linux systems, this is typically by adding a line to the ~/.ssh/authorized_keys file (note: US spelling); just add a new line starting with “ssh-ed25519”. From this point, these systems will trust the key.
  • Next you can test access using this key for the people (or systems) that will need access. Ensure you only give the key to those systems or people that should use it. Eg, yourself. When you sign in, look for evidence that shows the new key was used. For example, the Comment on the key (see point 1 above) may be displayed, such as:
  • Lastly you can remove the older key being trusted for remote access from those systems. For your first system, you may one to leave one SSH session connected, remove the older SSH key from the Authorized Keys file, and then initiate a second new connection to ensure you still have access.

Now that we have familiarity with this, we need to look at places where the older key may be used.

In the AWS environment, SSH Public Keys are stored in the Amazon EC2 environment for provision to new EC2 instances (hosts). This may be referenced and deployed during instance start time; but it can also be referenced as part of a Launch Configuration (LC) or Launch Template (LT). These LCs and LTs will need to be updated, so that any subsequent EC2 launches are provisioned with the new key. Ideally you have these defined in a CloudFormation Template; hence adjusting this template and updating the stack is necessary; this will likely trigger a replacement of the current instances, so schedule this operation accordingly (and test in lower environments first).

There’s no sudden emergency for this switch; it is part of the continual sunrise and sunset of technologies, and address the technical debt in a systematic and continual way, just as you would migrate in AWS from GP2 to GP3 SSD EBS volumes, from one EC2 instance family to the next, from the Instance MetaData v1 to v2, and or from IPv4 to dual-stack IPv6.

Gartner Magic Quadrant for Cloud Database: 2021 v 2020

In December, Gartner produced another one of their Magic Quadrants comparing the offering from various Cloud service providers focusing on their database offerings. While its like reading tea leaves, its interesting to see the jostling of the players, the new departures who are excited (funded) enough to run an analyst relations ream, and those who are dropping out.

You can get a copy of the current report from Gartner, AWS, or the 2020 version from Google.

Here’s a mash up comparing the two years; the darker navy blue is 2021, and the lighter blue dots are 2020.

New to this in 2021 are:

  • Intersystems
  • MariaDB
  • Single Store
  • Exasol
  • Cockroach Labs

Leaving the magic quadrant in 2021 are Tencent.

Much improved are AWS and Microsoft who continue to lead – these two are now ranked neck and neck, with Oracle sitting behind them (but also improved). IMHO, those increasing in position are SAP, Teradata, Snowflake, Databricks an Cloudera, and even Huawei.

At the same time, relative to the others in this list, are two that are dropping in comparison: Redis and Marklogic – but only slightly.