JEB's Blog – Page 4 – Scribblings of a Techie

More TLS 1.3 on AWS

Earlier this week, AWS posted about their expanded support for TLS 1.3, clearly jumping on the reduced handshake as a speed improvement in their blog post entitled: Faster AWS cloud connections with TLS 1.3.

Back in 2017, (yes, 6 years ago) we started raising Product Feature Requests for AWS products to enable this support, and at the same time, customer control to be able to limit the acceptable TLS versions. This makes perfect sense in customer applications (the data plane). Not only do we not want our applications supporting every possible historic version of cryptography, various compliance programs require us to disable them.

Most notable in this was PCI DSS 3.1, the Payment Card (credit card) Industry Association’s Data Security Standard, which drove the nail in to the coffin of TLS 1.1 and everything before it.

Over time, TLS versions (and SSL before it) have fallen from grace. Indeed, SSL 1.0 was so bad it never saw the light of day outside of Netscape.

And it stands to reason that, in future, newer versions of TLS will come to life, and older versions will, eventually, have to be retired; and between those two, is another transition. However, this transition requires deep upgrades from cryptography libraries, and sometimes to client code to support the lower level library’s new capability..

On the server side, we often see a more proactive implementation of what currently supported TLS versions are permitted. Great services like SSLLabs.com, Hardenize.com, and testssl.sh have guided many people to what today’s current state of “acceptable” and “good” would generally look like. And the key item of those services, is their continual uplift as the state of “acceptable” and “good” changes over time.

On the client side, its not always been as useful. I may have a process that establishes outbound connections to a server, but as a client, I amy wan tto specify some minimum version for my compliance, and not just rely upon the remote party to do this for me. Not many software packages do this – the closest control you get is an integration possibly using HTTPS (or TLS), and not the next level down of “yeah, so which versions are OK to use when I connect outbound”. Of course, having specified HTTPS (or TLS) and doing server certificate validation against our local trust store, we then have a degree of confidence hat its probably the right provider, given that one of my 500 trusted CAs signed that certificate. we got given back during the handshake

This sunrise/sunset is even more important to understand in the case of managed services from hyperscaler cloud providers. AWS speaks of the deprecation of TLS 1.1 and prior in this article (June 2022).

If you have solutions that use AWS APIs, such as applications talking to DynamoDB, then this is part of your technical debt you should be actively, regularly addressing. If you haven’t been including updated AWS SDKs in your application, and updating your installed SSL libraries, updating your OS, then you may not be prepared for this. Sure, it may be “working” fine right now.

One option you have is to look at your application connection logs, and see if the TLS version for connections is being logged. If not, you probably want to get that level of visibility. Sure, you could Wireshark (packet dump) a few sample connections, but it would probably be better not to have to resort to that. Having the right data logged is all part of Observability.

June 28 is the (current) deadline for AWS to raise the minimum supported TLS version. That’s a month away from today. Let’s see who hasn’t been listening…

Cloud Optimisation all the rage in 2023

I have been tinkering with AWS since 2008, and delivering AWS Cloud solutions since 2010, and in this time, I’ve seen many cloud trends come along, and messaging subtly change from all of the hyperscale IaaS and PaaS providers.

This year, we’re seeing more talk about Optimisation. In the 2023 earnings calls, we heard:

Microsoft: “Customers continued to exercise some caution as optimization … trends … continued”
Google: “slower growth of consumption as customers optimized GCP costs reflecting the macro backdrop”
AWS: “Customers continue to evaluate ways to optimize their cloud spending in response to these tough economic conditions.”

We have also seen the messaging around Migration to cloud evolve to “Migrate & Modernise”. This is putting pressure on the laziest, simplest, and least effective of the “Seven R’s of cloud migration”, being “Rehost” and “Relocate”.

Why is this?

Rehost takes the existing spaghetti of installed software, and runs it in exactly the same way on a hyperscale cloud providers concept of a virtual machine. And in true least effort approach, if you had a virtual machine with 64 GB of RAM previously (even if only 20% utilised), then you would select the closest match in a simple rehost/reinstall.

If you had 10 application servers on 24×7, then you still get 10 cloud virtual machines, 24×7, even if you only need that peak capacity for one day of the year.

And even more interesting is paying licensing fees for a virtualisation layer you don’t need to be paying for (and get a different experience when you do).

Not very efficient. Not very smart. But least effort.

So why do organisations do this?

That is easy: It’s complicated to do well. It takes time, experience, and wisdom.

This technology industry is full of individuals who think their job is to make nothing change. Historically, many IT service frameworks were designed around slowing the rate of change, diametrically opposed to DevOps. And organisations that see the cost of technology operations, not the value of it.

Cost has been driven down so much that the talented individuals have left, and only those who remain (who perhaps couldn’t geta job elsewhere) are keeping the lights on. They don’t have the experience, knowledge or wisdom to know what good looks like, they just know what has, for the last 30 years, been “stable”.

And while some engineers are keen to learn new things, and use that, many senior stakeholders have in their mind “what’s the least we can do right now”.

So when it comes to a cloud migration, they see “least change” as “least risky”, not “more costly”.

In 2023 at the AWS Partner summit, AWS representatives quoted that “only some 15% of workloads that will move to the cloud, have now done so”. They furthered that that 15% now requires some level of modernisation.

Again, 10 years ago, many organisation moved large workloads in lift-and-shift method to the same number of instances of the day. Perhaps the m3.xlarge in AWS or similar elsewhere. And then the people who knew what Cloud was, departed the scene, leaving an under experienced set of individuals to keep the lights on and change nothing.

I once did a review of an 3 party Analytics package running on 4 m3.xlarge instances; they had acquired a 3 year Reservation, and kept the instances running 24×7. They didn’t use them all the time. They had never done any changes (not even OS patching). They were running a RedHat 7.x OS.

Two cycles (yes, 2) of newer instance families had been released in that period: the m4, and then m5. Because of the Linux kernel in that version of RedHat, they were prevented from moving these virtual machines to a newer AWS EC2 instance family due to lack of kernel support.

In reality, they were fortunate, as they had run a version of RedHat that finally (after 20+ years) supported in-place upgrades. The easiest path ahead was to snapshot (or make an AMI of each individual host), then in-place upgrade to the latest RedHat release, and then during a shutdown of the instance (stop), adjust the instance family to the m5 equivalent. This alone would save them something like 20% of cost. They could then do a Reservation (my recommendation was for one year, as things change…) and they would have ended up at over 80% cost reduction – and faster performance.

Of course, far cleaner is to understand the installation of the 3^rd party analytics software, the way it clusters, and replace each node with a clean OS, instead of keeping the craft from bygone installations.

And beyond that would have been the option to not Reserve any instances, but just turn them off when not in use.

But they hadn’t.

Poor practices in basic maintenance and continual patching and upgrades meant they had a lot of work to do now, rather than already be up to date.

I’ve seen a number of Partners in the Cloud ecosystem get recognised and rewarded for the sheer volume of migrations they have done. And despite these being naive lift-and-shift, and looking good for the first 6 months, the bill shock starts to set in. And over time, the lack of maintenance becomes an issue.

Even with the “Re-platform” pattern of migration (adopting some cloud managed services in your tech stack), and adopting perhaps something like a Managed Database service, can have its catches. Selecting a managed version of perhaps Postgres 9.5 back 5 years ago, would have put you in trouble, when 9.6, 10, 11, 12, 13, 14 came out, because as newer versions came out, older versions were deprecated, and eventually unavailable. If you weren’t adequately addressing technical debt and maintenance tasks as part of your standard operations, then you’re looking at possible operational trouble.

One clever tactic to address this is to have engineers who are familiar with more contemporary configurations, experts in the field (whom you can spot by the 6+ concurrent AWS Certification they hold) who can perform a Well-Architected Framework Review. This is typically short 1 week engagement for a professional consulting company, borrowing experienced talent into your business that can give you clear addressable mitigations covering many facets, including cost.

If you previously did a Lift & Shift, and found it expensive, perhaps your organisations ability to bring expertise to bear on the work is missing. If you don’t know where to start, then start asking for expertise.

So why are all the cloud providers talking about Optimisation? Because they know their customers can do better, on the same cloud provider. And they would rather have a customer optimise their spend and stay, rather than now migrate & modernise to a competitor, and start the training and upskilling process again.

AWS Sydney Summit 2023

Last week I was fortunate enough to attend the AWS Sydney Summit 2023, along with my colleague and friend, Elliot Segler, on behalf of Akkodis.

A Partner Day was arranged that ran on Monday 3rd April – I’ll avoid details on that here at this time to focus on the main Customer Summit – the Big Event.

This was really the first major in-country Summit since the “end” of the Covid-19 pandemic – the 2022 version was much smaller, and that was only announced as a small in-person event just a few weeks before it was held. Thus in 2022 this event only had a small crowd; this year, attendance tickets were capped at 5,000.

In 2019, some 19,000 people attended over two days.

I’ve been fortunate to never miss an AWS Summit in Australia, and it gives me insight into the state of the Cloud when comparing year-on-year.

The floorplan of the AWS Sydney Summit 2023

The exhibitor floor at the ASW Sydney Summit 2023

AWS Managing Director A/NZ, Rianne van Veldhuizen, opens the Summit

Cameron Adams, founder of Canva, shared some of their AWS Cloud architecture.

Cameron also spoke about the move to sustainable computing, Canva signing The Climate Pledge, and Canva racing to integrate AI models into their service offering, such as magic erase.

We also saw Nicole Sheffield from Wesfarmers “One Digital” service who spoke about the implementation of their digital platform that offers customer-facing digital services across the Wesfarmers portfolio. This is significant, as Wesfarmers had, in 2014, been quite alarmed at Amazon entering Australia.

The exhibitors were split between Consulting Services partners, and ISV partners. Those ISVs had a distinct developer tooling flavour to them, but we also saw Canva talking to end customers about their product, and both Bespoke and Lumnify (formerly DDLS) training providers discussing their educational schedules and offerings.

Early in the morning I took a photo of the Builder lab, set up for individuals to undertake self-paced digital training – which was full for the rest of the day:

The Builder Lab (taken first thing in the morning before it opened)

In Financial Services, Australian bank (and one of the Big Four) Westpac provided their take on some of the cloud security approaches. This is notable to me as one of their staff in 2013 said they would “never use Public Cloud”; they seem to be doing well putting Public Cloud to good use.

Suncorp (finance, insurance and banking) also spoke about their exit strategy from their existing data centers using AWs as a target platform.

This year there were no major service announcements or releases at this Summit, but then these days the major announcements happen almost every few days anyway! In previous years, the Summit has always had a sort of “revolution” (when major new concepts or services were released, such as AI services) or “evolution” (incremental updates announced for existing services). This year the theme was more “steady-as-she goes” and stable.

What’s clear is that commentary around Lift & Shift migrations are now evolving to Migrate & Modernise (which is what I have always focused on – deeper expertise and a better short and long term outcome). This isn’t surprising, as often naïve Lift & Shift has left customers with workloads costing more and unable to take advantage of key cloud attributes, such as scalability or cloud-platform-managed-services to reduce TCO.

Of course, those who implemented cloud-native solutions, and paid close attention to Cloud Operating Models (and tech team org structures) with an eye to the Well-Architected Framework have enjoyed optimised and reliable operation.

AWS claimed that perhaps only 15% of all workloads that will eventually run in the cloud, is now doing so. Of course, that 15% requires the maintenance, care and attention to ensure it remains operational, and optimal.

So where are we in the Cloud evolution timeline? I suspect we’re in the middle of furious catch-up by software providers who now are focusing on adopting IaaS and PaaS to take their legacy solutions, and reimplement them as cloud-native SaaS. More vertical-specific SaaS products are coming to market.

The individual services within the cloud are maturing. ARM-based Graviton chips continue to uphold Moore’s Law (RIP Gordon Moore, 2023). IPv6 is progressing, but as noted by one of my fellow Partner Ambassadors and long term friend, Greg Cockburn, the rate of change in modern IPv6 networking appears to be slowing (with notable exception, VPC Network Firewall now supporting IPv6 only subnets). I suspect the major requests are now satisfied; workloads that want to be dual-stacked to the outside world are fully supported. Of course, many ISPs, telcos and carriers are continuing to slowly adopt IPv6 for their consumers; some advanced end user providers in Australia have been Telstra, Internode (ironically part of iiNet who dropped the IPv6 ball in 2013, and thus part of Vodafone), and Aussie Broadband.

From speaking to people, it did appear that a 5,000 person cap had limited many people from attending, particualy those who live in the same Australian state of New South Wales who may have left booking a ticket too late, and missed out. Perhaps in 2024 we’ll see another increase (and who knows).

Meanwhile, outside around Darling Harbour, much construction happens, watched over by warships and tall ships.

Warships and Tall Ships in Darling Harbour, 2023

And in case you’re wondering, its a long way from Perth (where I live) to Sydney, and the Great Circle line directly between the two cities takes us far over the Southern Ocean:

AWS Local Zone: Perth Launch

In the last few days, the AWS Local Zone service launched in Perth.

Articles in the media are claiming this is a significant uplift:

AWS has officially opened a new cloud infrastructure region in Perth.
Kate Weber, IT News

Only problem, this is not a Region (big R) in the AWS vernacular, but a single Availability Zone, attached as an adjunct to the AWS Sydney Region (ap-southeast-2).

Media Statements and press releases are designed to excite the media, but the reality is a little more circumspect. The Local Zone does add local compute (EC2) and block storage (in the form of EBS) within Perth. However, its what many would expect that leads a local zone to be perhaps a little limiting.

First up, the compute is only certain, limited types. The same with the EBS volume types. You may be forced to use more expensive compute and storage, because the cheaper instance sizes or EBS types are not available.

Second, the costs for the Compute and Storage is more expensive than in the full Region, where economies of scale and usage patterns drive down costs. You may see up to a 50% overhead on the same 4xLarge instance type compared to Sydney.

Third, some basics like IPv6 support is not (yet?) supported. That may upset your VPC conventions.

Fourth, you may have wanted close access to managed services, like RDS (databases), or even Workspaces (desktop) and AppStream (application virtualisation). Well, that’s back in the main Region, not in the Local Zone, and at 50ms across Australia, that may be a bit too far, which means you’re back to the stone age of running databases on Instances yourself.

Fifth, being a single AZ, you wont get any multi-AZ resilience that you’re comfortable with an a full Region.

Sixth: the Local zone’s operation is tied to the designated Region: issues in ap-southeast-2 (Sydney) may impact management (control plane) or availability (data plane) of that AZ. Your CloudFormation has to execute in Sydney to stand up a stack.

In essence, a Local Zone needs to be looked at from the Well-Architected Framework perspective, and utilised accordingly.

Once you have reconciled those issues, there is the bright side: compute in cloud, managed in a uniform way, without having to deploy an Outpost and have a datacentre for the Outpost to run in.

Many organisations do want Cloud services locally, and for some high volume, low latency, idempotent, loosely coupled, edge processing, this may be perfect.

An AWS Local Zone is also a small stepping stone to perhaps being a full Region one day, as was seen in Osaka, Japan. It just takes demand to validate the point.

Its 10 years since AWS opened first its point of presence (CloudFront, Route53) in Australia, and then the Sydney Region. We’re on the cusp of a second Local zone up in Brisbane, and a second Australian Region – yes, a full Region – in Melbourne.

It’s the new Region that suddenly opens up Multi-region application architecture, which, for public sector in particular, has not been permissible for data jurisdiction purposes (even if that is just a desire and not a mandated legal requirement).

IoT Trackers and the AWS Cloud

I continued my IoT project over the recent end-of-year/Christmas break period, picking up from where I was 6 months ago.

Since then, a new firmware version had become available for the RAK Wireless RAK10700 GNSS (Global Navigation Satellite System) solar powered device. These devices shipped without battery (due to postal limitations), and came with firmware 1.0.4.

I failed completely last time to get these to associate with my local IoT gateway (rak7246, basically a raspberry PI in a box that bridges LoRaWAN and WiFi).

Since then, a new firmware 1.0.6 has been released.

Documentation for the RAK10700 was OK, until you get to the page that says the latest firmware is version 1.0.1; given this device already shipped with 1.0.4, I dived in deeper; the link to the firmware is https://downloads.rakwireless.com/LoRa/WisBlock/Solutions/LPWAN-Tracker-Latest.zip, and the contents of this zip file, at the time of writing, are three files:

Manifest
WisBlock_SENS_V1.0.6_2022.04.07.13.36.27.bin
WisBlock_SENS_V1.0.6_2022.04.07.13.36.27.dat

Caveat Elit (developer beware): this appears to be firmware version 1.0.6.

Flashing this was interesting: the device, connected via its USB cable to my laptop, had to be reset into DFU mode, which required double-tapping the reset pin in quick succession (its located next to the USB port). Once done, the device presented as USB storage, and the adafruit-nrfutil tool could update it (check the COM port from Device Manager).

adafruit-nrfutil dfu serial --package LPWAN-Tracker-Latest.zip -p COM19 -b 115200

When in DFU mode, the device turned up as a different COM port compared to when it was in its normal mode. It took me two attempts for this to be successful, and then pressing the rest button to have the device return to normal mode.

Next came the interface to AWS IoT Core for LoRaWAN. I’d previously been using the LGT-92 (now not available), but had to abandon these as no amount of protection in waterproof bags had made them durable enough to last the distance of my use case; tracking a small sail boat.

The configuration that eventually worked for me was to define a profile with MAC version 1.0.3, Regional Parameters RP002-1.01, Max EIRP 15, for AU915 frequencies (I am in Australia):

AWS IoT Core for LPWAN: Profile Configuration for RAK10700

Now with the profile defined, I can add the two Devices in, using the AppKey, DevKey, etc:

With data coming through it was now time to decide the Payload. These devices use a format called CayenneLPP to stuff as much data into as small a payload as possible. One of the first things you’ll want to do is decide the data to check it looks legitimate. Using a small Python script, I can unpack it – after doing a pip install cayennelpp:

#!/usr/bin/python3
import base64
import sys
from cayennelpp import LppFrame
d=base64.standard_b64decode(sys.argv[1])
f=LppFrame().from_bytes(d)

for i in f.data:
  if len(i.value) == 1:
    print("Ch {}, {}: {}".format(i.channel, i.type, i.value[0]))
  else:
    print("Ch {}, {}: {}".format(i.channel, i.type, i.value))

By routing the incoming IoT messages to a Lambda function, I can now pick out the PayloadData from the event and see the string being send. Here’s what I am seeing in CloudWatch logs when I just print(event):

{'WirelessDeviceId': 'b15xxxx-xxxx-47df-8a5d-f57800c170b5', 'PayloadData': ' AXQBqwZoRwdnARQIcydWCQIG6Q==', 'WirelessMetadata': {'LoRaWAN': {'ADR': False, 'Bandwidth': 125, 'ClassB': False, 'CodeRate': '4/5', 'DataRate': '3', 'DevAddr': 'xxxxx', 'DevEui': 'xxxxx', 'FCnt': 73, 'FOptLen': 0, 'FPort': 2, 'Frequency': '917200000', 'Gateways': [{'GatewayEui': 'xxxxx930e93', 'Rssi': -31, 'Snr': 9.5}], 'MIC': 'xxxxx95b', 'MType': 'UnconfirmedDataUp', 'Major': 'LoRaWANR1', 'Modulation': 'LORA', 'PolarizationInversion': False, 'SpreadingFactor': 9, 'Timestamp': '2023-01-04T13:46:42Z'}}}

While inside, with no satellite lock, that PayloadData translates out to:

Ch 1, Voltage: 4.27
Ch 6, Humidity: 35.5
Ch 7, Temperature: 27.6
Ch 8, Barometer: 1007.0
Ch 9, Analog Input: 17.69

Now I have the two sensors, its time to get them outside, with a bit of soldering of the LiPo battery on to the right connector…