This years Sydney to Hobart was a stunning race. Coverage on broadcast TV in Australia started with good coverage by Seven West Media’s 7+ service. The stunning coverage included a view of the four simultaneous start lines for the different classes:
Sadly, the broadcast TV coverage ended just after the start. With 7+ on the sail of one of the boats, I was expecting a bit more coverage.
Luckily the CYC had an intermittent live stream on YouTube, with Peter Shipway (nominative determinism at work there), Gordon Bray and Peter Gee hosting.
The primary website for the race this year was https://www.rolexsydneyhobart.com/, and this year this appeared to be running via AWS CloudFront.
Time for a quick health check, with SSL Labs:
After noting this is CloudFront, I notice that its resolved as IPv4 only. Shame, as IPv6 is just two steps away: tick a box in the CloudFront config, and publish an AAAA record in DNS. Its also interesting that some of the sub-resources being loaded on their page from alternate origins are available over IPv6 (as well as good old IPv4).
Talking of DNS, a quick nslookup shows Route53 in use.
Back to the output, a B. Here’s a few items observed on the SSLLabs report:
TLS 1.1 is enabled – it’s well past time to retire this. Luckily, TLS 1.2 and 1.3 are both enabled.
Even with TLS 1.2, there are some weak ciphers, but (luckily?) the first one in the list is reasonably strong.
HTTP/2 is not enabled (falling back to HTTP/1.1).
HTTP/3 is not enabled. Even more performance than HTTP/2.
Amazon Certificate Manager (ACM) is in use for the TLS certificate on CloudFront
It also says that there is no DNS CAA record, a simple way to lock out any other CA provider being duped into mis-issuance of a certificate. A low risk, but a (free) way to prevent this.
Turning to SecurityHeaders.com, we get this:
Unfortunately, looks like no security related headers are sent.
Strict Transport Security (HSTS) is a no-brainer these days. We (as a world) have all gone TLS for online security, and we’re not heading back to unencrypted HTTP.
The service stayed up and responsive: well done to the team who put this together, and good luck with looking through the event and finding improvements (like above) for next year.
The Chief Executive of insurance company Zurich, Mario Greco, recently said:
“What will become uninsurable is going to be cyber,” he said. “What if someone takes control of vital parts of our infrastructure, the consequences of that?”
Mario Greco, Zurich
In the same article is Lloyds insurance looking for exceptions in Cyber insurance for those attacks that are state based actors, which is a difficult thing to prove with certainty.
All in all, some reasons that Cyber Insurance exists is to cover from a risk perspective the opportunity of spending less on insurance premiums (and having financial recompense to cover operational costs) that having competent processes around software maintenance to code securely to start with, detect threats quickly, and maintain (patch/update) rapidly over time.
The structure of most organisations to have a “support team” who are responsible for an ever growing list of digital solutions, goaled on cost minimisation, and not measured against the amount of maintenance actions per solutions operated.
Its one of the reasons I like the siloed approach of DevOps and Service Teams. Scope is contained to one (or a small number of similar) solution(s). Same tech base, same skill set. With a remit to have observability, metrics and focus on one solution, the team can go deep on full-stack maintenance, focusing on a job well done, rather than a system that is just turned on.
It’s the difference between a grand painter, and a photocopier. Both make images; and for some low-value solutions, perhaps a photocopier is all they are worth investing in from a risk-reward perspective. But for those solutions that are the digital-life-blood of an organisation, the differentiator to competitors, and those that have the biggest end-customer impact, then perhaps they need a more appropriate level of operational investment — as part of the digital solution, not as a separate cost centre that can be seen to be minimised or eradicated.
If Cyber insurance goes end-of-life as a product in the insurance industry, then the war on talent, the focus to find those artisans who can adequately provide that , increases. All companies want the smartest people, as one smarter person may be more cost effective than 3 average engineers.
I recently wrote about the change of Amazon CloudFront’s support for accessing content from S3 privately.
It’s bad practice to leave an origin server open to the world; if an attacker can overwhelm your origin server then your CDN cant help to insulate you from that, and the CDN cannot serve any legitimate traffic. There are tricks to this such as having a secret header value injected into origin requests and then have the origin process that, but that’s kind of a static credential. Origin Access Identity was the first approach to move this authentication into the AWS domain, and Origin Access Control is the newer way, supporting the v4 Signature algorithm (at this time).
(If you like web security, read up on the v4 Signature, look at why we don’t use v1/2/3, and think about a time if/when this gets bumped – we’ve already seen v4a)
CloudFormation Support
When Origin Access Control launched last month, it was announced with CloudFormation support! Unfortunately, that CloudFormation support was “in documentation only” by the time I saw & tried it, and thus didn’t actually work for a while (the resource type was not recognised). CloudFormation OAC documentation was rolled back, and has now been published again, along with the actual implementation in the CloudFormation service.
It’s interesting to note that the original documentation for AWS::CloudFront::OriginAccessControl had some changes between the two releases: DisplayName became Name, for example.
Why use CloudFormation for these changes?
CloudFormation is an Infrastructure as Code (IaC) way of deploying resources on the cloud. It’s not the only IaC approach, another being Terraform, or the AWS CDK. All of these approaches gives the operator an artefact (document/code) that itself can be checked in to revision control, giving us the ability to easily track differences over time and compare the current deployment to what is in revision control.
Using IaC also gives us the ability to deploy to multiple environments (Dev, Test, … Prod) with repeatability, consistency, and as minimal manual effort as possible.
IaC itself can also be automated, further reducing the human effort. With CloudFormation as our IaC, we also have the concept of Drift Detection within the deployed Stack resources as part of the CloudFormation service, so we can validate if any local (e.g., console) changes have been introduced as a deviation from the prescribed template configuration.
Migrating from Origin ID to OAC with CloudFormation
In using CloudFormation changes to migrate between the old and the new ways of securely accessing content in S3, you need to do a few steps to implement and then tidy up.
If you had a template that created the old OriginAccessId, then you could put this new resource along side that (and later, come back and remove the OID resource).
2. Update your S3 Bucket to trust both the old Origin Access ID, and the new Origin Access Control.
If you wish, you can split that new Principal (cloudfront.amazonaws.com) into a separate statement, and be more specific as to which CloudFront distribution Id is permitted to this S3 bucket/prefix.
In my case, I am using one Origin Access Control for all my distributions to access different prefixes in the same S3 bucket, but if I wanted to raise the bar I’d split that with one OAC per distribution, and a unique mapping of Distribution Id to S3 bucket/prefix.
3. Update the Distribution to use OAC, per Origin:
You’ll note above we still have the S3OriginConfig defined, with an OriginAccessIdentity that is empty. That took a few hours to figure out that empty string; without it, the S3OriginConfig element is invalid, and a CustomOriginConfig is not for accessing S3. At least at this time.
If you’re adopting this, be sure to also look at your CloudFront distributions’ HttpVersion setting; you may want to adopt http2and3 to turn on HTTP3.
4. Remove the existing S3 Bucket Policy line that permitted the old OID
“AWS”: !Sub ‘arn:aws:iam::cloudfront:user/CloudFront Origin Access Identity ${OriginAccessIdentity}’ is no longer needed:
Amazon CloudFront, the AWS Content Delivery Network (CDN) service, has come a long way since I first saw it launch; I recall a slight chortle when it had 53 points of presence (PoPs) account the world, as CloudFront often (normally?) shares edge location facilities with the Amazon Route53 (Hosted DNS) service.
Today it’s over 400 PoPs, and is used for large and small web acceleration workloads.
One common pattern is having CloudFront serve static objects (files) that are stored in AWS’s Simple Storage Service, S3. Those static objects are often HTML files, images, Cascading Style Sheet documents, and more. And while S3 has a native Website serving function, it has long been my strong recommendation to my friends and colleagues to not use it, but use CloudFront in front of S3. There’s many reasons for this, one of which is you can configure the TLS certificate handed out, set the minimally permitted TLS version, and inject the various HTTP Security Headers we’ve come to see as minimal requirements for asking web browsers to help secure workloads.
Indeed, having any CDN sit in front of an origin server is an architecture that’s as old as web 2.0 (or more). One consideration her is that you don’t want end users circumventing the CDN and going direct to your origin server; if that origin gets overloaded, then the CDN (which caches) may not be able to fetch content for it’s viewers.
It’s not uncommon for CDNs to exceed 99.99% caching of objects (files), greatly reducing the origin server(s) that host the content. CDNs can also do conditional GET requests against an origin, to check that a cached version of an object (file) has not changed, which helps ensure the cached object can still be served our to visitors.
Ensuring that origin doesn’t get overloaded then becomes a question of blocking all other requests to the origin except those from the CDN. Amzon CloudFront has evolved its pattern over the years, staring with each edge operating independently. As the number of PoPs grew, this became an issue, so a mid tier cache, called the CloudFront Regional Edge, was introduced to help absorb some of that traffic. It’s a pattern that Akamai was using in the 2000’s when it had hundreds/thousands of PoPs.
For S3, the initial approach was to use a CloudFront Origin Identity (OID), which would cause a CloudFront origin request (from the edge, to the origin) to be authenticated against the S3 endpoint. An S3 Bucket Policy could then be applied that would permit access for this identity, and thus protect the origin from denial of service.
The S3 documentation to restrict access to S3 for this is useful.
Here’s an example S3 Bucket policy from where I serve my web content (from various prefixes therein):
This has now been revised, an in one release post, labelled as legacy and deprecated. The new approach is called an Origin Access Control (OAC), and will give finer-grained control.
One question I look at is the migration from one to another, trying to reach this with minimal (or no) downtime.
In my case, I am not concerned with restricted access to the S3 object to a specific CloudFront distribution ID; I am happy to have one identity that all my CloudFront distributions share against the same S3 Bucket (with different prefixes). As such, my update is straight forward, in that I am going to start by updating the above Bucket policy:
With this additional Service line, any CloudFront distribution can now grab objects from my account (possibly across account as well). I can add conditions to this policy as well, such as checking the Distribution IDs, but as part of the migration from OID to OAC we’ll come back to that.
Next up, in the CloudFront Console (or in a Cloud Formation update) we create a new OAC entry, with the v4sig being enabled for origin requests. Here’s the CloudFormation snippet:
Now we have an Origin Access Control, which in the console looks like this:
With this in place, then we need to update the CloudFront distributions to use this for each behaviour’s origin.
Give it a few minutes, check the content is still being delivered, and then its time to now back out the old CloudFront origin Access identity from the S3 Bucket Policy:
Then pop back to the CloudFront world and remove the old Origin Access Id (again, but either Cloud Formation update if that’s how you created it, or via the console or API).
This is also a good time to look at the Condition options in that policy, and see if you want to place further restrictions on access to your S3 Bucket, possibly like:
(where the 1111… number in red is your AWS account number).
AWS has been key to say that:
Any distributions using Origin Access Identity will continue to work and you can continue to use Origin Access Identity for new distributions.
AWS
However, that position may change in future, and given this has already marked the existing OID approach as “legacy“, it’s time to start evaluating your configuration changes.
I was reading a post the other day advising AWS customers to consider why they aren’t reaching 100% RI coverage. This triggered me, as 100% coverage is often not a good thing. And yes, now we have Savings Plans for Some Things in AWS, but some places remain with Reservations as the way to get consumption discounts by trading on flexibility.
And it’s that trade on flexibility that is critical.
1 year versus 3 years
First off, 3 years versus 1 year; the difference in percentage discount is often negligible, sometimes as low as 1% – 2%. Whereas, over the difference (years 2 and 3), there is the distinct possibility that a new instance type may come out, offering better power, performance, or price. That price improvement point has historically been seen as around 15%, which makes for an ideal time to “roll forward”, if you can. Reservations don’t technically STOP you from doing this, but if you’re not using the capacity you reserved then you may find your still paying for what you no longer use.
Rolling forward on services like RDS is not a problem; as the customer, you’re not managing the OS in the Virtual Machine or Container that its running in.
But in the EC2 world, you may find that your Linux or Windows OS needs an update to support the newer instance family. This was the case ones with RedHat 7.x and the change from m3 to m4; an updated Linux kernel was required. You were fortunate if you were on RedHat Enterprise Linux >=7, as this was when in-place upgrades were introduced — not that this is the recommended DevOps path (rip and replace the instance is my preference).
In-place upgrades made this something that could get you out of a lot of re-engineering if a workload was not already designed with rolling updates and instance replacement in mind. Revolutionary as this was for Redhat 7 GA in 2014, but (as a Debian Developer) Debian’s been doing that since 1996.
Reservations in Waves
The next thing to look at is slicing your reservations into waves, to give you future flexibility.
Typically a partial or full up front payment for your reservation is going to give you the biggest discount, but at the cost of hitting your cash flow now. If you had 20 Reservations required, then you’d be tempted to acquire all of them immediately.
But wait, what happens if you change your mind for some of that workload now, or in 3 months time. And sinking all that capital now may be undesirable.
I’d strongly suggest slicing this into quarterly reservations (each at one year’s duration, as above), picking up (at most) a quarter of your fleet each time. This will, in future, give you a quarterly opportunity to adjust your coverage mix.
And while I say at most a quarter of the fleet; you may still want some flexibility to scale down a little, so perhaps your target is not continual 100% coverage, but continual 80% coverage.
This discussion is then a risk conversation, of making commitments you may want to adjust. And knowing the way you may want to adjust is something that is learnt through experience.
At each quarter, there is a smaller bump for the upfront or partial up front payment, but each of those bumps is now (and in future) a decision point.
This financial operating model may not fit your risk/reward requirements, but its worth considering your approach to long term discounts, and the flexibility you may want in future.