AWS: Save up to 19.2% on t* instances

Despite what AWS may say, the burstable CPUs are a workhorse for so many smaller workloads – the long tail of deployments in the cloud.

Yesterday saw the announcement of the AMD based T3a instance family as generally available in many regions. Memory and core-count matches the previous T3 and T2 instance families of the same size, which makes comparisons rather easy.

Below are prices as shown today (25/Apr/2019) for Sydney ap-southeast-2:

Sizet2 US$t3 US$t3a US$Diff t3a-t3%Diff t3a-t2%
nano.0073.0066.0059.000710.6.001419.2
micro.0146.0132.0118.001410.6.002819.2
small.0292.0264.0236.002810.6.005619.2
medium.0584.0528.0472.005610.6.011219.2
large.1168.1056.0944.011210.6.022419.2
xlarge.2336.2112.1888.022410.6.044819.2
2xlarge.4672.4224.3776.044810.6.089619.2

As you can see, the savings of moving from one older family to the next is consistent across the sizes: 10.6% saving for the minor t3 to t3a equivalent, but a larger 19.2% if you’re still back on t2.

It’s worth looking at any pending Reservations you currently have for older families, and not jumping to this prematurely – you may end up paying twice.

Talking of which, Reservations are available for t3a as well. Looking at the Sydney price for a nano, it drops from the 5.6c/hr to 4c/hr; across the fleet, discounts on reserved versus on-demand for the t3a are up to 63%

For those who don’t reserve – because you’re not ready to commit, perhaps? – then the simple change of family is an easy and low-risk way of reaping some savings. For example, a fleet of 100 small instances for a month on t2 swapped to t3a would reap a saving of US$2,172.48 – US$1,755.84 = US$416.64/month, or just shy of US$5,000 a year (AU$7,000).

YMMV, test your workload – and Availability Zones – for support of the t3a.

AWS Certification: Pearson VUE and PSI

With the announcement a few weeks back I thought I’d look back on where I can send my team to get certified. For the last few years, AWS Certification has only had their testing via PSI, and in Perth, that meant one venue, with two kiosks. Prior to that, there were more test centres (with Kryterion as the test provider, as per previous blog post in 2017).

But now Pearson VUE are in the mix along side PSI, and the expansion is great.

There are now an additional 6 locations to get certified in Western Australia, including the first one outside of Perth by some 300+ kms:

  • North Metropolitan TAFE, 30 Aberdeen St, Northbridge
  • DDLS Perth, 553 Hay Street
  • ATI-Mirage, Cloisters 863 Hay Street
  • Edith Cowan, Joondalup
  • North Metro TAFE, 35 Kendrew Crescent, Joondalup
  • Market Creations, 7 Chapman Road, Geraldton

Geraldton is several hours drive north from Perth, at around 420kms (260mi), with a population around 40,000. The rest of Western Australia north of that is probably only another 60,000 people in total across Karratha (16k), Carnarvon, Exmouth, Port Headland, Dampier, and Exmouth.

Lets get some perspective on these distances, for my foreign friends:

For comparisons, check out this. Suffice to say, its a bloody long way. My wife lived for a while in Carnarvon, half way up the coast; that was around 10 hours driving to get there.

It would be interesting to see Busselton (pop 74k), and Albany, both to the south have some availability hereto help get people services without having to trek for days, or not bother at all.

S3 Public Access: Preventable SNAFUs

It’s happened again.

This time it is Facebook who left an Amazon S3 Bucket with publicly (anonymously) accessible data. 540 million breached records.

Previously, Verizon, PicketiNet, GoDaddy, Booz Allen Hamilton, Dow Jones, WWE, Time Warner, Pentagon, Accenture, and more. Large, presumably trusted names.

Let’s start with the truth: objects (files, data) uploaded to S3, with no options set on the bucket or object, are private by default.
Someone has to either set a Bucket Policy to make objects anonymously accessible, or set each object as Public ACL for objects to be shared.

Lets be clear.

These breaches are the result of someone uploading data and setting the acl:public-read, or editing a Bucket’s overriding resource policy to facilittate anonymous public access.

Having S3 accessible via authenticated http(s) is great. Having it available directly via anonymous http(s) is not, but historically that was a valid use case.

This week I have updated a client’s account, that serves a static web site hosted in S3, to have the master “Block Public Access” enabled on their entire AWS account. And I sleep easier. Their service experienced no downtime in the swap, no significant increase in cost, and the CloudFront caching CDN cannot be randomly side-stepped with requests to the S3 bucket.

Serving from S3 is terrible

So when you set an object public it can be fetched from S3 with no authentication. It can also be served over unencrypted HTTP (which is a terrible idea).

When hitting the S3 endpoint, the TLS certificate used matches the S3 endpoint hostname, which is something like s3.ap-southeast-2.amazonaws.com. Now that hostname probably has nothing to do with your business brand name, and something like files.mycompany.com may at least give some indication of affiliation of the data with your brand. But with the S3 endpoint, you have no choice.

Ignoring the unencrypted HTTP; the S3 endpoint TLS configuration for HTTPS is also rather loosely curated, as it is a public, shared endpoint with over a decade of backwards compatibility to deal with. TLS 1.0 is still enabled, which would be a breach of PCI DSS 3.2 (and TLS 1.1 is there too, which IMHO is next to useless).

Its worth noting that there are dual-stack IPv4 and IPv6 endpoints, such as s3.dualstack.ap-southeast-2.amazonaws.com.

So how can we fix this?

CloudFront + Origin Access Identity

CloudFront allows us to select a TLS policy, pre-defined by AWS, but permitting us to restrict available protocols and ciphers. This lets us remove “early crypto” and be TLS 1.2 only.

CloudFront also permits us to use a customer specific name, for SNI enabled clients for no additional cost, or a dedicated IP address (not worth it, IMHO).

Origin Access Identities give CloudFront a rolling API keypair that the service can use to access S3. Your S3 bucket then has a policy permitting this Identity access to the host.

With this access in place, you can then flick the “Block Public Access” setting account-wide, possibly on the bucket first, then the account-wide settings last.

One thing to work out is your use of URLs ending in “/”. Using Lambda@edge, we convert these to a request for “/index.html”. Similaly URL paths that end in “/foo” with no typical suffix get mapped to “/foo/index.html”.

Governance FTW?

So, have you checked if Block Public Access is enabled in your account(s). How about a sweep through right now?

If you’re not sure about this, contact me.

5 AWS Trends and Wishes for early 2019

AWS is the largest public Cloud provider in the world, and it is constantly evolving at a rapid clip, and using the scale of its service to reap the benefits from the economies that can be brought to bear at that scale.

The IT industry is itself evolving, with new patterns, protocols, and approaches being created in and out of the cloud. AWS is well placed to embrace many of these trends; things like WebSockets, IPv6, and more. But not everything is “done”in AWS; it’s all a continuous work-in-progress to stay current; but AWS’s approach (independent Service Teams, loose coupling, well-documented API interfaces)  and track record puts it far ahead of the competition in the race to stay current.

I’ve been using AWS for >10 years now, hold 8 AWS Certifications at this point in time, served nearly 3 years as the only Solution Architect with a “depth” in Security for Australia & New Zealand, have been a Cloud Warrior for 2 years, and now an AWS Ambassador. I’ve developed and delivered critical government solutions in Australia that the entire population depends upon every day, so have a reasonably deep understanding of the requirements that organisations have around their digital systems. With nearly 20 years as a Debian Linux developer, and >20 years delivering online services, my experience puts me in reasonable position to understand the ecosystem.

Here’s a list of things I foresee becoming commonplace in early 2019:

  • Organisation CloudTrail: enforcing company wide API logging standards, leading to better analysis of CloudTrail logs and the activity they expose
  • Enforced patterns around serving static content via S3: blocked public access by default, enabled only by CloudFront and Origin Access Identity to serve content stored in S3. side effect: appropriate TLS Certificates, and TLS Protocol and Cipher enforcement.
  • Virtual Private Cloud: enforced company-wide standards on routing: Transit Gateway from a corporate “production services”account”, once DirectConnect is supported by Transit Gateway
  • CloudFront and ALB set to HTTPS only (possibly with HTTP-> HTTPS redirect), with TLS 1.2 only!

5 Things I’d still like to see in AWS:

  • Improved health checks for Network and Application Load Balancers, similar to the existing ELB (Classic).
  • ECDSA certificates from Amazon Certificate Manager
  • TLS 1.3 on ALB, CloudFront, and the ability to restrict TLS Protocols to TLS 1.2+, or TLS 1.3+.
  • VPC: IPv6-only comms for intra-VPC services (RDS, ElastiCache, ALB/ELB, RedShift, etc.), IPv6-only subnets leading to IPv6-only VPCs, helped by service discounts for adopting IPv6-only
  • In Australia: AWS finally added to the ASD Protected Cloud list, without a Consumer Guide!

None of these are surprises to those who have extensively used AWS and hold those valuable AWS certifications.  These items don’t preclude your immediate extensive usage of the Cloud; they present visibility of the continuing evolution that is required in IT.

AWS Re:Invent Day 1 thoughts

This is going to be a long week of learning how the world has changed. I’m already tired, and I’m not even there. My brain hurts (you’d not believe how many typos I am correcting here).

While (once again) I am not at Re:Invent in Las Vegas, Nevada, I’m tuned in to as many news sources as possible to try and catch what parts of the undifferentiated heavy lifting has changed. I’ve been one of the AWS Cloud Warriors for the last two years (2017-2018), which has been lucky enough for me to be given a conference ticket, but unfortunately I’ve not been able to get there.

While I may not be physically there, I am in spirit, having been nominated as one of the AWS Ambassadors.

However the live stream video (which has improved dramatically since 2014), the Tweets from various people, the updates on LinkedIn, RSS feeds, Release Notes, What’s New page, AWS Blog (hi Jeff), and indeed, the Recent Changes/Release History sections of lots of the documentation pages (such as this Release History page for CloudFormation) have given me more information to trawl through.

It’s now Tuesday night in Perth, Western Australia, and day two of Re:Invent but its only 7am Tuesday morning in Las Vegas (yes, I’m 16 hours in the future). Here’s my thoughts on the releases thus far:

100 GB/s networking in VPC

The ENA network interface was previously limited to 25 Gb/sec per instance on the largest instance types. Indeed, its worth noting that most network resources are limited to some degree by the instance size within an instance family. But now a new family – the C5n instances – have interfaces capable of up to 100 Gb/sec (that’s 12.6 GB/s – little b is bits, big B is bytes).

Much has been said about network throughput, and the comparison between ENA and SR-IOV in the AWS Cloud, and comparisons to other Cloud environments. 100 Gb/s now sets a new high bar that other vendors are yet to reach.

While its wonderful to have that level of throughput, its also worth noting that scale-out is still sometimes a good idea. 100 instances at 1 GB/s each may provide a better solution sometimes, but then again sometimes a problem doesn’t split nicely between multiple server instances. YMMV.

Transit gateway

Managing an enterprise within AWS usually a case of managing multiple AWS accounts. The ultimate in separation from a console/account level sometimes reverts to integration questions around network, governance and other considerations.

In March 2014 (yes, 4 and a half years ago), VPC service team introduced VPC Peering, a non-transitive peering arrangement between VPCs – non-throttling, no single point of failure way of meshing two separate VPCs together (including in separate accounts).

This announcement now gives a transitive way (hence the name) of meshing a spread enterprise deployment. There’s multiple reasons for doing so:

  • Compliance: all outbound (to Internet) traffic is deemed by your corporate policy to funnel via a centralised specific gateways.
  • Management overhead: organising N-VPCs to mesh together means creating (N-1)*N/2 peering arrangements, and double that number for routing table entries. If we have 4 environments (dev, test, UAT and Production), and 10 applications in their own environments, then that’s 40 VPCs, and 780 peering relationships and 1560 routing table updates.

Its worth noting that in some organisations, an accounts administrative users may themselves not have access to create an IGW for access to the internet; a Transit gateway may be the only way permitted for connectivity so it can be centrally managed.

But in taking central management, you now have a few considerations:

Blast radius. If you stuff up the Transit gateway configuration, you take down the organisation. With separation and peering, each VPC is its own blast radius.

  1. Cost: Transit gateway isn’t free. You probably want to permit S3 Endpoints for large volume object storage
  2. Throughput: 50 Gb/s may seem a lot, but now there are 100 Gb/s instances

ARM based A1 Instances

In 2013, when I worked at AWS, I spoke with friends at ARM and AWS Service teams about the possibility of this happening. The attractiveness of the reduced power envelope, and cost comparison of the chip itself made it already look compelling then. This was before Windows was compiled to ARM – and that support is only strengthening. Its heartening now to see this coming out the door, giving customers choice.

Earlier this month we saw an announcement about AMD CPUs. Now we have three CPU manufacturers to choose from in the cloud when looking to run Virtual Machines. Customers can now vote with their workloads as to what they want to use. The CPU manufacturers now have more reason to innovate and make better, faster or cheaper CPUs available. When you can switch platforms easily (you do DevOps, right? All scripted installs?) then its perhaps down to the cost question now.

Now, recently it was announced there will be a t3a. Wonder if there will be a t3a1?

Compute in the cloud just got even more commodity. Simon Wardley, fire up your maps.

S3 improvements (lots here)

Gosh, so much here already.

Firstly, an admission that AWS Glacier is no longer its own service, but folded under S3 and renamed as S3 Glacier. There’s a new API for glacier to make it easier to work with, and the ability to put objects to S3 and have them stored immediately as Glacier objects without having to have zero day archive Lifecycle policies.

SFTP transfers – finally, a commodity protocol for file uploads that simple integrators can use, without having to deploy your own maintained, patched, fault-tolerant, scalable ingestion fleet of servers. This right here is the definition of undifferentiated heavy lifting being simplified, but with a price of 30c/hour, you’re looking at US$216 before you include any data transfer charges.

Object Lock: the ability to put files and not be able to delete them for a period. For when you have strict compliance requirements. Currently can only be defined on a Bucket during Bucket creation.

S3 events seem to have got a lot more detailed as well, with more trigger types than can be sent to SQS, SNS, or straight to a Lambda function.

KMS with dedicated HSM storage

KMS has simplified the way that key management is done, but some organisations require a dedicated HSM for compliance reasons. Now you can tell KSM to use your custom key store (a single-tenneted CloudHSM devices in our VPC) as the storage for these keys, but still use KMS APIs for your own key interaction, and use those keys for your services.

A dedicated Security Conference

Boston, End of June. Two days.

Not so new (but really recent)

CLI Version 2

Something so critical – the CLI – used by so many poor-man (poor-person) integrations and CI/CD pipelines, now with a version 2 in the works. Its breaking changes time – but in the mean time, the v1 CLI continues to get updates.

Predictive AutoScaling

Having EC2 AutoScaling reactively scale when thresholds are breached has been great, but combining that with machine learning based upon previous scaling events to make predictive scaling is next-level .

Lambda Support for Python 3.7

You may initially think this is trivial, stepping up from Python 3.6 to Lambda with Python 3.7, but it means that Python Lambda code can now make TLS 1.3 requests. Updating from Python 3.6 to 3.7 is mostly trivial; from 2.7 to 3.x normally means re-factoring liburi/requests client libraries and liberal use of parentheses where previously they weren’t required (eg, for print()).

S3: Public Access Blocking

Block Public Access finally removes the need for custom Bucket policies to prevent accidental uploads with acl:public (which, when you’re using a 3rd party s3 client for which you can’t see or control the ACL used may be scary). The downfall of the previous policies that rejected uploads if ACL:public (or not acl:private) was used is that it interfered with the ability to do multi-part puts (different API).

There’s been way too many cases of customers leaving objects publicly accessible. This will become a critical control in future. Most organisations don’t want public access to S3: those that do want public, anonymous access probably should be using CloudFront to do so (and a CloudFront origin Access identity for this as well, with Lambda@Edge to handle auto indexing and trailing slash redirects).

DynamoDB: Encrypted by default

A big step up. In reality, the ‘encryption at rest’ scenario within AWS is a formality: as one of the few people in Australia who has actually been inside a US-East-1 facility (hey QuinnyPig, I recall that from your slide two weeks ago at Latency Conf) the physical security is superb; the separation of responsibility between the logical allocation of data, and the knowledge of the physical location are separate teams.

So given that someone in the facility doesn’t know where your data is, and someone who knows where it is doesn’t have physical access (and those with physical access cant smuggle storage devices in or out), we’re at a high bar (physical devices only leave facilities when crushed into a very fine powder, particularly for SSD based storage).

So the Encrypted At Rest capability is more a nice to have – an extra protection should the standard storage wiping techniques (already very robust) have an issue. But given the bulk of the AES algorithm has been in CPU extensions for years, the overhead of processing encryption is essentially no impact.

Summary

I’ve tried my best to stay aware of so much, but the last 24 months has stretched the definition of what Cloud is so very wide. IoT, Robotics, Machine Learning, Vision Processing, Connect, Alexa, Analytics, DeepLens, this list seems so wide before you dive deep to the details. And the existing stalwarts: Ec2, S3, SQS, and even VPC keep getting richer, and richer.

The above is the services I’ve been interested in – there is definitely a hell of a lot more in the last 24 hours as well.

What’s today (US time) going to bring? I need to get some sleep, because this is exhausting just trying to keep the brain up to date.