How did you get started in AWS?

Someone posed the question recently: how did you get started in using AWS?

Once upon a time…. I was working in London (2003-2010), and during my time at Vibrant Media running the IT operations team for their contextual advertising platform, I was looking for ways to serve content and process requests efficiently.

Vibrant had thousands of customers, and CommScore reporting indicated our advertising services were seen by some 49% of the US population each month (the platform was world-wide, but the CommScore report was for the US market). It was fairly busy.

In 2008 I stumbled across the then-launched AWS (starting 2006). At that time, the rudimentary controls were basic, and the architectural patterns for VPC at that time did not suit our requirements (all traffic from the VPC had to egress to the customer VPN – no IGW!). So I parked the idea, and moved on.

In 2010 I returned to Australia, and was approached by the team at Netshelter to implement a crawler for forum sites to identify the influencers in the network. Unlike my previous role at Vibrant, Netshelter had no data centres, no infrastructure, just AWS.

It was the words of Richard Brindley who said “we just have AWS, don’t worry about the bill, because anything you do in AWS is going to be vastly cheaper than what we would have done on premises”.

With only myself to architect, implement and operate the solution, I had to find ways to make myself scale. Platform as a service – managed components, was key. Any increase in pricing meant that I didn’t have to deal with the details of operations.

As a Linux developer and System Admin for the 15 years prior to that, I started with the EC2 platform. Finding images, launching them, and configuring them. Then came the automation of installation: scripting the deployment of packages as required for the code I was writing (back then, in Perl).

Pretty quickly, I realised I needed to scale horizontally to get through the work, and I would need some capability to distribute out work. I turned to SQS, and within a day had the epiphany that a reliable queue system was more important than a fleet of processing nodes. Individuals nodes could fail, but a good approach to queuing and message processing could overcome many obstacles.

In storing my results, I needed a database. I had been MySQL certified for years, writing stored procedures, creating schemas, and managing server updates. All of which was fascinating, but time consuming. RDS MySQL was the obvious choice to save me time.

As VPC capability evolved, additional layers of security became easier to implement without introducing Single Points of Failure (SPOFs), or pinch-points and bottlenecks.

From an Australian perspective, this was an interesting era: it was pre-Region in Australia. That meant that, at that time, most organisations dismissed cloud as not being applicable to them. True, some organisations addressing European and US markets were all-in, but latency and fears around the then-relevant Patriot Act kept usage low (this obviously changed in 2012!).

But in essence, the getting started advise of not worrying about the bill with respect to what the equivalent all-in-cost would have been for co-location fees, bandwidth commitments, compute and storage hardware, rack and stack time and costs, and th eoverhead of managing all these activities, meant that the immediacy and control of an AWs environment was far more effective.

I didn’t go wild on cost. Keeping an eye on the individual components mean the total charges remained sensible. As they say, look after the pennies, and the pounds look after themselves.

What was key was the approach to continuously learn. And then relearn something when it changes slightly, or unlearn past behaviours that no longer made sense.

It was also useful to always push the boundaries; reach out and ask service teams to add new capabilities, be they technical, compliance, policy, etc.

How would I start today…well, that’s another article for another day….

Writing (some of) the questions for the AWS Solution Architect Professional Certification

Writing the SA Professional questions in San Francisco.

bs the longest certified AWS individuals.

During my time with AWS, I also helped contribute to an early set of questions for the then-in-development Solution Architetc Professional certification. My contributions pulled upon my many years of involvement in Linux and Open Source, as well as my time then as AWS Security Solution architect for Australia and New Zealand.

As time (and I) moved on, I continued to sit more AWS certifications – at this time, I hold 8 AWS Certifications, and am awaiting results of the new Database Specialty certification. I’ve written many times about sitting these certifications, given guidance to friends and colleagues on sitting them. I’ve watched as the value to an individual of these certifications has increased, making them amongst some of the most respected, and best paid certifications in the technology field.

The attention to detail on running the certifications is high. The whole point of a certification is to discriminate fairly based on those who have the required capability to perform a task, and those who do not. If the certification were too easy, then it would undermine the value of the certification to those who are more adept in the topic.

Of course, the certification itself is not based on the same static set of questions. Some questions get invalidated over time as features get released and updated. Some services fall out of fashion, and new services are born that become critical (could you imagine running today without CloudTrail enabled).

The questions for these certifications are in a pool; and each time a candidate sits a certification, a subset of the currently active questions gets presented to them. The order of the questions is not fixed. The likelihood of two people getting the same questions, in the same order is extremely low.

However, over time, the pool runs low. Questions expire. New questions are needed.

Transamerica Building, San Francisco

In January 2020, I received a request to attend a question-writing workshop as a Subject Matter Expert (SME) for the Solution Architect Professional certification.

These workshops bring together some of the most capable, experienced AWS Cloud engineers on the planet. The goal is not to write questions that none of us could pass, but questions that all of us could pass that would bring more people into this tier.

Travel there

Arriving on Sunday, I managed to make it to my hotel, and then run to dinner with some dear friends and former colleagues from a decade ago who live in and around San Francisco.

Monday was a work day, so I was in the Modis office in San Francisco, talking to our team there about our cloud practice in Australia.

Corey Quin, @QuinnyPig, Cloud Economist, and James Bromberger having a coffee catch up in Unuion Square

I was also lucky enough to cross paths with Corey Quinn, whom I had met when he came to Perth for the Latency conference in 2018. A quick coffee, and we realised we knew a fair number of people in common, across AWS, and the UK and Australia.

Consul-General Nick Nichles speaks at the 111 Mimosa Gallery and the Austrade Cybersecurity event

As timing was still working well, there was an AISA and NAB sponsored trade delegation with the Australian Consular General hosting an event in town on Monday evening. Many people were in town for the popular RSA Conference, so I popped along.

Small world it is, running in to Andrew Woodward of ECU, and Graeme Speak of Bankvault, both from Perth. I was also recognised from my AISA presentations over the last few years…

The Bay Bridge

The exam workshop

14 Subject Matter Experts (SMEs) from around the world gathered in San Francisco for the Question writing workshop. The backgrounds were all varied, from end customers of massive national broadcasters, to finance workloads, government, and more.

Much time was spent trying to strike a fair balance of what should be passable, and trying to ensure the expression of the problems, and the answers, were as clear and unequivocal as possible.

The 2020-02-25 to 2020-02-27 SA Professional workshop team (minus Cassandra Hope)

Three days of this was mentally draining. But the team contributed and reviewed over 100 items. These items now go through review, and may eventually turn up in an exam that those aspiring to the professional lavel AWS certification will sit.

Ding ding! A cable car, the easiest way from Van Ness to Sansome Sts (along California, past the Top of the Mark, and more)

Thanks to the AWS team for organising and paying for my travel, and thanks to my team for letting me participate.


AWS Certified Database — Specialty

Today, Monday 25th of November 2019, is the dawn of a new AWS Certification, the “Certified Database — Specialty“, taking the current active AWS certifications to 12:

  • Cloud Practitioner — Foundational
  • Solution Architect — Associate
  • SysOps — Associate
  • Developer — Associate
  • Solution Architect — Professional
  • DevOps Engineer — Professional
  • Networking — Specialty
  • Security — Specialty
  • Big Data — Specialty
  • Alexa Skills Builder — Specialty
  • Machine Learning — Specialty
  • Database — Specialty

I saw my first AWS Certification, the Solution Architect Associate, back in January of 2013 with the initial cohort of AWS staff while in Seattle, and thus am the equal longest AWS-certified person in the world; to which I have continued doing many of these certifications.

I’ve been using databases – primarily open source databases such as MySQL and Postgres, since the mid 1990s. I was certified by MySQL AB back in 2005 in London. Indeed, in 2004 I wrote (and open sourced) an exhaustive MySQL replication check for Nagios, so I have some in-depth knowledge here.

So today, on this first day of the new certification, I went and sat it. Since this is a new beta, there are no immediate pass/fail scores made available — that will be some time in 2020, when enough people have sat this, and grading can be done to determine a fair passing score (as well as review the questions.

Services Covered

As always, three’s an NDA so I can’t go into detail about questions, but I can confirm some of the services covered:

  • RDS — of course — with Postgres, MySQL, Oracle and SQL Server
  • DynamoDB – regional and global tables
  • Aurora – both Postgres and MySQL interfaces
  • Elasticache Redis
  • DocumentDB
  • DMS
  • Glue

Sadly for Corey Quinn, no Route53 as a database-storage-engine, but DNS as a topic did come up. As did a fair amount of security, of course.

What was interesting was a constant focus on high availability, automated recovery, and minimal downtime when doing certain operations. This plays squarely into the Well-Architected Framework.

Who is this Certification for?

In my opinion, this certification is playing straight into the hands of the existing Database Administrator, who has perhaps long felt threatened by the automation that has replaced much of the undifferentiated heavy lifting of basic database operation (patching, replication and snapshots) with Managed RDS instances.

This gives the humble DBA of yore a pathway to regain legitimacy; for those that don’t will be left behind. It will probably spur many DBAs to undertake architectures and approaches they may have often felt were too hard, or too complicated, when indeed these are quite easy with managed services.

Conclusion

A good outing for a new certification, but the odd typo (the likes of which I produce) were seen (eg: cloud where it should have been could, if you can believe me).

For anyone with a Pro SA and Pro Dev Ops certification, this one shouldn’t be a stretch too hard. Of course, come March I may eat my words.

I know how much work goes into creating these question pools, reviewing the blueprints, questions, and the work yet to be done – grading and then confirming and rejecting. Well done Cert team on another one hitting customers hands!

Death of the Data Centre; long live the Data Centre

The last time I worked inside a data centre co-lo was in2009. From East Coast US to West, from the UK to Europe, and here in Australia I spent many long hours in these windowless hubs of electronic existence.

It’s been 10 years.

I started making a data centre at my fathers manufacturing organisation in the early 1990s. As a small business it had a number of computer systems, a small 100 MBit/sec LAN, and a room with air-conditioning that we sealed off and deployed physical dedicated servers and UPS units. I recall naming every host on the network with a name starting with the letter P:

  • Parrot
  • Pigeon
  • Pardaloot
  • Pootoo
  • Peacock
  • Pacific Black Duck

You get the idea. The company was called Pelican.

By the time I attended The University of Western Australia I was of course gravitating to the University Computer Club, a student association I would end up being Vice President and then President of. During my time there with friends, we furnished a small data centre our of recycled materials in order to contain the cooling for our server farm within the expanse of the vast Cameron Hall building; this structure still stands today (webcams).

In 1997 my interest in networking and digital rights led to help found The Western Australian Internet Association, now known as Internet.asn.au. Thus not being a network.

Despite not creating or working at an ISP in these earlier years of Internet, I was reasonably proficient in the IT physical infrastructure deployment. My professional career saw me spend 20 years within the data centres of banks, education, government, financial services. I used to order millions of dollars of server blade enclosures, remote control power distribution units and dual-power transfer units for reliability, switches, load balancers, remote KVM units; and upon notification of delivery at a data centre in Manhattan (111 8th Avenue, or 6th Ave), Seattle, China Basin in San Francisco, Andover MA, Amsterdam or more, I would organise for myself or my team to parachute in, un-box, unwrap, stack, and then crimp Ethernet leads, power on, and deploy clusters of servers, then kick off initial install server deployment, retreating home to then finish the software install remotely, and bring servers online and into service.

It was all about dependencies; have the right equipment in the right place at the right time to minimise the time spent in the co-lo.

The last one that I worked in was 2009. The last one that I visited was in 2013 – and that was one of the massive halls within the sprawling Amazon Web Services (AWS) US-East-1 complex; a facility that few people ever get to see (no photos).

All that effort, the logistics and physical work of installing equipment, is now largely redundant. I create virtual data centres on cloud providers from templates with more fault tolerance, scalability, and privacy in literally 5 minutes, across the planet without having to spend my time hidden for days (to weeks) crimping Ethernet cables, balancing redundant power usage, and architecting spanning tree powered reliable layer 2 networks.

While some write of the death of the data centre, I think the data centre has changed who its direct customers are. I’m not interested in touring facilities and planning cabinet layouts. I have better things to do. The hyper-scale cloud providers have automated and abstracted so much, that it is not cost effective for me to do any of that manual work any more.

Vivre la Data Centre. You don’t need to market to me any more. Just to those Cloud providers; cut your costs, you’re a commodity, and have been for a decade.

Put your CAA in DNS!

There are hundreds of public, trusted* certificate authorities (CAs) in the world. These CAs have had their root CA Certificate published into the Trust Store of many solutions that the world uses. These Trust Stores include widely used web browsers (like the one you’re using now), to the various programming language run times, and individuals operating systems.

A trust store is literally a store of certificates which are deemed trusted. While users can edit their trust store, or make their own, they come with a set that have been selected by your software vendor. Sometimes these are manipulated in the corporate environment to include a company Certificate Authority, or remove specific distrusted authorities.

Over time, some CAs fall into disrepute, and eventually software distributors will issue updates that remove a rouge CA. Of course, issuing an update for systems that the public never apply doesn’t change much in the short term (tip: patch your environments, including the trust store).

Like all x509 certificates the CA root certificates have an expiry, typically over a very long 20+year period, and before expiry, much effort is put into creating a new root Certificate and having it issued distributed and updated in deployed applications.

Legitimate public certificate authorities are required to undertake some mandatory checks when they issue their certificates to their customers. These checks are called the Baseline Requirements, and are governed by the Browser/CA Forum industry body. CAs that are found to be flouting the Baseline Requirements are expelled from the Browser/CA Forum, and subsequently, most software distributions then remove them from their products (sometimes retrospectively via patches as mentioned above).

Being a Certificate Authority has been a lucrative business over the years. In the early days, it was enough to make Mark Shuttleworth a tidy packet with Thawte – enough for him to become a very early Space Tourist, and then start Canonical. With a trusted CA Root certificate widely adopted, a CA can then issue certificates for whatever they wish to charge.

What’s important to note though, is that any certificate in use has no bearing on the strength of encryption or negotiation protocol being used when a client connects to an HTTPS service. The only thing a CA-issued certificate gives you is a reasonably strong validation that the controller of the DNS name you’re connecting to has validate themselves to the CA vetting process.

It doesn’t tell you that the other end of your connection is someone you can TRUST, but you can reasonably TRUST that a given Certificate Authority thinks the entity at the other end of your connection may be the controller of their DNS (in Domain Validated (DV) certificates). Why reasonably? Well what if the controll erof the web site you’re trying to talk to accidentally published their PRIVATE key somewhere; a scammer could then set up a site that may look legitimate, poison some DNS or control a network segment your traffic routes over….

When a CA issues a certificate, it adds a digital signature (typically RSA based) around the originating certificate request. With in the certificate data are the various fields about the subject of the certificate, as well as information about who the issuer is, including a fingerprint (hash) of the issuer’s public certificate.

Previously CAs would issue certificates with an MD5 of their certificate. MD5 was replaced with SHA1, and around 2014, SHA1 was replaced with SHA2-256.

This signature algorithm is effectively the strength of the trust between the issuing CA, and the subjects certificate that you see on a web site. RSA gets very slow as key sizes get larger; today’s services typically use RSA at 2048 bits, which is currently strong enough to be deemed secure, and fast enough not to be a major performance overhead; make that 4096 bits and its another story.

Not only is the RSA algorithm being replaced, but eventually the SHA2-256 will be as well. The replacement for RSA is likely to be Eliptic Curve based, and SHA2-256 will either grow longer (SHA2-384), or to a new algorithm (SHA3-256), or a completely new method.

But back to the hundreds of CAs: you probably only use a small number in your organisation. LetsEncrypt, Amacon, Google, Verisign, GlobalTrust, etc. However, all CAs are seen as equally trusted when presented with a valid signed certificate. What can you do to prevent other CAs from issuing certificates in your (DNS) name?

The answer is simple: the DNS CAA record: Certificate Authority Authorisation. Its a list that says which CA(s) are allowed to issue certificates for your domain. It’s a record in DNS that is looked up by CAs just before they’re about to issue a certificate: if their indicator flag is not found, they don’t issue.

As it is so rarely issued, you can set this DNS record up with an extremely low TTL (say, 60 seconds). If you get the record wrong, or you forget to whitelist a new CA you’re moving to, update the record.

DNS isn’t perfect, but this slight incremental step may help keep public CAs to only issue from the CA’s you’ve made a decision to trust, and for your customers to trust as well.

DNS CAA was defined in 2010, and an IETF RFC in 2014. I worked with AWS Route53 team to have the record type supported in 2015. You can inspect CAA records using the dig command:

dig caa advara.com
; <<>> DiG 9.10.6 <<>> caa advara.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 5546
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0
;; QUESTION SECTION:
;advara.com. IN CAA
;; ANSWER SECTION:
advara.com. 60 IN CAA 0 issue "amazon.com"

Here you can see that advara.com has permitted AWS’s Certificate Manager, with its well known flag of “amazon.com” (and its a 60 second TTL).

You’ll also see that various online services will let you inspect this, including SSLLabs.com, Hardenize.com, and more.

Putting a CAA record in DNS typically costs nothing; its rarely looked up and can easily be changed. It protects you from someone tricking another CA into issuing certificates they think are legitimate; and this has been seen several times (think how valuable a google.com certificate would be ot intercept (MITM) mobile phones, searches, gmail, etc) – and while mis-issuance like this MAy lead to Browser/CA forum expulsion, and eventual client updates to distrust this CA, its far easier to prevent issuance with this simple record.

Of course, DNS Sec would be nice too…