Death of the Data Centre; long live the Data Centre

The last time I worked inside a data centre co-lo was in2009. From East Coast US to West, from the UK to Europe, and here in Australia I spent many long hours in these windowless hubs of electronic existence.

It’s been 10 years.

I started making a data centre at my fathers manufacturing organisation in the early 1990s. As a small business it had a number of computer systems, a small 100 MBit/sec LAN, and a room with air-conditioning that we sealed off and deployed physical dedicated servers and UPS units. I recall naming every host on the network with a name starting with the letter P:

  • Parrot
  • Pigeon
  • Pardaloot
  • Pootoo
  • Peacock
  • Pacific Black Duck

You get the idea. The company was called Pelican.

By the time I attended The University of Western Australia I was of course gravitating to the University Computer Club, a student association I would end up being Vice President and then President of. During my time there with friends, we furnished a small data centre our of recycled materials in order to contain the cooling for our server farm within the expanse of the vast Cameron Hall building; this structure still stands today (webcams).

In 1997 my interest in networking and digital rights led to help found The Western Australian Internet Association, now known as Internet.asn.au. Thus not being a network.

Despite not creating or working at an ISP in these earlier years of Internet, I was reasonably proficient in the IT physical infrastructure deployment. My professional career saw me spend 20 years within the data centres of banks, education, government, financial services. I used to order millions of dollars of server blade enclosures, remote control power distribution units and dual-power transfer units for reliability, switches, load balancers, remote KVM units; and upon notification of delivery at a data centre in Manhattan (111 8th Avenue, or 6th Ave), Seattle, China Basin in San Francisco, Andover MA, Amsterdam or more, I would organise for myself or my team to parachute in, un-box, unwrap, stack, and then crimp Ethernet leads, power on, and deploy clusters of servers, then kick off initial install server deployment, retreating home to then finish the software install remotely, and bring servers online and into service.

It was all about dependencies; have the right equipment in the right place at the right time to minimise the time spent in the co-lo.

The last one that I worked in was 2009. The last one that I visited was in 2013 – and that was one of the massive halls within the sprawling Amazon Web Services (AWS) US-East-1 complex; a facility that few people ever get to see (no photos).

All that effort, the logistics and physical work of installing equipment, is now largely redundant. I create virtual data centres on cloud providers from templates with more fault tolerance, scalability, and privacy in literally 5 minutes, across the planet without having to spend my time hidden for days (to weeks) crimping Ethernet cables, balancing redundant power usage, and architecting spanning tree powered reliable layer 2 networks.

While some write of the death of the data centre, I think the data centre has changed who its direct customers are. I’m not interested in touring facilities and planning cabinet layouts. I have better things to do. The hyper-scale cloud providers have automated and abstracted so much, that it is not cost effective for me to do any of that manual work any more.

Vivre la Data Centre. You don’t need to market to me any more. Just to those Cloud providers; cut your costs, you’re a commodity, and have been for a decade.

Put your CAA in DNS!

There are hundreds of public, trusted* certificate authorities (CAs) in the world. These CAs have had their root CA Certificate published into the Trust Store of many solutions that the world uses. These Trust Stores include widely used web browsers (like the one you’re using now), to the various programming language run times, and individuals operating systems.

A trust store is literally a store of certificates which are deemed trusted. While users can edit their trust store, or make their own, they come with a set that have been selected by your software vendor. Sometimes these are manipulated in the corporate environment to include a company Certificate Authority, or remove specific distrusted authorities.

Over time, some CAs fall into disrepute, and eventually software distributors will issue updates that remove a rouge CA. Of course, issuing an update for systems that the public never apply doesn’t change much in the short term (tip: patch your environments, including the trust store).

Like all x509 certificates the CA root certificates have an expiry, typically over a very long 20+year period, and before expiry, much effort is put into creating a new root Certificate and having it issued distributed and updated in deployed applications.

Legitimate public certificate authorities are required to undertake some mandatory checks when they issue their certificates to their customers. These checks are called the Baseline Requirements, and are governed by the Browser/CA Forum industry body. CAs that are found to be flouting the Baseline Requirements are expelled from the Browser/CA Forum, and subsequently, most software distributions then remove them from their products (sometimes retrospectively via patches as mentioned above).

Being a Certificate Authority has been a lucrative business over the years. In the early days, it was enough to make Mark Shuttleworth a tidy packet with Thawte – enough for him to become a very early Space Tourist, and then start Canonical. With a trusted CA Root certificate widely adopted, a CA can then issue certificates for whatever they wish to charge.

What’s important to note though, is that any certificate in use has no bearing on the strength of encryption or negotiation protocol being used when a client connects to an HTTPS service. The only thing a CA-issued certificate gives you is a reasonably strong validation that the controller of the DNS name you’re connecting to has validate themselves to the CA vetting process.

It doesn’t tell you that the other end of your connection is someone you can TRUST, but you can reasonably TRUST that a given Certificate Authority thinks the entity at the other end of your connection may be the controller of their DNS (in Domain Validated (DV) certificates). Why reasonably? Well what if the controll erof the web site you’re trying to talk to accidentally published their PRIVATE key somewhere; a scammer could then set up a site that may look legitimate, poison some DNS or control a network segment your traffic routes over….

When a CA issues a certificate, it adds a digital signature (typically RSA based) around the originating certificate request. With in the certificate data are the various fields about the subject of the certificate, as well as information about who the issuer is, including a fingerprint (hash) of the issuer’s public certificate.

Previously CAs would issue certificates with an MD5 of their certificate. MD5 was replaced with SHA1, and around 2014, SHA1 was replaced with SHA2-256.

This signature algorithm is effectively the strength of the trust between the issuing CA, and the subjects certificate that you see on a web site. RSA gets very slow as key sizes get larger; today’s services typically use RSA at 2048 bits, which is currently strong enough to be deemed secure, and fast enough not to be a major performance overhead; make that 4096 bits and its another story.

Not only is the RSA algorithm being replaced, but eventually the SHA2-256 will be as well. The replacement for RSA is likely to be Eliptic Curve based, and SHA2-256 will either grow longer (SHA2-384), or to a new algorithm (SHA3-256), or a completely new method.

But back to the hundreds of CAs: you probably only use a small number in your organisation. LetsEncrypt, Amacon, Google, Verisign, GlobalTrust, etc. However, all CAs are seen as equally trusted when presented with a valid signed certificate. What can you do to prevent other CAs from issuing certificates in your (DNS) name?

The answer is simple: the DNS CAA record: Certificate Authority Authorisation. Its a list that says which CA(s) are allowed to issue certificates for your domain. It’s a record in DNS that is looked up by CAs just before they’re about to issue a certificate: if their indicator flag is not found, they don’t issue.

As it is so rarely issued, you can set this DNS record up with an extremely low TTL (say, 60 seconds). If you get the record wrong, or you forget to whitelist a new CA you’re moving to, update the record.

DNS isn’t perfect, but this slight incremental step may help keep public CAs to only issue from the CA’s you’ve made a decision to trust, and for your customers to trust as well.

DNS CAA was defined in 2010, and an IETF RFC in 2014. I worked with AWS Route53 team to have the record type supported in 2015. You can inspect CAA records using the dig command:

dig caa advara.com
; <<>> DiG 9.10.6 <<>> caa advara.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 5546
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0
;; QUESTION SECTION:
;advara.com. IN CAA
;; ANSWER SECTION:
advara.com. 60 IN CAA 0 issue "amazon.com"

Here you can see that advara.com has permitted AWS’s Certificate Manager, with its well known flag of “amazon.com” (and its a 60 second TTL).

You’ll also see that various online services will let you inspect this, including SSLLabs.com, Hardenize.com, and more.

Putting a CAA record in DNS typically costs nothing; its rarely looked up and can easily be changed. It protects you from someone tricking another CA into issuing certificates they think are legitimate; and this has been seen several times (think how valuable a google.com certificate would be ot intercept (MITM) mobile phones, searches, gmail, etc) – and while mis-issuance like this MAy lead to Browser/CA forum expulsion, and eventual client updates to distrust this CA, its far easier to prevent issuance with this simple record.

Of course, DNS Sec would be nice too…

Project & Support versus DevOps and Service teams

The funding model for the majority of the worlds IT projects is fundamentally flawed, and the fall out is, over time, broken systems, lacking security and legacy systems.

It’s pretty easy to see that digital systems are the lifeblood of most organisations today. From banking, stock inventory and tracking, HR systems. And the majority of these critical operations have been deployed as “projects”, and then “migrate to support”. And it’s that “migrate to support” that is the problem.

Support roles are typically over subscribed, and under empowered. It’s a cost saving exercise to minimise the overhead, by taking the more expensive development resources and moving them to a fresh project, while more commodity problem solving labour comes along to triage operational run time issues. However, that support function has no history in the design and architecture, and often either has no access to the development and test environments to continue doing managed change, or is not empowered to do so. The end result is that Support teams use the deployed production features (eg: manually add a user to a standalone system) instead of driving incremental improvements (eg: automatically add a user base don the HR system being updated).

Contrast with a DevOps team, of dynamic size over time. The team that builds & tests & deploys & automates this more complete lifecycle, and stays with the critical line-of-business system, becomes a Service Team. Any changes they need to perform are not applied in production locally, as is often the case with “Support teams”, but in the Development environment. This then should pass automated testing and feedback loops before being promoted to a higher environment. Sounds great, yeah?

Unfortunately, economic realities are the constraint here. Both the customer, and consultancy are trying to minimise cost, not maximise capability. And navigating a procurement and legal team is something that the procurement cycle wants to do as rarely as possible, not on a continuous basis.

Contrast a Service team focus, of variable size over time, containing different capabilities over time. The cost for this team varies over time, based upon the required skill set. The team objective is to make the Best Service they can, and need to drive from metrics: Availability, Latency, Accuracy while meeting strict security requirements.

From the Service team’s perspective, they obviously need remuneration for their time, but also want to take a sense of pride in their work, and a sense of achievement.

A Support Team is not a Service Team, as they don’t have the full Software Lifecycle Management capability and/or Data Lifecycle Management capability. A Service Team should never be one person; that’s one step away from being zero people. A Service Team may look after more than one service, but not so many that they do not have crystal clear focus on any service.

AWS Partner Ambassador Meetup #1, Seattle, August 2019

The inaugural global meetup of the top partner engineers from around the world.

Another long overdue post from three weeks ago…

On the heel of the AWS Canberra Public Sector Summit 2019, and after some 24 hours at home with my family, I joined my fellow AWS Partner Ambassador at Modis – Steve Kinsman – and we started to wend our way across three flights to get to Seattle, departing a few minutes after midnight on Friday night/Saturday morning.

That guy behind me better not kick my seat! 😉
Continue reading “AWS Partner Ambassador Meetup #1, Seattle, August 2019”