Rants – Page 7 – JEB's Blog

Goodbye, iiNet.

I first found my way online around 1991, calling into BBSes in Australia (such as Dialix). When I arrived at the University of Western Australia as an undergraduate in 1994, the ISPs were starting to be born, and I subscribed to Omen Internet ISP, run by Mark Dignam and his brother

In the early years of market acquisition, Omen was consumed by iiNet, and thus I became an iiNet subscriber on ADSL. iiNet was itself founded and staffed by friends across the industry in Perth, and during this period there was plenty of innovation happening in the organisations across ADSL and DSLAMs, connectivity, routing, and speed. They were one of the first to offer naked ADSL – not requiring a telephone number subscription.

I pushed off to the UK in 2003, and upon my 2010 return, I subscribed again to iiNet. They (had been) Perth based and started by friends, and plenty of friends had worked there in engineering and other role.

As I have experimented with IPv6 tunnels back in 1999 from UWA, I looked for this in iiNet, and found only 6RD – tunnelled/encapsulated IPv6 offering from iiNet. The downside of this experiment was that packets were tunnelled from the customer site, to iiNet in Sydney. As I was in Perth, my IPv4 traffic would hit local cache endpoints, but IPv6 would traverse 50ms of Australia before getting the change to peer to cache services. It was… sub-optimal.

However, iiNet started an IPV6 blog, and showed a promise that technical engineering was continuing.

Sadly, that last saw a post in 2013, and since then, crickets.

But that was not all.

It seemed that every year, iiNet would silently introduce a new service plan. This plan would be nearly identical to the one customers were already on, but with slightly larger included data plan, or slightly faster, or slightly cheaper. In any sense, always better. Customers would have to notice this newer plan, and then request to move to it (later, this was self service in the iInet customer Toolbox). But it always took an action by the customer to ensure they continued to get value from iiNet. This meant that customers couldn’t trust iiNet to always be getting the best option. I recently discovered I was paying $10/month more than other customers, for the same speed internet access, and still unlimited downloads.

I don’t think this is overly customer focused. They are looking after their interest, rather than long term customer satisfaction and retention.

So coupled with the decline in engineering, and faced with an impressive price/performance offer from a competitor, I have finally churned away from iiNet.

I contracted Aussie Broadband on Monday 1st June at 9am. By 11am the same day I had a 1GB/sec internet connection with native IPv6 enabled (FTTP). I am in the process of porting over the home phone number I have had with iiNet for a decade.

Am I paying more? Yes.

Is it better? Yes.

I went from a 50/20 NBN unlimited plan, with a VoIP service with all Australian land-line and mobile calls included; it was AU$89/month, while iiNet’s new offering was $79. I ended up on 1000/50 NBN unlimited plan, with a VoIP services with all land-line and mobiles calls included for $169/month. 2x the price, 20x the speed.

Does that make it 10 times better? Hmm….

But more importantly, as they introduced this plan, Aussie indicated their existing customers on their legacy, more expensive yet slower plans would be migrated to this without them having to lift a finger. Proacticely better for their existing customers.

This breed customer trust.

So with two knock-out blows — innovation in engineering, and customer focus — I finally pulled the pin, giving up on the hope that the iiNet of old would engineer its way towards a modern ISP with a strong customer focus.

I have had a number of friends and colleagues move to Aussie Broadband in the last few months, and thus far I haven’t seen any one have any issues that haven’t been resolved quickly and capably. What held me back was my included phone number via iiNet, but that has now ported the phone number across to Aussie as well.

How did you get started in AWS?

Someone posed the question recently: how did you get started in using AWS?

Once upon a time…. I was working in London (2003-2010), and during my time at Vibrant Media running the IT operations team for their contextual advertising platform, I was looking for ways to serve content and process requests efficiently.

Vibrant had thousands of customers, and CommScore reporting indicated our advertising services were seen by some 49% of the US population each month (the platform was world-wide, but the CommScore report was for the US market). It was fairly busy.

In 2008 I stumbled across the then-launched AWS (starting 2006). At that time, the rudimentary controls were basic, and the architectural patterns for VPC at that time did not suit our requirements (all traffic from the VPC had to egress to the customer VPN – no IGW!). So I parked the idea, and moved on.

In 2010 I returned to Australia, and was approached by the team at Netshelter to implement a crawler for forum sites to identify the influencers in the network. Unlike my previous role at Vibrant, Netshelter had no data centres, no infrastructure, just AWS.

It was the words of Richard Brindley who said “we just have AWS, don’t worry about the bill, because anything you do in AWS is going to be vastly cheaper than what we would have done on premises”.

With only myself to architect, implement and operate the solution, I had to find ways to make myself scale. Platform as a service – managed components, was key. Any increase in pricing meant that I didn’t have to deal with the details of operations.

As a Linux developer and System Admin for the 15 years prior to that, I started with the EC2 platform. Finding images, launching them, and configuring them. Then came the automation of installation: scripting the deployment of packages as required for the code I was writing (back then, in Perl).

Pretty quickly, I realised I needed to scale horizontally to get through the work, and I would need some capability to distribute out work. I turned to SQS, and within a day had the epiphany that a reliable queue system was more important than a fleet of processing nodes. Individuals nodes could fail, but a good approach to queuing and message processing could overcome many obstacles.

In storing my results, I needed a database. I had been MySQL certified for years, writing stored procedures, creating schemas, and managing server updates. All of which was fascinating, but time consuming. RDS MySQL was the obvious choice to save me time.

As VPC capability evolved, additional layers of security became easier to implement without introducing Single Points of Failure (SPOFs), or pinch-points and bottlenecks.

From an Australian perspective, this was an interesting era: it was pre-Region in Australia. That meant that, at that time, most organisations dismissed cloud as not being applicable to them. True, some organisations addressing European and US markets were all-in, but latency and fears around the then-relevant Patriot Act kept usage low (this obviously changed in 2012!).

But in essence, the getting started advise of not worrying about the bill with respect to what the equivalent all-in-cost would have been for co-location fees, bandwidth commitments, compute and storage hardware, rack and stack time and costs, and th eoverhead of managing all these activities, meant that the immediacy and control of an AWs environment was far more effective.

I didn’t go wild on cost. Keeping an eye on the individual components mean the total charges remained sensible. As they say, look after the pennies, and the pounds look after themselves.

What was key was the approach to continuously learn. And then relearn something when it changes slightly, or unlearn past behaviours that no longer made sense.

It was also useful to always push the boundaries; reach out and ask service teams to add new capabilities, be they technical, compliance, policy, etc.

How would I start today…well, that’s another article for another day….

Death of the Data Centre; long live the Data Centre

The last time I worked inside a data centre co-lo was in2009. From East Coast US to West, from the UK to Europe, and here in Australia I spent many long hours in these windowless hubs of electronic existence.

Vibrant Media colo racks in Level 3, 111 8th Avenue Level 3 data centre.
Level 3 data centre UPS units, New Jersey
Vibrant Media cabinets, San Jose, California.
Looking inside a Fotango Dell cabinet, 111 8th Avenue, Manhattan, NY.
The IBM AIX hosts and SAN as used by Fotango at Colt Wapping data centre, circa 2003.
Data centre corridor, Level 3 China Basin, San Francisco, late 2000’s

It’s been 10 years.

I started making a data centre at my fathers manufacturing organisation in the early 1990s. As a small business it had a number of computer systems, a small 100 MBit/sec LAN, and a room with air-conditioning that we sealed off and deployed physical dedicated servers and UPS units. I recall naming every host on the network with a name starting with the letter P:

Parrot
Pigeon
Pardaloot
Pootoo
Peacock
Pacific Black Duck

You get the idea. The company was called Pelican.

By the time I attended The University of Western Australia I was of course gravitating to the University Computer Club, a student association I would end up being Vice President and then President of. During my time there with friends, we furnished a small data centre our of recycled materials in order to contain the cooling for our server farm within the expanse of the vast Cameron Hall building; this structure still stands today (webcams).

In 1997 my interest in networking and digital rights led to help found The Western Australian Internet Association, now known as Internet.asn.au. Thus not being a network.

Despite not creating or working at an ISP in these earlier years of Internet, I was reasonably proficient in the IT physical infrastructure deployment. My professional career saw me spend 20 years within the data centres of banks, education, government, financial services. I used to order millions of dollars of server blade enclosures, remote control power distribution units and dual-power transfer units for reliability, switches, load balancers, remote KVM units; and upon notification of delivery at a data centre in Manhattan (111 8th Avenue, or 6th Ave), Seattle, China Basin in San Francisco, Andover MA, Amsterdam or more, I would organise for myself or my team to parachute in, un-box, unwrap, stack, and then crimp Ethernet leads, power on, and deploy clusters of servers, then kick off initial install server deployment, retreating home to then finish the software install remotely, and bring servers online and into service.

It was all about dependencies; have the right equipment in the right place at the right time to minimise the time spent in the co-lo.

The last one that I worked in was 2009. The last one that I visited was in 2013 – and that was one of the massive halls within the sprawling Amazon Web Services (AWS) US-East-1 complex; a facility that few people ever get to see (no photos).

All that effort, the logistics and physical work of installing equipment, is now largely redundant. I create virtual data centres on cloud providers from templates with more fault tolerance, scalability, and privacy in literally 5 minutes, across the planet without having to spend my time hidden for days (to weeks) crimping Ethernet cables, balancing redundant power usage, and architecting spanning tree powered reliable layer 2 networks.

While some write of the death of the data centre, I think the data centre has changed who its direct customers are. I’m not interested in touring facilities and planning cabinet layouts. I have better things to do. The hyper-scale cloud providers have automated and abstracted so much, that it is not cost effective for me to do any of that manual work any more.

Vivre la Data Centre. You don’t need to market to me any more. Just to those Cloud providers; cut your costs, you’re a commodity, and have been for a decade.

Put your CAA in DNS!

There are hundreds of public, trusted* certificate authorities (CAs) in the world. These CAs have had their root CA Certificate published into the Trust Store of many solutions that the world uses. These Trust Stores include widely used web browsers (like the one you’re using now), to the various programming language run times, and individuals operating systems.

A trust store is literally a store of certificates which are deemed trusted. While users can edit their trust store, or make their own, they come with a set that have been selected by your software vendor. Sometimes these are manipulated in the corporate environment to include a company Certificate Authority, or remove specific distrusted authorities.

Over time, some CAs fall into disrepute, and eventually software distributors will issue updates that remove a rouge CA. Of course, issuing an update for systems that the public never apply doesn’t change much in the short term (tip: patch your environments, including the trust store).

Like all x509 certificates the CA root certificates have an expiry, typically over a very long 20+year period, and before expiry, much effort is put into creating a new root Certificate and having it issued distributed and updated in deployed applications.

Legitimate public certificate authorities are required to undertake some mandatory checks when they issue their certificates to their customers. These checks are called the Baseline Requirements, and are governed by the Browser/CA Forum industry body. CAs that are found to be flouting the Baseline Requirements are expelled from the Browser/CA Forum, and subsequently, most software distributions then remove them from their products (sometimes retrospectively via patches as mentioned above).

Being a Certificate Authority has been a lucrative business over the years. In the early days, it was enough to make Mark Shuttleworth a tidy packet with Thawte – enough for him to become a very early Space Tourist, and then start Canonical. With a trusted CA Root certificate widely adopted, a CA can then issue certificates for whatever they wish to charge.

What’s important to note though, is that any certificate in use has no bearing on the strength of encryption or negotiation protocol being used when a client connects to an HTTPS service. The only thing a CA-issued certificate gives you is a reasonably strong validation that the controller of the DNS name you’re connecting to has validate themselves to the CA vetting process.

It doesn’t tell you that the other end of your connection is someone you can TRUST, but you can reasonably TRUST that a given Certificate Authority thinks the entity at the other end of your connection may be the controller of their DNS (in Domain Validated (DV) certificates). Why reasonably? Well what if the controll erof the web site you’re trying to talk to accidentally published their PRIVATE key somewhere; a scammer could then set up a site that may look legitimate, poison some DNS or control a network segment your traffic routes over….

When a CA issues a certificate, it adds a digital signature (typically RSA based) around the originating certificate request. With in the certificate data are the various fields about the subject of the certificate, as well as information about who the issuer is, including a fingerprint (hash) of the issuer’s public certificate.

Previously CAs would issue certificates with an MD5 of their certificate. MD5 was replaced with SHA1, and around 2014, SHA1 was replaced with SHA2-256.

This signature algorithm is effectively the strength of the trust between the issuing CA, and the subjects certificate that you see on a web site. RSA gets very slow as key sizes get larger; today’s services typically use RSA at 2048 bits, which is currently strong enough to be deemed secure, and fast enough not to be a major performance overhead; make that 4096 bits and its another story.

Not only is the RSA algorithm being replaced, but eventually the SHA2-256 will be as well. The replacement for RSA is likely to be Eliptic Curve based, and SHA2-256 will either grow longer (SHA2-384), or to a new algorithm (SHA3-256), or a completely new method.

But back to the hundreds of CAs: you probably only use a small number in your organisation. LetsEncrypt, Amacon, Google, Verisign, GlobalTrust, etc. However, all CAs are seen as equally trusted when presented with a valid signed certificate. What can you do to prevent other CAs from issuing certificates in your (DNS) name?

The answer is simple: the DNS CAA record: Certificate Authority Authorisation. Its a list that says which CA(s) are allowed to issue certificates for your domain. It’s a record in DNS that is looked up by CAs just before they’re about to issue a certificate: if their indicator flag is not found, they don’t issue.

As it is so rarely issued, you can set this DNS record up with an extremely low TTL (say, 60 seconds). If you get the record wrong, or you forget to whitelist a new CA you’re moving to, update the record.

DNS isn’t perfect, but this slight incremental step may help keep public CAs to only issue from the CA’s you’ve made a decision to trust, and for your customers to trust as well.

DNS CAA was defined in 2010, and an IETF RFC in 2014. I worked with AWS Route53 team to have the record type supported in 2015. You can inspect CAA records using the dig command:

dig caa advara.com
 ; <<>> DiG 9.10.6 <<>> caa advara.com
 ;; global options: +cmd
 ;; Got answer:
 ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 5546
 ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0
 ;; QUESTION SECTION:
 ;advara.com.                    IN      CAA
 ;; ANSWER SECTION:
 advara.com.             60      IN      CAA     0 issue "amazon.com"

Here you can see that advara.com has permitted AWS’s Certificate Manager, with its well known flag of “amazon.com” (and its a 60 second TTL).

You’ll also see that various online services will let you inspect this, including SSLLabs.com, Hardenize.com, and more.

Putting a CAA record in DNS typically costs nothing; its rarely looked up and can easily be changed. It protects you from someone tricking another CA into issuing certificates they think are legitimate; and this has been seen several times (think how valuable a google.com certificate would be ot intercept (MITM) mobile phones, searches, gmail, etc) – and while mis-issuance like this MAy lead to Browser/CA forum expulsion, and eventual client updates to distrust this CA, its far easier to prevent issuance with this simple record.

Of course, DNS Sec would be nice too…

Project & Support versus DevOps and Service teams

The funding model for the majority of the worlds IT projects is fundamentally flawed, and the fall out is, over time, broken systems, lacking security and legacy systems.

It’s pretty easy to see that digital systems are the lifeblood of most organisations today. From banking, stock inventory and tracking, HR systems. And the majority of these critical operations have been deployed as “projects”, and then “migrate to support”. And it’s that “migrate to support” that is the problem.

Support roles are typically over subscribed, and under empowered. It’s a cost saving exercise to minimise the overhead, by taking the more expensive development resources and moving them to a fresh project, while more commodity problem solving labour comes along to triage operational run time issues. However, that support function has no history in the design and architecture, and often either has no access to the development and test environments to continue doing managed change, or is not empowered to do so. The end result is that Support teams use the deployed production features (eg: manually add a user to a standalone system) instead of driving incremental improvements (eg: automatically add a user base don the HR system being updated).

Contrast with a DevOps team, of dynamic size over time. The team that builds & tests & deploys & automates this more complete lifecycle, and stays with the critical line-of-business system, becomes a Service Team. Any changes they need to perform are not applied in production locally, as is often the case with “Support teams”, but in the Development environment. This then should pass automated testing and feedback loops before being promoted to a higher environment. Sounds great, yeah?

Unfortunately, economic realities are the constraint here. Both the customer, and consultancy are trying to minimise cost, not maximise capability. And navigating a procurement and legal team is something that the procurement cycle wants to do as rarely as possible, not on a continuous basis.

Contrast a Service team focus, of variable size over time, containing different capabilities over time. The cost for this team varies over time, based upon the required skill set. The team objective is to make the Best Service they can, and need to drive from metrics: Availability, Latency, Accuracy while meeting strict security requirements.

From the Service team’s perspective, they obviously need remuneration for their time, but also want to take a sense of pride in their work, and a sense of achievement.

A Support Team is not a Service Team, as they don’t have the full Software Lifecycle Management capability and/or Data Lifecycle Management capability. A Service Team should never be one person; that’s one step away from being zero people. A Service Team may look after more than one service, but not so many that they do not have crystal clear focus on any service.

M	T	W	T	F	S	S
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30