Wikileaks Vault 7: Tech’s dirty laundry

Wikileaks have dumped another huge cache of data ex-filtrated from behind the closed doors of three-letter acronym agencies: BBC, ABC, Independent.

Apple’s comments was wonderful, according to the BBC link above:

Apple’s statement was the most detailed, saying it had already addressed some of the vulnerabilities.

This is the crux of good security posture. Vulnerabilities exist in so much of what we use, the point is to be continuously addressing the issues and applying security before it is a problem.

I see patch cycles in organisations that can be measured in tectonic plate movement intervals. There are security updates available every few hours, yet organisations wait sometimes years to apply these.

Its simple:

  • Do you know more than the software vendor about security?
    1. Probably not; therefore, take their advise and apply all pending security updates.
    2. Yes, I do!!; no, you probably don’t. See 1 above.
  • Do you want to have an exploit situation caused by a KNOWN vulnerability with a KNOWN patch?
    1. No, cause I’d look pretty foolish if this happened. Apply security patches.
    2. Yes, because that’s the corporate policy and I don’t care about my job!

There’s not much we can do about UNKNOWN vulnerabilities, except that over time, some of the UNKNOWN become KNOWN, and they then become the PATCHED.

Now take this approach to your entire operating environment. Production servers, monitoring servers, CI systems, bastion hosts, VPN servers, proxy servers, Wikis, revision control systems, routers, switches, printers. The list goes on, but they all require maintenance, because writing good software is hard, and what looks like good practice today may become relegated tomorrow.

CloudPets security fail is not a Cloud failure

I spent several years at Amazon Web Services as the Solution Architect with a depth in Security in A/NZ. I created and presented the Security keynotes at the AWS Summits in Australia and New Zealand. I teach Advanced Security and Operations on AWS. I have run online share-trading systems for many of the banks in Australia. I help create the official Debian EC2 AMIs. I am the National Cloud Lead for AWS Partner Ajilon, and via Ajilon, I also secure the State Government Land Registry in Ec2 with Advara.

So I am reasonably familiar with configuring AWS resources to secure workloads.

Last week saw a poor security failure; the compromise of a company that makes Internet-connected plush toys for children that lets users record and playback audio via the toys: CloudPets. Coverage from Troy Hunt,  The Register, ArsTechnica.

As details emerged, a few things became obvious. But here are the highlights (low-lights, really) to me that apparently occurred:

  • A production database (MongoDB) was exposed directly to the Internet with no authentication required to query it
  • Audio files in S3 were publicly, anonymously retrievable. However, they were not listable directly (no worries, the object URLs were in that open Mondo database)
  • Non-production and production systems were co-tenanted

There’s a number of steps that should have been taken technically to secure this:

  1. Each device should have had a unique certificate or credential on each of them
  2. This certificate/credential should have been used to authenticate to an API Endpoint
  3. Each certificate/credential could then be uniquely invalidated if someone stole the the keys from it
  4. Each certificate/credential should only have been permitted access to fetch/retrieve its own recordings, not any recording from any customer
  5. The Endpoint that authenticates the certificate should have generated Presigned URLs for the referenced recordings. PreSigned URLs contain a timestamp set in the future, after which the Presigned URL is no longer valid. Each time the device (pet) would want a file, it could ask the Endpoint to generate the Presigned URL, and then fetch it from S3
  6. The Endpoint could rate limit the number of requests per certificate pre minute/hour/day. Eg, 60 per minute (for burst fetches), 200 per hour, 400 per day?

If the Endpoint for the API was an Ec2 instance (or better yet, an AutoScale Group of them), then it could itself be running in the context of an IAM Role, with permission to create these Presigned URLs. Similarly an API Gateway running a Lambda in a Role.

Indeed, that Endpoint would have been what would have used the MongoDB (privately), removing the publicly facing database.

I’ve often quoted Voltaire (or Uncle Ben from Spider Man, take your pick): “with great power comes great responsibility“. There’s no excuse from the series of failures that were conducted here; the team apparently didn’t understand security in their architecture.

Yet security is in all the publicly facing AWS customer documents (joint responsibility). It’s impossible to miss this. AWS even offers a free security fundamentals course, which I recommend as a precursor to my own teachings.

Worse is the response and lack of action from the company when they were alerted last year.

PII and PHI is stored in the cloud. Information that the economy, indeed modern civilisation depends upon. The techniques used to secure workloads are not overly costly, they mostly require knowledge and implementation.

You don’t need to be using Hardware Security Modules (HSMs) to have a good security architecture, but you do need current protocols, ciphers, authentication and authorisation. The protocols and ciphers will change over time, so IoT devices like this need to also update over time to support Protocols and Ciphers that may not exist today. It’s this constant stepping-stone approach, to continually be moving to the next implementation of transport and at-rest ciphers that is becoming a pattern.

Security architecture is not an after-thought that can be left on the shelf of unfulfilled requirements, but a core enabler of business models.

Looking back at 2016, and forward to the future

It’s going to be interesting to see how the Gartner Magic Quadrant for Infrastructure as a Service looks when it comes out this later year (assuming around August time again): the gap between the players, and the names that disappear.

2016 saw 5 competitors drop out compared to Gartner’s 2015 edition, and now more recently Cisco’s $1B investment in Intercloud seems to have ended; however they’ve now purchased AppDynamics who have been pushing very heavily into the cloud, especially around the microservices world. It’s interesting to see the the players shuffle around:

Year Count Differences to previous year
2013 15
2014 15 Merged IBM + Softlayer, -Tier3, -Savis, +VMWare, +Google, +CenturyLink
2015 15 -GoGrid, -HP, +NTT, +Interoute
2016 10 -Joyent, -DimensionData, -Verizon, -CSC, -Interoute

Meanwhile at AWS, services continued to innovate, reliably and without any major interruptions. May 2015 saw VPC S3 Endpoints launched, permitting private interconnect between VPCs and S3 service, and there’s been promises of more of this to follow. Re:Invent 2016 saw enhanced distributed account controls with AWS Organisations being announced (only in preview, so subject to change), enhancing the corporate controls in a multi-AWS-account set-up.

AWS did open up four additional Regions in 2016 as promised — Ohio, Canada, London, and India. The footprint of its Edge Locations also expanded — although some of these were additional Edges in the same cities (at different interconnect/peering providers). That’s OK; as the Edges can be turned on and off transparently around maintenance windows, so having multiple Edges in a location may indicate how important this location is.

I’ve found it particularly interesting to see CloudFront move from a flat network of Points of Presence (POPs), to a two-tier caching model with “Regional Edges” servicing requests from “Global Edges”. As CloudFront has spread wider into more locations, there’s an increase in the number of origin requests (misses) made to your origin service, which even with modest TTLs on objects can still be an overwhelming volume of traffic.

From a networking perspective, the availability of IPv6 on Service Endpoints, and now within the VPC is also a sign of evolution. These EC2 evolutions have happened in the past — perhaps not so noticeable:

  • from 32 bit to 64 bit VMs
  • from Para-Virtualisation (PVM) to Hardware-assisted Vitalisation (HVM) for EC2
  • to newer generations of Instance types (helped by an improved pricing point)

And now we see the start of the move from IPv4 to IPv6. It will take a few years, but we’re standing at the edge of massive change. Yet another migration. Only yesterday have we seen the launch of IPv6 for ELB within VPC – something that used to exist for ELB in what is now called “Classic” (all customer shared networking EC2), and today IPv6 within the VPC in all existing Regions (from what was just US-East-2 at launch; which in itself was interesting to see Ohio uses as a canary for the new feature deployment instead of the traditional US-East-1).

For the Debian the EC2 images that I help maintain, we started to support the Elastic Network Adaptor (ENA) at the end of 2016 after I attended the first Debian Cloud Sprint in Seattle – with thanks to Marcin Kulisz for his assistance. For those not familiar, Debian is a 23 year old non-profit, open-source operating system, which underlies much of the modern Linux ecosystem. I’ve been participating since the late 1990s, and a member of the project since 2000 (18 years now). Today I help maintain the Debian AMIs on EC2 for (at least) tens of thousands of AWS customers (may be much higher).

Debian has been selected to be one of the options of operating system in AWS’s new LightSail product: point-and-click VPS that neatly wraps up the details of VPC, Security Groups and storage into a simple model. This brings the beauty of Debian to even more people, taking away the long-held myth that Linux is hard.

What’s in store for 2017

For Debian: In 2017 we’ll move to make the images even more transparent to consumers than they are now with the help of the very talented maintainer of FAI for the last 20 years or so, Mr Thomas Lange (whom I have had the pleasure of knowing for many of those years since we met at DebConf 1). Marcin Kulisz, Anders Ingemann and others have played a major part in this, and of course, the other 800+ Debian Developers world-wide, and of course the contributors who report bugs, review code and help ensure that Debian remains as transparent as possible and true to its goals.

For the AWS platform, storage pricing continues to drop; and while it took a while to get the cents-per-GB-per-month, I’m sure we’ll see cents-per-TB-per-month not too long from now. Others say Cloud storage will be “free” (little “f”), but I just think the order of magnitude for charging will change. Compute edges down in price too; new instance types will come, and those who architect (and automate) their deployments well (CloudFormation, Auto-scale and Launch Configurations) can and will easily adopt them.

Status Quo: All Change

What’s become clear is that for any cloud deployment, there is constant change and maintenance in order to be able to take advantage of improvements to the platform over time. Be that re-deploying your app servres with new operating system patches, modifying VPC architectures (Endpoints, NAT GW, IPv6), etc. I guess the main things these days is to be pretty comfortable with a quote from Heraclitus (535-475 BC): “Change is the only constant in life“.

Meanwhile, there’s another whole story around my work that’s been very satisfying and exciting, but that’s a story for another day…


If you’re interested in AWS and Security, then please check out my training at https://nephology.net.au/, where in a 2 day in-person class we cover above and beyond the AWS courses to ensure you have the knowledge and are prepared for the agile world of running and securing environments in the AWS Cloud.

AWS Zero to Hero in a few hours: Environment creation and Deployment at speed

I had a colleague approach me asking to create a new environment for him. Previously I had helped create a CloudFormation template defining an enterprise VPC, and he had previously created CloudFormation templates for his work loads in his development environment.

I wasn’t intending this to be a race, but it happened pretty quickly as we’d prepared our templates for other environments previously. So starting around 10:30am, here’s what we did:

  1. Create a new email Distribution List for the master (root) account on the corporate mail system.
  2. Sign up a new account, and lock it down. Hardware MFA for the root account, IAM Group (Admins) for local IAM users, and an initial IAM user (me) with MFA turned on for the user. Customer Managed Policy – Administrator access, but with an IP Address Condition – assigned to the IAM Group. Password policy enabled, and SST disabled in all except US East 1 and our commonly used (and closest) Region. Set Challenge Response questions for support, adjust comms preferences (to none).
  3. Configure the SAML (AD FS) provider for the organisation. Create several IAM roles for Web SSO via the SAML IDP: Network team, Security team.
  4. Create an AWS cross-account IAM role for the security account, with read-only privileges. This enables the proactive DevSecOps to kick in.
  5. In our security account we adjusted the S3 Bucket policy to permit the new workload account to log to it. In my billing account, issue a consolidated invite to the new workload account.
  6. Back in my new workload account, accept the Consolidate billing request, and configure Global CloudTrail to the security account bucket above. This then automatically filters to my security log processing and alerting.

By this time it’s around noon, and we were ready to create our VPC. After taking an initial IP allocation from the initial topology (I’ve been using a /20 CIDR block for most VPCs, and either locating a significant workload, or multiple smaller workloads, in each VPC).

I’ve maintained my VPC Template for some time, progressively adding more “baseline” features that I love to have present – and sharing it with those around me to help accelerate (and standardise) their environments):

  • VPC designed across three AZs, but with addressing consistency and space to add a fourth AZ for each allocation, with:
    • A set of subnets for Internal Load Balancers. No route to the Internet for these.
    • A set of (small) subnets for (relational) databases. No route to the Internet here again.
    • A set of subnets for direct Internet access, such as for External Load Balancers and services that are facing the internet directly… routing the internet via the IGW.
    • A NAT Gateway per AZ (housed in the Internet subnets above)
    • A set of subnets for Application servers, using the NAT GW Per AZ. Hence a routing table per AZ.
    • A set of subnets for other “backend” servers, in case there is anything else that we’ll want to segregate out from the Application servers internally.
  • VPC Flow Logging
  • CloudWatch logs group with retention period set
  • SNS Topics for app servers to send default Alerts and Escalations to.
  • VPC Endpoints for S3
  • RDS DB Subnets
  • etc, etc.

The separation of Internal Elastic Load Balancers into their own contiguous CIDR block is to make my life simpler with the traditional on-premise firewalls. Naturally I expose my ELBs to my clients, but I am required to also authorise the on-premise firewall to egress traffic into our VPC. By having one contiguous block, this makes it one destination rule in this legacy equipment by super-netting the three contiguous blocks. For example:

  • ELB in AZ A: 10.0.0.0/26
  • ELB in AZ B: 10.0.0.64/26
  • ELB in AZ C: 10.0.0.128/26
  • Reserved block if there were AZ D: 10.0.0.196/26
  • Total range for ELBs in one CIDR: 10.0.0.0/24

It’s important to observe the natural block boundaries of CIDR ranges, so choose carefully and use various web tools to help you with address calculations. As noted above, there’s some left over space that I’m not currently allocating: that’s the price to pay for being prepared for the future in an IPv4 world, but its better that than having to re-jig subnets after the workload is live (I’ve had to do this with significant government workloads in order to switch from 2AZs to using 3AZs, but I’m glad I did for several reasons).

After 10 minutes of this VPC creation, we were ready for DirectConnect sub interfaces to drop in, and initial connectivity back to the on-premise network, to be supplemented by a VPN over Internet on a lower priority, preferenced by BGP weightings.

After this came a few S3 buckets: one for holding software and associated ‘artefacts’ for the development cycle, and other for holding logging data (ELB, S3, etc). A quick switch to the Development account and a read-only policy for the Development ‘release’ bucket, and artefacts are ready to be pulled into this Test environment.

After an initial sync, the CloudFormation templates for the EC2-based workloads were ready to roll, with parameters for the new logging destinations, artefact sources from S3 buckets, and subnets options.

By 4pm, the workload was up and stable. Ready for the 6pm call to resize all Autoscale groups to zero, which would be reversed at 6am.

Now this wouldn’t be possible without the support of the networking team looking after the on-premise routers and the direct connect VLAN allocations, or the enterprise server team for creating the email Distribution List, and Claims on the SAML Identity Provider (IDP): it’s as a team we manage to get such velocity at delivery.

But the real key to all of this: templating and automation. Managing changes via a template makes it repeatable; that’s what makes updates to CloudFormation just as exciting for me as updates to the services you can configure yourself via CLI, API or Web Console.


If you’re interested in AWS and Security, then please check out my training at https://nephology.net.au/, where in a 2 day in-person class we cover above and beyond the AWS courses to ensure you have the knowledge and are prepared for the agile world of running and securing environments in the AWS Cloud. Every student on our course gets a complementary Gemalto hardware MFA for use with any AWS account.

The new AWS Security Certification

Thursday I sat the new AWS Security Certification introduced at Re:Invent 2016, currently in “beta” and apparently over-subscribed to the point that AWS has removed content from the Certification web site and stopped more candidates from taking sitting this test at this time.

Being a beta, the exam is US$150 (will be US$300 post-beta), and unlike the “Generally Available” existing certifications which disclose your result immediately, we’ll be waiting until the end of March to get a final answer. So perhaps then I’ll be eating my words and keeping quiet! 🙂

Like the rest of the AWS Certifications, it’s proctored via Kryterion and their WebAssessor platform. At 170 minutes, I was presented with just over 100 questions of multi-choice answers, where the response was either a “choose 1” answer, “choose 2”, “choose 3”, or “chose all that apply” to the question or statement.

This did feel like a beta: one question was shown the three of the six responses to chose from listed as “[Reserved for beta]”. One question presented to me twice. Errant spaces appeared in one example IAM Policy. There was some inconsistent capitalisation in the text of some questions.

At one stage I tried to submit comments for the above issues into the supplied input box, but somehow this triggered the “security” module for WebAssessor to immediately lock me out and require a Proctor to unlock it. This presented a new learning: the Proctor was given two options to unlock the screen: “End Exam”, and “Re-start”. Luckily “Re-start” was more “resume”, and I continued. Kryterion, perhaps some better wording here?

I was pleased the the challenges that the questions presented, and generally felt the mix was a good test of a broad range of AWS Security. It took me the majority of the time to get through and then review my responses, so I wouldn’t say it was easy. It felt like more a professional level cert than an associate level given the length of time. I would have liked some more crypto questions — KMS and key sharing, ELB and SSL Policies, etc.

Clearly putting together any certification on a platform like AWS, which is constantly evolving, is like trying to hit a moving target — the Certification team have done a great job brining these additional Speciality certifications to market.

 

Footnote: I did reach out to the global head of certification at AWS to submit these comments and try and get some indication if these issues are known or being fixed, but after 2 days I haven’t had a response. Either way, I hope this feedback is helpful.


If you’re interested in AWS and Security, then please check out my training at https://nephology.net.au/, where in a 2 day in-person class we cover above and beyond the AWS courses to ensure you have the knowledge and are prepared for the agile world of running and securing environments in the AWS Cloud.