A Two-part Blog Series and Upcoming Cloud Security Alliance Webinar
In talking with end-user organizations, we’ve seen and heard lots of misconceptions and mistakes over the years – and even espoused a few ourselves. As Head of Products for Valtix, I’ve been in a unique position to understand where enterprises are coming from and see the lessons learned, too often the hard way. So I thought we should share, and hopefully, some of you can learn these lessons the easy way! Please note that most of the findings will be IaaS and PaaS primarily, i.e. EC2, VPC, RDS, DynamoDB, S3 etc.
I’m writing this two-part series to highlight what we’ve learned from customers, security experts, and the 10 years of experience of our team in using public clouds. I’m also VERY excited to be working with Roy Long, an experienced cloud architect with one of our customers, to deliver a Cloud Security Alliance webinar on the highlights of our Top 10 AWS Security Mistakes. This is on October 6, 2021 – register here. We will be digging deep on a few of these that are particularly interesting to Roy and me. For the blog, I’ve broken up the Top 10 into:
- Native Controls
- PaaS Security
- Process and Culture
We’ll cover native controls in this blog, and visibility, PaaS security, and process and culture in part 2.
Native network security controls are useful but don’t depend on them entirely
The knobs provided by public clouds are important to use, and they reduce the attack surface, but depending solely on them for your security does not prevent attacks, stop exfiltration or avoid lateral movement of attackers.
1. Assuming Security Groups and ACLs are enough to protect against attacks and stop exfiltration
This may sound too obvious to security experts, but there’s still a lot of deployments that just rely on security groups and ACLs. Security groups are basically stateful firewalls, use them to reduce the attack surface: only open inbound 443 for web applications, 3389 for RDP, or 22 for SSH by admins. But assuming that you will be protected from advanced attacks is a serious fallacy and a lot of cloud apps are still using security groups alone! Similarly, access control lists (ACLs) can be used to lockdown the source traffic to your enterprise networks or home IP’s of your admins (yeah, we’re all connecting from home). But with no advanced inspection against malware or ransomware in play, any home device or corporate machine can increase your risks.
Oh, and there is no logging of security groups… so you actually don’t know where the bad guys came from, which Linux machine they exploited and where their command-and-control (C2) originated from. And there’s no packet capture (PCAP) of when the attack happened, so incident response (IR) is flying pretty blindly waiting for the next attack.
- Deploy advanced network security that provides visibility into traffic flows and DNS queries across your instances, VPCs, PaaS and correlates the cloud asset information with threat intelligence.
- Deploy advanced network security that actually inspects the content of the traffic (WAF for web traffic, IDS/IPS, and AV for all traffic), especially encrypted traffic that is being aggressively used by hackers to exfiltrate data or download their ransomware toolkit
2. Using the default outbound security group of 0.0.0.0/0 (allow any/all)
This default is one of the biggest risks for any cloud environment, whether it’s sensitive/confidential/production or developer/QA accounts. Since security groups have no logging of allowed/denied traffic, you don’t know if data is being exfiltrated and which instances are involved. And, trying to correlate all the different AWS logs and GuardDuty findings won’t help if you can’t trace the entire attack killchain in one single place.
A big reason outbound security groups are wide open is that it’s practically impossible to list all the different IP addresses you want to allow (good destinations) or deny (known bad guys). And, to do this based on the workload profile: Dev wants full access to all of GitHub.com, while PCI and prod must be restricted to GitHub.com/myOrgRepo and a few domains or URLs.
- Implement GeoIP controls and FQDN/URL filtering to ensure you’re only connecting to approved FQDN/URL categories and specific approved sites, while blocking malicious sites and unnecessary countries.
3. Leaving east-west security to chance
Gone are the days when you only had 5 VPCs maximum. AWS best practices recommend creating the smallest blast radius, i.e. each application gets its own VPC and perhaps runs in a separate account. Some organizations allow each team or each developer to have their own VPC and even AWS account. This makes sense, but you face the challenge of now connecting them and ensuring the appropriate level of security. A common design pattern in AWS is to use a hub-n-spoke design with AWS Transit Gateway providing inter-VPC connectivity. And shared services such as Active Directory, common file shares, and databases are in a services VPC. This raises the question of how do you ensure lots of low-trust VPCs (say developers, test/QA, partners) don’t have the same broad network access as critical ones (production, compliance)? The same goes for on-premises networks connecting into your cloud infrastructure.
- Get visibility: Even if VPCs and applications of the same trust level are talking to each other, a broadly open security group between VPCs gives you no information on whether all’s well or attackers are moving laterally.
- For high-trust VPCs: Implement advanced network security that inspects even encrypted traffic (transparent forward proxy) to ensure that malware is not moving around and data is not being exfiltrated with DLP and URL filtering.
- For low-trust VPCs: For internal users such as developers and test/QA VPCs implement access control based on tags, i.e. “dev” VPCs cannot connect to prod VPCs or “partner” VPCs cannot connect to “pci” RDS. Use of tags is now commonplace in AWS for cost allocation. You can also easily implement FQDN filtering on egress traffic using forwarding mode inspection without decryption to ensure that “dev” VPCs are not connecting to malicious or inappropriate sites. Ideally, you want to also decrypt, inspect and re-encrypt this low-trust traffic to reduce your attack surface. If you have an auto-scaling network security solution with a low overhead of maintenance then a single solution can work across the board.
4. Thinking you’re secure because you have CSPM
Cloud security posture management (CSPM) tools like Palo Alto Networks Prisma Cloud, Check Point Dome9, CloudCheckr etc are absolutely critical to ensure good security hygiene of your cloud environment. And, AWS offers some great tools like AWS Config and AWS SCP that perform similar features. These are necessary, but not sufficient.
What’s missing here is that there is no actual protection against:
- Attacks that exploit vulnerabilities in your software whether it’s Linux systems or any of the applications (NGINX, WordPress, Joomla, etc).
- Lateral movement of threats, especially from frontend to backend and other connected systems across all the VPCs that maybe even be private.
- Exfiltration of data
Protection against malware and ransomware is what CSPMs cannot handle, i.e. proactively stop attacks from happening. It’s a bad idea to wait for bad things to happen and then respond.
- Implement advanced inline network security that provides a defense-in-depth approach that complements the CSPM and cloud-native controls.
- Based on the identity, trust boundaries, and context of the workload, you can define the depth of inspection.
So that’s the first 4 mistakes, some detail on how and why, and a recommendation for what you do to rectify. Obviously, there’s a lot to learn about native controls and how and where to use them – and where they need augmentation. We’ll cover a bit more ground in Part 2.