4. Introduction
• A brisk introduction to security monitoring
• How do you monitor cloud services?
• What should you do with the data you collect?
• Keeping up and keeping sane
• Opportunities for security engineering
6. “Security monitoring is the process
of generating security events
based on data gathered from your
IT environment.”
7. “Ability to detect threats in
near real time”
“Ability to respond after
a successful attack”
8. CSC 6
Maintenance, Monitoring, and Analysis of Audit
Logs
“Collect, manage, and analyze audit logs of
events that could help detect, understand, or
recover from an attack.”
Security is a process, the first part is gathering the data and the second part is analysing it
Pull logs and telemetry from wherever you can to monitor your systems
A security event is loosely defined as something of interest and this will vary from environment to environment. Some generic examples:
When a user is given new permissions
When a firewall rule is changed
Authentication failures
Authorisation failures
A new service is started or an existing one is changed
To give yourself both a detection and a response capability.
CEO coinbase “The only thing worse than being hacked, is being hacked but not knowing how it happened.”
Compliance.
But why really?
malware dogs
Hivemind
The 400 pound hacker
NSA and GCHQ
In a traditional Enterprise you’re collecting logs from your endpoints, NAC, IDS, HIDS, web proxies, firewall logs, NetFlow
Send it all to the box (SIEM)
Non-traditional enterprise IT stack can be constructed completely from a wide range of cloud services
Cloud native or Cloud first
Non-traditional office setups
Employees are not static on office LANs
Glued together with the apps we use
In a non-tradition cloud native setup what you have is an array of services sitting on the internet holding your data and running your business.
Generic security monitoring pipeline
The logs are there they just need to be pulled or have somewhere to push too
Need to get used to APIs, webhooks and probably JSON
These services actually offer some rich logs
(some) Cloud service providers do try and stand out based on their security practices and openness
Learn come curl foo
Postman (https://www.getpostman.com/)
Slack is where we had a gap for a period of time
They offer an API but it’s a pull with a fixed window size
Self host (ELK stack) or use a cloud service (sumo logic)
Aggregation of service logs in Sumo Logic
Used for search and to create security events —> alerting goes to slack
For this data to be useful to your security team you need to apply some logic (or intelligence) to create a security event
The most important part of the security event is the associated action
There isn’t always a one-to-one mapping of event to alert to response action.
Some events clearly require an immediate alert and a quick response.
Some may require a number of occurrences before they become significant and some may need to be correlated with other events before action can be taken.
Cloud service logs often reflect the specific service calls happening beneath the hood.
They are often actually directly the API calls being made to fulfil that user action.
These APIs are a good jumping off point to help identify the distinct actions that you are interested in.
This has been vastly over complicated with threat intel vendor
To start with your looking for bad changes and misconfigurations
Document what is wrong, write an alert for it and track it’s remediation
CloudTrail provides a history of AWS API calls for your account - for every type of interaction (console, CLI and SDK)
Turn on CloudTrail
Track IAM like your life depends on it
Service access logs such for S3, CloudFront, and ELB/ALB contain every call made to this services from the public
VPC Flow Logs
Set of “prescriptive guidance” for configuring security options
Within that there is a set of change monitors (using CloudWatch alarms).
https://aws.amazon.com/blogs/security/announcing-industry-best-practices-for-securing-aws-resources/
A search for root account usage
Sent to our #security-alerts channel for review by an engineer
Action is to immediately validate the login
Administrator activity
Authentication failures
Credential / permission changes
Scope changes
MDM - mobile devices in use
- Example dashboard for Google logins
Admin activity
Access changes for repositories and teams
- People have been added to your organisation
Repositories being made public
Authentication logs. Which can be used to track where people are logging in from and how often.
What integrations have been installed
Don’t forget about your servers.
System logs (particularly auth.log)
go-audit (auditd) https://github.com/slackhq/go-audit
osquery : https://osquery.io/
Command execution, who is running what
As you increase the number of alerts your Slack channel can quickly became very noisy.
How you handle this noise is really important both to successfully identifying issues and also keeping the sanity of your team!
You should be working hard to prevent alarm fatigue or you run the risk of missing something important which may have been lost in the noise or disillusionment of your engineers.
Tuning has always been an important part of any alert-based security system.
To tune our own setup we implemented a #security-alerts-beta channel where we can experiment with new alerts and review their impact.
That is why we have been very protective of the alerts sent to the #security-alerts channel. A message sent here will interrupt the whole security team and should therefore require immediate attention.
After this review period an alert will either be promoted to the #security-alerts channel, or run on a timed reporting cycle for a regular review in #security-reports
One of the issues is making sure you have ACK’d every event in the Slack channel
Runs in a Webtask
- Result from a the slash webtask command
https://github.com/auth0/audit-droid
One of the most time consuming aspects of security monitoring is following up with users so we use audit-droid to get our users to acknowledgement a particular security event.
- Secbot has helped us stay on top of a dynamic environment. A good example of this is how we use it to track GitHub user changes.
Monitor your monitoring
There is a low a barrier of entry for using these tools. It’s not a large capital investment.
Closing the attacker and defender gap (perceived attacker asymmetry)
Your infrastructure isn’t a black box anymore, series of API calls
Use the APIs to get visibility into the state and behaviour of your assets
Then start thinking about how the API calls can be abused by an attacker, what path would they take and how can you disrupt it?
It is dynamic but also provides many hooks to control and monitor
This is step one, next step is event driven security
Engineering led rather than vendor led
- Our in house MDM monitor to prevent unsafe mobile Slack app use.
I’ve built my career using open source tools and now we get to give back
Sharing and collaboration
Look for the Slack, Netflix, dropbox and airbnb teams.