A look at how to design and build services, systems, networks, hosts and applications that are designed to be able to successfully deal with a security compromise.
The deck also touches on the topics of self-healing systems and potential applications of machine learning to the problem space.
2. Why?
"We may be at the point of diminishing returns by trying to buy
down vulnerability"
"maybe it’s time to place more emphasis on coping with the
consequences of a successful attack, and trying to develop
networks that can ‘self-heal’ or ‘self-limit’ the damages inflicted
upon them”
Gen. Michael Hayden (USAF-Ret.) ex NSA and CIA head
February, 2012
31. Self-heal – what is healthy?
• Client’s user behaviour
• Client’s software behaviour
• Client’s system behaviour
• Clients behaviour
32. Self-heal – what is healthy?
• Service behaviour
• Software behaviour
• System behaviour
• Network behaviour
• Operations / staff (and their credentials)
33. Putting it into practice
two (simplistic) examples
and one point for consideration
34. Example #1 (semi-passive response)
• Client SQLi
• Database dump – sequential record read
• Response taken
• Alerts raised
• Snapshots taken
… facilitates full post indecent analysis
35. Example #2 (active response)
• Ops client side attack
• Credentials stolen
• Anomalous credential behaviour
• Alerts sent
• Credentials automatically disabled
… exposure window minutes
36. Point for consideration
• Red and Blue teams
• Red team could be a
Netflix-esq simian army
• Blue team could be your
self-healing systems
37. Conclusions
• Design and implement compromise readiness
• Self learning / healing the future
• Plan for worse case*
• Test scenarios continually
38. Europe
Manchester - Head Office
Cheltenham
Edinburgh
Leatherhead
London
Milton Keynes
Amsterdam
Copenhagen
Munich
Zurich
North America
Atlanta
Austin
Chicago
Mountain View
New York
San Francisco
Seattle
Australia
Sydney
Thanks! Questions?
ollie.whitehouse@nccgroup.com
Notes de l'éditeur
These aren’t the only attack paths. For example you could attack upstream i.e.:
Third party software components source repos.
Customer threat actors could go after the service’s corporate IT etc.
Packaging, testing & deployment
Careful trust and architecture boundary considerations
Kill passwords forever (2FA/MFA)
Ability to easily monitor to varying degrees (live, log or full packet capture)
Ability to easily isolate aspects while maintaining service
Ability to easily operate while isolated from known compromised / good
Ability to roll credentials / secrets
Ability to query service properties, behaviour, performance etc.
Ability to increase protective monitoring / active response
Ability to verify integrity* (configuration, software, package, system, host, network etc..)
Ability to increase integrity verification frequency
Ability to define, model or learn healthy / normal
Ability to define and execute reactions to events / situations
if this then that
Consider (less tried and tested – or ‘it worked in PhD project’)
Machine learning for behaviours at all layers (we’ve seen this productized in a focused manner)
Ability to rate or access limit functionality automatically and/or manually in high alert situations
Something we’ve not considered
Educate in defensive coding and functional design
Consider 3rd party component integrity verification
Ability to verify source control integrity
Ability to verify build server integrity
Ability to verify development to live assets integrity
Archive releases (artefacts, source, test output and logs)
Develop compromise unit test cases for functionality in systems and software
Test compromise scenarios in pre-production
Able to define ‘security healthy’
Plan for highest level of access compromise
Ensure configuration management
Ensure configuration integrity monitoring
Protective monitoring and anomaly detection
Have the ability to time-line across many distinct sources of data
Take inspiration* from Netflix’s Simian Army and fire drill
investigating, segregating, operating, rebuilding, repairing, rolling and reintegrating
You need to be able to define system, network, host, software and service
Integrity verification or other high confidence indicator
Ability to identify likely root cause and remediate*
Alert (operations)
Opt out of operation
Snapshot (machines / configuration / logs)
Revert (to known good)
Restart
Verify
Reintegrate
Client’s user behaviour – needs to be learnt
Client’s software behaviour – do we care?
Clients system behaviour – do we care?
Client behaviour – needs to be learnt
Service behaviour – needs to be defined / modelled / learnt
Software behaviour – needs to be defined / modelled / learnt
System behaviour – needs to be defined / modelled / learnt
Network behaviour – needs to be defined / modelled / learnt
Operations / staff (and their credentials) behaviour
Client’s database queries usually*(1) non sequential across records and non complete result sets*(2)
Query observed doing select * from what is usually a source(*3) of the same base 75 queries
Results return speed is rate limited*(4) with marginal effect
Alert is raised to client security point of contact
query, source, destination (including db and table), time and date
reaction by system
Snapshot database logs and source machinetaken into security incident zone for client / your analysis
… facilitates full post incident analysis
An operations desktop gets rolled by client side
Credentials stolen and used at a higher rate*(1) than normal during non incident window*(2) or against systems not part of incident group*(3)
Credentials used from hosts other than expected*(4)
Alert sent to operations shift manager and security operations centre
sources, destination, times and dates
reaction by system
Credentials automatically disabled
… exposure window minutes
One large company has Red and Blue teams
Red always attacking the services
Blue always looking trying to detect and mitigate
Idea:
Your Red team could be a Netflix-esq simian army
Your Blue team could be your self-healing systems
Result = If stuff isn’t happening then it’s broken!
Services, systems and software need to be compromise ready – old school:
Secure engineering
Intrusion prevention
Principal of least privilege
Segregation
Intrusion detection
Current approaches revolve around:
Event correlation / confidence indicators
Human analysis and intervention
Machine learning
Modelling
… it’s the way of the future …