Digital Twins for Security Automation

1/19
Digital Twins for Security Automation
IEEE/IFIP Network Operations and Management Symposium
8-12 May 2023, Miami FL USA
Kim Hammar & Rolf Stadler

2/19
Use Case: Intrusion Response
I A defender owns an infrastructure
I Consists of connected components
I Components run network services
I Defender defends the infrastructure
by monitoring and active defense
I Has partial observability
I An attacker seeks to intrude on the
infrastructure
I Has a partial view of the
infrastructure
I Wants to compromise specific
components
I Attacks by reconnaissance,
exploitation and pivoting
Attacker Clients
. . .
Defender
1 IPS
1
alerts
Gateway
7 8 9 10 11
6
5
4
3
2
12
13 14 15 16
17
18
19
21
23
20
22
24
25 26
27 28 29 30 31

3/19
Automated Intrusion Response: Current Landscape
Levels of security automation
No automation.
Manual detection.
Manual prevention.
No alerts.
No automatic responses.
Lack of tools.
1980s 1990s 2000s-Now Research
Operator assistance.
Manual detection.
Manual prevention.
Audit logs.
Security tools.
Partial automation.
System has automated functions
for detection/prevention
but requires manual
updating and configuration.
Intrusion detection systems.
Intrusion prevention systems.
High automation.
System automatically
updates itself.
Automated attack detection.
Automated attack mitigation.

4/19
Can we use decision theory and learning-based methods to
automatically find effective security strategies?1
π
Σ Σ
security
objective
feedback
control
input
target
system
security
indicators
disturbance
1
Kim Hammar and Rolf Stadler. “Finding Effective Security Strategies through Reinforcement Learning and
Self-Play”. In: International Conference on Network and Service Management (CNSM 2020). Izmir, Turkey, 2020,
Kim Hammar and Rolf Stadler. “Learning Intrusion Prevention Policies through Optimal Stopping”. In:
International Conference on Network and Service Management (CNSM 2021).
http://dl.ifip.org/db/conf/cnsm/cnsm2021/1570732932.pdf. Izmir, Turkey, 2021, Kim Hammar and
Rolf Stadler. “Intrusion Prevention Through Optimal Stopping”. In: IEEE Transactions on Network and Service
Management 19.3 (2022), pp. 2333–2348. doi: 10.1109/TNSM.2022.3176781, Kim Hammar and Rolf Stadler.
Learning Near-Optimal Intrusion Responses Against Dynamic Attackers. 2023. doi: 10.48550/ARXIV.2301.06085.
url: https://arxiv.org/abs/2301.06085.

5/19
Our Framework for Automated Network Security
s1,1 s1,2 s1,3 . . . s1,n
s2,1 s2,2 s2,3 . . . s2,n
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Digital Twin
Target
Infrastructure
Model Creation &
System Identification
Strategy Mapping
π
Selective
Replication
Strategy
Implementation π
Simulation System
Reinforcement Learning &
Generalization
Strategy evaluation &
Model estimation
Automation &
Self-learning systems

6/19
Creating a Digital Twin of the Target Infrastructure
s1,1 s1,2 s1,3 . . . s1,n
s2,1 s2,2 s2,3 . . . s2,n
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Digital Twin
Target
Infrastructure
Model Creation &
Strategy Mapping
π
Selective
Replication
Strategy
Implementation π
Simulation System
Generalization
Model estimation
Automation &

6/19
Theoretical Analysis and Learning of Defender Strategies
s1,1 s1,2 s1,3 . . . s1,n
s2,1 s2,2 s2,3 . . . s2,n
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Digital Twin
Target
Infrastructure
Model Creation &
Strategy Mapping
π
Selective
Replication
Strategy
Implementation π
Simulation System
Generalization
Model estimation
Automation &

7/19
Creating a Digital Twin of the Target Infrastructure
I An infrastructure is
defined by its
configuration.
I Set of configurations
supported by our
framework can be
seen as a
configuration space
I The configuration
space defines the
class of
infrastructures for
which we can create
digital twins.
Configuration Space
*
* *
Digital twins
R1 R1 R1

8/19
The Target Infrastructure
I 33 components
I Topology shown to the right
I Components run network services, e.g.
IDPS, SSH, Web, etc.
I A subset of components have
vulnerabilities
I CVE-2017-7494, CVE-2015-3306,
CVE-2015-5602
I CVE-2014-6271, CVE-2016-10033,
CVE-2015-1427, etc.
I Clients and the attacker access the
infrastructure through the public
gateway
Attacker Clients
. . .
Defender
1 IPS
1
alerts
Gateway
7 8 9 10 11
6
5
4
3
2
12
13 14 15 16
17
18
19
21
23
20
22
24
25 26
27 28 29 30 31

9/19
Emulating Physical Components
I We emulate physical components with
Docker containers
I Focus on linux-based systems
I The containers include everything
needed to emulate the host: a runtime
system, code, system tools, system
libraries, and configurations.
I Examples of containers: IDPS
container, client container, attacker
container, CVE-2015-1427 container,
Open vSwitch containers, etc.
Containers
Physical server
Operating system
Docker engine
CSLE

10/19
Emulating Network Connectivity
Management node 1
Emulated IT infrastructure
Management node 2
Management node n
VXLAN VXLAN . . . VXLAN
IP network
I We emulate network connectivity on the same host using
network namespaces.
I Connectivity across physical hosts is achieved using VXLAN
tunnels with Docker swarm.

11/19
Emulating Network Conditions
I We do traffic shaping using
NetEm in the Linux kernel
I Emulate internal
connections are full-duplex
& loss-less with bit
capacities of 1000 Mbit/s
I Emulate external
connections are full-duplex
with bit capacities of 100
Mbit/s & 0.1% packet loss
in normal operation and
random bursts of 1% packet
loss
User space
. . .
Application processes
Kernel
TCP/UDP
IP/Ethernet/802.11
OS
TCP/IP
stack
Queueing
discipline
Device driver
queue (FIFO)
NIC
Netem config:
latency,
jitter, etc.
Sockets

12/19
Emulating Actors
I We emulate client arrivals
with Poisson processes
I We emulate client
interactions with load
generators
I Attackers are emulated by
automated programs that
select actions from a
pre-defined set
I Defender actions are
emulated through a custom
gRPC API.
Markov Decision Process
s1,1 s1,2 s1,3 . . . s1,4
s2,1 s2,2 s2,3 . . . s2,4
Digital Twin
. . .
Virtual
network
Virtual
devices
Emulated
services
Emulated
actors
IT Infrastructure
Configuration
& change events
System traces
Verified security policy
Optimized security policy

13/19
s1,1 s1,2 s1,3 . . . s1,n
s2,1 s2,2 s2,3 . . . s2,n
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Digital Twin
Target
Infrastructure
Model Creation &
Strategy Mapping
π
Selective
Replication
Strategy
Implementation π
Simulation System
Generalization
Model estimation
Automation &

14/19
Monitoring and Telemetry
Devices
Event bus
Security Policy
Storage Systems
Control actions
Data pipelines
Events
I Emulated devices run monitoring agents that periodically
push metrics to a Kafka event bus.
I The data in the event bus is consumed by data pipelines that
process the data and write to storage systems.
I The processed data is used by an automated security policy to
decide on control actions to execute in the digital twin.

15/19
Estimating Metric Distributions
ˆ
f
O
(o
t
|0)
Probability distribution of # IPS alerts weighted by priority ot
0 1000 2000 3000 4000 5000 6000 7000 8000 9000
ˆ
f
O
(o
t
|1)
Fitted model Distribution st = 0 Distribution st = 1
I We use the collected data to estimate metric distributions.
I We use the estimated distributions to instantiate Markov
games and Markov decision processes.

16/19
Learning Security Strategies
I We model the evolution of the system with a discrete-time
dynamical system.
I We assume a Markovian system with stochastic dynamics and
partial observability.
I A Partially Observed Markov Decision Process (POMDP)
I If attacker is static.
I A Partially Observed Stochastic Game (POSG)
I If attacker is dynamic.
Stochastic
System
(Markov)
Noisy
Sensor
Optimal
filter
Controller
Attacker
action a
(1)
t
action a
(2)
t
observation
ot
state
st
belief
bt

17/19
Learning Security Strategies
0
50
100
Reward per episode
0
50
100
150
Episode length (steps)
0.0
0.5
1.0
P[intrusion stopped]
T-SPSA simulation T-SPSA digital twin ot > 0 baseline Snort IDPS upper bound
I t-spsa is our reinforcement learning algorithm
I t-spsa outperforms Snort and converges to near-optimal
strategies
I While the performance is slightly better in simulation than in
the digital twin, it is clear that the performance in the two
environments are correlated.

18/19
For more details about the theory
I Finding Effective Security Strategies through Reinforcement Learning and Self-Play2
I Learning Intrusion Prevention Policies through Optimal Stopping3
I A System for Interactive Examination of Learned Security Policies4
I Intrusion Prevention Through Optimal Stopping5
I Learning Security Strategies through Game Play and Optimal Stopping6
I An Online Framework for Adapting Security Policies in Dynamic IT Environments7
I Learning Near-Optimal Intrusion Responses Against Dynamic Attackers8
2
Kim Hammar and Rolf Stadler. “Finding Effective Security Strategies through Reinforcement Learning and
Self-Play”. In: International Conference on Network and Service Management (CNSM 2020). Izmir, Turkey, 2020.
3
Kim Hammar and Rolf Stadler. “Learning Intrusion Prevention Policies through Optimal Stopping”. In:
International Conference on Network and Service Management (CNSM 2021).
http://dl.ifip.org/db/conf/cnsm/cnsm2021/1570732932.pdf. Izmir, Turkey, 2021.
4
Kim Hammar and Rolf Stadler. “A System for Interactive Examination of Learned Security Policies”. In:
NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium. 2022, pp. 1–3. doi:
10.1109/NOMS54207.2022.9789707.
5
Kim Hammar and Rolf Stadler. “Intrusion Prevention Through Optimal Stopping”. In: IEEE Transactions on
Network and Service Management 19.3 (2022), pp. 2333–2348. doi: 10.1109/TNSM.2022.3176781.
6
Kim Hammar and Rolf Stadler. “Learning Security Strategies through Game Play and Optimal Stopping”. In:
Proceedings of the ML4Cyber workshop, ICML 2022, Baltimore, USA, July 17-23, 2022. PMLR, 2022.
7
Kim Hammar and Rolf Stadler. “An Online Framework for Adapting Security Policies in Dynamic IT
Environments”. In: International Conference on Network and Service Management (CNSM 2022). Thessaloniki,
Greece, 2022.
8
Kim Hammar and Rolf Stadler. Learning Near-Optimal Intrusion Responses Against Dynamic Attackers. 2023.
doi: 10.48550/ARXIV.2301.06085. url: https://arxiv.org/abs/2301.06085.

19/19
Conclusions
I We develop a framework for
automated security.
I Our framework centers
around a digital twin
I We use the digital twin to
optimize security strategies
through reinforcement
learning, game theory, and
control theory.
I Documentation of our
framework:
limmen.dev/csle.
Markov Decision Process
s1,1 s1,2 s1,3 . . . s1,4
s2,1 s2,2 s2,3 . . . s2,4
Digital Twin
. . .
Virtual
network
Virtual
devices
Emulated
services
Emulated
actors
IT Infrastructure
Configuration
& change events
System traces
Verified security policy
Optimized security policy

Digital Twins for Security Automation

Recommandé

Recommandé

Contenu connexe

Similaire à Digital Twins for Security Automation

Similaire à Digital Twins for Security Automation (20)

Plus de Kim Hammar

Plus de Kim Hammar (18)

Dernier

Dernier (20)

Digital Twins for Security Automation