SlideShare une entreprise Scribd logo
1  sur  70
Télécharger pour lire hors ligne
Laurent Bernaille
Staff Engineer, Datadog
Kubernetes DNS Horror Stories
(and how to avoid them)
@lbernail
Datadog
Over 350 integrations
Over 1,200 employees
Over 8,000 customers
Runs on millions of hosts
Trillions of data points per day
10000s hosts in our infra
10s of k8s clusters with 100-4000 nodes
Multi-cloud
Very fast growth
@lbernail
Challenges
● Scalability
● Containerization
● Consistency across cloud providers
● Networking
● Ecosystem youth
● Edge-cases
@lbernail
What we did not expect
Our Kubernetes DNS infrastructure currently serves ~200 000 DNS queries per second
Largest cluster is at ~60 000 DNS queries per second
DNS is one of your more critical services when running Kubernetes
@lbernail
Outline
● DNS in Kubernetes
● Challenges
● "Fun" stories
● What we do now
DNS in Kubernetes
@lbernail
How it works (by default)
kubelet
pod pod
container container /etc/resolv.conf
search <namespace>.svc.cluster.local
svc.cluster.local
cluster.local
ec2.internal
nameserver <dns service>
options ndots:5
kubelet configuration
ClusterDNS: <dns service>
@lbernail
Accessing DNS
kube-proxy
pod
container
proxier
<dns service>
DNS
DNS
DNS
kubedns / Coredns
apiserver
watch
services
Upstream resolver
DNS config
cluster.local:
- authoritative
- get services from apiserver
.:
- forward to upstream resolver
@lbernail
Theory: Scenario 1
Pod in namespace "metrics", requesting service "points" in namespace "metrics"
getent ahosts points
1. points has less than 5 dots
2. With first search domain: points.metrics.svc.cluster.local
3. Answer
@lbernail
Theory: Scenario 2
Pod in namespace "logs", requesting service "points" in namespace "metrics"
getent ahosts points.metrics
1. points.metrics has less than 5 dots
2. With first search domain: points.metrics.logs.svc.cluster.local
3. Answer: NXDOMAIN
4. With second domain points.metrics.svc.cluster.local
5. Answer
Challenges
@lbernail
In practice
Pod in namespace default, requesting www.google.com (GKE)
getent ahosts www.google.com
10.220.1.4.58137 > 10.128.32.10.53: A? www.google.com.default.svc.cluster.local. (58)
10.220.1.4.58137 > 10.128.32.10.53: AAAA? www.google.com.default.svc.cluster.local. (58)
10.128.32.10.53 > 10.220.1.4.58137: NXDomain 0/1/0 (151)
10.128.32.10.53 > 10.220.1.4.58137: NXDomain 0/1/0 (151)
10.220.1.4.54960 > 10.128.32.10.53: A? www.google.com.svc.cluster.local. (50)
10.220.1.4.54960 > 10.128.32.10.53: AAAA? www.google.com.svc.cluster.local. (50)
10.128.32.10.53 > 10.220.1.4.54960: NXDomain 0/1/0 (143)
10.128.32.10.53 > 10.220.1.4.54960: NXDomain 0/1/0 (143)
10.220.1.4.51754 > 10.128.32.10.53: A? www.google.com.cluster.local. (46)
10.220.1.4.51754 > 10.128.32.10.53: AAAA? www.google.com.cluster.local. (46)
10.128.32.10.53 > 10.220.1.4.51754: NXDomain 0/1/0 (139)
10.128.32.10.53 > 10.220.1.4.51754: NXDomain 0/1/0 (139)
10.220.1.4.42457 > 10.128.32.10.53: A? www.google.com.c.sandbox.internal. (59)
10.220.1.4.42457 > 10.128.32.10.53: AAAA? www.google.com.c.sandbox.internal. (59)
10.128.32.10.53 > 10.220.1.4.42457: NXDomain 0/1/0 (148)
10.128.32.10.53 > 10.220.1.4.42457: NXDomain 0/1/0 (148)
10.220.1.4.45475 > 10.128.32.10.53: A? www.google.com.google.internal. (48)
10.220.1.4.45475 > 10.128.32.10.53: AAAA? www.google.com.google.internal. (48)
10.128.32.10.53 > 10.220.1.4.45475: NXDomain 0/1/0 (137)
10.128.32.10.53 > 10.220.1.4.45475: NXDomain 0/1/0 (137)
10.220.1.4.40634 > 10.128.32.10.53: A? www.google.com. (32)
10.220.1.4.40634 > 10.128.32.10.53: AAAA? www.google.com. (32)
10.128.32.10.53 > 10.220.1.4.40634: 3/0/0 AAAA 2a00:1450:400c:c0b::67
10.128.32.10.53 > 10.220.1.4.40634: 6/0/0 A 173.194.76.103
12 queries!
Reasons:
- 5 search domains
- IPv6
Problems:
- latency
- packet loss => 5s delay
- load on DNS infra
Challenges: Resolver behaviors
@lbernail
IPv6?
getaddrinfo will do IPv4 and IPv6 queries by "default"
The Good: We're ready for IPv6!
The Bad: Not great, because it means twice the amount of traffic
The Ugly: IPv6 resolution triggers packet losses in the Kubernetes context
- Accessing the DNS service requires NAT
- Race condition in netfilter when 2 packets are sent within microseconds
- Patched in the kernel (4.19+)
- Detailed issue: https://github.com/kubernetes/kubernetes/issues/56903
If this happens for any of the 10+ queries, resolution takes 5s (at least)
Impact is far lower with IPVS (no DNAT)
@lbernail
Let's disable IPv6!
GRUB_CMDLINE_LINUX_DEFAULT="$GRUB_CMDLINE_LINUX_DEFAULT ipv6.disable=1"
getent ahosts www.google.com
IP 10.140.216.13.53705 > 10.129.192.2.53: A? www.google.com.datadog.svc.cluster.local. (63)
IP 10.129.192.2.53 > 10.140.216.13.53705: NXDomain*- 0/1/0 (171)
IP 10.140.216.13.34772 > 10.129.192.2.53: A? www.google.com.svc.cluster.local. (55)
IP 10.129.192.2.53 > 10.140.216.13.34772: NXDomain*- 0/1/0 (163)
IP 10.140.216.13.54617 > 10.129.192.2.53: A? www.google.com.cluster.local. (51)
IP 10.129.192.2.53 > 10.140.216.13.54617: NXDomain*- 0/1/0 (159)
IP 10.140.216.13.55732 > 10.129.192.2.53: A? www.google.com.ec2.internal. (45)
IP 10.129.192.2.53 > 10.140.216.13.55732: NXDomain 0/0/0 (45)
IP 10.140.216.13.36991 > 10.129.192.2.53: A? www.google.com. (32)
IP 10.129.192.2.53 > 10.140.216.13.36991: 1/0/0 A 172.217.2.100 (62)
Much better !
No IPv6: no risk of drops, 50% less traffic
@lbernail
We wanted to celebrate, but...
A
AAAA
Still a lot of AAAA queries
Where are they coming from?
@lbernail
What triggers IPv6?
According to POSIX, getaddrinfo should do IPv4 and IPv6 by default
BUT glibc includes hint AI_ADDRCONFIG by default
According to POSIX.1, specifying hints as NULL should cause ai_flags
to be assumed as 0. The GNU C library instead assumes a value of
(AI_V4MAPPED | AI_ADDRCONFIG) for this case, since this value is
considered an improvement on the specification.
AND AI_ADDRCONFIG only makes IPv6 queries if it finds an IPv6 address
man getaddrinfo, glibc
If hints.ai_flags includes the AI_ADDRCONFIG flag, then IPv4
addresses are returned in the list pointed to by res only if the
local system has at least one IPv4 address configured
So wait, disabling IPv6 should just work, right?
@lbernail
Alpine and Musl
Turns out Musl implements the POSIX spec, and sure enough:
getaddrinfo( "echo" , NULL , NULL , &servinfo)o)
Ubuntu base image
10.140.216.13.52563 > 10.129.192.2.53: A? echo.datadog.svc.fury.cluster.local. (53)
10.129.192.2.53 > 10.140.216.13.52563: 1/0/0 A 10.129.204.147 (104)
Alpine base image
10.141.90.160.46748 > 10.129.192.2.53: A? echo.datadog.svc.cluster.local. (53)
10.141.90.160.46748 > 10.129.192.2.53: AAAA? echo.datadog.svc.cluster.local. (53)
10.129.192.2.53 > 10.141.90.160.46748: 0/1/0 (161)
10.129.192.2.53 > 10.141.90.160.46748: 1/0/0 A 10.129.204.147 (104)
Service in the same namespace No hints (use defaults)
But, we don't use alpine that much. So what is happening?
@lbernail
We use Go a lot
10.140.216.13.55929 > 10.129.192.2.53: AAAA? www.google.com.datadog.svc.cluster.local. (63)
10.140.216.13.46751 > 10.129.192.2.53: A? www.google.com.datadog.svc.cluster.local. (63)
10.129.192.2.53 > 10.140.216.13.46751: NXDomain*- 0/1/0 (171)
10.129.192.2.53 > 10.140.216.13.55929: XDomain*- 0/1/0 (171)
10.140.216.13.57414 > 10.129.192.2.53: AAAA? www.google.com.svc.cluster.local. (55)
10.140.216.13.54192 > 10.129.192.2.53: A? www.google.com.svc.cluster.local. (55)
10.129.192.2.53 > 10.140.216.13.57414: NXDomain*- 0/1/0 (163)
10.129.192.2.53 > 10.140.216.13.54192: NXDomain*- 0/1/0 (163)
net.ResolveTCPAddr("tcp", "www.google.com:80"), on Ubuntu
IPv6 is back...
@lbernail
What about CGO?
10.140.216.13.55929 > 10.129.192.2.53: AAAA? www.google.com.datadog.svc.cluster.local. (63)
10.140.216.13.46751 > 10.129.192.2.53: A? www.google.com.datadog.svc.cluster.local. (63)
10.129.192.2.53 > 10.140.216.13.46751: NXDomain*- 0/1/0 (171)
10.129.192.2.53 > 10.140.216.13.55929: XDomain*- 0/1/0 (171)
...
Native Go
CGO: export GODEBUG=netdns=cgo
10.140.216.13.49382 > 10.129.192.2.53: A? www.google.com.datadog.svc.cluster.local. (63)
10.140.216.13.49382 > 10.129.192.2.53: AAAA? www.google.com.datadog.svc.cluster.local. (63)
10.129.192.2.53 > 10.140.216.13.49382: NXDomain*- 0/1/0 (171)
10.129.192.2.53 > 10.140.216.13.49382: NXDomain*- 0/1/0 (171)
...
Was GODEBUG ignored?
@lbernail
Subtle difference
10.140.216.13.55929 > 10.129.192.2.53: AAAA? www.google.com.datadog.svc.cluster.local. (63)
10.140.216.13.46751 > 10.129.192.2.53: A? www.google.com.datadog.svc.cluster.local. (63)
Native Go
CGO: export GODEBUG=netdns=cgo
10.140.216.13.49382 > 10.129.192.2.53: A? www.google.com.datadog.svc.cluster.local. (63)
10.140.216.13.49382 > 10.129.192.2.53: AAAA? www.google.com.datadog.svc.cluster.local. (63)
Native Go uses a different source port for A and AAAA
@lbernail
CGO implementation
https://github.com/golang/go/blob/master/src/net/cgo_linux.go
So we can't really avoid IPv6 queries in Go unless we change the app
net.ResolveTCPAddr("tcp4", "www.google.com")
But only with CGO...
No AI_ADDRCONFIG
Challenges: Query volume reduction
@lbernail
Coredns Autopath
With this option, coredns knows the search domains and finds the right record
10.140.216.13.37164 > 10.129.192.23.53: A? google.com.datadog.svc.cluster.local. (58)
10.140.216.13.37164 > 10.129.192.23.53: AAAA? google.com.datadog.svc.cluster.local. (58)
10.129.192.23.53 > 10.140.216.13.37164: 2/0/0 CNAME google.com., A 216.58.218.238 (98)
10.129.192.23.53 > 10.140.216.13.37164: 2/0/0 CNAME google.com., AAAA 2607:f8b0:4004:808::200e (110)
Much better: only 2 queries instead of 10+
But
Requires to run the Coredns Kubernetes plugin with "pods: verified"
- To infer the full search domain (pod namespace)
- Memory requirements becomes proportional to number of pods
- Several OOM-killer incidents for us
@lbernail
Node-local-dns
kube-proxy
daemonset
pod in hostnet
local bind
node-local
dns
proxier
tcp:<dns service>
tcp:<cloud resolver>
DNS
DNS
DNS
kubedns / Coredns
apiserver
watch
services
Upstream resolver
pod
container
node-local-dns
Daemonset
Binds a non-routable IP (169.x.y.z)
Cache
Proxies queries to DNS service in TCP
Proxies non-kubernetes queries directly
Wins
Reduces load
- cache
- non Kubernetes queries bypass service
Mitigates the netfilter race condition
Challenges: Rolling updates
@lbernail
Initial state
kube-proxy
IPVS
DNS A
DNS B
DNS C
watch
services
pod 1
container
apiserverIP1:P1=> VIP:53 ⇔ IP1:P1 => IPA:53 300s
VIP:53
IPA:53
@lbernail
Pod A deleted
kube-proxy
IPVS
DNS A
DNS B
DNS C
watch
services
pod 1
container
apiserver
VIP:53
IP1:P1=> VIP:53 ⇔ IP1:P1 => IPA:53 200s
IP1:P2=> VIP:53 ⇔ IP1:P2 => IPB:53 300s
@lbernail
Source port reuse
kube-proxy
IPVS
DNS A
DNS B
DNS C
watch
services
pod 1
container
apiserver
VIP:53
IP1:P1=> VIP:53 ⇔ IP1:P1 => IPA:53 100s
IP1:P2=> VIP:53 ⇔ IP1:P2 => IPB:53 200s
What if a new query reuses P1 while the entry is still in the IPVS conntrack?
Packet is silently dropped
Situation can be frequent for applications sending a lot of DNS queries
@lbernail
Mitigation #1
kube-proxy
IPVS
DNS A
DNS B
DNS C
watch
services
pod 1
container
apiserver
VIP:53
IP1:P1=> VIP:53 ⇔ IP1:P1 => IPA:53 100s
IP1:P2=> VIP:53 ⇔ IP1:P2 => IPB:53 200s
expire_nodest_conn=1
New packet from IP1:P1?
- Delete entry from the IPVS conntrack
- Send ICMP Port Unreachable
Better, but under load, this can still trigger many errors
@lbernail
Mitigation #2
kube-proxy
IPVS
DNS A
DNS B
DNS C
watch
services
pod 1
container
apiserver
VIP:53
IP1:P2=> VIP:53 ⇔ IP1:P2 => IPB:53 29s
expire_nodest_conn=1, low UDP timeout (30s) [kube-proxy 1.18+]
Likelihood of port collision much lower
Still some errors
Kernel patch to expire entries on backend deletion (5.9+) by @andrewsykim
"Fun" stories
Sometimes your DNS is unstable
@lbernail
Coredns getting OOM-killed
Some coredns pods getting OOM-killed
Not great for apps...
Coredns memory usage / limit
@lbernail
Coredns getting OOM-killed
Coredns memory usage / limit
Apiserver restarted
Coredns pod reconnected
Too much memory => OOM
Startup requires more memory
Autopath "Pods:verified" makes sizing hardapiserver restart
Sometimes Autoscaling works too well
@lbernail
Proportional autoscaler
Number of coredns replicas
Proportional-autoscaler for coredns:
Less nodes => Less coredns pods
@lbernail
Proportional autoscaler
Exceptions due to DNS failure
Number of coredns replicas
Triggered port reuse issue
Some applications don't like this
Sometimes it's not your fault
@lbernail
Staging fright on AWS
.:53 {
kubernetes cluster.local {
pods verified
}
autopath @kubernetes
proxy . /etc/resolv.conf
}
cluster.local:53 {
kubernetes cluster.local
}
.:53 {
proxy . /etc/resolv.conf
cache
}
Enable autopath
Simple change right?
Can you spot what broke the staging cluster?
@lbernail
Staging fright on AWS
.:53 {
kubernetes cluster.local {
pods verified
}
autopath @kubernetes
proxy . /etc/resolv.conf
}
cluster.local:53 {
kubernetes cluster.local
}
.:53 {
proxy . /etc/resolv.conf
cache
}
With this change we disabled caching for proxied queries
AWS resolver has a strict limit: 1024 packets/second to resolver per ENI
A large proportion of forwarded queries got dropped
@lbernail
Staging fright on AWS #2
cluster.local:53 {
kubernetes cluster.local
}
.:53 {
proxy . /etc/resolv.conf
cache
force_tcp
}
cluster.local:53 {
kubernetes cluster.local
}
.:53 {
proxy . /etc/resolv.conf
cache
}
Let's avoid UDP for upstream queries: avoid truncation, less errors
Sounds like a good idea?
@lbernail
Staging fright on AWS #2
cluster.local:53 {
kubernetes cluster.local
}
.:53 {
proxy . /etc/resolv.conf
cache
force_tcp
}
cluster.local:53 {
kubernetes cluster.local
}
.:53 {
proxy . /etc/resolv.conf
cache
}
AWS resolver has a strict limit: 1024 packets/second to resolver per ENI
DNS queries over TCP use at least 5x more packets
Don't query the AWS resolvers using TCP
Sometimes it's really not your fault
@lbernail
Upstream DNS issue
Application resolution errors
@lbernail
Upstream DNS issue
Application resolution errors
DNS queries by zone
Sharp drop in number of queries for zone "."
@lbernail
Application resolution errors
Upstream resolver issue in a single AZ
Provider DNS
Zone "A"
DNS queries by DNS zone
Forward Health-check failures by Upstream/AZ
Forward query Latency by Upstream/AZ
Upstream DNS issue
Sometimes you have to remember
pods run on nodes
@lbernail
Node issue
Familiar symptom: some applications have issues due to DNS errors
DNS errors for app
@lbernail
Familiar symptom: some applications have issues due to DNS errors
DNS errors for app
Conntrack entries (hosts running coredns pods) ~130k entries
Node issue
@lbernail
Familiar symptom: some applications have issues due to DNS errors
DNS errors for app
Conntrack entries (hosts running coredns pods) ~130k entries
--conntrack-min => Default: 131072
Node issue
Increase number of pods and nodes
@lbernail
Something weird
Group of nodes with ~130k entries
Group of nodes with ~60k entries
@lbernail
Something weird
Kernel 4.15
Kernel 5.0
Kernel patches in 5.0+ to improve conntrack behavior with UDP
=> conntrack entries for DNS get 30s TTL instead of 180s
Details
● netfilter: conntrack: udp: only extend timeout to stream mode after 2s
● netfilter: conntrack: udp: set stream timeout to 2 minutes
Sometimes it's just weird
@lbernail
DNS is broken for a single app
Symptom
pgbouncer can't connect to postgres after coredns update
But, everything else works completely fine
@lbernail
DNS is broken for a single app
10.143.217.162 41005 1 0 BAcKEnd.DAtAdOG.serViCe.CONsul.DATAdog.svC.clusTEr.LOCaL
10.143.217.162 41005 28 0 BaCKEND.dATaDOg.sErvicE.cONSuL.dataDog.svc.ClUsTER.loCaL
10.129.224.2 53 1 3 0 BAcKEnd.DAtAdOG.serViCe.CONsul.DATAdog.svC.clusTEr.LOCaL
10.129.224.2 53 28 3 0 BaCKEND.dATaDOg.sErvicE.cONSuL.dataDog.svc.ClUsTER.loCaL
[...]
10.143.217.162 41005 1 0 bACKEND.dataDog.SeRvICe.CONsUl
10.143.217.162 41005 28 0 BaCkenD.DatADOg.ServiCE.COnsUL
10.129.224.2 53 1 0 1 bACKEND.dataDog.SeRvICe.CONsUl
Let's capture and analyze DNS queries
Source IP/Port
Same source port across all queries??
1: A
28: AAA
3: NXDOMAIN
0: NOERROR
Queried domain. Random case??
IETF Draft to increase DNS Security
"Use of Bit 0x20 in DNS Labels to Improve Transaction Identity"
This DNS client is clearly not one we know about
@lbernail
DNS is broken for a single app
10.143.217.162 41005 1 0 BAcKEnd.DAtAdOG.serViCe.CONsul.DATAdog.svC.clusTEr.LOCaL
10.143.217.162 41005 28 0 BaCKEND.dATaDOg.sErvicE.cONSuL.dataDog.svc.ClUsTER.loCaL
10.129.224.2 53 1 3 0 BAcKEnd.DAtAdOG.serViCe.CONsul.DATAdog.svC.clusTEr.LOCaL
10.129.224.2 53 28 3 0 BaCKEND.dATaDOg.sErvicE.cONSuL.dataDog.svc.ClUsTER.loCaL
[...]
10.143.217.162 41005 1 0 bACKEND.dataDog.SeRvICe.CONsUl
10.143.217.162 41005 28 0 BaCkenD.DatADOg.ServiCE.COnsUL
10.129.224.2 53 1 0 1 bACKEND.dataDog.SeRvICe.CONsUl
Let's capture and analyze DNS queries
Truncate Bit (TC)
pgbouncer compiled with evdns, which doesn't support TCP upgrade (and just ignores the answer if TC=1)
Previous coredns version was not setting the TC bit when upstream TCP answer was too large (bug was fixed)
Recompiling pgbouncer with c-ares fixed the problem
Sometimes it's not DNS
Sometimes it's not DNS
Rarely
@lbernail
Logs are full of DNS errors% of Errors in app
Sometimes it's not DNS
@lbernail
Logs are full of DNS errors
Sharp drop in traffic received by nodes
??
Average #Packets received by nodegroups
% of Errors in app
Sometimes it's not DNS
@lbernail
Logs are full of DNS errors
Sharp drop in traffic received by nodes
??
Average #Packets received by nodegroups
TCP Retransmits by AZ-pairs
High proportion of drops for any traffic involving zone "b"
Confirmed transient issue from provider
Not really DNS that time, but this was the first impact
% of Errors in app
Sometimes it's not DNS
What we run now
@lbernail
Our DNS setup
kube-proxy
daemonset
pod binding
169.254.20.10
node-local
dns
IPVS
cluster.local
forward <dns svc>
force_tcp
cache
.
forward <cloud>
prefer_udp
cache
DNS
DNS
DNS
Coredns
Cloud resolver
pod
container
cluster.local
kubernetes
pods disabled
cache
.
forward <cloud>
prefer_udp
cache
UDP
TCP
UDP
UDP
@lbernail
Our DNS setup
kube-proxy
daemonset
pod binding
169.254.20.10
node-local
dns
IPVS
DNS
DNS
DNS
Cloud resolver
pod
container
Container /etc/resolv.conf
search <namespace>.svc.cluster.local
svc.cluster.local
cluster.local
ec2.internal
nameserver 169.254.20.10
<dns svc>
options ndots:5, timeout: 1
Coredns
@lbernail
Our DNS setup
kube-proxy
daemonset
pod binding
169.254.20.10
node-local
dns
IPVS
DNS
DNS
DNS
Cloud resolver
pod
container
Container /etc/resolv.conf
search <namespace>.svc.cluster.local
svc.cluster.local
cluster.local
ec2.internal
nameserver 169.254.20.10
<dns svc>
options ndots:5, timeout: 1
Alternate /etc/resolv.conf
Opt-in using annotations (mutating webhook)
search svc.cluster.local
nameserver 169.254.20.10
<dns svc>
options ndots:2, timeout: 1
Coredns
Conclusion
@lbernail
Conclusion
● Running Kubernetes means running DNS
● DNS is hard, especially at scale
● Recommendations
○ Use node-local-dns and cache everywhere you can
○ Test your DNS infrastructure (load-tests, rolling updates)
○ Understand the upstream DNS services you depend on
● For your applications
○ Try to standardize on a few resolvers
○ Use async resolution/long-lived connections whenever possible
○ Include DNS failures in your (Chaos) tests
Thank you
We’re hiring!
https://www.datadoghq.com/careers/
laurent@datadoghq.com
@lbernail
Kubernetes DNS Horror Stories

Contenu connexe

Tendances

Writing the Container Network Interface(CNI) plugin in golang
Writing the Container Network Interface(CNI) plugin in golangWriting the Container Network Interface(CNI) plugin in golang
Writing the Container Network Interface(CNI) plugin in golangHungWei Chiu
 
Introduction to CNI (Container Network Interface)
Introduction to CNI (Container Network Interface)Introduction to CNI (Container Network Interface)
Introduction to CNI (Container Network Interface)HungWei Chiu
 
[MeetUp][1st] 오리뎅이의_쿠버네티스_네트워킹
[MeetUp][1st] 오리뎅이의_쿠버네티스_네트워킹[MeetUp][1st] 오리뎅이의_쿠버네티스_네트워킹
[MeetUp][1st] 오리뎅이의_쿠버네티스_네트워킹InfraEngineer
 
Docker networking Tutorial 101
Docker networking Tutorial 101Docker networking Tutorial 101
Docker networking Tutorial 101LorisPack Project
 
도커 무작정 따라하기: 도커가 처음인 사람도 60분이면 웹 서버를 올릴 수 있습니다!
도커 무작정 따라하기: 도커가 처음인 사람도 60분이면 웹 서버를 올릴 수 있습니다!도커 무작정 따라하기: 도커가 처음인 사람도 60분이면 웹 서버를 올릴 수 있습니다!
도커 무작정 따라하기: 도커가 처음인 사람도 60분이면 웹 서버를 올릴 수 있습니다!pyrasis
 
BPF - in-kernel virtual machine
BPF - in-kernel virtual machineBPF - in-kernel virtual machine
BPF - in-kernel virtual machineAlexei Starovoitov
 
Kubernetes Architecture
 Kubernetes Architecture Kubernetes Architecture
Kubernetes ArchitectureKnoldus Inc.
 
Autovacuum, explained for engineers, new improved version PGConf.eu 2015 Vienna
Autovacuum, explained for engineers, new improved version PGConf.eu 2015 ViennaAutovacuum, explained for engineers, new improved version PGConf.eu 2015 Vienna
Autovacuum, explained for engineers, new improved version PGConf.eu 2015 ViennaPostgreSQL-Consulting
 
왜 쿠버네티스는 systemd로 cgroup을 관리하려고 할까요
왜 쿠버네티스는 systemd로 cgroup을 관리하려고 할까요왜 쿠버네티스는 systemd로 cgroup을 관리하려고 할까요
왜 쿠버네티스는 systemd로 cgroup을 관리하려고 할까요Jo Hoon
 
BPF Internals (eBPF)
BPF Internals (eBPF)BPF Internals (eBPF)
BPF Internals (eBPF)Brendan Gregg
 
OVN operationalization at scale at eBay
OVN operationalization at scale at eBayOVN operationalization at scale at eBay
OVN operationalization at scale at eBayAliasgar Ginwala
 
Open vSwitch 패킷 처리 구조
Open vSwitch 패킷 처리 구조Open vSwitch 패킷 처리 구조
Open vSwitch 패킷 처리 구조Seung-Hoon Baek
 
[KubeCon EU 2022] Running containerd and k3s on macOS
[KubeCon EU 2022] Running containerd and k3s on macOS[KubeCon EU 2022] Running containerd and k3s on macOS
[KubeCon EU 2022] Running containerd and k3s on macOSAkihiro Suda
 
Ninja Build: Simple Guide for Beginners
Ninja Build: Simple Guide for BeginnersNinja Build: Simple Guide for Beginners
Ninja Build: Simple Guide for BeginnersChang W. Doh
 
Wait! What’s going on inside my database?
Wait! What’s going on inside my database?Wait! What’s going on inside my database?
Wait! What’s going on inside my database?Jeremy Schneider
 
Rootless Containers
Rootless ContainersRootless Containers
Rootless ContainersAkihiro Suda
 
Admission controllers - PSP, OPA, Kyverno and more!
Admission controllers - PSP, OPA, Kyverno and more!Admission controllers - PSP, OPA, Kyverno and more!
Admission controllers - PSP, OPA, Kyverno and more!SebastienSEYMARC
 

Tendances (20)

Writing the Container Network Interface(CNI) plugin in golang
Writing the Container Network Interface(CNI) plugin in golangWriting the Container Network Interface(CNI) plugin in golang
Writing the Container Network Interface(CNI) plugin in golang
 
Introduction to CNI (Container Network Interface)
Introduction to CNI (Container Network Interface)Introduction to CNI (Container Network Interface)
Introduction to CNI (Container Network Interface)
 
[MeetUp][1st] 오리뎅이의_쿠버네티스_네트워킹
[MeetUp][1st] 오리뎅이의_쿠버네티스_네트워킹[MeetUp][1st] 오리뎅이의_쿠버네티스_네트워킹
[MeetUp][1st] 오리뎅이의_쿠버네티스_네트워킹
 
Docker networking Tutorial 101
Docker networking Tutorial 101Docker networking Tutorial 101
Docker networking Tutorial 101
 
도커 무작정 따라하기: 도커가 처음인 사람도 60분이면 웹 서버를 올릴 수 있습니다!
도커 무작정 따라하기: 도커가 처음인 사람도 60분이면 웹 서버를 올릴 수 있습니다!도커 무작정 따라하기: 도커가 처음인 사람도 60분이면 웹 서버를 올릴 수 있습니다!
도커 무작정 따라하기: 도커가 처음인 사람도 60분이면 웹 서버를 올릴 수 있습니다!
 
BPF - in-kernel virtual machine
BPF - in-kernel virtual machineBPF - in-kernel virtual machine
BPF - in-kernel virtual machine
 
Kubernetes Architecture
 Kubernetes Architecture Kubernetes Architecture
Kubernetes Architecture
 
Distributed fun with etcd
Distributed fun with etcdDistributed fun with etcd
Distributed fun with etcd
 
Autovacuum, explained for engineers, new improved version PGConf.eu 2015 Vienna
Autovacuum, explained for engineers, new improved version PGConf.eu 2015 ViennaAutovacuum, explained for engineers, new improved version PGConf.eu 2015 Vienna
Autovacuum, explained for engineers, new improved version PGConf.eu 2015 Vienna
 
왜 쿠버네티스는 systemd로 cgroup을 관리하려고 할까요
왜 쿠버네티스는 systemd로 cgroup을 관리하려고 할까요왜 쿠버네티스는 systemd로 cgroup을 관리하려고 할까요
왜 쿠버네티스는 systemd로 cgroup을 관리하려고 할까요
 
BPF Internals (eBPF)
BPF Internals (eBPF)BPF Internals (eBPF)
BPF Internals (eBPF)
 
OVN operationalization at scale at eBay
OVN operationalization at scale at eBayOVN operationalization at scale at eBay
OVN operationalization at scale at eBay
 
Open vSwitch 패킷 처리 구조
Open vSwitch 패킷 처리 구조Open vSwitch 패킷 처리 구조
Open vSwitch 패킷 처리 구조
 
[KubeCon EU 2022] Running containerd and k3s on macOS
[KubeCon EU 2022] Running containerd and k3s on macOS[KubeCon EU 2022] Running containerd and k3s on macOS
[KubeCon EU 2022] Running containerd and k3s on macOS
 
Ninja Build: Simple Guide for Beginners
Ninja Build: Simple Guide for BeginnersNinja Build: Simple Guide for Beginners
Ninja Build: Simple Guide for Beginners
 
Wait! What’s going on inside my database?
Wait! What’s going on inside my database?Wait! What’s going on inside my database?
Wait! What’s going on inside my database?
 
Rootless Containers
Rootless ContainersRootless Containers
Rootless Containers
 
Admission controllers - PSP, OPA, Kyverno and more!
Admission controllers - PSP, OPA, Kyverno and more!Admission controllers - PSP, OPA, Kyverno and more!
Admission controllers - PSP, OPA, Kyverno and more!
 
Deploying IPv6 on OpenStack
Deploying IPv6 on OpenStackDeploying IPv6 on OpenStack
Deploying IPv6 on OpenStack
 
Observability with HAProxy
Observability with HAProxyObservability with HAProxy
Observability with HAProxy
 

Similaire à Kubernetes DNS Horror Stories

Getting started with IPv6
Getting started with IPv6Getting started with IPv6
Getting started with IPv6Private
 
Networking in Kubernetes
Networking in KubernetesNetworking in Kubernetes
Networking in KubernetesMinhan Xia
 
Finding target for hacking on internet is now easier
Finding target for hacking on internet is now easierFinding target for hacking on internet is now easier
Finding target for hacking on internet is now easierDavid Thomas
 
Handy Networking Tools and How to Use Them
Handy Networking Tools and How to Use ThemHandy Networking Tools and How to Use Them
Handy Networking Tools and How to Use ThemSneha Inguva
 
OakTable World Sep14 clonedb
OakTable World Sep14 clonedb OakTable World Sep14 clonedb
OakTable World Sep14 clonedb Connor McDonald
 
介绍 Percona 服务器 XtraDB 和 Xtrabackup
介绍 Percona 服务器 XtraDB 和 Xtrabackup介绍 Percona 服务器 XtraDB 和 Xtrabackup
介绍 Percona 服务器 XtraDB 和 XtrabackupYUCHENG HU
 
Multicloud connectivity using OpenNHRP
Multicloud connectivity using OpenNHRPMulticloud connectivity using OpenNHRP
Multicloud connectivity using OpenNHRPBob Melander
 
Debugging linux issues with eBPF
Debugging linux issues with eBPFDebugging linux issues with eBPF
Debugging linux issues with eBPFIvan Babrou
 
Kubernetes at Datadog Scale
Kubernetes at Datadog ScaleKubernetes at Datadog Scale
Kubernetes at Datadog ScaleDocker, Inc.
 
Oracle cluster installation with grid and iscsi
Oracle cluster  installation with grid and iscsiOracle cluster  installation with grid and iscsi
Oracle cluster installation with grid and iscsiChanaka Lasantha
 
[OpenStack 하반기 스터디] HA using DVR
[OpenStack 하반기 스터디] HA using DVR[OpenStack 하반기 스터디] HA using DVR
[OpenStack 하반기 스터디] HA using DVROpenStack Korea Community
 
Ignacy Kowalczyk
Ignacy KowalczykIgnacy Kowalczyk
Ignacy KowalczykCodeFest
 
USENIX ATC 2017 Performance Superpowers with Enhanced BPF
USENIX ATC 2017 Performance Superpowers with Enhanced BPFUSENIX ATC 2017 Performance Superpowers with Enhanced BPF
USENIX ATC 2017 Performance Superpowers with Enhanced BPFBrendan Gregg
 
Enhancing Network and Runtime Security with Cilium and Tetragon by Raymond De...
Enhancing Network and Runtime Security with Cilium and Tetragon by Raymond De...Enhancing Network and Runtime Security with Cilium and Tetragon by Raymond De...
Enhancing Network and Runtime Security with Cilium and Tetragon by Raymond De...ContainerDay Security 2023
 
The Ring programming language version 1.9 book - Part 9 of 210
The Ring programming language version 1.9 book - Part 9 of 210The Ring programming language version 1.9 book - Part 9 of 210
The Ring programming language version 1.9 book - Part 9 of 210Mahmoud Samir Fayed
 
Tensorflow in Docker
Tensorflow in DockerTensorflow in Docker
Tensorflow in DockerEric Ahn
 
PythonBrasil[8] - CPython for dummies
PythonBrasil[8] - CPython for dummiesPythonBrasil[8] - CPython for dummies
PythonBrasil[8] - CPython for dummiesTatiana Al-Chueyr
 

Similaire à Kubernetes DNS Horror Stories (20)

Getting started with IPv6
Getting started with IPv6Getting started with IPv6
Getting started with IPv6
 
Networking in Kubernetes
Networking in KubernetesNetworking in Kubernetes
Networking in Kubernetes
 
Finding target for hacking on internet is now easier
Finding target for hacking on internet is now easierFinding target for hacking on internet is now easier
Finding target for hacking on internet is now easier
 
Handy Networking Tools and How to Use Them
Handy Networking Tools and How to Use ThemHandy Networking Tools and How to Use Them
Handy Networking Tools and How to Use Them
 
derby onboarding (1)
derby onboarding (1)derby onboarding (1)
derby onboarding (1)
 
OakTable World Sep14 clonedb
OakTable World Sep14 clonedb OakTable World Sep14 clonedb
OakTable World Sep14 clonedb
 
介绍 Percona 服务器 XtraDB 和 Xtrabackup
介绍 Percona 服务器 XtraDB 和 Xtrabackup介绍 Percona 服务器 XtraDB 和 Xtrabackup
介绍 Percona 服务器 XtraDB 和 Xtrabackup
 
Multicloud connectivity using OpenNHRP
Multicloud connectivity using OpenNHRPMulticloud connectivity using OpenNHRP
Multicloud connectivity using OpenNHRP
 
Debugging linux issues with eBPF
Debugging linux issues with eBPFDebugging linux issues with eBPF
Debugging linux issues with eBPF
 
Kubernetes at Datadog Scale
Kubernetes at Datadog ScaleKubernetes at Datadog Scale
Kubernetes at Datadog Scale
 
Oracle cluster installation with grid and iscsi
Oracle cluster  installation with grid and iscsiOracle cluster  installation with grid and iscsi
Oracle cluster installation with grid and iscsi
 
[OpenStack 하반기 스터디] HA using DVR
[OpenStack 하반기 스터디] HA using DVR[OpenStack 하반기 스터디] HA using DVR
[OpenStack 하반기 스터디] HA using DVR
 
Ignacy Kowalczyk
Ignacy KowalczykIgnacy Kowalczyk
Ignacy Kowalczyk
 
tp smarts_onboarding
 tp smarts_onboarding tp smarts_onboarding
tp smarts_onboarding
 
USENIX ATC 2017 Performance Superpowers with Enhanced BPF
USENIX ATC 2017 Performance Superpowers with Enhanced BPFUSENIX ATC 2017 Performance Superpowers with Enhanced BPF
USENIX ATC 2017 Performance Superpowers with Enhanced BPF
 
Enhancing Network and Runtime Security with Cilium and Tetragon by Raymond De...
Enhancing Network and Runtime Security with Cilium and Tetragon by Raymond De...Enhancing Network and Runtime Security with Cilium and Tetragon by Raymond De...
Enhancing Network and Runtime Security with Cilium and Tetragon by Raymond De...
 
Haproxy - zastosowania
Haproxy - zastosowaniaHaproxy - zastosowania
Haproxy - zastosowania
 
The Ring programming language version 1.9 book - Part 9 of 210
The Ring programming language version 1.9 book - Part 9 of 210The Ring programming language version 1.9 book - Part 9 of 210
The Ring programming language version 1.9 book - Part 9 of 210
 
Tensorflow in Docker
Tensorflow in DockerTensorflow in Docker
Tensorflow in Docker
 
PythonBrasil[8] - CPython for dummies
PythonBrasil[8] - CPython for dummiesPythonBrasil[8] - CPython for dummies
PythonBrasil[8] - CPython for dummies
 

Plus de Laurent Bernaille

How the OOM Killer Deleted My Namespace
How the OOM Killer Deleted My NamespaceHow the OOM Killer Deleted My Namespace
How the OOM Killer Deleted My NamespaceLaurent Bernaille
 
Evolution of kube-proxy (Brussels, Fosdem 2020)
Evolution of kube-proxy (Brussels, Fosdem 2020)Evolution of kube-proxy (Brussels, Fosdem 2020)
Evolution of kube-proxy (Brussels, Fosdem 2020)Laurent Bernaille
 
Making the most out of kubernetes audit logs
Making the most out of kubernetes audit logsMaking the most out of kubernetes audit logs
Making the most out of kubernetes audit logsLaurent Bernaille
 
Kubernetes the Very Hard Way. Velocity Berlin 2019
Kubernetes the Very Hard Way. Velocity Berlin 2019Kubernetes the Very Hard Way. Velocity Berlin 2019
Kubernetes the Very Hard Way. Velocity Berlin 2019Laurent Bernaille
 
Kubernetes the Very Hard Way. Lisa Portland 2019
Kubernetes the Very Hard Way. Lisa Portland 2019Kubernetes the Very Hard Way. Lisa Portland 2019
Kubernetes the Very Hard Way. Lisa Portland 2019Laurent Bernaille
 
10 ways to shoot yourself in the foot with kubernetes, #9 will surprise you! ...
10 ways to shoot yourself in the foot with kubernetes, #9 will surprise you! ...10 ways to shoot yourself in the foot with kubernetes, #9 will surprise you! ...
10 ways to shoot yourself in the foot with kubernetes, #9 will surprise you! ...Laurent Bernaille
 
10 ways to shoot yourself in the foot with kubernetes, #9 will surprise you!
10 ways to shoot yourself in the foot with kubernetes, #9 will surprise you!10 ways to shoot yourself in the foot with kubernetes, #9 will surprise you!
10 ways to shoot yourself in the foot with kubernetes, #9 will surprise you!Laurent Bernaille
 
Optimizing kubernetes networking
Optimizing kubernetes networkingOptimizing kubernetes networking
Optimizing kubernetes networkingLaurent Bernaille
 
Kubernetes at Datadog the very hard way
Kubernetes at Datadog the very hard wayKubernetes at Datadog the very hard way
Kubernetes at Datadog the very hard wayLaurent Bernaille
 
Deep Dive in Docker Overlay Networks
Deep Dive in Docker Overlay NetworksDeep Dive in Docker Overlay Networks
Deep Dive in Docker Overlay NetworksLaurent Bernaille
 
Deeper dive in Docker Overlay Networks
Deeper dive in Docker Overlay NetworksDeeper dive in Docker Overlay Networks
Deeper dive in Docker Overlay NetworksLaurent Bernaille
 
Operational challenges behind Serverless architectures
Operational challenges behind Serverless architecturesOperational challenges behind Serverless architectures
Operational challenges behind Serverless architecturesLaurent Bernaille
 
Deep dive in Docker Overlay Networks
Deep dive in Docker Overlay NetworksDeep dive in Docker Overlay Networks
Deep dive in Docker Overlay NetworksLaurent Bernaille
 
Feedback on AWS re:invent 2016
Feedback on AWS re:invent 2016Feedback on AWS re:invent 2016
Feedback on AWS re:invent 2016Laurent Bernaille
 
Early recognition of encryted applications
Early recognition of encryted applicationsEarly recognition of encryted applications
Early recognition of encryted applicationsLaurent Bernaille
 
Early application identification. CONEXT 2006
Early application identification. CONEXT 2006Early application identification. CONEXT 2006
Early application identification. CONEXT 2006Laurent Bernaille
 

Plus de Laurent Bernaille (17)

How the OOM Killer Deleted My Namespace
How the OOM Killer Deleted My NamespaceHow the OOM Killer Deleted My Namespace
How the OOM Killer Deleted My Namespace
 
Evolution of kube-proxy (Brussels, Fosdem 2020)
Evolution of kube-proxy (Brussels, Fosdem 2020)Evolution of kube-proxy (Brussels, Fosdem 2020)
Evolution of kube-proxy (Brussels, Fosdem 2020)
 
Making the most out of kubernetes audit logs
Making the most out of kubernetes audit logsMaking the most out of kubernetes audit logs
Making the most out of kubernetes audit logs
 
Kubernetes the Very Hard Way. Velocity Berlin 2019
Kubernetes the Very Hard Way. Velocity Berlin 2019Kubernetes the Very Hard Way. Velocity Berlin 2019
Kubernetes the Very Hard Way. Velocity Berlin 2019
 
Kubernetes the Very Hard Way. Lisa Portland 2019
Kubernetes the Very Hard Way. Lisa Portland 2019Kubernetes the Very Hard Way. Lisa Portland 2019
Kubernetes the Very Hard Way. Lisa Portland 2019
 
10 ways to shoot yourself in the foot with kubernetes, #9 will surprise you! ...
10 ways to shoot yourself in the foot with kubernetes, #9 will surprise you! ...10 ways to shoot yourself in the foot with kubernetes, #9 will surprise you! ...
10 ways to shoot yourself in the foot with kubernetes, #9 will surprise you! ...
 
10 ways to shoot yourself in the foot with kubernetes, #9 will surprise you!
10 ways to shoot yourself in the foot with kubernetes, #9 will surprise you!10 ways to shoot yourself in the foot with kubernetes, #9 will surprise you!
10 ways to shoot yourself in the foot with kubernetes, #9 will surprise you!
 
Optimizing kubernetes networking
Optimizing kubernetes networkingOptimizing kubernetes networking
Optimizing kubernetes networking
 
Kubernetes at Datadog the very hard way
Kubernetes at Datadog the very hard wayKubernetes at Datadog the very hard way
Kubernetes at Datadog the very hard way
 
Deep Dive in Docker Overlay Networks
Deep Dive in Docker Overlay NetworksDeep Dive in Docker Overlay Networks
Deep Dive in Docker Overlay Networks
 
Deeper dive in Docker Overlay Networks
Deeper dive in Docker Overlay NetworksDeeper dive in Docker Overlay Networks
Deeper dive in Docker Overlay Networks
 
Discovering OpenBSD on AWS
Discovering OpenBSD on AWSDiscovering OpenBSD on AWS
Discovering OpenBSD on AWS
 
Operational challenges behind Serverless architectures
Operational challenges behind Serverless architecturesOperational challenges behind Serverless architectures
Operational challenges behind Serverless architectures
 
Deep dive in Docker Overlay Networks
Deep dive in Docker Overlay NetworksDeep dive in Docker Overlay Networks
Deep dive in Docker Overlay Networks
 
Feedback on AWS re:invent 2016
Feedback on AWS re:invent 2016Feedback on AWS re:invent 2016
Feedback on AWS re:invent 2016
 
Early recognition of encryted applications
Early recognition of encryted applicationsEarly recognition of encryted applications
Early recognition of encryted applications
 
Early application identification. CONEXT 2006
Early application identification. CONEXT 2006Early application identification. CONEXT 2006
Early application identification. CONEXT 2006
 

Dernier

🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 

Dernier (20)

🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 

Kubernetes DNS Horror Stories

  • 1. Laurent Bernaille Staff Engineer, Datadog Kubernetes DNS Horror Stories (and how to avoid them)
  • 2. @lbernail Datadog Over 350 integrations Over 1,200 employees Over 8,000 customers Runs on millions of hosts Trillions of data points per day 10000s hosts in our infra 10s of k8s clusters with 100-4000 nodes Multi-cloud Very fast growth
  • 3. @lbernail Challenges ● Scalability ● Containerization ● Consistency across cloud providers ● Networking ● Ecosystem youth ● Edge-cases
  • 4. @lbernail What we did not expect Our Kubernetes DNS infrastructure currently serves ~200 000 DNS queries per second Largest cluster is at ~60 000 DNS queries per second DNS is one of your more critical services when running Kubernetes
  • 5. @lbernail Outline ● DNS in Kubernetes ● Challenges ● "Fun" stories ● What we do now
  • 7. @lbernail How it works (by default) kubelet pod pod container container /etc/resolv.conf search <namespace>.svc.cluster.local svc.cluster.local cluster.local ec2.internal nameserver <dns service> options ndots:5 kubelet configuration ClusterDNS: <dns service>
  • 8. @lbernail Accessing DNS kube-proxy pod container proxier <dns service> DNS DNS DNS kubedns / Coredns apiserver watch services Upstream resolver DNS config cluster.local: - authoritative - get services from apiserver .: - forward to upstream resolver
  • 9. @lbernail Theory: Scenario 1 Pod in namespace "metrics", requesting service "points" in namespace "metrics" getent ahosts points 1. points has less than 5 dots 2. With first search domain: points.metrics.svc.cluster.local 3. Answer
  • 10. @lbernail Theory: Scenario 2 Pod in namespace "logs", requesting service "points" in namespace "metrics" getent ahosts points.metrics 1. points.metrics has less than 5 dots 2. With first search domain: points.metrics.logs.svc.cluster.local 3. Answer: NXDOMAIN 4. With second domain points.metrics.svc.cluster.local 5. Answer
  • 12. @lbernail In practice Pod in namespace default, requesting www.google.com (GKE) getent ahosts www.google.com 10.220.1.4.58137 > 10.128.32.10.53: A? www.google.com.default.svc.cluster.local. (58) 10.220.1.4.58137 > 10.128.32.10.53: AAAA? www.google.com.default.svc.cluster.local. (58) 10.128.32.10.53 > 10.220.1.4.58137: NXDomain 0/1/0 (151) 10.128.32.10.53 > 10.220.1.4.58137: NXDomain 0/1/0 (151) 10.220.1.4.54960 > 10.128.32.10.53: A? www.google.com.svc.cluster.local. (50) 10.220.1.4.54960 > 10.128.32.10.53: AAAA? www.google.com.svc.cluster.local. (50) 10.128.32.10.53 > 10.220.1.4.54960: NXDomain 0/1/0 (143) 10.128.32.10.53 > 10.220.1.4.54960: NXDomain 0/1/0 (143) 10.220.1.4.51754 > 10.128.32.10.53: A? www.google.com.cluster.local. (46) 10.220.1.4.51754 > 10.128.32.10.53: AAAA? www.google.com.cluster.local. (46) 10.128.32.10.53 > 10.220.1.4.51754: NXDomain 0/1/0 (139) 10.128.32.10.53 > 10.220.1.4.51754: NXDomain 0/1/0 (139) 10.220.1.4.42457 > 10.128.32.10.53: A? www.google.com.c.sandbox.internal. (59) 10.220.1.4.42457 > 10.128.32.10.53: AAAA? www.google.com.c.sandbox.internal. (59) 10.128.32.10.53 > 10.220.1.4.42457: NXDomain 0/1/0 (148) 10.128.32.10.53 > 10.220.1.4.42457: NXDomain 0/1/0 (148) 10.220.1.4.45475 > 10.128.32.10.53: A? www.google.com.google.internal. (48) 10.220.1.4.45475 > 10.128.32.10.53: AAAA? www.google.com.google.internal. (48) 10.128.32.10.53 > 10.220.1.4.45475: NXDomain 0/1/0 (137) 10.128.32.10.53 > 10.220.1.4.45475: NXDomain 0/1/0 (137) 10.220.1.4.40634 > 10.128.32.10.53: A? www.google.com. (32) 10.220.1.4.40634 > 10.128.32.10.53: AAAA? www.google.com. (32) 10.128.32.10.53 > 10.220.1.4.40634: 3/0/0 AAAA 2a00:1450:400c:c0b::67 10.128.32.10.53 > 10.220.1.4.40634: 6/0/0 A 173.194.76.103 12 queries! Reasons: - 5 search domains - IPv6 Problems: - latency - packet loss => 5s delay - load on DNS infra
  • 14. @lbernail IPv6? getaddrinfo will do IPv4 and IPv6 queries by "default" The Good: We're ready for IPv6! The Bad: Not great, because it means twice the amount of traffic The Ugly: IPv6 resolution triggers packet losses in the Kubernetes context - Accessing the DNS service requires NAT - Race condition in netfilter when 2 packets are sent within microseconds - Patched in the kernel (4.19+) - Detailed issue: https://github.com/kubernetes/kubernetes/issues/56903 If this happens for any of the 10+ queries, resolution takes 5s (at least) Impact is far lower with IPVS (no DNAT)
  • 15. @lbernail Let's disable IPv6! GRUB_CMDLINE_LINUX_DEFAULT="$GRUB_CMDLINE_LINUX_DEFAULT ipv6.disable=1" getent ahosts www.google.com IP 10.140.216.13.53705 > 10.129.192.2.53: A? www.google.com.datadog.svc.cluster.local. (63) IP 10.129.192.2.53 > 10.140.216.13.53705: NXDomain*- 0/1/0 (171) IP 10.140.216.13.34772 > 10.129.192.2.53: A? www.google.com.svc.cluster.local. (55) IP 10.129.192.2.53 > 10.140.216.13.34772: NXDomain*- 0/1/0 (163) IP 10.140.216.13.54617 > 10.129.192.2.53: A? www.google.com.cluster.local. (51) IP 10.129.192.2.53 > 10.140.216.13.54617: NXDomain*- 0/1/0 (159) IP 10.140.216.13.55732 > 10.129.192.2.53: A? www.google.com.ec2.internal. (45) IP 10.129.192.2.53 > 10.140.216.13.55732: NXDomain 0/0/0 (45) IP 10.140.216.13.36991 > 10.129.192.2.53: A? www.google.com. (32) IP 10.129.192.2.53 > 10.140.216.13.36991: 1/0/0 A 172.217.2.100 (62) Much better ! No IPv6: no risk of drops, 50% less traffic
  • 16. @lbernail We wanted to celebrate, but... A AAAA Still a lot of AAAA queries Where are they coming from?
  • 17. @lbernail What triggers IPv6? According to POSIX, getaddrinfo should do IPv4 and IPv6 by default BUT glibc includes hint AI_ADDRCONFIG by default According to POSIX.1, specifying hints as NULL should cause ai_flags to be assumed as 0. The GNU C library instead assumes a value of (AI_V4MAPPED | AI_ADDRCONFIG) for this case, since this value is considered an improvement on the specification. AND AI_ADDRCONFIG only makes IPv6 queries if it finds an IPv6 address man getaddrinfo, glibc If hints.ai_flags includes the AI_ADDRCONFIG flag, then IPv4 addresses are returned in the list pointed to by res only if the local system has at least one IPv4 address configured So wait, disabling IPv6 should just work, right?
  • 18. @lbernail Alpine and Musl Turns out Musl implements the POSIX spec, and sure enough: getaddrinfo( "echo" , NULL , NULL , &servinfo)o) Ubuntu base image 10.140.216.13.52563 > 10.129.192.2.53: A? echo.datadog.svc.fury.cluster.local. (53) 10.129.192.2.53 > 10.140.216.13.52563: 1/0/0 A 10.129.204.147 (104) Alpine base image 10.141.90.160.46748 > 10.129.192.2.53: A? echo.datadog.svc.cluster.local. (53) 10.141.90.160.46748 > 10.129.192.2.53: AAAA? echo.datadog.svc.cluster.local. (53) 10.129.192.2.53 > 10.141.90.160.46748: 0/1/0 (161) 10.129.192.2.53 > 10.141.90.160.46748: 1/0/0 A 10.129.204.147 (104) Service in the same namespace No hints (use defaults) But, we don't use alpine that much. So what is happening?
  • 19. @lbernail We use Go a lot 10.140.216.13.55929 > 10.129.192.2.53: AAAA? www.google.com.datadog.svc.cluster.local. (63) 10.140.216.13.46751 > 10.129.192.2.53: A? www.google.com.datadog.svc.cluster.local. (63) 10.129.192.2.53 > 10.140.216.13.46751: NXDomain*- 0/1/0 (171) 10.129.192.2.53 > 10.140.216.13.55929: XDomain*- 0/1/0 (171) 10.140.216.13.57414 > 10.129.192.2.53: AAAA? www.google.com.svc.cluster.local. (55) 10.140.216.13.54192 > 10.129.192.2.53: A? www.google.com.svc.cluster.local. (55) 10.129.192.2.53 > 10.140.216.13.57414: NXDomain*- 0/1/0 (163) 10.129.192.2.53 > 10.140.216.13.54192: NXDomain*- 0/1/0 (163) net.ResolveTCPAddr("tcp", "www.google.com:80"), on Ubuntu IPv6 is back...
  • 20. @lbernail What about CGO? 10.140.216.13.55929 > 10.129.192.2.53: AAAA? www.google.com.datadog.svc.cluster.local. (63) 10.140.216.13.46751 > 10.129.192.2.53: A? www.google.com.datadog.svc.cluster.local. (63) 10.129.192.2.53 > 10.140.216.13.46751: NXDomain*- 0/1/0 (171) 10.129.192.2.53 > 10.140.216.13.55929: XDomain*- 0/1/0 (171) ... Native Go CGO: export GODEBUG=netdns=cgo 10.140.216.13.49382 > 10.129.192.2.53: A? www.google.com.datadog.svc.cluster.local. (63) 10.140.216.13.49382 > 10.129.192.2.53: AAAA? www.google.com.datadog.svc.cluster.local. (63) 10.129.192.2.53 > 10.140.216.13.49382: NXDomain*- 0/1/0 (171) 10.129.192.2.53 > 10.140.216.13.49382: NXDomain*- 0/1/0 (171) ... Was GODEBUG ignored?
  • 21. @lbernail Subtle difference 10.140.216.13.55929 > 10.129.192.2.53: AAAA? www.google.com.datadog.svc.cluster.local. (63) 10.140.216.13.46751 > 10.129.192.2.53: A? www.google.com.datadog.svc.cluster.local. (63) Native Go CGO: export GODEBUG=netdns=cgo 10.140.216.13.49382 > 10.129.192.2.53: A? www.google.com.datadog.svc.cluster.local. (63) 10.140.216.13.49382 > 10.129.192.2.53: AAAA? www.google.com.datadog.svc.cluster.local. (63) Native Go uses a different source port for A and AAAA
  • 22. @lbernail CGO implementation https://github.com/golang/go/blob/master/src/net/cgo_linux.go So we can't really avoid IPv6 queries in Go unless we change the app net.ResolveTCPAddr("tcp4", "www.google.com") But only with CGO... No AI_ADDRCONFIG
  • 24. @lbernail Coredns Autopath With this option, coredns knows the search domains and finds the right record 10.140.216.13.37164 > 10.129.192.23.53: A? google.com.datadog.svc.cluster.local. (58) 10.140.216.13.37164 > 10.129.192.23.53: AAAA? google.com.datadog.svc.cluster.local. (58) 10.129.192.23.53 > 10.140.216.13.37164: 2/0/0 CNAME google.com., A 216.58.218.238 (98) 10.129.192.23.53 > 10.140.216.13.37164: 2/0/0 CNAME google.com., AAAA 2607:f8b0:4004:808::200e (110) Much better: only 2 queries instead of 10+ But Requires to run the Coredns Kubernetes plugin with "pods: verified" - To infer the full search domain (pod namespace) - Memory requirements becomes proportional to number of pods - Several OOM-killer incidents for us
  • 25. @lbernail Node-local-dns kube-proxy daemonset pod in hostnet local bind node-local dns proxier tcp:<dns service> tcp:<cloud resolver> DNS DNS DNS kubedns / Coredns apiserver watch services Upstream resolver pod container node-local-dns Daemonset Binds a non-routable IP (169.x.y.z) Cache Proxies queries to DNS service in TCP Proxies non-kubernetes queries directly Wins Reduces load - cache - non Kubernetes queries bypass service Mitigates the netfilter race condition
  • 27. @lbernail Initial state kube-proxy IPVS DNS A DNS B DNS C watch services pod 1 container apiserverIP1:P1=> VIP:53 ⇔ IP1:P1 => IPA:53 300s VIP:53 IPA:53
  • 28. @lbernail Pod A deleted kube-proxy IPVS DNS A DNS B DNS C watch services pod 1 container apiserver VIP:53 IP1:P1=> VIP:53 ⇔ IP1:P1 => IPA:53 200s IP1:P2=> VIP:53 ⇔ IP1:P2 => IPB:53 300s
  • 29. @lbernail Source port reuse kube-proxy IPVS DNS A DNS B DNS C watch services pod 1 container apiserver VIP:53 IP1:P1=> VIP:53 ⇔ IP1:P1 => IPA:53 100s IP1:P2=> VIP:53 ⇔ IP1:P2 => IPB:53 200s What if a new query reuses P1 while the entry is still in the IPVS conntrack? Packet is silently dropped Situation can be frequent for applications sending a lot of DNS queries
  • 30. @lbernail Mitigation #1 kube-proxy IPVS DNS A DNS B DNS C watch services pod 1 container apiserver VIP:53 IP1:P1=> VIP:53 ⇔ IP1:P1 => IPA:53 100s IP1:P2=> VIP:53 ⇔ IP1:P2 => IPB:53 200s expire_nodest_conn=1 New packet from IP1:P1? - Delete entry from the IPVS conntrack - Send ICMP Port Unreachable Better, but under load, this can still trigger many errors
  • 31. @lbernail Mitigation #2 kube-proxy IPVS DNS A DNS B DNS C watch services pod 1 container apiserver VIP:53 IP1:P2=> VIP:53 ⇔ IP1:P2 => IPB:53 29s expire_nodest_conn=1, low UDP timeout (30s) [kube-proxy 1.18+] Likelihood of port collision much lower Still some errors Kernel patch to expire entries on backend deletion (5.9+) by @andrewsykim
  • 33. Sometimes your DNS is unstable
  • 34. @lbernail Coredns getting OOM-killed Some coredns pods getting OOM-killed Not great for apps... Coredns memory usage / limit
  • 35. @lbernail Coredns getting OOM-killed Coredns memory usage / limit Apiserver restarted Coredns pod reconnected Too much memory => OOM Startup requires more memory Autopath "Pods:verified" makes sizing hardapiserver restart
  • 37. @lbernail Proportional autoscaler Number of coredns replicas Proportional-autoscaler for coredns: Less nodes => Less coredns pods
  • 38. @lbernail Proportional autoscaler Exceptions due to DNS failure Number of coredns replicas Triggered port reuse issue Some applications don't like this
  • 39. Sometimes it's not your fault
  • 40. @lbernail Staging fright on AWS .:53 { kubernetes cluster.local { pods verified } autopath @kubernetes proxy . /etc/resolv.conf } cluster.local:53 { kubernetes cluster.local } .:53 { proxy . /etc/resolv.conf cache } Enable autopath Simple change right? Can you spot what broke the staging cluster?
  • 41. @lbernail Staging fright on AWS .:53 { kubernetes cluster.local { pods verified } autopath @kubernetes proxy . /etc/resolv.conf } cluster.local:53 { kubernetes cluster.local } .:53 { proxy . /etc/resolv.conf cache } With this change we disabled caching for proxied queries AWS resolver has a strict limit: 1024 packets/second to resolver per ENI A large proportion of forwarded queries got dropped
  • 42. @lbernail Staging fright on AWS #2 cluster.local:53 { kubernetes cluster.local } .:53 { proxy . /etc/resolv.conf cache force_tcp } cluster.local:53 { kubernetes cluster.local } .:53 { proxy . /etc/resolv.conf cache } Let's avoid UDP for upstream queries: avoid truncation, less errors Sounds like a good idea?
  • 43. @lbernail Staging fright on AWS #2 cluster.local:53 { kubernetes cluster.local } .:53 { proxy . /etc/resolv.conf cache force_tcp } cluster.local:53 { kubernetes cluster.local } .:53 { proxy . /etc/resolv.conf cache } AWS resolver has a strict limit: 1024 packets/second to resolver per ENI DNS queries over TCP use at least 5x more packets Don't query the AWS resolvers using TCP
  • 44. Sometimes it's really not your fault
  • 46. @lbernail Upstream DNS issue Application resolution errors DNS queries by zone Sharp drop in number of queries for zone "."
  • 47. @lbernail Application resolution errors Upstream resolver issue in a single AZ Provider DNS Zone "A" DNS queries by DNS zone Forward Health-check failures by Upstream/AZ Forward query Latency by Upstream/AZ Upstream DNS issue
  • 48. Sometimes you have to remember pods run on nodes
  • 49. @lbernail Node issue Familiar symptom: some applications have issues due to DNS errors DNS errors for app
  • 50. @lbernail Familiar symptom: some applications have issues due to DNS errors DNS errors for app Conntrack entries (hosts running coredns pods) ~130k entries Node issue
  • 51. @lbernail Familiar symptom: some applications have issues due to DNS errors DNS errors for app Conntrack entries (hosts running coredns pods) ~130k entries --conntrack-min => Default: 131072 Node issue Increase number of pods and nodes
  • 52. @lbernail Something weird Group of nodes with ~130k entries Group of nodes with ~60k entries
  • 53. @lbernail Something weird Kernel 4.15 Kernel 5.0 Kernel patches in 5.0+ to improve conntrack behavior with UDP => conntrack entries for DNS get 30s TTL instead of 180s Details ● netfilter: conntrack: udp: only extend timeout to stream mode after 2s ● netfilter: conntrack: udp: set stream timeout to 2 minutes
  • 55. @lbernail DNS is broken for a single app Symptom pgbouncer can't connect to postgres after coredns update But, everything else works completely fine
  • 56. @lbernail DNS is broken for a single app 10.143.217.162 41005 1 0 BAcKEnd.DAtAdOG.serViCe.CONsul.DATAdog.svC.clusTEr.LOCaL 10.143.217.162 41005 28 0 BaCKEND.dATaDOg.sErvicE.cONSuL.dataDog.svc.ClUsTER.loCaL 10.129.224.2 53 1 3 0 BAcKEnd.DAtAdOG.serViCe.CONsul.DATAdog.svC.clusTEr.LOCaL 10.129.224.2 53 28 3 0 BaCKEND.dATaDOg.sErvicE.cONSuL.dataDog.svc.ClUsTER.loCaL [...] 10.143.217.162 41005 1 0 bACKEND.dataDog.SeRvICe.CONsUl 10.143.217.162 41005 28 0 BaCkenD.DatADOg.ServiCE.COnsUL 10.129.224.2 53 1 0 1 bACKEND.dataDog.SeRvICe.CONsUl Let's capture and analyze DNS queries Source IP/Port Same source port across all queries?? 1: A 28: AAA 3: NXDOMAIN 0: NOERROR Queried domain. Random case?? IETF Draft to increase DNS Security "Use of Bit 0x20 in DNS Labels to Improve Transaction Identity" This DNS client is clearly not one we know about
  • 57. @lbernail DNS is broken for a single app 10.143.217.162 41005 1 0 BAcKEnd.DAtAdOG.serViCe.CONsul.DATAdog.svC.clusTEr.LOCaL 10.143.217.162 41005 28 0 BaCKEND.dATaDOg.sErvicE.cONSuL.dataDog.svc.ClUsTER.loCaL 10.129.224.2 53 1 3 0 BAcKEnd.DAtAdOG.serViCe.CONsul.DATAdog.svC.clusTEr.LOCaL 10.129.224.2 53 28 3 0 BaCKEND.dATaDOg.sErvicE.cONSuL.dataDog.svc.ClUsTER.loCaL [...] 10.143.217.162 41005 1 0 bACKEND.dataDog.SeRvICe.CONsUl 10.143.217.162 41005 28 0 BaCkenD.DatADOg.ServiCE.COnsUL 10.129.224.2 53 1 0 1 bACKEND.dataDog.SeRvICe.CONsUl Let's capture and analyze DNS queries Truncate Bit (TC) pgbouncer compiled with evdns, which doesn't support TCP upgrade (and just ignores the answer if TC=1) Previous coredns version was not setting the TC bit when upstream TCP answer was too large (bug was fixed) Recompiling pgbouncer with c-ares fixed the problem
  • 59. Sometimes it's not DNS Rarely
  • 60. @lbernail Logs are full of DNS errors% of Errors in app Sometimes it's not DNS
  • 61. @lbernail Logs are full of DNS errors Sharp drop in traffic received by nodes ?? Average #Packets received by nodegroups % of Errors in app Sometimes it's not DNS
  • 62. @lbernail Logs are full of DNS errors Sharp drop in traffic received by nodes ?? Average #Packets received by nodegroups TCP Retransmits by AZ-pairs High proportion of drops for any traffic involving zone "b" Confirmed transient issue from provider Not really DNS that time, but this was the first impact % of Errors in app Sometimes it's not DNS
  • 63. What we run now
  • 64. @lbernail Our DNS setup kube-proxy daemonset pod binding 169.254.20.10 node-local dns IPVS cluster.local forward <dns svc> force_tcp cache . forward <cloud> prefer_udp cache DNS DNS DNS Coredns Cloud resolver pod container cluster.local kubernetes pods disabled cache . forward <cloud> prefer_udp cache UDP TCP UDP UDP
  • 65. @lbernail Our DNS setup kube-proxy daemonset pod binding 169.254.20.10 node-local dns IPVS DNS DNS DNS Cloud resolver pod container Container /etc/resolv.conf search <namespace>.svc.cluster.local svc.cluster.local cluster.local ec2.internal nameserver 169.254.20.10 <dns svc> options ndots:5, timeout: 1 Coredns
  • 66. @lbernail Our DNS setup kube-proxy daemonset pod binding 169.254.20.10 node-local dns IPVS DNS DNS DNS Cloud resolver pod container Container /etc/resolv.conf search <namespace>.svc.cluster.local svc.cluster.local cluster.local ec2.internal nameserver 169.254.20.10 <dns svc> options ndots:5, timeout: 1 Alternate /etc/resolv.conf Opt-in using annotations (mutating webhook) search svc.cluster.local nameserver 169.254.20.10 <dns svc> options ndots:2, timeout: 1 Coredns
  • 68. @lbernail Conclusion ● Running Kubernetes means running DNS ● DNS is hard, especially at scale ● Recommendations ○ Use node-local-dns and cache everywhere you can ○ Test your DNS infrastructure (load-tests, rolling updates) ○ Understand the upstream DNS services you depend on ● For your applications ○ Try to standardize on a few resolvers ○ Use async resolution/long-lived connections whenever possible ○ Include DNS failures in your (Chaos) tests