This document provides an introduction to debugging IPv4 networks. It outlines the key concepts and tools someone should understand, including IP addressing, routing, and common network protocols. The document then describes a process for troubleshooting network issues, starting with gathering information, checking basic connectivity using tools like ping and traceroute, and examining settings at each network layer. It recommends tools for more advanced testing of bandwidth, latency, DNS resolution, NTP synchronization and reviewing logs. The goal is to methodically work through each network layer and protocol to pinpoint where issues may be occurring.
3. What you should know
• Basic IP connectivity concepts:
▫ Know what an IP-address and netmask is
(IPv4)
4. What you will learn
• Common layer-2 network protocols and their
daily use
• Basic IPv4 routing and problem identification,
meaning…
▫ You don’t have to solve problems you encounter,
just be able to pin-point them.
• Common IP services, like nameresolving and
timesynchronisation
• Opensource network tools and some basic Unix
hacking skills
5. Contents
• Introduction
• What you should know
• What you will learn
• When connectivity fails…
• Before you begin
• The tools
• Network plan/map
• The tests
• The results
• Network measurements
• Post processing
• Questions
Soft-skills
“Direct”
tests
Difficult
cases
Procesintime
7. When connectivity
fails…
• Oh really?
• Listen…
• …Listen carefully…
• …LISTEN…
• ((Gracefully take any insults, it’s-just-work-you-know))
• …repeat
8. Before you begin
• Can I test/simulate this SOMEWHERE ELSE
• Baseline performance figures! (normal behavior)
• Zero-load performance figures! (single user performance)
• Peak hours? Spikes? Notorious: Batch processing/backups
• Who is involved in this?
▫ Users?
▫ Managers?
▫ 3rd parties?
• What do “they” expect from me?
▫ Follow procedures? (impact=?)
▫ Document(s)?
• Begin with the end in mind
▫ Setup test-tree (“if-this-works” then “test-that”)
9. The tools
• KNOW THY TOOLS… THOROUGHLY
• Learn tools in test environment.
• Do that again…
• …and again…
• …and again…
• Repeat
11. About the tests
• We’ll be following the OSI layers:
1. People
2. Do
3. Need
4. To
5. See
6. Pamala
7. Anderson
bits
frames
packets
1. Princess
2. Diana
3. Never
4. Tried
5. Shagging
6. Prince
7. Andrew
1. Port
2. Drinking
3. Now
4. Together
5. Standing
6. People
7. All
1. Processing
2. Data
3. Need
4. To
5. Seem
6. People
7. All
12. …In theory – as short as possible
• Hardware, NIC, MAC *
• VLAN, ports & tags *
• Spanning tree *
• TCP/IP, IPv4 address space, netmask
calculations
• ARP, ICMP
• UDP, TCP three-way handshake
• TLS/SSL, PKI
• NTP, DNS
* https://en.wikipedia.org/wiki/User:Jaccovanbuuren/Books/Layer_2
15. The tests – But first…
1. Identify “problem-chain”
(if more than one, pick any, all if possible)
Documentation…?... Or…
…Document-It-Yourself (DIY)
BUILD A MAP!
16. Network host discovery
• Going boldly where no packet
has gone before…
▫ (ze)nmap!
▫ Zmap?
▫ Masscan??
▫ Milder: Zabbix host discovery
• … but rarely done as
part of troubleshooting
Just
because you
can, doesn’t
mean that
you should
17. The tests
2. Check settings at both ends, and – if possible
3. EVERYWHERE IN BETWEEN
(( Check interfaces autosense/autonegotiate, line
speed and duplex settings ))
((( Layer 2 intermezzo: MAC, CDP/LLDP, STP! )))
18. The tests
4. Check the ARP cache
root@io:~# arp -an
? (192.168.223.1) at 08:00:27:60:05:2a [ether] on eth1
? (10.0.2.2) at 52:54:00:12:35:02 [ether] on eth0
19. The tests
5. Check ICMP Echo/Echo reply a.k.a. “PING”
- Local interface
- Local network
- Ping broadcast address
- Default gateway
- Host on other network
20. The tests
6. Check “distances” with variable Time-To-Live
(TTL) packets
(ping)
21. The tests
7. Check fragments with variable MTU sizes at
distant networks.
- Set “Don’t fragment” bit…
(ping)
22. The tests
8. Check “port/host unreachable” with UDP ports
at distant network.
((h)ping)
23. The tests
• Check nameresolving for relevant hostnames
C:WINDOWSsystem32>nslookup.exe
Default Server:
nlhag999a21ads.ww002.siemens.net
Address: 139.10.220.20
> set type=txt
> set class=chaos
> version.bind
Server: nlhag999a21ads.ww002.siemens.net
Address: 139.10.220.20
version.bind.ww002.siemens.net text =
"Microsoft DNS 6.1.7601 (1DB1557D)"
> exit
27. The tests
• Check available bandwidth and latency
▫ Check on high QoS ports (SIP: 5060/5061 tcp)
• iperf
• ftp(!)
28. The results
Latency:
• Localhost
▫ <1 ms latency
• Localnet
▫ <10 ms latency
• Distant net
▫ …yeah… fuzzy…
• Bandwidth should be within 10%
Must
o
Should
Could
o
Would
29. …And now what?
• …Move along now, nothing to see here(?)
…or is it…?
30. Network measurements
• Port monitor at network edge (tcpdump)
• Port monitor at server farm (tcpdump)
• Routers&switches: SNMP graphics!
• Server farm: show me your SYSLOG
31. Post processing
WARNING! CODING SKILLS REQUIRED!
PROCEED WITH CAUTION!
• What am I looking for?
▫ Spikes = High Bandwidth usage
▫ Peak hours = Concurrent usage
▫ Hick-ups = Re-occuring events