Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.

Routed Fabrics For Ceph

51 vues

Publié le

Chris Ellis: Routed Fabrics For Ceph

Publié dans : Technologie
  • Soyez le premier à commenter

Routed Fabrics For Ceph

  1. 1. http://intrbiz.comchris@intrbiz.com Routed Fabrics For Ceph Chris Ellis - @intrbiz Fast & Effective Networking For Ceph Ceph Day London 2019
  2. 2. http://intrbiz.comchris@intrbiz.com Hello! ● I’m Chris ○ IT jack of all trades ● Mostly a PostgreSQL Consultant ○ Full stack: ■ from electronic design to web dev ● Very much into Open Source ○ Started a monitoring system project a few years ago ○ Big openSUSE and PostgreSQL fan ● Been using and playing with Ceph for a couple of years ○ Build a small VM farm with Ceph for shared storage
  3. 3. http://intrbiz.comchris@intrbiz.com Routed Fabrics, Huh?
  4. 4. http://intrbiz.comchris@intrbiz.com Routed Fabrics, Huh? ● Essentially we make servers participate in routing ○ Every network link the server has is active / active utilised ○ Every server takes part in the routing protocol ○ Routing protocol deals with device and link failures ■ Data just takes another path in the event of a fault ● Equal Cost Multi Path (ECMP) is used to efficiently move traffic ○ IP packets are routed over all available links ○ TCP streams don’t get split across more than one path ■ Single stream is still limited to the bandwidth of your links ○ IE: with 4x 10Gbe NICs we can push 40Gb/s of traffic in aggregate ■ An individual TCP stream maxes at 10Gb/s
  5. 5. http://intrbiz.comchris@intrbiz.com The Build ● My setup is about as small as you can go ● I've my R&D setup ● It's only two switches ● But it's about showing that these approaches work even at small scale ○ All traffic is still routed ○ We still get all benefits of a Routed Fabric ○ We can use cheap commodity switching ○ You don't need super high end kit to get efficiency and speed ● Yes, it's not a real Clos topology, you need a bigger problem domain for that ● This is about thinking about different ways of doing things
  6. 6. http://intrbiz.comchris@intrbiz.com What You’ll Need
  7. 7. http://intrbiz.comchris@intrbiz.com Connecting Things
  8. 8. http://intrbiz.comchris@intrbiz.com Connecting Things
  9. 9. http://intrbiz.comchris@intrbiz.com A Cunning Plan - Network Assignments ● Switch 1: 172.31.1.0/24 ○ Port 1: 172.31.1.0/30 ○ Port 2: 172.31.1.4/30 ○ … ○ Port 24: 172.31.1.92/30 ● Inter-switch: 172.31.3.0/24 ○ Link 1: 172.31.3.0/30 ○ Link 2: 172.31.3.4/30 ○ … ○ Link 8: 172.31.3.28/30 ● Switch 2: 172.31.2.0/24 ○ Port 1: 172.31.2.0/30 ○ Port 2: 172.31.2.4/30 ○ … ○ Port 24: 172.31.2.92/30 ● Ceph: 172.28.0.0/24 ○ Node 1: 172.28.0.1/32 ○ Node 2: 172.28.0.2/32 ○ … ○ Node 12: 172.28.0.12/32
  10. 10. http://intrbiz.comchris@intrbiz.com Configuring Your Switches - Turn On Routing ip routing router ospf router-id 172.26.1.210 network 172.31.1.0 255.255.255.0 area 0.0.0.0 network 172.31.3.0 255.255.255.0 area 0.0.0.0 redistribute connected redistribute static exit
  11. 11. http://intrbiz.comchris@intrbiz.com Configuring Your Switches - Server Interface interface 0/1 mtu 9018 routing ip address 172.31.1.1 255.255.255.252 ip ospf area 0.0.0.0 exit
  12. 12. http://intrbiz.comchris@intrbiz.com Configuring Your Switches - Server Interface interface 0/2 mtu 9018 routing ip address 172.31.1.5 255.255.255.252 ip ospf area 0.0.0.0 exit
  13. 13. http://intrbiz.comchris@intrbiz.com Configuring Your Switches - Server Interface interface 0/24 mtu 9018 routing ip address 172.31.1.93 255.255.255.252 ip ospf area 0.0.0.0 exit
  14. 14. http://intrbiz.comchris@intrbiz.com Configuring Your Switches - Inter Switch Interface interface 0/28 mtu 9018 routing ip address 172.31.3.1 255.255.255.252 ip ospf area 0.0.0.0 exit
  15. 15. http://intrbiz.comchris@intrbiz.com Configuring Your Ceph Server - Interfaces $> cat ifcfg-eth4 BOOTPROTO='static' IPADDR='172.31.1.2/30' MTU='9000' $> cat ifcfg-eth6 BOOTPROTO='static' IPADDR='172.31.1.6/30' MTU='9000' $> cat ifcfg-eth5 BOOTPROTO='static' IPADDR='172.31.2.2/30' MTU='9000' $> cat ifcfg-eth7 BOOTPROTO='static' IPADDR='172.31.2.6/30' MTU='9000'
  16. 16. http://intrbiz.comchris@intrbiz.com Configuring Your Ceph Server - Dummy Interface $> cat ifcfg-dummy0 BOOTPROTO='static' IPADDR='172.28.0.1/32' MTU='9000'
  17. 17. http://intrbiz.comchris@intrbiz.com Configuring Your Ceph Server - Quagga & OSPFd $> zypper in quagga ospfd
  18. 18. http://intrbiz.comchris@intrbiz.com Configuring Your Ceph Server - Quagga $> cat zebra.conf hostname ceph1 ! interface eth4 ip address 172.31.1.2/30 interface eth5 ip address 172.31.2.2/30 interface eth6 ip address 172.31.1.6/30 interface eth7 ip address 172.31.2.6/30 !
  19. 19. http://intrbiz.comchris@intrbiz.com Configuring Your Ceph Server - OSPFd $> cat ospfd.conf hostname ceph1 ! interface eth4 interface eth5 interface eth6 interface eth7 router ospf ospf router-id 172.26.1.1 network 172.28.0.1/32 area 0 network 172.31.1.2/30 area 0 network 172.31.1.6/30 area 0 network 172.31.2.2/30 area 0 network 172.31.2.6/30 area 0 !
  20. 20. http://intrbiz.comchris@intrbiz.com Configuring Your Ceph Server - Kernel $> cat sysctl.conf # Enable IP routing net.ipv4.ip_forward = 1 # Tweak ECMP policy net.ipv4.fib_multipath_hash_policy = 1 net.ipv4.fib_multipath_use_neigh = 1 # Disable reverse path filtering net.ipv4.conf.all.rp_filter = 0 # Enable reverse path filtering on normal NICs net.ipv4.conf.bond1.rp_filter = 1
  21. 21. http://intrbiz.comchris@intrbiz.com Configuring Your Ceph Server - Ceph $> cat ceph.conf [global] public_network = 172.28.0.0/24
  22. 22. http://intrbiz.comchris@intrbiz.com Et Volia $> ip route 172.26.28.2 proto zebra metric 20 nexthop via 172.31.1.10 dev eth7 weight 1 nexthop via 172.31.1.14 dev eth6 weight 1 nexthop via 172.31.2.10 dev eth4 weight 1 nexthop via 172.31.2.14 dev eth5 weight 1 172.26.28.3 proto zebra metric 20 nexthop via 172.31.1.18 dev eth7 weight 1 nexthop via 172.31.1.22 dev eth6 weight 1 nexthop via 172.31.2.18 dev eth4 weight 1 nexthop via 172.31.2.22 dev eth5 weight 1 ...
  23. 23. http://intrbiz.comchris@intrbiz.com Caveats ● Make sure that MTUs are configured correctly and match ○ OSPF is a custom IP type, if your MTU is mismatched packets get corrupted ● Label your cables ○ Swapping cables around will break things ● Quagga will only set a default route if no default route is already defined ○ OSPFd needs: `default-information originate metric-type 1`
  24. 24. http://intrbiz.comchris@intrbiz.com Further Reading ● Intro to Clos networks ○ https://en.wikipedia.org/wiki/Clos_network ● Google white paper on their CLOS topologies ○ https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/43837.pdf ● Cumulus on Clos and ECMP: ○ https://cumulusnetworks.com/blog/celebrating-ecmp-part-one/ ● Benefits of ditching layer 2 ○ https://thenewstack.io/ditch-pitfalls-layer-2-networks-modern-data-center-design/

×