Contenu connexe

Présentations pour vous(20)

Similaire à How We Defined Our Own Cloud.pdf(20)




How We Defined Our Own Cloud.pdf

  1. How We Defined Our Own Cloud May 19th, 2022 Andrew Hajinikitas Cloud Platform Department Rakuten Group, Inc.
  2. 2 About Me Andrew Hajinikitas Worked in industry over 20+ years! ~ seen a lot of change and made a a lot of change! Currently working in Rakuten’s: Cloud platform Department Core Infrastructure Section as Vice Senior Manager Started at Rakuten in 2018 as an Architect in Cloud platform Department - Core Infrastructure – Network Group Specializing in pipelining automation, infrastructure, networking, security, and much much more! ! !
  3. 3 CONTENTS 1. Some of Rakuten's key infrastructure 2. About the current cloud platform 3. Potential future efforts
  4. 4 Metal Instance A machine is one of fundamental bricks of our Cloud, alongside the network We’re here providing building blocks Metal Containers / Hypervisors Internet projects Customer projects One Cloud Services
  5. 5 Microservers Quanta X11C-8N • Intel Xeon E-2100 system. • Four DDR4 DIMM slots. • For storage, each sled has a 2.5” SFF SATA drive bay • Can use either two M.2 drives or two NF1 drives with the sled. • Networking is provided by either dual 10GbE or a single 25GbE uplink Reference Page: server/Microserver/QuantaMicro-X11C-8N
  6. 6 Internals to a Sled Reference Page:
  7. 7 GPU Servers Quanta SYS-420GP-TNAR • 2x 32 core Intel Xeon 8358 @2.6 Ghz • 2 Terrabyte of RAM • 8x A100 - 80 GB NVidia GPU • 2x 960 GB (OS) • 4x 3.84 TB (data) • 8x HDR (200 Gbs, GPU) • 2x HDR (200 Gbs, CPU) • 2x 100 Gbs • 2x 200 Gbs 0.16 PFlops (FP32) • 1.25 PFlops (FP32+TC) Reference Page:
  8. 8 GPU Servers
  9. 9 Software Tech Stack
  10. 10 US EU 9data centers in operation in Japan and overseas Providing around 30,000servers • Expansion of new managed services • Expanding the implementation / use of study sessions for in-house users LBaaS DBaaS Storage CaaS Monitoring JP Current Status New Managed Services
  11. 11 What’s Coming Up? • Extension of network self-service • Making GPU resources a platform service • Efficiency including DC resources • Resource-efficient physical equipment selection according to the usage of each upper service layer • HW / NW resource management and stable supply (responding to silicon shortage) • Network CI / CD (automation of upgrades and verifications) 未来を想像させる画像
  12. 12 What’s Coming Up? • We have built a full-fledged integrated private cloud to support the Rakuten ecosystem and have begun using it in production environments. • Steer towards managed services and simplify core infrastructure. • Providing high-performance physical server / network resources at low cost. • Multi-tenant support / Clarify the scope of responsibility between each platform and user. • We are promoting the use of Rakuten services not only in Japan but also overseas, and have already expanded to 9 DCs in Japan and overseas. 未来を想像させる画像
  13. 13 Goal ~ “Manage the Instance Life Cycle from One Dashboard”