This document discusses file systems and distributed file systems. It describes how file systems work, including hardware, partitions, logical volume management (LVM), and basic and distributed file systems. It focuses on GlusterFS and NFS distributed file systems. GlusterFS allows various volume types including distributed, replicated, distributed-replicated and stripe. NFS provides network access but no redundancy. The document also discusses storage solutions for AI training workloads, including Pure Storage FlashBlade and AIRI systems optimized for high-performance needs of AI.
4. Hardware
HDD/SDD
We won’t use the whole disk
You need to plan how to use the disk
○ How many partition
○ Size of each partition
/dev/sda (100G)
○ /dev/sda1 –(50G)
○ /dev/sda2 –(25G)
○ /dev/sda3 – (25G)
8. LVM
It’s impossible to predict the usage of
each partition.
You need to re-partition the whole
disk sometimes to fit the usage of
your user.
We can use the LVM
○ Logical Volume management
16. Distributedfilesystem
Allows access to files from multiple
hosts sharing via a computer network.
Multiple users on multiple machines
to share files and storage resources.
Components
○ Server
○ Client
○ Metadata Servers
Different features
○ Security/Redundancy
26. ErasureCoding
Erasure Coding
○ N = k + m (data + redundancy)
○ Take 6=4 + 2 as example
10 MB File
2.5 MB 2.5 MB
Server 1 Server 2
2.5 MB
Server 3
2.5 MB
Server 4
2.5 MB
Server 5
2.5 MB
Server 6
Data Data Data DataRedundancyRedundancy
28. LocalMount
Read/Write from/to local disk
○ Memory
○ Disk
Can’t share data cross nodes
Share data in the same node
○ Access control
○ Read-Write many
○ Read-Write once
○ Read-Only
32. NFS
Read/Write from/to NFS
○ Memory
○ Network Access
Share data cross nodes (same name)
○ Access Control (no…since the NFS doesn’t
support those feature)
Ask the data size
○ It don’t support this feature..lol
35. Storage systems available today are
optimized for a design point that’s
different to what AI truly requires
They are optimized for structured
workloads – predictable, sequential
access, not random patterns.
36. Pure-Storage(FlashBlade)
A single storage server for NVIDIA
DGX-1
Based on NFS
Storage
○ Flash-Array
○ PB
○ 17 GB/s bandwidth
○ 1.5M IOPS
Network
○ 8 * 40Gb/s
Price <$1 per GB