SlideShare une entreprise Scribd logo
1  sur  23
Télécharger pour lire hors ligne
IBM Research




9P Trace and Code Walkthrough



         Eric Van Hensbergen
         IBM Austin Research Lab
         (bergevan@us.ibm.com)




                                   © 2010 IBM Corporation
IBM Research




Agenda
• 9P Trace analysis for common operations
    • mount
    • open + write + close
    • open + read + close
    • chmod
    • ls -l
• High level code organization
• Client and Transport Interfaces
• Important data structures and their accounting
• Code Review
    • VFS Code Review
    • Network Code Review
 2      9P Trace and Code Walkthroughs             © 2010 IBM Corporation
IBM Research




9P Trace: Mount 9P /mnt (Plan 9)
mount 9p /mnt
                                        Tversion(NOFID, 8216, 9P2000, “”)



                                        Tattach(1, 70, 4294967295, ericvh,””)




 3     9P Trace and Code Walkthroughs                                 © 2010 IBM Corporation
IBM Research




9P Trace: Mount 9P /mnt (Linux)
mount 9p /mnt
                                        Tversion(NOFID, 8216, 9P2000, “”)



                                        Tattach(1, 70, -1, ericvh,””)
                                        Twalk(1, 70, 102, array[] of {})
                                        Tstat(1, 102)
                                        Tclunk(1, 102)




 4     9P Trace and Code Walkthroughs                                      © 2010 IBM Corporation
IBM Research




9P Trace: Write a File (Plan 9)
echo hello > /mnt/tmp/hello.txt
fd = create(“/mnt/tmp/hello.txt”);
                          Twalk(1, 70, 59, array[] of {“tmp”})
                          Twalk(1, 59, 86, array[] of {“hello.txt”})
                          Rerror(1, “file does not exist”)
                          Twalk(1, 59, 86, nil)
                          Tcreate(1, 86, “hello.txt”, 8r666, 1)
                          Tclunk(1, 59);
pwrite(fd,”hello”, 5, 0);
                                          Twrite(1, 86, 0, array[6] of {“hello”})
close(fd);
                                          Tclunk(1, 86)
  5      9P Trace and Code Walkthroughs                                     © 2010 IBM Corporation
IBM Research




9P Trace: Write a File (Linux)
echo hello > /mnt/tmp/hello.txt
fd = create(“/mnt/tmp/hello.txt”);
                                            Twalk(1, 70, 103, array[] of {“tmp”})
                                            Tstat(1, 103)
                                            Twalk(1, 103, 109, nil)
                                            Twalk(1, 109, 75, nil)
                                            Twalk(1, 75, 97, array[] of {“hello.txt”})
                                            Rerror(1, “file does not exist”)
                                            Tclunk(1, 75);
                                            Tclunk(1, 109);
                                            Twalk(1, 103, 109, nil)
                                            Twalk(1, 109, 75, nil)
                                            Tcreate(1, 75, “hello.txt”, 8r666, 1)
                                            Twalk(1, 109, 99, nil)
                                            Twalk(1, 99, 107, nil)
                                            Twalk(1, 107, 110, array[] of {“hello.txt})
                                            Tclunk(1, 107)
                                            Tclunk(1, 99)
                                            Tclunk(1, 109)
                                            Tstat(1, 110)
pwrite(fd,”hello”, 5, 0);                   Twrite(1, 75, 0, array[6] of {“hello”})
                                            Tclunk(1, 75)
close(fd);                                  Tclunk(1, 110)
                                            Tclunk(1, 103)


  6        9P Trace and Code Walkthroughs                                                 © 2010 IBM Corporation
IBM Research




9P Trace: Read a File (Plan 9)
% cat /mnt/tmp/hello.txt
fd = open(“/mnt/tmp/hello.txt”);
                                           TWalk(1, 70, 85, array[] of {“tmp”}, {“hello.txt”})
n = 0;                                     TOpen(1, 85, 0)

do {
  result = pread(fd, buf+n, 255-n, n)
  n += result;
                          Tread(1, 85, 0, 255)
} while (result > 0);
                                           Rread(1, array[6] of “hello”)
                                           Tread(1, 85, 6, 249)
                                           Rread(1, array[0] of “”)
close(fd);
                                           TClunk(1, 85)
  7       9P Trace and Code Walkthroughs                                            © 2010 IBM Corporation
IBM Research




9P Trace: Read a File (Linux)
% cat /mnt/tmp/hello.txt                  Twalk(1, 70, 106, array[] of {“tmp”})
                                          Tstat(1, 106)
fd = open(“/mnt/tmp/hello.txt”);          Twalk(1, 106, 104, nil)
                                          Twalk(1, 104, 75, nil)
                                          Twalk(1, 75, 100, array[] of {“hello.txt”})
n = 0;                                    Tclunk(1, 75)
                                          Tclunk(1, 104)
do {                                      Tstat(1, 100)
                                          Twalk(1, 100, 104, nil)
  result = pread(fd, buf+n, 255-n, n)     TOpen(1, 104, 0)
  n += result;                            Tstat(1, 100)
                                          Tread(1, 104, 0, 255)
} while (result > 0);                     Rread(1, array[6] of “hello”)
                                          Tread(1, 104, 6, 249)
                                          Rread(1, array[0] of “”)
                                          Tclunk(1, 104)
                                          Tclunk(1, 100)
close(fd);                                Tclunk(1, 106)

  8      9P Trace and Code Walkthroughs                                 © 2010 IBM Corporation
IBM Research




9P Trace: Get/Set Attributes (Linux)
                                         Twalk(1, 70, 114, array[] of {“tmp”})
% chmod ugo+rwx /mnt/tmp/hello.txt       Tstat(1, 114)
                                         Twalk(1, 114, 113, nil)
s = stat(“/mnt/tmp/hello.txt”);          Twalk(1, 113, 104, nil)
                                         Twalk(1, 104, 75, array[] of {“hello.txt”})
                                         Tclunk(1, 104)
                                         Tclunk(1, 113)
                                         Tstat(1, 75)
                                         Tclunk(1, 75)
                                         Tclunk(1, 114)
                                         Twalk(1, 70, 102, array[] of {“tmp})
chmod(“/mnt/tmp/hello.txt”, 0777);       Tstat(1, 102)
                                         Twalk(1, 102, 112, nil)
                                         Twalk(1, 112, 113, nil)
                                         Twalk(1, 113, 104, array[] of {“hello.txt”})
                                         Tclunk(1, 113)
                                         Tclunk(1, 112)
                                         Tstat(1, 104)
                                         Twstat(1, 104, Dir(...””,8r777,-1,-1,...)
                                         Tclunk(1, 104)
                                         Tclunk(1, 102)


 9      9P Trace and Code Walkthroughs                                     © 2010 IBM Corporation
IBM Research




9P Trace: Read Directory (Plan 9)
% ls -l /mnt/tmp
                                          Twalk(1, 70, 85, array[] of {“tmp”})
                                          Tstat(1, 85)
                                          Tclunk(1, 85)

                                          Twalk(1, 70, 85, array[] of {“tmp”})
                                          Topen(1, 85, 0);
                                          Tread(1, 85, 2048)
                                          Rread(1, array[69] of Dir(...))
                                          Tread(1, 85, 2048)
                                          Rread(1, array[0] of “”)
                                          Tclunk(1, 85)




 10      9P Trace and Code Walkthroughs                                          © 2010 IBM Corporation
IBM Research
                                          Walk(2,70,103,array[] of {"tmp"})
                                          Stat(2,103)
                                          Stat(2,103)



9P Trace: Read Directory (Linux)
                                          Clunk(2,103)
                                          Walk(2,70,114,array[] of {"tmp"})
                                          Stat(2,114)
                                          Clunk(2,114)
                                          Walk(2,70,102,array[] of {"tmp"})
                                          Stat(2,102)
                                          Clunk(2,102)

% ls -l /mnt/tmp                          Walk(2,70,103,array[] of {"tmp"})
                                          Stat(2,103)
                                          Walk(2,103,109,nil)
                                          Open(2,109,0)
                                          Read(2,109,0,8168)
                                          Read(2,109,69,8168)
                                          Walk(2,103,112,nil)
                                          Walk(2,112,100,nil)
                                          Walk(2,100,97,array[] of {"hello.txt"})
                                          Clunk(2,100)
                                          Clunk(2,112)
                                          Stat(2,97)
                                          Stat(2,97)
                                          Clunk(2,97)
                                          Walk(2,103,97,nil)
                                          Walk(2,97,112,nil)
                                          Walk(2,112,100,array[] of {"hello.txt"})
                                          Clunk(2,112)
                                          Clunk(2,97)
                                          Stat(2,100)
                                          Clunk(2,100)
                                          Walk(2,103,100,nil)
                                          Walk(2,100,97,nil)
                                          Walk(2,97,112,array[] of {"hello.txt"})
                                          Clunk(2,97)
                                          Clunk(2,100)
                                          Stat(2,112)
                                          Clunk(2,112)
                                          Read(2,109,69,8168)
                                          Clunk(2,109)
                                          Clunk(2,103)


 11      9P Trace and Code Walkthroughs                            © 2010 IBM Corporation
IBM Research




High Level Code Organization

                                                         Core Protocol

            fs/9p                             fs/net


       VFS Hooks
                                       fd     rdma          virtio


                                            Transports




 12   9P Trace and Code Walkthroughs                             © 2010 IBM Corporation
IBM Research




Core Network Interfaces (client.h)
• p9_client_create(dev_name, options)
    • create a new client instance (mount)
• p9_client_destroy(client)
    • called by VFS interface to destroy a client (after umount)
• p9_client_disconnect(client)
    • called by transport if client session is interrupted or has a
        fatal error
• p9_client_<operation>: execute a 9P operation
    • (version, attach, open, read, write, etc.)
    • almost all called with p9_fid structure as element
• p9_client_cb(client, request)
    • called when a response is received to wake up client thread
 13      9P Trace and Code Walkthroughs                    © 2010 IBM Corporation
IBM Research




Transport Interface (in transport.h)
• create(client_struct, device name, options)
    • create a new connection for client on the transport
    • return value indicates success/failure
• close(client_struct)
    • release a connection for client on the transport
    • no return
• request(client_struct, p9_req_t)
    • issue a request on the transport
    • return value indicates success
• cancel(client_struct, p9_req_t)
    • cancel a request (if it hasn’t been sent)
    • return value indicates success/failure (if req already sent)
 14      9P Trace and Code Walkthroughs                   © 2010 IBM Corporation
IBM Research




Data Structure Overview
           v9fs_session

                                                p9_client    transport private
                        file
      dentry
                                                            fid pool
                                    fid list
                                                            tag pool
           fid
                                                               request array

           request fcall
                                                  request

          response fcall


 15            9P Trace and Code Walkthroughs                                  © 2010 IBM Corporation
IBM Research




Client Accounting (p9_client) in client.h
• Client and session information accounting
    • lock: protect client structure
    • dotu: whether or not extensions are active
    • trans_mod: transport for this session
    • status: current status (connected, error, etc.)
    • trans: transport private information
    • conn: trans_fd specific tracking structure
    • fidpool: per session fid accounting
    • fidlist: list of active fid handles (for cleanup)
    • tagpool: per session tag tracking
    • reqs[] - double array of requests for quick lookup
    • max_tag - maximum number of outstanding requests so far
 16     9P Trace and Code Walkthroughs              © 2010 IBM Corporation
IBM Research




Data Structure Overview
       v9fs_session

                                    p9_client      transport private
              file
dentry
                                                  fid pool
                         fid list
                                                  tag pool
       fid
                                                     request array

       request fcall
                                        request

       response fcall


  17         9P Trace and Code Walkthroughs                            © 2010 IBM Corporation
IBM Research




Request (p9_req_t) Structure (in client.h)
• Passed between core network and transport to track ops
   • status: status of this request slot
   • t_err: transport error reporting
   • wq: wait queue (client thread blocks while waiting for
           response)
      • tc: request fcall
      • rc: response fcall
      • aux: transport specific data
      • req_list: link for higher level objects to chain requests

• Allocated and released by core network code
 18        9P Trace and Code Walkthroughs                    © 2010 IBM Corporation
IBM Research




Fcall (p9_fcall) Structure (in 9p.h)
• Encapsulates protocol message (Either request or response)
   • size: size of entire message
   • id: protocol operation
   • tag: multiplexer identifier
   • offset: used by marshalling to track current buffer pos
   • capacity: used by marshalling to track total buffer capacity
   • sdata: actual protocol buffer

• Allocated and paired with buffer for tracking purposes
• Usually grouped inside a request structure

 19      9P Trace and Code Walkthroughs                    © 2010 IBM Corporation
IBM Research




Data Structure Overview
       v9fs_session

                                    p9_client      transport private
              file
dentry
                                                  fid pool
                         fid list
                                                  tag pool
       fid
                                                     request array

       request fcall
                                        request

       response fcall


  20         9P Trace and Code Walkthroughs                            © 2010 IBM Corporation
IBM Research




FID (p9_fid) Structure (in client.h)
• Encapsulates file handle and user credentials
   • client: client back-pointer
   • fid: numeric identifier
   • mode: if open, then the mode it was opened
   • qid: current qid for fid
   • iounit: max data per packet on this fid
   • uid: user associated with this fid
   • rdir: accounting structure for dirread

      • flist: per-client-instance fid tracking
      • dlist: per-dentry fid tracking

• FIDs associated with dentries on Client for tracking purposes
 21         9P Trace and Code Walkthroughs                 © 2010 IBM Corporation
IBM Research




Support Interfaces
• p9_errstr2errno: used to map Plan 9 error strings to errno
• fid accounting
     • p9_fid_create - allocate numeric fid and initialize fid struct
     • p9_fid_destroy - release numeric fid & its resources
• request/tag accounting
     • p9_tag_allocate - allocate a request
     • p9_tag_lookup - lookup a request by tag
     • p9_free_req - release a tag and cleanup memory




 22      9P Trace and Code Walkthroughs                     © 2010 IBM Corporation
IBM Research




Code Review
•   http://lxr.linux.no/linux/include/net/9p/
•   http://lxr.linux.no/linux/fs/9p/
       • fid.c/fid.h - fid management
       • v9fs.c/v9fs.h - session management
       • vfs_super.c - superblock ops (mount, unmount)
       • vfs_inode.c - inode operations (lookup, stat, wstat, create..)
       • vfs_file.c - file operations (open, read, write, close)
       • vfs_dir.c - dirread
       • vfs_addr.c - address space operations (mmap, etc.)
       • vfs_dentry.c - dentry operations (mostly fid releasing)
       • cache.c - fscache code
•   http://lxr.linux.no/linux/net/9p/
       • client.c - core client code
       • protocol.c - marshaling functions
       • trans_[fd,rdma,virtio].c - transport implementation
       • util.c - misc utility functions (pool accounting)
       • mod.c - module accounting and dynamic transport registration
       • error.c - error mapping

    23         9P Trace and Code Walkthroughs                             © 2010 IBM Corporation

Contenu connexe

Tendances

Tendances (9)

Usp
UspUsp
Usp
 
The Ring programming language version 1.7 book - Part 12 of 196
The Ring programming language version 1.7 book - Part 12 of 196The Ring programming language version 1.7 book - Part 12 of 196
The Ring programming language version 1.7 book - Part 12 of 196
 
Goptuna Distributed Bayesian Optimization Framework at Go Conference 2019 Autumn
Goptuna Distributed Bayesian Optimization Framework at Go Conference 2019 AutumnGoptuna Distributed Bayesian Optimization Framework at Go Conference 2019 Autumn
Goptuna Distributed Bayesian Optimization Framework at Go Conference 2019 Autumn
 
Unit testing pig
Unit testing pigUnit testing pig
Unit testing pig
 
Process management
Process managementProcess management
Process management
 
Mc Squared
Mc SquaredMc Squared
Mc Squared
 
Presentatie - Introductie in Groovy
Presentatie - Introductie in GroovyPresentatie - Introductie in Groovy
Presentatie - Introductie in Groovy
 
The Art of JVM Profiling
The Art of JVM ProfilingThe Art of JVM Profiling
The Art of JVM Profiling
 
file handling
file handlingfile handling
file handling
 

Similaire à 9P Code Walkthrough

The Ring programming language version 1.9 book - Part 32 of 210
The Ring programming language version 1.9 book - Part 32 of 210The Ring programming language version 1.9 book - Part 32 of 210
The Ring programming language version 1.9 book - Part 32 of 210Mahmoud Samir Fayed
 
The hangover: A "modern" (?) high performance approach to build an offensive ...
The hangover: A "modern" (?) high performance approach to build an offensive ...The hangover: A "modern" (?) high performance approach to build an offensive ...
The hangover: A "modern" (?) high performance approach to build an offensive ...Nelson Brito
 
Clojure Intro
Clojure IntroClojure Intro
Clojure Introthnetos
 
Linux seccomp(2) vs OpenBSD pledge(2)
Linux seccomp(2) vs OpenBSD pledge(2)Linux seccomp(2) vs OpenBSD pledge(2)
Linux seccomp(2) vs OpenBSD pledge(2)Giovanni Bechis
 
The Ring programming language version 1.8 book - Part 30 of 202
The Ring programming language version 1.8 book - Part 30 of 202The Ring programming language version 1.8 book - Part 30 of 202
The Ring programming language version 1.8 book - Part 30 of 202Mahmoud Samir Fayed
 
TypeScript Introduction
TypeScript IntroductionTypeScript Introduction
TypeScript IntroductionHans Höchtl
 
Presentation R basic teaching module
Presentation R basic teaching modulePresentation R basic teaching module
Presentation R basic teaching moduleSander Timmer
 
An (abridged) Ruby Plumber's Guide to *nix
An (abridged) Ruby Plumber's Guide to *nixAn (abridged) Ruby Plumber's Guide to *nix
An (abridged) Ruby Plumber's Guide to *nixEleanor McHugh
 
Looking Ahead to Tcl 8.6
Looking Ahead to Tcl 8.6Looking Ahead to Tcl 8.6
Looking Ahead to Tcl 8.6ActiveState
 
Are we ready to Go?
Are we ready to Go?Are we ready to Go?
Are we ready to Go?Adam Dudczak
 
The Ring programming language version 1.6 book - Part 27 of 189
The Ring programming language version 1.6 book - Part 27 of 189The Ring programming language version 1.6 book - Part 27 of 189
The Ring programming language version 1.6 book - Part 27 of 189Mahmoud Samir Fayed
 
The Ring programming language version 1.5.4 book - Part 25 of 185
The Ring programming language version 1.5.4 book - Part 25 of 185The Ring programming language version 1.5.4 book - Part 25 of 185
The Ring programming language version 1.5.4 book - Part 25 of 185Mahmoud Samir Fayed
 
The Ring programming language version 1.5.2 book - Part 11 of 181
The Ring programming language version 1.5.2 book - Part 11 of 181The Ring programming language version 1.5.2 book - Part 11 of 181
The Ring programming language version 1.5.2 book - Part 11 of 181Mahmoud Samir Fayed
 
Streams processing with Storm
Streams processing with StormStreams processing with Storm
Streams processing with StormMariusz Gil
 
Do snow.rwn
Do snow.rwnDo snow.rwn
Do snow.rwnARUN DN
 
DeepStochLog: Neural Stochastic Logic Programming
DeepStochLog: Neural Stochastic Logic ProgrammingDeepStochLog: Neural Stochastic Logic Programming
DeepStochLog: Neural Stochastic Logic ProgrammingThomas Winters
 

Similaire à 9P Code Walkthrough (20)

Python for Penetration testers
Python for Penetration testersPython for Penetration testers
Python for Penetration testers
 
The Ring programming language version 1.9 book - Part 32 of 210
The Ring programming language version 1.9 book - Part 32 of 210The Ring programming language version 1.9 book - Part 32 of 210
The Ring programming language version 1.9 book - Part 32 of 210
 
The hangover: A "modern" (?) high performance approach to build an offensive ...
The hangover: A "modern" (?) high performance approach to build an offensive ...The hangover: A "modern" (?) high performance approach to build an offensive ...
The hangover: A "modern" (?) high performance approach to build an offensive ...
 
Clojure Intro
Clojure IntroClojure Intro
Clojure Intro
 
Lrz kurse: r visualisation
Lrz kurse: r visualisationLrz kurse: r visualisation
Lrz kurse: r visualisation
 
Linux seccomp(2) vs OpenBSD pledge(2)
Linux seccomp(2) vs OpenBSD pledge(2)Linux seccomp(2) vs OpenBSD pledge(2)
Linux seccomp(2) vs OpenBSD pledge(2)
 
The Ring programming language version 1.8 book - Part 30 of 202
The Ring programming language version 1.8 book - Part 30 of 202The Ring programming language version 1.8 book - Part 30 of 202
The Ring programming language version 1.8 book - Part 30 of 202
 
TypeScript Introduction
TypeScript IntroductionTypeScript Introduction
TypeScript Introduction
 
Presentation R basic teaching module
Presentation R basic teaching modulePresentation R basic teaching module
Presentation R basic teaching module
 
An (abridged) Ruby Plumber's Guide to *nix
An (abridged) Ruby Plumber's Guide to *nixAn (abridged) Ruby Plumber's Guide to *nix
An (abridged) Ruby Plumber's Guide to *nix
 
Java Week9(A) Notepad
Java Week9(A)   NotepadJava Week9(A)   Notepad
Java Week9(A) Notepad
 
Looking Ahead to Tcl 8.6
Looking Ahead to Tcl 8.6Looking Ahead to Tcl 8.6
Looking Ahead to Tcl 8.6
 
Are we ready to Go?
Are we ready to Go?Are we ready to Go?
Are we ready to Go?
 
The Ring programming language version 1.6 book - Part 27 of 189
The Ring programming language version 1.6 book - Part 27 of 189The Ring programming language version 1.6 book - Part 27 of 189
The Ring programming language version 1.6 book - Part 27 of 189
 
The Ring programming language version 1.5.4 book - Part 25 of 185
The Ring programming language version 1.5.4 book - Part 25 of 185The Ring programming language version 1.5.4 book - Part 25 of 185
The Ring programming language version 1.5.4 book - Part 25 of 185
 
The Ring programming language version 1.5.2 book - Part 11 of 181
The Ring programming language version 1.5.2 book - Part 11 of 181The Ring programming language version 1.5.2 book - Part 11 of 181
The Ring programming language version 1.5.2 book - Part 11 of 181
 
Streams processing with Storm
Streams processing with StormStreams processing with Storm
Streams processing with Storm
 
Do snow.rwn
Do snow.rwnDo snow.rwn
Do snow.rwn
 
FPBrno 2018-05-22: Benchmarking in elixir
FPBrno 2018-05-22: Benchmarking in elixirFPBrno 2018-05-22: Benchmarking in elixir
FPBrno 2018-05-22: Benchmarking in elixir
 
DeepStochLog: Neural Stochastic Logic Programming
DeepStochLog: Neural Stochastic Logic ProgrammingDeepStochLog: Neural Stochastic Logic Programming
DeepStochLog: Neural Stochastic Logic Programming
 

Plus de Eric Van Hensbergen

Scaling Arm from One to One Trillion
Scaling Arm from One to One TrillionScaling Arm from One to One Trillion
Scaling Arm from One to One TrillionEric Van Hensbergen
 
Balance, Flexibility, and Partnership: An ARM Approach to Future HPC Node Arc...
Balance, Flexibility, and Partnership: An ARM Approach to Future HPC Node Arc...Balance, Flexibility, and Partnership: An ARM Approach to Future HPC Node Arc...
Balance, Flexibility, and Partnership: An ARM Approach to Future HPC Node Arc...Eric Van Hensbergen
 
ISC14 Embedded HPC BoF Panel Presentation
ISC14 Embedded HPC BoF Panel PresentationISC14 Embedded HPC BoF Panel Presentation
ISC14 Embedded HPC BoF Panel PresentationEric Van Hensbergen
 
Simulation Directed Co-Design from Smartphones to Supercomputers
Simulation Directed Co-Design from Smartphones to SupercomputersSimulation Directed Co-Design from Smartphones to Supercomputers
Simulation Directed Co-Design from Smartphones to SupercomputersEric Van Hensbergen
 
Scalable Elastic Systems Architecture (SESA)
Scalable Elastic Systems Architecture (SESA)Scalable Elastic Systems Architecture (SESA)
Scalable Elastic Systems Architecture (SESA)Eric Van Hensbergen
 
XCPU3: Workload Distribution and Aggregation
XCPU3: Workload Distribution and AggregationXCPU3: Workload Distribution and Aggregation
XCPU3: Workload Distribution and AggregationEric Van Hensbergen
 
Effect of Virtualization on OS Interference
Effect of Virtualization on OS InterferenceEffect of Virtualization on OS Interference
Effect of Virtualization on OS InterferenceEric Van Hensbergen
 
Systems Support for Many Task Computing
Systems Support for Many Task ComputingSystems Support for Many Task Computing
Systems Support for Many Task ComputingEric Van Hensbergen
 
Holistic Aggregate Resource Environment
Holistic Aggregate Resource EnvironmentHolistic Aggregate Resource Environment
Holistic Aggregate Resource EnvironmentEric Van Hensbergen
 

Plus de Eric Van Hensbergen (20)

Scaling Arm from One to One Trillion
Scaling Arm from One to One TrillionScaling Arm from One to One Trillion
Scaling Arm from One to One Trillion
 
Balance, Flexibility, and Partnership: An ARM Approach to Future HPC Node Arc...
Balance, Flexibility, and Partnership: An ARM Approach to Future HPC Node Arc...Balance, Flexibility, and Partnership: An ARM Approach to Future HPC Node Arc...
Balance, Flexibility, and Partnership: An ARM Approach to Future HPC Node Arc...
 
ISC14 Embedded HPC BoF Panel Presentation
ISC14 Embedded HPC BoF Panel PresentationISC14 Embedded HPC BoF Panel Presentation
ISC14 Embedded HPC BoF Panel Presentation
 
Simulation Directed Co-Design from Smartphones to Supercomputers
Simulation Directed Co-Design from Smartphones to SupercomputersSimulation Directed Co-Design from Smartphones to Supercomputers
Simulation Directed Co-Design from Smartphones to Supercomputers
 
Brasil Ross 2011
Brasil Ross 2011Brasil Ross 2011
Brasil Ross 2011
 
Scalable Elastic Systems Architecture (SESA)
Scalable Elastic Systems Architecture (SESA)Scalable Elastic Systems Architecture (SESA)
Scalable Elastic Systems Architecture (SESA)
 
Multipipes
MultipipesMultipipes
Multipipes
 
Multi-pipes
Multi-pipesMulti-pipes
Multi-pipes
 
VirtFS
VirtFSVirtFS
VirtFS
 
HARE 2010 Review
HARE 2010 ReviewHARE 2010 Review
HARE 2010 Review
 
PUSH-- a Dataflow Shell
PUSH-- a Dataflow ShellPUSH-- a Dataflow Shell
PUSH-- a Dataflow Shell
 
XCPU3: Workload Distribution and Aggregation
XCPU3: Workload Distribution and AggregationXCPU3: Workload Distribution and Aggregation
XCPU3: Workload Distribution and Aggregation
 
9P Overview
9P Overview9P Overview
9P Overview
 
Push Podc09
Push Podc09Push Podc09
Push Podc09
 
Libra: a Library OS for a JVM
Libra: a Library OS for a JVMLibra: a Library OS for a JVM
Libra: a Library OS for a JVM
 
Effect of Virtualization on OS Interference
Effect of Virtualization on OS InterferenceEffect of Virtualization on OS Interference
Effect of Virtualization on OS Interference
 
PROSE
PROSEPROSE
PROSE
 
Libra Library OS
Libra Library OSLibra Library OS
Libra Library OS
 
Systems Support for Many Task Computing
Systems Support for Many Task ComputingSystems Support for Many Task Computing
Systems Support for Many Task Computing
 
Holistic Aggregate Resource Environment
Holistic Aggregate Resource EnvironmentHolistic Aggregate Resource Environment
Holistic Aggregate Resource Environment
 

9P Code Walkthrough

  • 1. IBM Research 9P Trace and Code Walkthrough Eric Van Hensbergen IBM Austin Research Lab (bergevan@us.ibm.com) © 2010 IBM Corporation
  • 2. IBM Research Agenda • 9P Trace analysis for common operations • mount • open + write + close • open + read + close • chmod • ls -l • High level code organization • Client and Transport Interfaces • Important data structures and their accounting • Code Review • VFS Code Review • Network Code Review 2 9P Trace and Code Walkthroughs © 2010 IBM Corporation
  • 3. IBM Research 9P Trace: Mount 9P /mnt (Plan 9) mount 9p /mnt Tversion(NOFID, 8216, 9P2000, “”) Tattach(1, 70, 4294967295, ericvh,””) 3 9P Trace and Code Walkthroughs © 2010 IBM Corporation
  • 4. IBM Research 9P Trace: Mount 9P /mnt (Linux) mount 9p /mnt Tversion(NOFID, 8216, 9P2000, “”) Tattach(1, 70, -1, ericvh,””) Twalk(1, 70, 102, array[] of {}) Tstat(1, 102) Tclunk(1, 102) 4 9P Trace and Code Walkthroughs © 2010 IBM Corporation
  • 5. IBM Research 9P Trace: Write a File (Plan 9) echo hello > /mnt/tmp/hello.txt fd = create(“/mnt/tmp/hello.txt”); Twalk(1, 70, 59, array[] of {“tmp”}) Twalk(1, 59, 86, array[] of {“hello.txt”}) Rerror(1, “file does not exist”) Twalk(1, 59, 86, nil) Tcreate(1, 86, “hello.txt”, 8r666, 1) Tclunk(1, 59); pwrite(fd,”hello”, 5, 0); Twrite(1, 86, 0, array[6] of {“hello”}) close(fd); Tclunk(1, 86) 5 9P Trace and Code Walkthroughs © 2010 IBM Corporation
  • 6. IBM Research 9P Trace: Write a File (Linux) echo hello > /mnt/tmp/hello.txt fd = create(“/mnt/tmp/hello.txt”); Twalk(1, 70, 103, array[] of {“tmp”}) Tstat(1, 103) Twalk(1, 103, 109, nil) Twalk(1, 109, 75, nil) Twalk(1, 75, 97, array[] of {“hello.txt”}) Rerror(1, “file does not exist”) Tclunk(1, 75); Tclunk(1, 109); Twalk(1, 103, 109, nil) Twalk(1, 109, 75, nil) Tcreate(1, 75, “hello.txt”, 8r666, 1) Twalk(1, 109, 99, nil) Twalk(1, 99, 107, nil) Twalk(1, 107, 110, array[] of {“hello.txt}) Tclunk(1, 107) Tclunk(1, 99) Tclunk(1, 109) Tstat(1, 110) pwrite(fd,”hello”, 5, 0); Twrite(1, 75, 0, array[6] of {“hello”}) Tclunk(1, 75) close(fd); Tclunk(1, 110) Tclunk(1, 103) 6 9P Trace and Code Walkthroughs © 2010 IBM Corporation
  • 7. IBM Research 9P Trace: Read a File (Plan 9) % cat /mnt/tmp/hello.txt fd = open(“/mnt/tmp/hello.txt”); TWalk(1, 70, 85, array[] of {“tmp”}, {“hello.txt”}) n = 0; TOpen(1, 85, 0) do { result = pread(fd, buf+n, 255-n, n) n += result; Tread(1, 85, 0, 255) } while (result > 0); Rread(1, array[6] of “hello”) Tread(1, 85, 6, 249) Rread(1, array[0] of “”) close(fd); TClunk(1, 85) 7 9P Trace and Code Walkthroughs © 2010 IBM Corporation
  • 8. IBM Research 9P Trace: Read a File (Linux) % cat /mnt/tmp/hello.txt Twalk(1, 70, 106, array[] of {“tmp”}) Tstat(1, 106) fd = open(“/mnt/tmp/hello.txt”); Twalk(1, 106, 104, nil) Twalk(1, 104, 75, nil) Twalk(1, 75, 100, array[] of {“hello.txt”}) n = 0; Tclunk(1, 75) Tclunk(1, 104) do { Tstat(1, 100) Twalk(1, 100, 104, nil) result = pread(fd, buf+n, 255-n, n) TOpen(1, 104, 0) n += result; Tstat(1, 100) Tread(1, 104, 0, 255) } while (result > 0); Rread(1, array[6] of “hello”) Tread(1, 104, 6, 249) Rread(1, array[0] of “”) Tclunk(1, 104) Tclunk(1, 100) close(fd); Tclunk(1, 106) 8 9P Trace and Code Walkthroughs © 2010 IBM Corporation
  • 9. IBM Research 9P Trace: Get/Set Attributes (Linux) Twalk(1, 70, 114, array[] of {“tmp”}) % chmod ugo+rwx /mnt/tmp/hello.txt Tstat(1, 114) Twalk(1, 114, 113, nil) s = stat(“/mnt/tmp/hello.txt”); Twalk(1, 113, 104, nil) Twalk(1, 104, 75, array[] of {“hello.txt”}) Tclunk(1, 104) Tclunk(1, 113) Tstat(1, 75) Tclunk(1, 75) Tclunk(1, 114) Twalk(1, 70, 102, array[] of {“tmp}) chmod(“/mnt/tmp/hello.txt”, 0777); Tstat(1, 102) Twalk(1, 102, 112, nil) Twalk(1, 112, 113, nil) Twalk(1, 113, 104, array[] of {“hello.txt”}) Tclunk(1, 113) Tclunk(1, 112) Tstat(1, 104) Twstat(1, 104, Dir(...””,8r777,-1,-1,...) Tclunk(1, 104) Tclunk(1, 102) 9 9P Trace and Code Walkthroughs © 2010 IBM Corporation
  • 10. IBM Research 9P Trace: Read Directory (Plan 9) % ls -l /mnt/tmp Twalk(1, 70, 85, array[] of {“tmp”}) Tstat(1, 85) Tclunk(1, 85) Twalk(1, 70, 85, array[] of {“tmp”}) Topen(1, 85, 0); Tread(1, 85, 2048) Rread(1, array[69] of Dir(...)) Tread(1, 85, 2048) Rread(1, array[0] of “”) Tclunk(1, 85) 10 9P Trace and Code Walkthroughs © 2010 IBM Corporation
  • 11. IBM Research Walk(2,70,103,array[] of {"tmp"}) Stat(2,103) Stat(2,103) 9P Trace: Read Directory (Linux) Clunk(2,103) Walk(2,70,114,array[] of {"tmp"}) Stat(2,114) Clunk(2,114) Walk(2,70,102,array[] of {"tmp"}) Stat(2,102) Clunk(2,102) % ls -l /mnt/tmp Walk(2,70,103,array[] of {"tmp"}) Stat(2,103) Walk(2,103,109,nil) Open(2,109,0) Read(2,109,0,8168) Read(2,109,69,8168) Walk(2,103,112,nil) Walk(2,112,100,nil) Walk(2,100,97,array[] of {"hello.txt"}) Clunk(2,100) Clunk(2,112) Stat(2,97) Stat(2,97) Clunk(2,97) Walk(2,103,97,nil) Walk(2,97,112,nil) Walk(2,112,100,array[] of {"hello.txt"}) Clunk(2,112) Clunk(2,97) Stat(2,100) Clunk(2,100) Walk(2,103,100,nil) Walk(2,100,97,nil) Walk(2,97,112,array[] of {"hello.txt"}) Clunk(2,97) Clunk(2,100) Stat(2,112) Clunk(2,112) Read(2,109,69,8168) Clunk(2,109) Clunk(2,103) 11 9P Trace and Code Walkthroughs © 2010 IBM Corporation
  • 12. IBM Research High Level Code Organization Core Protocol fs/9p fs/net VFS Hooks fd rdma virtio Transports 12 9P Trace and Code Walkthroughs © 2010 IBM Corporation
  • 13. IBM Research Core Network Interfaces (client.h) • p9_client_create(dev_name, options) • create a new client instance (mount) • p9_client_destroy(client) • called by VFS interface to destroy a client (after umount) • p9_client_disconnect(client) • called by transport if client session is interrupted or has a fatal error • p9_client_<operation>: execute a 9P operation • (version, attach, open, read, write, etc.) • almost all called with p9_fid structure as element • p9_client_cb(client, request) • called when a response is received to wake up client thread 13 9P Trace and Code Walkthroughs © 2010 IBM Corporation
  • 14. IBM Research Transport Interface (in transport.h) • create(client_struct, device name, options) • create a new connection for client on the transport • return value indicates success/failure • close(client_struct) • release a connection for client on the transport • no return • request(client_struct, p9_req_t) • issue a request on the transport • return value indicates success • cancel(client_struct, p9_req_t) • cancel a request (if it hasn’t been sent) • return value indicates success/failure (if req already sent) 14 9P Trace and Code Walkthroughs © 2010 IBM Corporation
  • 15. IBM Research Data Structure Overview v9fs_session p9_client transport private file dentry fid pool fid list tag pool fid request array request fcall request response fcall 15 9P Trace and Code Walkthroughs © 2010 IBM Corporation
  • 16. IBM Research Client Accounting (p9_client) in client.h • Client and session information accounting • lock: protect client structure • dotu: whether or not extensions are active • trans_mod: transport for this session • status: current status (connected, error, etc.) • trans: transport private information • conn: trans_fd specific tracking structure • fidpool: per session fid accounting • fidlist: list of active fid handles (for cleanup) • tagpool: per session tag tracking • reqs[] - double array of requests for quick lookup • max_tag - maximum number of outstanding requests so far 16 9P Trace and Code Walkthroughs © 2010 IBM Corporation
  • 17. IBM Research Data Structure Overview v9fs_session p9_client transport private file dentry fid pool fid list tag pool fid request array request fcall request response fcall 17 9P Trace and Code Walkthroughs © 2010 IBM Corporation
  • 18. IBM Research Request (p9_req_t) Structure (in client.h) • Passed between core network and transport to track ops • status: status of this request slot • t_err: transport error reporting • wq: wait queue (client thread blocks while waiting for response) • tc: request fcall • rc: response fcall • aux: transport specific data • req_list: link for higher level objects to chain requests • Allocated and released by core network code 18 9P Trace and Code Walkthroughs © 2010 IBM Corporation
  • 19. IBM Research Fcall (p9_fcall) Structure (in 9p.h) • Encapsulates protocol message (Either request or response) • size: size of entire message • id: protocol operation • tag: multiplexer identifier • offset: used by marshalling to track current buffer pos • capacity: used by marshalling to track total buffer capacity • sdata: actual protocol buffer • Allocated and paired with buffer for tracking purposes • Usually grouped inside a request structure 19 9P Trace and Code Walkthroughs © 2010 IBM Corporation
  • 20. IBM Research Data Structure Overview v9fs_session p9_client transport private file dentry fid pool fid list tag pool fid request array request fcall request response fcall 20 9P Trace and Code Walkthroughs © 2010 IBM Corporation
  • 21. IBM Research FID (p9_fid) Structure (in client.h) • Encapsulates file handle and user credentials • client: client back-pointer • fid: numeric identifier • mode: if open, then the mode it was opened • qid: current qid for fid • iounit: max data per packet on this fid • uid: user associated with this fid • rdir: accounting structure for dirread • flist: per-client-instance fid tracking • dlist: per-dentry fid tracking • FIDs associated with dentries on Client for tracking purposes 21 9P Trace and Code Walkthroughs © 2010 IBM Corporation
  • 22. IBM Research Support Interfaces • p9_errstr2errno: used to map Plan 9 error strings to errno • fid accounting • p9_fid_create - allocate numeric fid and initialize fid struct • p9_fid_destroy - release numeric fid & its resources • request/tag accounting • p9_tag_allocate - allocate a request • p9_tag_lookup - lookup a request by tag • p9_free_req - release a tag and cleanup memory 22 9P Trace and Code Walkthroughs © 2010 IBM Corporation
  • 23. IBM Research Code Review • http://lxr.linux.no/linux/include/net/9p/ • http://lxr.linux.no/linux/fs/9p/ • fid.c/fid.h - fid management • v9fs.c/v9fs.h - session management • vfs_super.c - superblock ops (mount, unmount) • vfs_inode.c - inode operations (lookup, stat, wstat, create..) • vfs_file.c - file operations (open, read, write, close) • vfs_dir.c - dirread • vfs_addr.c - address space operations (mmap, etc.) • vfs_dentry.c - dentry operations (mostly fid releasing) • cache.c - fscache code • http://lxr.linux.no/linux/net/9p/ • client.c - core client code • protocol.c - marshaling functions • trans_[fd,rdma,virtio].c - transport implementation • util.c - misc utility functions (pool accounting) • mod.c - module accounting and dynamic transport registration • error.c - error mapping 23 9P Trace and Code Walkthroughs © 2010 IBM Corporation