2. FILE I/O
• In this section we will cover unbuffered I/O
which is at the lowest level in UNIX
• Knowledge of fopen(), getc(), putc(),
fread() and fwrite() is assumed
• We will be using file descriptors instead of
FILE* objects
• All this allows for lower level access and direct
manipulation of files, symbolic links and
directories
3. • In the kernel all files are given a non-negative
integer as a reference up to OPEN_MAX
(defined in <limits.h>)
• Three files are always open:
• STDIN_FILENO: 0
• STDOUT_FILENO: 1
• STDERR_FILENO: 2
• All primitive file access goes through these file
descriptors
FILE DESCRIPTORS
4. FILE DESCRIPTORS
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
int open(const char *pathname, int oflag)
int open(const char * pathname,
int oflag, mode_t mode)
int close(int filedes);
5. OPENING A FILE
O_RDONLY
O_WRONLY exclusive
O_RDWR
O_APPEND
O_CREAT
O_EXCL error if O_CREAT on existing file
O_TRUNC set length to 0
O_SYNC reflect to physical I/O immediately
open oflag constants (OR’d togethor)
Open returns the lowest numbered
descriptor or -1 on error
6. OTHER FILE SYSTEM CALLS
#include <sys/types.h>
#include <unistd.h>
off_t lseek(int fd, off_t offset, int whence);
return new file offset, -1 on error
ssize_t read (int fd, void *buff, size_t
nbytes);
ssize_t write(int fd, const void *buff, size_t
nbytes)
Return number of bytes, -1 on error
7. PROCESS⇓◊ FILE
• Each process contains a process descriptor entry
• This process descriptor contains a list of open
files and for each a pointer to the file table
entry
• The file table entry contains the current file
offset, file status and a pointer to the v-node
table entry
• The v-node structure contains the i-node together
with a pointer to all the functions that operate on
this file
8. V-NODE
• Each open file has a v-node structure that contains
information about the type of file and pointers to
functions that operate on the file
• For most files, the v-node also contains the i-node for
the file
• This information is read from disk when the file is
opened
• i-node contains the owner of the file, file size, pointers
to where the actual data blocks are on disk …
• v-nodes provide support for multiple file system types
on a single computer
9. PROCESS⇓◊ FILE
• This hierarchical structure allows more than one
process to have the same file open (sharing v-node
entries)
• In this case, each process would have a different file
offset
11. PROCESS ⇓◊ FILE
• When a file is opened for writing and the
offset exceeds the i-node file size, the i-node is
updated accordingly
• Periodically and when closing the file, the i-
node is rewritten to the file system
• There is always one unique v-node table per
file and it is stored in the kernel
• Child processes share the same file table entry
12. DUPLICATING FILES
• An existing file descriptor is duplicating by
either of the following functions
• Writing to a duplicated file descriptor changes
the offset of the original file and vice versa
#include <unist.h>
int dup(int filedes);
int dup(int filesdes, int filedes2);
14. SYNC, FSYNC, FDATASYNC
• UNIX systems have a buffer cache or page cache in
the kernel through which most disk I/O passes
• When data is written to file data is normally copied
by the kernel into one of its buffers and queued for
writing to disk (delayed write)
• Kernel eventually writes all the delayed-write
blocks to disk, normally when it needs to reuse the
buffer
• To ensure consistency of the file system on disk with
the contents of the buffer cache, the sync, fsync
and fdatasync functions are provided
15. SYNC, FSYNC, FDATASYNC
• The sync function simply queues all the modified
block buffers for writing and returns
• fsync refers only to a single file and waits for
the disk writes to complete
• fdatasync is similar to fsync but affects only
the data portions of the file
#include <unist.h>
void sync(void)
int fsync(int filedes);
int fdatasync(int filedes);
16. FCNTL AND IOCTL
• fcntl is used to set or request properties of
opened files
• ioctl is a catch all function which gives raw
access to all file attributes
#include <sys/types.h>
#include <unistd.h>
#include <fcntl.h>
#include <ioctl.h>
Int fcntl(int filedes, int cmd, …);
Int ioctl(int filedes, int request, …);
17. FCNTL AND IOCTL
• fcntl is used for five different purposes:
• Duplicate an existing file (F_DUPFD)
• Get/set file descriptor flags
(F_GESTF/F_SETFD)
• Get/set file status flags
(F_GETGL/F_SETFL)
• Get/set asynchronous I/O ownership
(F_GETOWN/F_SETOWN)
• Get/set records locks
(F_GETLK/F_SETLK/F_SETLKW)
18. STAT, FSTAT AND LSTAT
#include <sys/stat.h>
int stat(const char* restrict pathname,
struct stat *restrict buf);
int fstat(int filedes,
struct stat *restrict buf);
int lstat(const char* restrict pathname,
struct stat *restrict buf);
All return 0 if OK, 1 on error
19. STAT, FSTAT AND LSTAT
• Given a pathname (or file descriptor for an
open file), the stat (fstat) function returns
a structure of information about the
named/open file
• The lstat function return information about
a symbolic link
• The biggest user of the stat function is the ls
–l command, which returns all the information
of a file
20. FILE TYPES
• Most files on a UNIX system are either regular
files or directories. There are additional ones:
• Regular file: contains data of some form,
interpretation of which is left to the
application processing the file
• Directory file: a file that contains the names
of other files and pointers to information on
these files.
• Block special file: provides buffered I/O
access in fixed-sized units to devices
21. FILE TYPES
• Character special file: provides unbuffered I/O
access in variable sized units to devices
NOTE: all devices on a system are either block special
files of character special files
• FIFO: used for communication between
processes (sometimes called named PIPE)
• Socket: used for network communication
• Symbolic link: points to another file
• File type is encoded in st_mode in stat
structure
22. USER AND GROUP IDS
• Every process has six or more IDs associated
with it
Real user ID
Real group ID
Who we really are
Effective user ID
Effective group ID
Supplementary group
IDs
Used for file access
permission checks
Saved set-user-ID
Saved set-group-ID
Saved by exec function
23. PROCESS IDS
• Real IDs identify who we really are and are taken
from out entry in the password file when we log in
• Effective IDs determine our file access permissions
(see later)
• Saved set-user-IDs contain copies of the effective
IDs when a program is executed
• Normally, effective and real IDs are the same
• Every file has an owner and a group owner
• Process effective ID can be changed to be that of
the file’s owner (for example to run in su mode)
24. FILE ACCESS PERMISSIONS
• Nine permission bits for each file
• Whenever we want to open a file, we must have
execute permission in each directory mentioned in
the name
• We cannot create a new file to a directory unless
we have write permission
• To delete a file, we need write and execute
permission in the directory
• To run a program we must have execute
permission
25. FILE PERMISSION FUNCTIONS
• access: test file accessibility base on real IDs
• umask: set file mode creation mask for the
process
• chmod and fchmod: change file access
permissions
• chown, fchown and lchown: change user and
group ID of a file
28. FILE SYSTEMS
• i-node contains all information about the file (access
permission, size, pointers to data blocks …)
• Directory entry stores filename and i-node number
(and others like file length)
• Since i-node number in the directory entry points to
an i-node in the same file system we cannot have a
directory entry point to an i-node in a different file
system (ln doesn’t cross systems)
• When renaming a file the actual contents of the files
do not need to be moved, but a new directory entry
needs to be added which points to the existing i-
node, and unlink old entry
29. FILE LINKING FUNCTIONS
• link: creates a new directory entry that
references an existing path. Atomic
• unlink: remove an existing directory entry and
decrements link count of the file
• remove: for a file, identical to unlink; for a
directory it is identical to rmdir
• rename: rename a file or directory
30. MORE CALLS
• symlink: create a symbolic link
• readlink: open the link itself
• utime: get/set file access and modification time
• mkdir: create a new, empty directory
• rmdir: delete and empty directory (must contain
only . And .. Entries)
• Directory management calls: opendir, readdir,
rewinddir, closedir, telldir, seekdir,
chdir, fchdir, getcwd
32. PASSWORD FILE
• Every user is assigned a unique username which is
associated with a user ID and group ID
• Also every user has a password which is stored using
a one way algorithm that generates 13 printable
characters from 64 (newer version might not)
• All this information is stored in /etc/passwd and
optionally /etc/shadow for security
• Password file format:
username:password:UID:GID:Comment
field:initial directory:initial shell
• Every process is also assigned the UID and GID of the
process owner
33. PASSWORD FILE
• There is usually an entry with the user name root (0)
• Some fields of the password file entry can be empty
• Shell field contains the name of the executable
program to be used as the login shell
• The nobody username can be used to allow people
to log in to a system without any privileges
• finger allows additional information in the
comment field
• vipw command allows administrator to edit
password file
35. SHADOW FILE
• Systems now store the encrypted password in a
different file, the shadow password file
• Avoid brute-force password hacking
• This contains the user name and encrypted password,
together with other field such as password change
fields
• This is not readable by the world, only a few
programs (login, passwd) can, which are often set-
user-ID root
• A separate set of function are available to access the
shadow password file
36. SHADOW FILE
• Systems now store the encrypted password in a
different file, the shadow password file
• Avoid brute-force password hacking
• This contains the user name and encrypted password,
together with other field such as password change
fields
• This is not readable by the world, only a few
programs (login, passwd) can, which are often set-
user-ID root
• A separate set of function are available to access the
shadow password file