2. Fundamentals
Low-level I/O (read/write using system calls)
Opening/Creating files
Reading & Writing files
Moving around in files
file descriptors
Program Creation and Execution
exec family of system calls
using fork() to create processes
exiting processes
process termination
3. Low-level I/O
Unix System Calls can be used to:
Open/Create Files
Read & Write files
Move around in files
System calls work with file descriptors
A file descriptor is an int that represents a file/device
File descriptors are allocated by the system
Limited number of file descriptors available
0 is stdin, 1 is stdout, 2 is stderr; these are automatically allocated
and opened when a program begins
4. Opening/Creating Files
open(<file/device name>,<how to open>[,<mode>])
<file/device name> is a C-string, exactly as in the C++ fstream member
function open()
<how to open> is an int, usually specified with a constant defined in
fcntl.h. The values may be combined using a logical OR ( | ), allowing
every possible way of opening a file (examples on next slide)
Detail on this appears on the next slide
<mode> is an optional third argument that is specified only if the file is
created. It is a three digit octal number representing the permissions to be
set on the file.
a file descriptor (of type int) is returned on success; otherwise –1 is
returned.
5. Specifying How a File is to be Opened
File Access
You must specify how the file is to be accessed. Exactly one of these
values must appear in the 2nd argument of open():
O_RDONLY
The file is opened for reading only. If the file doesn’t exist, and the create
option (next slide) isn’t specified, open() will fail (note that it is not
ordinary to open a file to be read with the create option specified)
O_WRONLY
The file is to be opened for writing only. If the file doesn’t exist, and the
create option is not specified, or if the file does exist, and the create option
is specified along with the fail if file exists option, open() will fail.
O_RDWR
The file is opened for full access. Errors specified above can occur,
however.
If none of these are specified, open() can succeed, but nothing can be
done to the file. (See open2.c in the demo directory)
6. Specifying How a File is to be Opened
File Opening Options
The following are used to further specify how a file is opened. They are
combined with the access and possibly each other:
O_APPEND
all access is at end of file
O_CREAT
create file if it doesn’t exist (does not cause failure if file exists).
Use a third argument to specify mode.
O_TRUNC
Destroy the existing contents of the file, if it exists.
O_EXCL
When combined with O_CREAT, causes open() to fail if the file
already exists.
7. Specifying How a File is to be Opened
Exercises/Examples
Open a file named abc.txt for writing, only if it exists. If it does, the
present contents should be preserved, and writing should occur after the
present contents.
Give an open() command equivalent to the C++ command
Dest.open(“abc.txt”), where Dest is an ofstream
Give an open command that opens a file abc.txt for writing only if it
doesn’t already exist.
Give command(s), to open a file abc.txt for writing. If abc.txt exists, the
user should be prompted, and then the file opened for writing only if the
user answers affirmatively to a prompt to overwrite the file. Otherwise,
the file is created.
Repeat the previous exercise, using C++
8. Low-level Reading and Writing
Using low-level (system) calls, I/O can only be accomplished via a buffer,
whose size is determined by the user.
small buffer requires more device accesses, slowing program; some
systems override with their own buffer
Commands:
read(<file descriptor>,<address of buffer>,<# bytes to read>)
write(<file descriptor>,<address of buffer>,<# bytes to read>)
value returned is # bytes read; 0 => eof (read only), -1 => error
9. Moving Around in A File
Low-level access via File Descriptor
lseek(<file descriptor>,<amount>,<relativity>)
where
<file descriptor> is a file descriptor of an open file
<amount> is the number of bytes to move the file descriptor...
<relativity> is the point from which the operation takes place
SEEK_SET (0) is start of file
SEEK_CUR (1) is current position
SEEK_END (2) is end of file
if <amount> is positive with this option, there will be a hole in the file.
the return value is the new offset in the file
10. Exercises
Give an lseek() command to find the present offset in a file
How is the present offset in a file found when the file is
accessed via a file pointer?
Give a loop to write the values 1-100 in the same spot of a
text file.
Why is the seek command basically worthless in a text file,
except to go to the beginning or end of the file?
11. Program Creation and Execution
exec family of system calls
using fork() to create processes
process termination
waiting for processes
12. exec Family of System Calls
Execute a program using one of the exec calls
Six exec functions
Can provide a list and have function build argv[] vector or provide the
vector itself
Can specify path to program or use path variable to find it
Can use present environment, or specify new environment
environment defined by setenv items. Type set at Unix prompt to see
environment
called program occupies environment of caller (takes its pid, as new
process isn’t created). Only replaces text, data, heap, and stack segments
of caller of exec
execs are the only function calls that don’t return
13. execl & execv
execl
give pathname to file (incl. name) as 1st argument
Follow with list of command line arguments; executable 1st
this list is used to build the argv vector passed to the called program
executable name is in argv[0]
last argument must be null char *
execv
give pathname to file (incl. name) as 1st argument
Follow with null-terminated argv vector
14. execlp & execvp
execlp
give filename as 1st argument; path environment variable is
used to find executable
Follow with list of command line arguments; executable 1st
this list is used to build the argv vector passed to the called program
last argument must be null char *
execvp
give filename as 1st argument; path used
Follow with null-terminated argv vector
15. Exercise
Suppose a program X is executed whose
purpose is to call another program Y whose
name and command-line arguments are
passed in to X as command-line arguments.
1. Write an execlp and an execvp command,
as they would appear in X, to execute Y.
2. What problem could arise if we choose to
use execl or execv?
16. Process Creation
fork & vfork
Processes can only be created using a fork
command
fork is called once, but returns twice
parent calls fork, value is returned to each of parent and
child
created process is called the child
existing process is the parent
Process identifier (pid) is used to distinguish
between parent and child
Every process has a unique pid in the system
17. fork
Called once, returns twice; return type is pid_t, unsigned int type for
pids
Parent receives child’s pid as return value of fork
child receives 0 as return value from fork
Order of execution depends on scheduling algorithm of system.
Order can be controlled; important if there is dependence between parent and
child
Child and parent execute in their own environments
Child gets copy of parent’s environment (including data space) as it is at
time of fork
If parent has file open, child gets copy of file pointers or descriptors
If parent’s stdout (cout) is redirected before fork, child output to stdout is also
redirected
18. Child Process’ Inheritance
user id, group id
controlling terminal
current working directory
root directory
file mode creation mask
signal mask (later) and dispositions
close-on-exec flag for open file descriptors
environment
attached shared memory segments
resource limits
19. Items That Distinguish
Parent From Child
Return value from fork
unique pid
different parent pid (ppid)
child’s ppid is the parent’s pid
parent’s pid stays same
certain time values reset in child
parent’s file locks aren’t inherited
pending alarms are not inherited
pending signals in parent aren’t inherited
20. vfork
Maintained for compatability, vfork is not often used
Some systems implement it as a fork
Child executes first. Parent doesn’t resume until child either
exits (via call of exit() or _exit()) or uses exec to start
another program.
The child is a separate process, but it shares the parent’s
data space. vforkdemo.c shows that this is the case
21. vforkdemo.c
#include <stdio.h>
#include <sys/types.h>
#include <unistd.h>
•Parent output is same as child’s
int glob ;
int •If separate data spaces, parent wouldn’t experience ++
main(void){
operations
int var;
pid_t pid;
glob = 10; var = 20;
printf("before vforkn");
if (vfork()== 0 ) {
sleep(1);
glob++;
var++;
printf("child glob= %d var= %dn", glob, var);
_exit(0); // what if this is just exit()?
}
else {
printf("parent glob= %d var= %dn", glob, var);
}
}
22. When a process terminates...
Its parent is informed via a SIGCHLD signal
If its parent doesn’t “collect” it, it becomes a zombie
process (next slide)
If its parent has already terminated, it is inherited by the init
process (pid 1), which “collects” it using a wait operation (2
slides hence).
All processes must have an identifiable (by pid) parent
It is able to provide minimal information to its parent when
it comes to “collect” it
23. Zombie Process
A zombie is a process that has terminated, but hasn’t
been “collected” by its parent
The system frees most of the resources associated
with this terminated process, e.g. closing open files
and freeing memory,and minimally maintains:
process id
CPU time used
termination status
24. wait operations
A terminated process’ information is “collected” via a
call of a wait operation by its parent
Operations:
wait
blocks the calling process until a child process terminates,
returning the child’s pid.
If caller has no children, wait immediately returns –1 (error)
the termination status (return value) of the child may be obtained
via the argument
waitpid
permits a caller to wait for a particular child, identified by its pid
can be configured to not block if the child is active
25. Process Termination
Normal:
1. main contains return statement
2. call exit
3. call _exit
Abnormal:
1. Process aborts itself
2. Process receives a signal (e.g. division by 0)
26. exit() vs._exit()
exit() causes a process to terminate.
stdio funtion fclose is applied to all open streams
the argument is the exit status
exit status becomes termination status when exit() calls _exit()
_exit is a Unix specific command
the process terminates immediately without any cleanup
see vfork examples
If a process doesn’t explicitly specify exit status via
a call to exit() or _exit(), the process’ termination
status will be undefined
27. Exercises
1. What can happen if a process calls exit
while writing to a buffered output stream?
2. Write code guaranteed to create a zombie
process.
3. Write code that will cause a process to be
inherited by the init process.