System Programming
Credits to: CS341 System Programming at UIUC
- Mini course: http://cs-education.github.io/sys/#lessons
- Lecture Handouts: https://github.com/angrave/CS341-Lectures-SP24
- Coursebook: https://github.com/illinois-cs241/coursebook
To pull starter code for assignments:
git pull release main --allow-unrelated-histories --no-rebase
May have to power on the server at https://vc.cs.illinois.edu/.
1. C Language
1.1. Function Parameter List
(void)
: Called with zero argument only.
()
: Called with any number of arguments.
main()
: A special function. Standards:
int main(void)
int main()
int main(int argc, char * argv [])
1.2. Preprocessor
Text substitution the compiler performs before actually compiling the program.
#define A B
macro: Substitute A with B. Can substitute function together with arguments.
1.3. switch
and break
1.4. TODO extern
1.5. TODO File Descriptor-Level (System Call) Input/Output
Type man
in the shell to access the man page. They are organized in sections. Section 2 are system calls, section 3 are C libraries, and section 7 are longer articles.
write()
unistd.h
3 arguments:
- file descriptor (
int
):- standard output: 1 or
STDOUT_FILENO
. If redirecting the output to some file (e.g.,./a.out > output.txt
), then the standard output will be in that file. - standard error: 2 or
STDERR_FILENO
. Will be on screen but not the file. - See the return value of
open()
- standard output: 1 or
- pointer to a character (
void *
, without specific type)- e.g.,
"Hello\n"
- e.g.,
- the number of characters to write (
int
)- e.g., 6 for entire
"Hello\n"
- e.g., 6 for entire
- file descriptor (
int open(const char *pathname, int flags, mode_t mode);
<fcntl.h>
pathname
: pointer of path name.flags
: bitwise OR of one or more flagsO_CREAT
: create if the file does not existO_TRUNC
: truncate to length 0 (overwrite). Requires that the access mode allows writing.O_RDWR
: request opening the file with read/write access.
mode
(essentially an integer,<sys/types.h>
and<sys/stat.h>
): bitwise OR of one or more modesS_IRUSR
: user has read permission.S_IWUSR
: user has write permission.
Return value: a file descriptor (
int
). The lowest unused file descriptor.-1
if an error occurred.
close()
1 argument:
- file descriptor (
int
).
E.g.,
#include <unistd.h> #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> int main() { mode_t mode = S_IRUSR | S_IWUSR; int fd = open("output.txt", O_CREAT | O_TRUNC | O_RDWR, mode); write(fd, "Hello\n", 6); close(fd); return 0; }
E.g., if
close(1)
at the very beginning:#include <unistd.h> #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> #include <stdio.h> int main() { close(STDOUT_FILENO); mode_t mode = S_IRUSR | S_IWUSR; int fd = open("hello_world.txt", O_CREAT | O_TRUNC | O_RDWR, mode); printf("Hello\n"); close(fd); return 0; }
then
open()
will return 1 as the file descriptor. In this case, those outputs used to go to the standard output (e.g.,printf
) and those delibrately written to the file will all go to the file.- file descriptor (
read()
ssize_t read(int fildes, void *buf, size_t nbyte);
:<unistd.h>
- read
nbytes
bytes from file descriptorfildes
to buffer pointed to bybuf
. - return value: s(igned) size type.
- if successful: the numbe rof byptes actually read
- end-of-file: 0
- error: -1
P.S.: May want to manually put a
\0
at the end. E.g.,char buffer[4096]; ssize_t len = read(0, buffer, sizeof(buffer) - 1); buffer[len] = '\0';
- read
1.6. C Library’s Input / Output
1.6.1. File stream, FILE
A file stream (or file descriptor in system calls) is the base interface to everything external to RAM, including:
- files
- pipes
- sockets
- standard streams (C library):
stdin
,stdout
,stderr
. They are ofFILE*
type, and a newFILE*
could be created usingfopen()
.
See ferror()
, feof()
, fgetc()
, etc.
1.6.2. printf()
family
<stdio.h>
Even if put before system-call IO codes, its output may be still after their outputs
C library has a buffer (for performance), printf()
is only going to call write()
when
- that buffer is full
- finish a line
- close a file stream
It writes to the standard output. To write to some specific file descriptor, one can use the trick in the second exampe in close()
.
fprintf()
: To print to other file stream
stderr
is not buffered
asprintf(char **ret, const char *format, ...)
:
Takes in an address of a char pointer, allocate enough memory, and set the pointer (
*ret)
to be a pointer to that memory.char * ptr = NULL; asprintf(&ptr, "%d", 123); printf("%s\n", ptr); free(ptr);
Free the pointer later on.
1.6.3. puts()
puts()
<stdio.>
Taskes a char pointer and print it.
1.6.4. TODO scanf()
family
int scanf(const char * format, ...);
: <stdio.h>
format:
d i u o x f c s p int hex … float char string (one word) pointer - return value: the number of input items assigned.
sscanf()
:
2+ arguments:
- input string
- format string
- pointers to memory where the extracted output will be stored. E.g., to store in integer variable
x
,& x
should be put there.
Return value: number of arguments it correctly parsed.
fscanf()
: The same except that the first argument is a file descriptor. File descriptor stdin
represents the standard input.
scanf()
: Read from standard input. No file descriptor as its input.
1.6.5. gets()
gets()
: Takes a char pointer and read the content into it. When the read content is longer than the char array, it will overflow to other variables.
E.g.,
char buffer[12]; gets(buffer);
1.6.6. getline()
ssize_t getline(char **lineptr, size_t *n, FILE * stream);
:
stream
:stdin
or fromfopen()
.- Prepare a C-string (as a char pointer) and a capacity variable. Pass these two by reference (as their addresses) in order to let
getline()
modify them. getline
- reads from the file stream
stream
(newline at the end included), - allocate memory (in heap) for the content,
modify the C-string pointed to by
lineptr
to the content,lineptr
must be initialized toNULL
before the first call.and update the capacity pointed to by
n
to capacity of the allocated memory.n
must be initialized to 0 before the first call and retain the correct value (modified bygetline()
) in order to letgetline
properly allocate memory.
- reads from the file stream
Free the buffer to avoid memory leak.
P.S., when calling
getline()
to read longer text, it may allocate a new starting address while freeing the old address. Only need to handle freeing the last address returned.
Return value: number of characters read, excluding \0
, in type ssize_t
(signed size). -1 if it fails to read a line or encounters end-of-file.
E.g.,
#include <stdio.h> #include <stdlib.h> int main() { char * buffer = NULL; size_t capacity = 0; ssize_t num_char = getline(& buffer, & capacity, stdin); printf("%d \"%s\" %d\n", num_char, buffer, capacity); }
abcd abcd abcd 15 "abcd abcd abcd " 16
abcd abcd abcd abcd 20 "abcd abcd abcd abcd " 32
To strip off the newline at the end (if any)
if (num_char > 0 && buffer[num_char - 1] == '\n') buffer[num_char - 1] = '\0';
1.7. Detecting error
assert()
: <assert.h>
. Assert an expression that is supposed to be true. If the expression is actually false, the process will be aborted.
1.8. Sizes, Pointers
1.8.1. Integer
INT_MIN
and INT_MAX
: <limits.h>
, the range of int
on a particular machine.
The size of int
is dependent on architecture and compiler. However, C promises that an int
will be at least 16 bits.
1.8.2. Character
A char
is always one byte. However, for C, a byte does not mean 8 bits, but at least 8 bits. (On modern machines, 1 byte typically = 8 bits)
CHAR_BITS
: <limits.h>
, the number of bits for a char
on a particular machine.
1.8.3. sizeof()
1 argument: a type / variable
Return value: the size of the type/variable in bytes.
Evaluated at compile-time (as opposed to link time and run time).
- Not link time: E.g.,
sizeof()
astruct
must be after the full declaration of thestruct
. Claiming thestruct
, followed bysizeof()
and then its full declaration will not compile. E.g.,
int a = 0; size_t a_size = sizeof(a ++); printf("a: %d\n", a);
1.8.4. NULL
NULL
pointer has a address of 0.
Dereferencing a NULL
pointer will cause segmentation fault. To avoid that, check if the pointer is NULL
.
1.8.5. int
Arrays/Pointers
When declaring an array, the variable is a pointer to the address of the first element of that array.
To declare a pointer to the whole array, do int (*ptr)[<same_length>]
.
E.g.,
#include <stdio.h> int main() { int a[8]; printf("a is at %p, a+1 is at %p, a+8 is at %p\n", a, a+1, a+8); int *p1 = a; printf("p1 is at %p, p1+1 is at %p, p1+8 is at %p\n", p1, p1+1, p1+8); int (*p2)[8] = &a; printf("p2 is at %p, p2+1 is at %p\n", p2, p2+1); return 0; }
a is at 0x16f5d3308, a+1 is at 0x16f5d330c, a+8 is at 0x16f5d3328 p1 is at 0x16f5d3308, p1+1 is at 0x16f5d330c, p1+8 is at 0x16f5d3328 p2 is at 0x16f5d3308, p2+1 is at 0x16f5d3328
To read/write the address of a pointer, add an asterisk (*(a)
). For int
array, we can alternatively use square bracket (a[0]
). Additionally, we can do 0[a]
.
E.g.,
*(a) = 100;
*(a+1) = 101;
printf("a[0] is %d, a[1] is %d\n", a[0], a[1]);
a[0] is 100, a[1] is 101
1.8.6. char
Arrays/Pointers
C string has a special character \0
at the end.
sizeof()
a string returns the length of the string +1, because each character is one byte. It also counts \0
inside the string.
E.g.,
printf("sizeof(\"Hello\\0World\") = %d\n", sizeof("Hello\0World"));
sizeof("Hello\0World") = 12
A character pointer char *ptr = <string>
is a character array, and ptr
will point to the first character. If deferencing the last character (\0
), we will get a 0. sizeof(ptr)
is the size of the pointer itself.
E.g.,
char *ptr = "hello"; printf("ptr[4] = '%c', ptr[5] = '%c' or %d\n", ptr[4], ptr[5], ptr[5]);
ptr[4] = 'o', ptr[5] = '' or 0
Alternatively, we can declare it in the syntax for array, char arr[] = <string>
. In this case, sizeof(arr)
is the size of the whole array (including the \0
at the end).
E.g.,
char arr[] = "hello"; printf("%d\n", sizeof(arr));
6
strlen()
: <string.h>
. Pass a character array to get its length (until but does not include the first \0
).
E.g.,
char *ptr = "Hello\0World"; printf("strlen(ptr) = %d, strlen(\"Hello\\0World\") = %d\n", strlen(ptr), strlen("Hello\0World"));
strlen(ptr) = 5, strlen("Hello\0World") = 5
If a character pointer (char * ptr
) is initialized with a constant string, then modifying individual characters of it will cause segmentation fault. This is because the memory will be read-only.
E.g.,
char *ptr = "hello"; ptr[0] = 'H';
will give an error.
However, if a character array (char arr []
) is initialized with a string, then the memory is in the stack and we can modify the array.
E.g.,
char arr[] = "hello"; *arr = 'H'; printf("%s\n", arr);
Hello
strcpy()
<string.h>
Two arguments:
- destination: The character pointer that points to the memory to which the string will be copied.
- string.
E.g.,
char ptr[] = "hello"; printf("%s\n", ptr); strcpy(ptr, "world"); printf("%s\n", ptr);
hello world
strdup()
: Duplicate the string pointed to by the input pointer and return a pointer to the duplicate. Free the pointer to avoid memory leak.
Pointer to pointer to char
/ 2D char
array: Use char **ptr
or char * ptr []
.
E.g.,
char * p1 [] = {"ab", "xyz"}; char ** p2 = p1; printf("p1[0]=%s, p1[1]=%s\n", p1[0], p1[1]); printf("p2[0]=%s, p2[1]=%s\n", p2[0], p2[1]);
p1[0]=ab, p1[1]=xyz p2[0]=ab, p2[1]=xyz
1.9. Function Pointer
A pointer that stores the address of a function’s code.
With this, we can pass a function as an argument
This allows us to reuse code
- Function pointer
- Pointers to functions. Allow you to pass functions as argumetns to other functions.
Declaration:
returnType (*pointerName) (paramTypes, ...);
- Initilization:
returnType (*pointerName) (paramTypes, ...) = &funcName
or= funcname
. - Invocation: dereference the pointer
(*pointerName)(arguments)
or simplypointerName(arguments)
Passing it as arguments:
anotherType anotherFunc(returnType (*pointerName)(paramTypes, ...));
Typedef:
typedef returnType (*Type)(paramTypes, ...); Type pointerName;
E.g.,
#include <stdio.h> typedef int (*funcType)(int, int); int add(int a, int b) { return a + b; } int subtract(int a, int b) { return a - b; } int main() { funcType operation1 = add; funcType operation2 = subtract; printf("%d\n", operation1(1, 2)); printf("%d\n", operation2(1, 2)); return 0; }
1.10. Structs
1.10.1. TODO typedef
typedef
: Make an alias of a type.
E.g.,
typedef long long ll; ll n;
Function type: todo
1.10.2. struct
struct
: struct <name>
to define a structure called <name>
. struct <name> var~ to declare a variable of that structure. <var>.<attr>
to access an attribute. If ptr
is a pointer to the variable, then we can use ptr-><attr>
to access an attribute.
1.11. time
, ctime
time()
<time.h>
- pointer to where the returned time will be stored, or a null pointer.
Return value: number of seconds since 1970, in type
time_t
. Use&
to get its address.ctime()
Converts the given time value in
time_t
to a human readable format. Note that it stores the result in a static storage and will overwrite the storage if called again later on.- pointer to the
time_t
time.
Return value: A character pointer to the output string.
- pointer to the
1.12. TODO Memory Use
char *t1 = "abcd";
: The “abcd” is in the text segment, which is read-only, and t1
points to the address of it.
char t2[] = "abcd";
: The “abcd” is first stored in the text segment. Then, C creates enough space for the array on the stack and copies the “abcd” into the array.
*t2 = 'A';
: modify an element of the array t2
on the stack.
Stack value is unclean.
Global memory: defiend globally
1.13. main()
, arguments, return value
main()
:
int argc
: the number of argumentschat * argv[]
: an array of character pointers. The first ([0]
) character pointer points to the name of the command (e.g.,./a.out
). Words are separated by whitespaces. To join several words, use double quotes to wrap them.
E.g.,
#include <stdio.h> int main(int argc, char* argv[]) { printf("argc is %d\n", argc); for (int i = 0; i < argc; ++ i) { printf("argv[%d]=%s\n", i, argv[i]); } return 0; }
% ./a.out abc "xyz 123" argc is 3 argv[0]=./a.out argv[1]=abc argv[2]=xyz 123
Alternatively, we can iterate through argv
and count until a null pointer is encountered (there is a nulll pointer at the end).
Return value of main()
:
- Standard: 0
echo $?
to print the execute value of the last command (e.g.,./a.out
)
atoi()
: <stdlib.h>
. Converts a string to a integer. If the string does not represent an integer or the integer is out of the range of int
, it will return 0.
1.14. Environment Variables, environ
, getenv()
extern char **environ;
(global) To print all the environment variables, iterate through
environ
(stop criterion is doable because the last element is null)extern char ** environ; int main() { char ** ptr = environ; while (*ptr) { printf("%s\n", *ptr); ptr ++; } return 0; }
E.g., HOME
and USER
can be found in the output.
- In shell
- Can add a new variable by
export <NAME>=<value>
. - Print a variable:
echo $<NAME>
orecho ${<NAME>}
.
- Can add a new variable by
getenv()
<stdlib.h>
- Pass the variable name to get the value. If the variable does not exist, will return a null pointer.
- also see
setenv()
andputenv()
.
E.g.,
char * home = getenv("HOME");
1.15. Valgrind
To detect memory leak:
valgrind --leak-check=full --show-leak-kinds=all myprogram arg1 arg2
2. Processes
A process is the base computation container on Linux.
2.1. Forking
pid_t fork(void)
<unistd.h>
.- Clone the current process to create a child process (Very expensive).
- Return value: type
pid_t
. returns a value of 0 to the child process and returns the process ID of the child process to the parent process.
Not shared data (Duplicated, independent):
- Heap memory: Duplicated for each process.
- Non-read-only data segment: duplicated for each process.
- File descriptor tables (mapping from numbers to file): Duplicated for each process, meaning that closing a file descriptor does not affect the file descriptor referring to the same object in the other process.
- Signal handlers: Signal handlers installed by the parent process are inherited by the child process.
Shared data between parent and child process:
- Text segment: Executable Code.
- Read-only data segment: Read-only data, such as constants and static variables, is shared between the parent and child processes
- Open file descriptors: Meaning that these descriptors reference the same underlying objects. (Otherwise how can they be not shared? XD)
- Output buffer:
- Issues: If not flushed manually, these problems may arise: (1) The output from the parent process is mixed with that of the child process. (2) If the child process
exec
another program and then exit, the buffer will be deallocated, causing the output from the parent to be lost. Resolution:
fflush(some_file); // e.g., stdout pid_t pid = fork();
- Issues: If not flushed manually, these problems may arise: (1) The output from the parent process is mixed with that of the child process. (2) If the child process
getpid()
<unistd.h>
the process ID of the calling process.getppid()
- the parent’s pid.
What happends to child processes if their parents die first: They will be re-parented to pid 1.
2.1.1. ulimit
ulimit
- To prevent fork bomb, we can set a limit for how many times you can fork via
ulimit
.It is terminal session specific. To configure the setting to be applied everytime we start a Bash shell session, add
ulimit -n 400
into
./bashrc
.- The recommended threshold is 400.
- To apply the changes to your current shell session, you can either start a new shell session or run
source ~/.bashrc
. ulimit -a
and check “open files” for the setting.
2.2. Waiting
pid_t waitpid(pid_t pid, int *status, int options);
<sys/wait.h>
Wait for processpid
with option (of action)options
and store the status instatus
.pid
: Which process the caller wants to wait for. Ifpid
is -1, waitpid waits for any child process (seewait()
).status
: A pointer to an integer where the exit status or termination information of the child process will be stored.options
: Controlls the behavior of waitpid.- 0 (default), the caller will wait (block) until the child (1) terminates or (2) is stopped by a signal (3) is resumed by a signal.
WNOHANG
, the caller will not block.- If
pid
has not been stopped/exited chlildren (still running), return 0. - If
pid
has changed state (terminated, stopped, etc.), waitpid will return its process ID immediately.
- If
- The exit status is stored in
*status
.
- Return value (in general):
- If a child process is stopped/terminated, returns its pid
- If on error, return -1.
- Exit status:
WEXITSTATUS(status)
to get the exit status
pid_t wait(int *wstatus);
Wait one of the children. Equivalent to
waitpid(-1, &wstatus, 0);
.E.g., to wait for all children:
while ((wpid = wait(&wstatus)) > 0);
- Zombie Process
- A zombie process is a process that has terminated but still has an entry in the process table because its parent process has not yet collected its termination status via a
wait
system call. Zombie processes consume system resources (such as a process table slot and some memory for the process control block), but they do not consume CPU time. - Orphan Process
- An orphan process is a process whose parent process has terminated or ended before it has finished. The orphaned child process will become a child of the init process, which has process ID 1.
2.3. exec(
)
exec
series- Replace any lines of code after
exec()
with the program executed byexec()
. The exec process will replace the existing place.execl(const char *path, const char *arg0, ..., /*, (char *)0, */))
- “l” stands for “list”.
arg0
,arg1
, …: The zeroth argument points to the process name, which is by convention the file name, but it can be actually an arbitrary name.- This is a list of points to arguments. After that, we must pass a NULL pointer
(char *)0
.
execlp(const char *file, const char *arg0, ..., /*, (char *)0, */);
- “p” stands for “PATH”. Use the “PATH” environment variable as the search path for
file
. execv(const char *path, char *const argv[]);
- “v” stands for “vector” (array). Arguments are passed as an array of pointers.
- Return value: It returns only if an error has occurred and the return value is -1.
2.4. fork-exec-wait
int main() { pid_t pid = fork(); if (pid < 0) { // fork failure perror("fork"); exit(1); } else if (pid > 0) { // in parent int status; int waitResult = waitpid(pid, &status, 0); // If on error waiting if (waitResult == -1) { perror("waitpid"); exit(1); } // if existed but the exist status is non-zero if (WIFEXITED(status) && WEXITSTATUS(status) != 0) { exit(1); } } else { // in child /* obtain args */ if (execvp(args[0], args) == -1) { perror("execvp"); exit(1); } } }
3. Signals
POSIX signals are software interrupts.
Two sources of signals:
- Kernel
- User processes
Signal names: of type int
SIGINT
: Ctrl C. Interrupt program.SIGKILL
: kill program.SIGSEGV
: segmentation violation.SIGALRM
: terminate program. commonly used in scenarios where you need to enforce timeouts. Seealarm()
below.SIGSTOP
andSIGCONT
: suspend the execution of a process without terminating it. Once a process is stopped, it retains its state and resource allocation. The process remains stopped until it receives aSIGCONT
signal.
unsigned alarm(unsigned seconds);
<unistd.h>
. Afterseconds
seconds, send aSIGALRM
signal to the current process.
3.1. Terminology
- Generated
- Signals are generated by the operating system or by specific actions within a program. For example, when a user presses Ctrl+C to interrupt a program, the operating system generates a SIGINT signal and sends it to the process running in the terminal.
- Pending
- If a signal is generated for a process but that process is currently handling another signal, the new signal is said to be pending until the process finishes handling the current signal.
- Blocked
- Processes can block signals, meaning they won’t be delivered or handled until the process unblocks them. This is useful for critical sections of code where you don’t want signals to interrupt execution.
- Delivered
- When a signal is sent to a process and it’s not blocked, the signal is delivered to the process and can be handled by a signal handler.
- Caught
- When a signal is delivered to a process and the process has registered a signal handler for that signal, the signal is considered caught. The signal handler is a function that gets executed in response to the signal.
- Disposition
- The disposition of a signal refers to what action the operating system takes when a particular signal is caught by a process. The disposition can be set to default (the default action for the signal), ignore (the signal is ignored), or a specific signal handler function.
Signals in child processes/threads:
- Signals can be inherited by child processes in Unix-like systems. When a parent process creates a child process, the child inherits the signal disposition of the parent.
- Threads within a process share the same signal handlers, meaning if one thread catches a signal, it affects the entire process.
3.2. Sending Signals:
Signals can be generated by users, system events, or other processes, enabling the initiation of actions or responses. Examples include user-triggered signals (e.g., pressing CTRL-C) and system-generated signals in response to events like memory access violations. Preferably, signals should not be relied upon as the primary driver of program logic due to inherent drawbacks and limitations.
kill
(man -S1 kill
)- Send signal in shell.
kill(pid_t pid, int sig)
(man -S2 kill
)<signal.h>
Send the signalsig
topid
. The signal could be one of the signal names above. Thekill()
function is not only capable of sendingSIGKILL
.raise(int sig);
- Sending
sig
to the process itself. Same as callingkill()
withgetpid()
.
3.3. Pending Signals
int sigpending(sigset_t *set);
- The sigpending function returns a mask of the signals pending for delivery to the calling process in the location indicated by
set
.
3.4. Handling Signals:
Signal handlers execute in response to signals, but there are strict limitations on the executable code within them. Most library and system calls are considered async-signal-unsafe, necessitating caution and adherence to reentrant safety principles. Ensuring signal handler safety involves meticulous consideration of shared resources, multithreading, and synchronization.
void (* signal(int sig, void (*func)(int)))(int);
- Pass the funtion name or the address of the function.
sig
: Specifies the signal for which the handler is to be set.func
: Pointer to the function that will handle the signal. It must have the following signaturevoid func(int sig)
.- It calls the function
func
when the specified signalsig
is received.
int sigaction(int sig, const struct sigaction *restrict act, struct sigaction *restrict oact);
allows programs to change the action taken by a process upon receipt of a specific signal. It provides a more flexible and reliable way to handle signals compared to the older
signal
function.act
: A pointer to astruct sigaction
object that specifies the new action to be taken for the signal.oldact
: (Optional) A pointer to a struct sigaction object where the previous action for the signal will be stored.void (*sa_handler)(int)
: A pointer to the signal handler function. Alternatively, sa_sigaction can be used for more complex handlers.sigset_t sa_mask
: A set of signals to be blocked (masked) while the signal handler is executing.int sa_flags
: Flags that modify the behavior of signal handling, such asSA_RESTART
, which specifies whether interrupted system calls should be automatically restarted.
3.5. Blocking Signals:
Sigprocmask enables blocking, unblocking, or setting signal masks, affecting the delivery of signals to a process or thread. Proper initialization and handling of signal sets are crucial to prevent errors and ensure desired signal blocking behavior.
- Signal mask
- a set of signals that are currently blocked for the process.
Signal set
- Create a signal set by
sigset_t set
.int sigemptysetsigaddset(sigset_t *set, int signo);
: initializes a signal set to be empty.int sigaddset(sigset_t *set, int signo);
: Adds the specified signal signo to the signal set.int sigfillset(sigset_t *set);
: initializes a signal set to contain all signals.
int sigprocmask(int how, const sigset_t *restrict set, sigset_t *restrict oset);
allows a process to manipulate its signal mask.
how
:SIG_BLOCK
: The new mask is the union of the current mask and the specifiedset
.SIG_UNBLOCK
: The new mask is the intersection of the current mask and the complement of the specified set.SIG_SETMASK
: The current mask is replaced by the specifiedset
.
oset
: Ifoset
is not null, it is set to the previous value of the signal mask. Otherwise, do nothing.
Usage:
sigset_t set; sigfillset(&set); sigprocmask(SIG_SETMASK, &set, NULL); // Block all the signals which can be blocked
3.6. Signals in Child Processes and Threads:
Child processes inherit signal dispositions and masks from their parent processes, affecting their handling of signals. Threads within a process have their own signal masks, but the process itself is treated as a collection of threads by the kernel. Blocking signals in multi-threaded programs requires setting signal masks at the thread level to control signal delivery.
3.7. Topics:
- Signal Handler Safety
- Signal Disposition
- Signal States
- Pending Signals when Forking/Exec
- Signal Disposition when Forking/Exec
- Raising Signals in C
- Raising Signals in a multithreaded program
4. Memory Allocators
4.1. Memory Layout
Text segment:
- Readable, executable. Not writable.
- Bottom. Next to data.
Data segment:
- Next to text/heap.
Heap:
- Readable, writable. Not executable.
- Next to data.
Stack:
- Readable, writable. Not executable.
- Goes down
Kernel:
- Top
static
: Add static
before variable declaration to make the lifetime of the variable the length of the process. static
variables are stored in data segment (also called static memory). Whether putting it globally or inside a function does not affect its lifetime, but only affects the scope of the variable name.
4.2. Memory Allocation API
Heap memory can vary in size.
void * sbrk(int n)
Increases the process’s data segment by n bytes. Returns a pointer to the base of the new storage (top of the increased heap) if successful; otherwise -1.
Use
sbrk(0)
to get the top of the heap.The internal implementation of
malloc()
etc. may callsbrk()
when additional heap memory is required.malloc()
<stdlib.h>
. Allocate memory of the given size (in bytes) in the heap without initializing it. Return the pointer to the first byte. If it fails to allocate the memory, it will return a null pointer.free()
<stdlib.h>
. Free the allocated memory pointed to by the given pointer. Set the freed pointer toNULL
to avoid dangling pointer a (pointer to invalid memory).free(NULL)
is a null operation.E.g., to return a pointer pointing to some memory allocated within a function
char * func() { char * ptr = malloc(128); if (!ptr) return ptr; strcpy(ptr, "hello"); return ptr; } int main() { char * ptr = func(); printf("%s\n", ptr); free(ptr); return 0; }
hello
Double free: Once the pointer is freed, the memory it points to may be allocated for other use (e.g., used by the heap itself). Thus, double free will cause unexpected behaviour.
void * memset(void *b, int c, size_t len);
- Write
len
bytes of valuec
(converted into a char) to the stringb
. void * realloc(void *ptr, size_t size);
- Tries to change the size of the allocation pointed to by
ptr
tosize
, and returns the pointer to the memory (NULL if there is an error).- If there is enough space starting from
ptr
, simply expand the allocation and do not copy old data. - If there is not, find a new location, copy the old data to there, and free the old allocation.
- If there is enough space starting from
void * calloc(size_t count, size_t size);
- Allocates memory for
count
elements that aresize
bytes each, initializes the memory to zero, and returns a pointer to that memory.
4.3. Placement Strategies
Elements in a struct is stored continuously (increasing for both stack and heap).
- Placement Strategies
- Best fit: Smallest location that is sufficienty large to hold the memory.
- Drawbacks: (1) Need to look through the entire memory space. (2) When realloc later, less likely to be able to expand it, has to realloc elsewhere and copy the data.
- Creates a lot of small fragments.
- Worst fit: The largest location.
- Drawbacks: Need to check the entire heap.
- Creates a lot of medium fragments.
- First fit: The first location that is big enough.
- Drawback: Fragamentation.
- Best fit: Smallest location that is sufficienty large to hold the memory.
- False Fragmentation
free memory blocks are available for allocation but are not contiguous, leading to wasted memory space that is not effectively utilized.
There are two types of false fragmentation. Be careful to differentiate available memory, allocated memory, and required memory:
- Internal fragmentation occurs when allocated memory is larger than required, resulting in wasted space within allocated memory blocks. This wasted space is internal to the allocated block.
- External fragmentation occurs when there is enough total memory space to satisfy a memory request, but the available space is fragmented into small, non-contiguous blocks, making it impossible to allocate a contiguous block of memory to satisfy the request.
5. Threads
- Thread
A thread (short for a thread of execution) is the smallest unit of execution within a process. It represents an independent sequence of instructions that can be scheduled to run by the operating system. Threads share the same memory space within a process and can communicate and synchronize with each other more efficiently than separate processes.
Memory sharing of threads within the same process:
- Independent (Duplicated): registers, stack pointer, and stack segment.
- Shared: heap segment, data segment, address space, file descriptors. This means they can access the same variables and data structures of the process.
A thread is much lighter to create compared to a process.
Re-entrant code: If
5.1. Pthreads
pthread is short for POSIX-thread.
<pthreads.h>
.
int pthread_create(pthread_t *thread, const pthread_attr_t *attr, void *(*start_routine)(void *), void *arg);
- Usually don’t need to worry about
attr
.- Will write the thread id to
thread
The thread executes
start_routine
with argumentarg
. Functione pointerstart_routine
takes in avoid *
and return avoid *
.Return value: exit value.
- Will write the thread id to
int pthread_join(pthread_t thread, void **retval);
suspends execution of the calling thread until
thread
terminates.- Store the return value (
void *
) to*retval
.
If the thread is already terminated, the return value will be 0.
#include <pthread.h> void* do_massive_work(void* payload){ /* Doing massive work */ return NULL; } int main(){ pthread_t threads[10]; for(int i = 0; i < 10; ++i){ pthread_create(threads+i, NULL, do_massive_work, NULL); } for(int i = 0; i < 10; ++i){ pthread_join(threads[i], NULL); } return 0; }
- Store the return value (
pthread_exit(...)
- same as
return ...
in astart_routine
function. pthread_cancel()
Tell the thread to stop, but there is no guarantee that how fast it will stop.
Better practice: Use a global flag variable, e.g.,
pls_stop
, to set whether to stop or not. Then, occasionally check in the thread the value of this variable.pthread_self()
- Return the pthread id of the caller.
5.2. Exit from Threads
exit()
/return ...
frommain()
: terminates the process with all threads.pthread_exit()
/return ...
from the starting function: terminates the thread. If that thread is the last thread, the process will finish.A lasy way to wait for all thread to finish: E.g.,
int main() { pthread_t tid1, tid2; pthread_create(&tid1, ...); pthread_create(&tid2, ...); pthread_exit(NULL); // exit this thread return 0; // never happen, the whole process wait until the last pthread ends }
5.3. Race Condition
Race conditions are whenever the outcome of a program is determined by its sequence of events determined by the processor. This means that the execution of the code is non-deterministic.
5.4. Reduce
In functional programming, there is an operation called reduce. reduce takes three parameters: an input list, a function to apply to the elements of that list, and an initial value (or base case). The function takes two inputs and returns a value. It is first applied to the base case and the first element, and from then on, the function is repeatedly applied to the cumulative result and the next element. reduce then returns the “reduced” value of the entire list.
int reduce(int *list, size_t length, reducer reduce_func, int base_case) { int result = base_case; for (size_t i = 0; i < length; ++i) { result = reduce_func(result, list[i]); } return result; }
Given that the reduce function and the reduce operation are commutative and associative, we can accelerate this procedure using parallelism
6. Thread-Based Parallelism
Recall the meomory sharing policies of threads and processes: Threads global variables (part of data segment) and heap segment, while processes do not. Therefore, we have to carefully use synchronization mechanisms to manage threads within the context of shared memory. On the other hand, making good use of threads can accelerate computation.
- Critical section
Part of code that should be executed by one thread at a time. This usually involves updating shared data
E.g.,
count ++
. This code actually first reads thecount
value, and then adds it, and then writes it back. When two threads run this code at the same time, the actual execution order might be that both of them read the value before the other write it back, so it may ends up addingcount
only once.- Atomic Operation
- An atomic operation is an operation that is performed as a single, indivisible unit of execution, without interruption from other threads or processes.
- Thread safe
- A piece of code is thread safe if it can be used with multi-thread.
6.1. Mutex (Mutual Exclusion)
#include <pthread.h>
pthread_mutex_init(&lock, NULL);
where
lock
is apthread_mutex_t
.Alternatively,
static pthread_mutex_t lock = PTHREAD_MUTEX_INITIALIZER;
(global)pthread_mutex_lock(&lock)
,pthread_mutex_unlock(&lock)
- In between, do critical section stuff. Only one (but arbitrary) thread will do the critical section and others will be blocked.
pthread_mutex_destroy()
- destroy.
How to make it thread safe:
- Identify critical sections.
- Wrap them pthread mutex lock/unlock.
#include <pthread.h> pthread_mutex_t m = PTHREAD_MUTEX_INITIALIZER; int x; int main() { pthread_mutex_lock(&m); x = x + 1; pthread_mutex_unlock(&m); pthread_mutex_destroy(&m); return 0; }
6.2. Lock Contention
Lock Contention : One thread is forced to wait for another
6.3. Dead lock
#include <pthread.h> pthread_t tid1, tid2; pthread_mutex_t m = PTHREAD_MUTEX_INITiALIZER; void * func(void * param) { for (int i = 0; i < 100; ++ i) { pthread_mutex_lock(&m); // lock itself and the other thread } return NULL; } int main() { pthread_create(&tid1, 0, func, NULL); pthread_create(&tid2, 0, func, NULL); pthread_join(tid1, NULL); pthread_join(tid2, NULL); return 0; }
6.4. Semaphore
A semaphore is a non-negative integer variable. It maintains a count that represents the number of available resources or permits.
<semaphore.h>
int sem_init(sem_t *sem, int pshared, unsigned int value);
- Initialize
sem_wait(sem_t *sem)
- Decrements the value of the semaphore by 1. If the value becomes negative, the calling thread blocks until the semaphore value becomes positive.
sem_post(sem_t *sem)
- Increments the value of the semaphore by 1. If there are threads blocked on the semaphore due to a wait operation, one of them is woken up.
sem_destroy(sem_t *sem)
- destroy
#include <semaphore.h> sem_t s; int main() { sem_init(&s, 0, 10); sem_wait(&s); // Could do this 10 times without blocking sem_post(&s); // Announce that we've finished (and one more resource item is available; increment count) sem_destroy(&s); // release resources of the semaphore }
6.5. Condition Variables
Condition variables allow threads to block until a specific condition is met, at which point they can be woken up by other threads.
pthread_cond_t cv = PTHREAD_COND_INITIALIZER
- to initialize a cv.
int pthread_cond_wait(pthread_cond_t *cond, pthread_mutex_t *mutex);
Unlock the mutex, then block the current thread until another thread signals the condition variable
cond
usingpthread_cond_signal
orpthread_cond_broadcast
. After the thread is woken up, lock the mutex.cond
is accosicated withmutex
and the corresponding shared data.mutex
protects the shared data. Before callingpthread_cond_wait
, the calling thread must acquiremutex
(pthread_mutex_lock
).Atomicity: It release the mutex lock while waiting on condition variable allowing other threads to access the shared data concurrently.
Spurious Wakeups: Threads waiting on a condition variable can sometimes wake up without the condition being signaled. This is known as a spurious wakeup and can occur due to system interrupts or other reasons (not because of the code).
pthread_cond_signal(&cv)
- Signals the condition to wake up a (at least one) thread. If multiple threads are waiting, which thread gets woken up is not specified.
pthread_cond_broadast(&cv)
Wake up all threads that are waiting on the condition variable.
pthread_cond_signal
andpthread_cond_broadcast
can be called by a thread wether or not it owns the mutex associated with the condition variable.- Usage pattern
Wrap
pthread_cond_wai
by while loop with a manually defined condition. Thepthread_cond_signal
is called somewhere this condition is satisfied.pthread_mutex_lock(&mutex); // Lock the mutex while (!condition) { pthread_cond_wait(&cv, &mutex); // Wait on the condition variable } // Do something after the condition is signaled pthread_mutex_unlock(&mutex); // Unlock the mutex
// immediately after the condition is met pthread_cond_signal(&cv);
There are two sets of control:
- Manually maintaing a condition: For example, maintain a global variable that monitors the condition.
while
makes sure that if a thread spuriously wakes up, it re-enter wait if this manual condition check fails. - Wait & Signal: Pthread’s utilities. They maintain the pthread condition variable based on their positions in the code.
- Manually maintaing a condition: For example, maintain a global variable that monitors the condition.
6.6. Barrier
- Barrier
A barrier ensures that all participating threads reach a certain point (i.e., the barrier) in their execution before any of them can proceed further.
This can be achieved by:
- Waiting: When a thread reaches a barrier, it will wait at the barrier until all the threads reach the barrier.
- Release: Once all threads have arrived at the barrier, the barrier is released, allowing all threads to proceed with their execution.
pthread_barrier_t
pthread_barrier_init()
pthread_barrier_wait()
pthread_barrier_init()
pthread_barrier_destroy()
6.7. Thread Pool
Problem description: A sequence of a same type of tasks, meaning that each task can be solved by a call to the same function. How to accelerate solving all the tasks?
- Thread pool
- Create a pool of fixed number of threads at the very beginning. Tasks are solved using these threads instead of new threads.
There are different ways to use the thread pool:
Parallelize the task sequence: Each task is solved by one thread and multiple tasks can be solved simultaneously.
We can use a thread-safe queue (whether or not it has finite capacity) to maintain the reading and solving of tasks: Everytime a task is read,
- Push it to the queue (if there is a capacity, we have to implement the waiting scheme).
“Assign” the task to a thread: This is actually done by each thread actively keeping pulling a task from the queue while the queue is non-empty and the reading has not finished.
In practice, this can be implemented by a condition variable. The conditions are just the two above. Inside the thread, it waits until the condition is met:
while (1) { pthread_mutex_lock(&m1); while (tasksCompleted == 0 && queueCount == 0) { pthread_cond_wait(&cv, &m1); // Wait on the condition variable } // printf("[%d] After cond var: tasksCompleted: %d, queueCount: %ld\n", tid, tasksCompleted, queueCount); if (tasksCompleted == 1 && queueCount == 0) { pthread_mutex_unlock(&m1); return NULL; } task_t *task = (task_t *)queue_pull(tasks); queueCount -= 1; pthread_mutex_unlock(&m1); solve(task); }
Outside the thread, send the release signal when one of the condition changes:
// Condition 1 while (read task != EOF) { pthread_mutex_lock(&m1); queue_push(tasks, task); queueCount += 1; pthread_cond_broadcast(&cv); pthread_mutex_unlock(&m1); } // Condition 2 pthread_mutex_lock(&m1); tasksCompleted = 1; pthread_cond_broadcast(&cv); pthread_mutex_unlock(&m1);
Parallelize each task: Tasks are solved one-by-one, each divided into subtasks that are solved using all the threads simultaneously.
We can use a barrier to block all the threads until the all the subtasks of the current task are solved. Everytime a task is read,
- Reset all relevant variables.
- Divide the task into subtasks.
“Assign” the subtasks to threads:
- Similarly, each thread actively tries to grab a subtask while there is a subtask remaining and the reading has not finished.
- After the thread solved a subtask, set a barrier.
- If this is the last subtask, signal the main thread that this task is completed (See below).
This can be implemented using a condition variable:
while (1) { pthread_mutex_lock(&m1); while (tasksCompleted == 0 && remainingSubtasks != glo_thread_count) { pthread_cond_wait(&cv1, &m1); } if (tasksCompleted == 1 && remainingSubtasks == 0) { pthread_mutex_unlock(&m1); return NULL; } pthread_mutex_unlock(&m1); solve the task; pthread_barrier_wait(&b); pthread_mutex_lock(&m1); remainingSubtasks -= 1; if (remainingSubtasks == 0) pthread_cond_signal(&cv2); pthread_mutex_unlock(&m1); }
In the main thread, after the subtasks are sent, set a barrier, wait until all subtasks are solved, and then collect the solutions.
while (read task != EOF) { // Reset relevant variables pthread_mutex_lock(&m1); remainingSubtasks = thread_count; pthread_cond_broadcast(&cv1); pthread_mutex_unlock(&m1); pthread_barrier_wait(&b); pthread_mutex_lock(&m1); while (remainingSubtasks != 0) { pthread_cond_wait(&cv2, &m1); } pthread_mutex_unlock(&m1); collect the solutions } pthread_mutex_lock(&m1); tasksCompleted = 1; pthread_cond_broadcast(&cv1); pthread_mutex_unlock(&m1);
7. TODO Deadlock
7.1. Resource Allocation Graph
- Deadlock
- defined as when a system (a set of rules by which a set of processes can move move from one state to another) cannot make any forward progress (at least one progress can change state).
- Resource Allocation Graph
- Bipartite graph. Processes and resources are nodes. If a process is using (holding) a resource, an arrow is drawn from the resource node to the process node. If a process is requesting (waiting) a resource, an arrow is drawn from the process node to the resource node.
- Coffman conditions
Including:
- Mutual exclusion: A resource can only be obtained by one process at a time.
- Hold and wait: Once a resource is obtained by a process, the process keeps the resource locked, until the process itself unlocks it.
- No pre-emption: Nothing can force the process to give up a resource, unless the process itself unlocks it.
- Circular wait: There exists a cycle in the Resource Allocation Graph, or there exists a set of processes {P1, P2,…} such that P1 is waiting for resources held by P2, which is waiting for P3,…, which is waiting for P1.
Theorem: A system is deadlocked if and only if all the Coffman conditions holds.
Remark: The first 3 conditions are more like the (natural) rules of the system, while the 4th is the triggering condition for deadlock.
8. Virtual Memory
- Virtual memory
- Virtual memory provides a layer of abstraction over physical memory, allowing programs to operate as if they have access to a large, contiguous memory space.
Protection: Virtual memory systems provide memory protection mechanisms that prevent one program from accessing the memory of another program. This enhances system stability and security by isolating processes from each other.
Each process is isolated and has its own virtual memory space, in which addresses look continuous to the process. However, different processes may share physical memory (e.g., parent and children processes).
- Efficient Use of Memory: Virtual memory allows for efficient use of physical memory by swapping less frequently used pages to secondary storage. This enables systems to run larger programs than would be possible with only physical memory.
- Ease of Programming: Programmers can develop applications without needing to worry about the physical memory limitations of the underlying hardware. They can write code assuming a large and contiguous memory space, which simplifies application development.
8.1. Translating Address
- Memory Management Unit (MMU)
A hardware unit that is part of the CPU. It converts a virtual memory address into a physical address.
A \(X\)-bit machine, meaning pointers are \(X\)-bit, can address \(2 ^{X}\) different locations where one location is one byte.
Naive lookup scheme: A \(2 ^{X}\)-entry (# of different physical addresses) table, where each entry records a \(X\)-bit physical address (which is \(X /8\) bytes). This consumes \(2 ^{X} \cdot X /8\) bytes.
The above lookup scheme is to inefficient. Instead, we can all addresses into pages and build a look up table for each page.
- Page
- A block of (continuous) virtual memory. A typical block size (# of addresses in a page)on Linux is \(2 ^{12}\) addresses. For a \(X\)-bit machine, assuming a block size of \(2 ^{P}\), there are \(2 ^{X - P}\) pages.
- Frame
- A block of (continous) physical memory. A frame has the same number of addresses as a virtual page.
- Page Table
The highest \((X-P)\) bits are the page number and the lowest \(P\) bits are the offset within the frame.
A page table is a map from a page number (physical) to a frame number (virtual). The table is in the order of page number.
In practice, the table only contains the second column. The page table itself consumes \(2 ^{X - P} \cdot (X - P) / 8\) bytes (usually rounded up \((X - P) / 8\) to some power of 2).
Memory lookup: First lookup the correspdoning frame number to the page number in the page table. And then append the offset to the frame number to get the physical address.
- Multi-level page tales
The lowest \(P\) bits are the offset. Suppose there are \(L\) levels. The higher bits \((X-P)\) are evenly divided into \(L\) segments, in which the \(i\)-th segment is the index used to lookup in the \(i\)-th subtable.
Memory lookup: The \(i\)-th table is a map from a level-\(i\) index to a \((i-1)\)-th subtable number. The lowest table is a map from a level-1 index to the offset in physical frame. Sometimes the higher-level (parent) subtables are called directories, while the lower-level subtables are called table. E.g., an entry in the directory maps to a table.
Table size: There is one top-level table and \(2 ^{(X-P) /L}\) \(i\)-th level subtables for \(i = L-1, \dots , 1\). Each table has \(2 ^{(X-P) / L}\) entries, each consuming \((X - P) / L / 8\) byptes (usually rounded up to power of 2). Thus, in total it consumes \([1 + (L-1) \cdot 2 ^{(X-P)/L}] \cdot 2 ^{(X-P)/L} \cdot (X-P) /L /8\) bytes.
8.2. Translation Lookaside Buffer
The more levels of tables, the slower the memory lookup.
- Translation Looksaide Buffer (TLB)
- A cache of recently-used virtual-page-to-frame lookups. Every time a virtual address needs to be translated into a physical address, TLB is queried in parallel to the page table.
Address per page.
Number of page: / address per page
MMU:
pages / frams does not corres only to ram
8.3. Frames and Page Protections
Page in: Read disk memory into physical frames.
Page out: Copy a physical frame into disk.
Frames and Sharing:
- Physical memory frames can be shared between multiple processes (no delay in modification on the memory).
- The page table entries, in addition to storing the frame number, can also store access permissions (read, write, execute) for each mapped page.
- Read-only frames can be safely shared across processes, allowing for efficient sharing of code segments (e.g., C library code).
- If a process attempts to write to a read-only page, it will trigger a segmentation fault (SIGSEGV) because the hardware enforces the read-only protection.
Page Bits and Protections: A bit in the page table entry (Not in the virtual address) that record the following information:
- Read-only Bit
- Marks a page as read-only, causing write attempts to generate a page fault. Used for sharing code segments and implementing copy-on-write.
- Execution Bit
- Indicates whether the bytes in a page can be executed as CPU instructions. Prevents code injection attacks by marking data pages as non-executable.
- Dirty Bit
- Tracks whether a page has been modified since it was loaded from disk (set the dirty bit if the page has been writtern to). This requires ??? that the backing store retain a copy of the page after it is paged into memory. If the page is clean, then it is uncessary to write it back to the backing store, vice versa.
- Other Bits
- Different architectures may have additional page-level bits for various purposes, such as tracking accessed pages, enabling memory protection keys, or implementing hardware-assisted virtualization.
8.4. Page Faults
- Minor
This occurs when a process tries to access a valid virtual memory address that is not currently mapped to any page in physical memory.
It typically happens when a process requests more memory (e.g., via
sbrk(2)
) but has not written to that memory yet.The operating system can handle this by allocating a new page frame, loading it into memory, and mapping it to the virtual address requested by the process.
- Major
This occurs when a process tries to access a virtual memory address that is mapped to a page that is not currently in physical memory (i.e., the page is swapped out to disk).
The operating system must read the page from disk into a free page frame in physical memory. This is a more expensive operation than a minor page fault, as it involves disk I/O.
- Invalid
This occurs when a process tries to access a virtual memory address that is not valid or not accessible to the process (e.g., trying to write to a read-only memory area or accessing memory outside the process’s address space).
The operating system typically responds to an invalid page fault by terminating the process and generating a segmentation fault signal (
SIGSEGV
).
9. Pipes and Files
9.1. TODO Pipes
Pipes were invented to take the output of one program and feed it to the input of another program.
Capacity of pipes depends on the system.
#include <unistd.h>
- Pipes in terminal
cmd1 | cmd2 | ... | cmdn
. From left to right, the output of the previous command is fed into the next command.int pipe(int fildes[2]);
<unist.h>
. Creates a pipe and allocates a pair of file descriptors.fildes[0]
connects to the read end of the pipe;fildes[1]
connects to the write end. Data written tofildes[1]
can be read fromfildes[0]
.If all file descriptors referring to the read end are closed, then the writing process to receive a
SIGPIPE
signal. The writer may catch this signal.Closing all the file descriptors referring to the write end is the only way to deliver end-of-file to a reader.
Usage:
int fildes[2]; pipe(fildes); pit_t p = fork(); if (p > 0) { /* Parent */ ... // E.g., read from fildes[0] } else { /* Children */ ... // E.g., write to fildes[1] }
Note that
fork()
duplicates file descriptors referring to the same objects for the children process, so we will have two open fd’s referring to the read end and the write end, respectively. It is common practice to close the unnecessary ends in the child and parent process to meet the criterion above (all fds are closed).pipe2()
- TODO
- Named pipes,
mkfifo
int mkfifo(const char *path, mode_t mode);
creates a new fifo file with namepath
. The actual contents of the pipe aren’t printed to the file and read from the file. The OS simply create an unnamed pipe that refers to the named pipe.In terminal,
mkfifo path
. Then,echo Hello > fifo
will hang untilcat fifo
is run on another terminal to read the content.int dup2(int oldfd, int newfd);
Duplicates an existing object descriptor
oldfd
using the file descriptor number speicified bynewfd
. Ifnewfd
is already in use,close()
is called first to deallocated it.file1 file2 A A | \ | | \ X | \ | oldfd newfd
9.2. File
long fseek(FIle *stream, long offset, int whence)
Seek to a new position in a stream. The new position, measured in bytes, is obtained by adding offset bytes to the position specified by
whence
. Ifwhence
is set toSEEK_SET
,SEEK_CUR
, orSEEK_END
, the offset is relative to the start of the file, the current position indicator, or end-of-file, respectively.May have to
fflush()
to update the position on time.
long ftell(FILE *stream);
- Tell current position in a stream;
void rewind(FILE *stream)
- sets the file position indicator to the beginning of the file.
ssize_t pwrite(int fildes, const void *buf, size_t nbyte, off_t offset);
- Write to the specified position in the file without modifying the file pointer.
How the C library warp a file descriptor (simplified):
- buffer: to reduce the times of read/write.
- capacity: maximal capacity of the buffer
- size: the current usage of the buffer
- buffering method: No buffering, line buffering, full buffering (only flush when the buffer is completely full)
Reading and Writing binary data:
size_t fread(void *restrict ptr, size_t size, size_t nitems, FILE *restrict stream);
- Reads
nitems
items, eachsize
bytes long a certain, fromstream
toptr
.
10. Network
- Octet
- A group of 8 bits.
- Port
- A 16-bit number.
- 0 to 1024: privileged ports. E.g., 80 for HTTP
Only processes with super-user access can listen to ports < 1024.
User Datagram Protocol (UDP) : Connectionless. Low latency is more important than reliability (receiving all of the data).
- data packets may be dropped and not recieved by the listener
- data packets may be duplicated
- data packets may arrive out of order
- do not build pipe
- Use case: game, streaming video.
Transmission Control Protocol (TCP) : Connection-based, requiring a handshake at the beginning. Reliability is more important than low latenct. Flow control and retransmission. Stream-based. (Simple) Error checking.
HTTP:
UDP over TCP:
bind associates an abstract network socket created with socket to an actual networking interface/address.
10.1. TCP Client
Client: (1) Writes user input to the server and (2) writes bytes from server to user.
int getaddrinfo(const char *hostname, const char *servname, const struct addrinfo *hints, struct addrinfo **res);
- Perform address resolution for a given host and service. Creates a linked-list of
addrinfo
structs and sets the given pointerres
to point to the first one.hostname
: Either a host name or a numeric host address string.NULL
for server bind (bind to any available network interface on the host).servname
: Either a decimal port name as a string or a service name.hints
: Astruct addrinfo
that indicate the type of socket to use.struct addrinfo { int ai_flags; /* input flags */ int ai_family; /* protocol family for socket */ int ai_socktype; /* socket type */ int ai_protocol; /* protocol for socket */ socklen_t ai_addrlen; /* length of socket-address */ struct sockaddr *ai_addr; /* socket-address for socket */ char *ai_canonname; /* canonical name for service location */ struct addrinfo *ai_next; /* pointer to next in list */ };
ai_family
: The protocol family that should be used.AF_INET
for IPv4,AF_INET6
for IPv6.ai_socktype
: Denotes the type of socket that is wanted.SOCK_STREAM
for TCP,SOCK_DGRAM
for UDP. When ai_socktype is zero the caller will accept any socket type.
- Return value: 0 if successfull and non-zero as error code. Call
gai_strerror()
on the error code to get the error information.
int socket(int domain, int type, int protocol);
#include <sys/socket.h>
. Create a network socket and returns a descriptor that can be used withread
andwrite
.int connect(int socket, const struct sockaddr *address, socklen_t address_len);
- Attempts the connection to the remote machine.
void freeaddrinfo(struct addrinfo *res);
- Frees the memory that was allocated for
res
.
Setting up a client:
#include <sys/types.h> #include <sys/socket.h> #include <netdb.h> #include <string.h> #include <stdio.h> #include <stdlib.h> #include <unistd.h> int main() { struct addrinfo hints, * result; memset(&hints, 0, sizeof(hints)); hints.ai_family = AF_INET; // for IPv4, and AF_INET6 fr IPv6 hints.ai_socktype = SOCK_STREAM; // TCP int s = getaddrinfo("illinois.edu", "80", &hints, &result); if (s != 0) { fprintf(stderr, "getaddrinfo: %s\n", gai_strerror(s)); exit(1); } int sock_fd = socket(result->ai_family, result->ai_socktype, result->ai_protocol); if (sock_fd == -1) { perror("socket failed"); exit(1); } int connect_result = connect(sock_fd, result->ai_addr, result->ai_addrlen); if (connect_result == -1) { perror("connect failed"); exit(1); } printf("Connected\n"); // do something freeaddrinfo(result); return 0; }
- Reading
It’s possible that the data arrives in smaller chunks than the total amount requested. And the
read
function may be interrupted by signals or other events, resulting in a partial read. Use a while loop to keep reading.ssize_t read_all(int sockfd, void *buf, size_t count) { size_t total_read = 0; char *ptr = (char *)buf; while (total_read < count) { ssize_t read_bytes = read(sockfd, ptr + total_read, count - total_read); if (read_bytes == -1) { if (errno == EINTR) { continue; } else { return -1; } } else if (read_bytes == 0) { break; } total_read += read_bytes; } return total_read; }
- Writing
Symmetric. Here is a simple version.
ssize_t write_all(int sockfd, const void *buf, size_t count) { size_t total_written = 0; const char *ptr = (const char *)buf; while (total_written < count) { ssize_t written = write(sockfd, ptr + total_written, count - total_written); if (written == -1) { if (errno == EINTR) { continue; } else { return -1; } } total_written += written; } return total_written; }
10.2. TCP Server
int bind(int socket, const struct sockaddr *address, socklen_t address_len);
Associates (binds) an abstract socket with an actual network interface and port.
By default, when a server is closed, the port enters a “timed-wait” state and is released after some time. To be able to immediately reuse a port, specify
SO_REUSEPORT
beforebind
:int optval = 1; setsockopt(socket, SOL_SOCKET, SO_REUSEPORT, &optval, sizeof(optval)); bind(socket, ...);
int listen(int socket, int backlog);
- Start to receive data. Set the maximum number (
backlog
) of pending connections that can be queued up onsocket
. int accept(int socket, struct sockaddr *restrict address, socklen_t *restrict address_len);
- Accept a pending connection. Block until there is a customer for it to serve.
int shutdown(int socket, int how);
- Shutdown the connection associated with
socket
according tohow
.how
:SHUT_RD
for reading/receiving.SHUT_WR
for writing/sending.SHUT_RDWR
for reading and writing.
uint16_t ntohs(uint16_t netshort);
#include <arpa/inet.h>
Network to host short (= 16 bits). Convert representations from network byte order and host byte order.
- Setting up a server
#include <sys/types.h> #include <sys/socket.h> #include <netdb.h> #include <string.h> #include <stdio.h> #include <stdlib.h> #include <unistd.h> int main() { struct addrinfo hints, * result; memset(&hints, 0, sizeof(hints)); hints.ai_family = AF_INET; hints.ai_socktype = SOCK_STREAM; hints.ai_flags = AI_PASSIVE; int s = getaddrinfo(NULL, "1234", &hints, &result); if (s != 0) { fprintf(stderr, "getaddrinfo: %s\n", gai_strerror(s)); exit(1); } int sock_fd = socket(AF_INET, SOCK_STREAM, 0); if (bind(sock_fd, result->ai_addr, result->ai_addrlen) != 0) { perror("bind()"); exit(1); } if (listen(sock_fd, 10) != 0) { perror("listen()"); exit(1); } struct sockaddr_in * result_addr = (struct sockaddr_in *) result->ai_addr; printf("Listening on file descriptor %d, port %d\n", sock_fd, noths(result_addr->sin_port)); while(1) { int client_fd = accept(sock_fd, NULL, NULL); if (client_fd < 0) { perror(NULL); exit(1); } char buffer[1024]; ssize_t len = read(client_fd, buffer, 999); if (len > 0) { buffer[len] = '\0'; } write(client_fd, "Hello\n", 6); shutdown(client_fd, SHUT_RDWR); close(client_fd); // that number can be reused } }
10.3. Non-Blocking / Asynchrous IO
The default IO is blocking, meaning that it will wait until the data is ready before the function returns.
In non-blocking mode:
read()
will read whatever bytes are available (despite the specified number of bytes to read) and return immediately. If you tried to read any number of bytes but there are no bytes ready, it would return-1
and seterrno
to eitherEAGAIN
orEWOULDBLOCK
.write()
will only send bytes it was able to send immediately and return the number of the bytes. Similarly, if you tried to write something but there are no bytes ready, the return value anderrno
would be the same asread()
.
Set a non-blocking flag on a file descriptor:
For a file:
int fd = open("file.txt", O_NONBLOCK);
For a socket:
int fd = socket(AF_INET, SOCK_STREAM | SOCK_NONBLOCK, 0);
Via
fcntl
(for those sockets that are not created by the above methods):int flags = fcntl(fd, F_GETFL, 0); fcntl(fd, F_SETFL, flags | O_NONBLOCK);
Ways to check that the IO has arrived:
10.3.1. Select
int select(int nfds, fd_set *restrict readfds, fd_set *restrict writefds, fd_set *restrict errorfds, struct timeval *restrict timeout);
Wait for any of the three sets of file descriptors to become ready. Only return the total number of ready file descriptors. Then, the caller need to iterate throught the file descriptors to see which ones are ready.
Advantages: cross-platform.
10.3.2. Epoll
#include <sys/epoll.h>
Tell exactly which descriptors are ready.
int epoll_create(int size);
- Create a new epoll instance (a file descriptor). The
size
argument is ignored but required to be a positive integer. Callclose()
on it the close it at the end. struct epoll_event
Then set a
epoll_event
with the file descriptor of interest:// fd is the file descriptor of interest struct epoll_event ev = {.events = EPOLLIN, .data.fd = fd};
events
member:EPOLLIN
for read,EPOLLOUT
for write.
int epoll_ctl(int epfd, int op, int fd, struct epoll_event *event);
- Control interface.
op
:EPOLL_CTL_ADD
: Add an entry (fd
andevent
) to the interest list ofepfd
. This operation can be called any times.EPOLL_CTL_DEL
: Remove an entry from the interest list.
int epoll_wait(int epfd, struct epoll_event *events, int maxevents, int timeout);
Waits for up to
maxevents
events on (interest list of) the epoll instance referred to byepfd
. Thetimeout
argument specifies the number of milliseconds thatepoll_wait()
will block.Return the number of file descriptors ready for the request IO. The ready events will be store as an array starting from
events
.events[i]
points to the i-th event.Return
-1
on error.
- Level-triggered mode
epoll_wait()
returns whenever the underlying file descriptor has data available for reading or writing.If you do not read all available data from the file descriptor, the event will continue to trigger subsequent calls to
epoll_wait
.- Edge-triggered mode
epoll_wait
returns only when there is a change in the status of the file descriptor. This means that it returns when new data arrives (no data -> new data) or when space becomes available (unavailable -> available) for writing, but it does not return if there is still data available to read or space to write.If you do not read all available data or write all available data space, subsequent calls to epoll_wait will not return until there is a change in the status of the file descriptor.
To use the edge-triggered mode, add the flag
EPOLLET
to theevents
member using bitwise OR.
Usage: The following example is from man 7 epoll
. Assume we have a prepared TCP server socket int listen_sock
.
Create an epoll file descriptor
epollfd = epoll_create1(0); if (epollfd == -1) { perror("epoll_create1"); exit(EXIT_FAILURE); }
Create an event that monitors the server socket
listen_sock
in level-triggered mode.struct epoll_event ev; // This file object will be `read` from (connect is technically a read operation) ev.events = EPOLLIN; ev.data.fd = listen_sock; // Add the socket in with all the other fds. Everything is a file descriptor if (epoll_ctl(epollfd, EPOLL_CTL_ADD, listen_sock, &ev) == -1) { perror("epoll_ctl: listen_sock"); exit(EXIT_FAILURE); }
- In a loop:
Wait for any events specified in
epollfd
struct epoll_event events[MAX_EVENTS]; n = epoll_wait(epollfd, events, MAX_EVENTS, -1); if (n == -1) { perror("epoll_wait"); exit(EXIT_FAILURE); }
Iterate the first
nfds
elements ofevents
.If we get an event on the server socket (
events[i].data.fd == listen_sock
), that means that there is a new client, so we add an event for the client in edge-triggered mode toepollfd
.Otherwise, that means that the event is on some client socket, which has data ready to be read. In this case, we read from that socket.
for (int i = 0; i < n; i++) { if (events[i].data.fd == listen_sockfd) { // New incoming connection struct sockaddr_in addr; socklen_t addrlen = sizeof(addr); int fd = accept(listen_sockfd, (struct sockaddr *)&addr, &addrlen); if (fd == -1) { perror("accept"); exit(EXIT_FAILURE); } // Non-blocking mode int flags = fcntl(fd, F_GETFL, 0); fcntl(fd, F_SETFL, flags | O_NONBLOCK); // create some variable for this client fd (stores fd, etc.) connection_state_t *state = create_connection_state(fd); struct epoll_event ev; ev.events = EPOLLIN | EPOLLET | EPOLLONESHOT; ev.data.ptr = state; // pointer to the variable if (epoll_ctl(epollfd, EPOLL_CTL_ADD, fd, &ev) == -1) { perror("epoll_ctl: add"); close(fd); free(state); exit(EXIT_FAILURE); } } else { // Get back the variable connection_state_t *state = (connection_state_t *)events[i].data.ptr; // process state } }
11. File System
11.1. Pathing
char * realpath(const char *restrict file_name, char *restrict resolved_name);
#include <stdlib.h>
. Converts the inputfile_name
to the corresponding absolute canonical path if that path exists.If
resolved_name
isNULL
, memory will be dynamically allocated and the address will be returned; otherwise, the path will be stored atresolved_name
.
11.2. TODO File Contents
- Disk block
- A portion of the disk that is reserved for storing the contents a file or a directory. Every disk block has the same size, say \(2 ^{B}\) bytes.
- inode
- Represents a file or directory, containing the metadata about the file and pointers to disk block that hold the file contents.
- Superblock
- Contains metadata about the inodes and disk blocks.
11.3. Inode
An inode has
- Direct block
- A disk block that directly store the file content. A direct block can store \(2 ^{B}\) bytes of the file content.
- Indirect block
- A disk block that holds pointers to direct disk blocks rather the file content. Suppose a pointer consumes \(P\) byptes (\(4P\) bits), then an indirect block holds \(\frac{2 ^{B}}{P}\) pointers and indirectly points to \(\frac{2 ^{B}}{P} \times 2 ^{B}\) bytes of the file content.
- Double/triple/k-th indirect block
- A \(k\)-th indirect block is a disk block that holds pointers to \((k-1)\)-th indirect disk blocks. It indirectly points to \[ \underbrace{\frac{2 ^{B}}{P}}_{\text{number of pointers}} \times \underbrace{\left( \frac{2 ^{B}}{P} \right) ^{k-1} \times 2 ^{B}}_{\text{bytes a (k-1)-th indirect block points to}} \] byptes of file content.
11.4. Directory
- Directory (folder)
- A special file whose contents are inode numbers and names of its entries. Some special bits are set in its inode to identity that this is a directory.
- Directory entry (dirent)
- A entry under a directory. It is also a file, other than
ls -i
- List the inode numbers and the entries in the current directory.
opendir()
- Open a directory.
readdir()
d_name
:d_ino
:
- (no term)
closedir()
11.5. Meta File Information
stat
st_mode
: Read/Write/Execute. File/directory type.
fstat
lstat
11.6. TODO Linking
- Hard link,
ln source_file target_file
- A hard link is simply an entry in a directory that points to an inode that represents an existing file.
- Changes made to the any hard link are reflected in all hard links, as they all point to the same data blocks.
- Removing the original file does not affect hard links, as they maintain a reference to the underlying data until all links are removed.
- To check the number of hard links to a file, you can also use ls -l and observe the second field. The number of hard links is displayed as the second field of the output.
- Symbolic link,
ln -s source_file target_file
- A symbolic link, also called a soft link, is a file with a special bit set and stores a path to another file.
- Can read the content of the source file by reading the symbolic link.
- Read the path the symbolic link contains, use
readlink
. - Deleting or moving the original file/directory does not affect symlinks, but accessing a broken symlink (pointing to a non-existent target) results in an error.
- To check if a file is a symbolic link, you can use
ls -l
. Symbolic links will be identified by an l at the beginning of the permissions field.
12. Scheduling
- CPU Scheduling
- CPU Scheduling involves efficiently selecting processes to run on CPU cores. In busy systems, there are more ready-to-run processes than CPU cores, requiring the system kernel to evaluate which processes to schedule for execution and which to defer.
- Pre-emption
the act of temporarily halting the execution of a currently running process to allow the CPU to immediately start executing another process with higher priority.
Preemption ensures that processes with higher urgency or shorter execution times can be promptly attended to, preventing situations like starvation and improving system responsiveness.
12.1. Measures of Efficiency
- Arrival time
- is the time the process enters the scheduler (CPU may start working on it)
- Start time
- is the wall-clock start time of the process (CPU starts working on it)
- End time
- is the end wall-clock of the process (CPU finishes the process)
- Run time
- is the total amount of CPU time required
- Response Time
- start time - arrival time. Latency from arrival to CPU start.
- Turnaround Time
- end time - arrival time. Total time from process arrival to completion.
- Wait Time
- Turnaround time - run time. Total time a process spends on the ready queue.
- Convoy effect
- when a CPU-intensive task monopolizes the CPU, delaying I/O-bound processes and degrading overall system performance.
12.2. Scheduling Algorithms
- Shortest Job First (SJF)
- Selects the shortest total CPU time process first, minimizing wait and response times. Requires knowledge of process runtimes.
- Pre-emptive SJF (PSJF)
- Similar to SJF but allows pre-emption. Ensures shorter jobs are prioritized but requires frequent context switching.
- First Come First Served (FCFS)
- Processes executed in arrival order. Simple but susceptible to the Convoy Effect.
- Round Robin (RR)
Processes executed in order with a fixed time quantum, ensuring fairness but potentially leading to frequent context switching.
- Priority Scheduling
- Processes executed based on priority value, allowing for the prioritization of critical tasks.
13. Error
errno
#include <errno.h>
. A global variable that is set by system calls and library functions in the event of an error. It holds an error code (integer) that provides information about the type of error.perror()
- Prints a descriptive error message corresponding to the current value of
errno
. It takes a string argument (custom message) and prints it followed by the (built-in) error message. stderror()
- Provides a string representation of an error code. It takes an integer argument representing an error code and returns a pointer to a string describing the error.
14. Makefile
- Compile:
make
. - Clean up the assignment directory:
make clean
- Compile a debugable version that can use gdb on (e.g., show source code)
- Compile a release version
15. Regrade
malloc_pt1: 66.67 -> 100.00
shell_pt1: 81.48 -> 92.59
- Quizzes: 15 / 15
- MP: 43.40 / 45
- Lab: 16.60 / 17
- Final: ? / 21
- 65% -> 90.45
- 75% -> 92.6
- 85% -> 94.7