% lec 4: data structures # administrivia - hw1 out - start early - can use up to 2 late days on it - you really don't want to - today: the last lecture on C & CSE 351 review - next lecture: system calls - unix paper questions: try your best - upload your answers early - [hands-on](http://web.mit.edu/6.033/www/assignments/handson-unix.html) # today's plan - vectors - singly linked lists # why data structures in C control: performance, resource low-level & error-prone (cache, concurrency) # vector v0 ```c typedef int T; struct vector { size_t size; T *data; }; struct vector *vector_alloc(size_t n, T initial_val); void vector_free(struct vector *s); ``` example use: ```c int main(void) { size_t i, n = 10; struct vector *s = vector_alloc(n, 42); /* ... */ vector_free(s); } ``` # vector v0: problems? ```c struct vector { size_t size; T *data; }; ``` ```c struct vector *vector_alloc(size_t n, T initial_val) { struct vector *s = malloc(sizeof(struct vector)); s->size = n; s->data = malloc(n * sizeof(T)); for (size_t i = 0; i < n; ++i) s->data[i] = initial_val; return s; } ``` ```c void vector_free(struct vector *s) { free(s); free(s->data); } ``` # problems - double free (wrong order) - `malloc` may fail - allocation size may overflow see [vector_0.c](l04/vector_0.c) # performance? ``` {.ditaa #l04/vec0} +----------+ +-----+ --->| size: 10 | +-->| 42 | +----------+ | +-----+ | data |--+ | ... | +----------+ +-----+ | 42 | +-----+ ``` . . . ``` {.ditaa #l04/vec1} +-------------+ --->| size: 10 | +-------------+ | data[0]: 42 | +-------------+ | ... | +-------------+ | data[9]: 42 | +-------------+ ``` # vector v1: flexible array member ```c struct vector { size_t size; T data[]; }; ``` . . . ```c struct vector *vector_alloc(size_t n, T initial_val) { struct vector *s = malloc(sizeof(struct vector) + n * sizeof(T)); s->size = n; for (size_t i = 0; i < n; ++i) s->data[i] = initial_val; return s; } ``` ```c void vector_free(struct vector *s) { free(s); } ``` # vector v1: one malloc - out of memory - malloc size overflow see [vector_1.c](l04/vector_1.c) # flexible array member ```c struct vector_0 { size_t size; T *data; }; ``` ```c struct vector_1 { size_t size; T data[]; }; ``` `sizeof(struct vector_0)` vs `sizeof(struct vector_1)`? try `printf` the results # sum up - be careful with allocation - out of memory - size overflow - flexible array member - useful for reducing mallocs/footprint/dereferences - old fashion: zero- or one-length array # bug example ```c struct snap_context { ... size_t num_snaps; uint64_t snaps[]; }; struct snap_context *ctx; size_t num = ...; if (num > SIZE_MAX / sizeof(uint64_t) - sizeof(*ctx)) goto fail; ctx = malloc(sizeof(*ctx) + num * sizeof(uint64_t)); ``` see [bug & fix](https://github.com/torvalds/linux/commit/80834312a4da1405a9bc788313c67643de6fcb4c) # singly linked list ```c typedef int T; struct slist { T elem; struct slist *next; }; struct slist *slist_push(struct slist *lst, T e); ``` ``` {.ditaa #l04/slist} +----------+ +----------+ +----------+ ---> | elem: 12 | +-->| elem: 34 | +-->| elem: 56 | +----------+ | +----------+ | +----------+ | next |--+ | next |--+ | next: -- | +----------+ +----------+ +----------+ ``` # slist_push ```c struct slist *slist_push(struct slist *lst, T e) { struct slist *n = malloc(sizeof(struct slist)); /* TODO: check if n is NULL */ n->elem = e; n->next = lst; return n; } ``` ```c int main(void) { struct slist *lst = NULL; lst = slist_push(lst, 12); lst = slist_push(lst, 34); lst = slist_push(lst, 56); } ``` # more data structures Example: Linux kernel - doubly linked list: [list.h](https://github.com/torvalds/linux/blob/master/include/linux/list.h) - red-black tree: [rbtree.h](https://github.com/torvalds/linux/blob/master/include/linux/rbtree.h) - radix tree: [radix-tree.h](https://github.com/torvalds/linux/blob/master/include/linux/radix-tree.h) - hash table: [hashtable.h](https://github.com/torvalds/linux/blob/master/include/linux/hashtable.h) - lock-free slist: [llist.h](https://github.com/torvalds/linux/blob/master/include/linux/llist.h) - ... # example: linux kernel's list ```c struct person { char name[32]; long ident; struct list_head list; }; ``` ```c int main(void) { struct person alice = {"alice", 42}, bob = {"bob", 24}, *p; LIST_HEAD(lst); list_add(&alice.list, &lst); list_add(&bob.list, &lst); list_for_each_entry(p, &lst, list) { printf("%s:\t%ld\n", p->name, p->ident); } } ``` see [list.h](l04/list.h) and [test_list.c](l04/test_list.c) # C final details - macros - `.h` files: headers, to be `#include`'d - declarations, inline functions - avoid double inclusion: [include guard](http://en.wikipedia.org/wiki/Include_guard) - `.c` files: definitions - build process: preprocessing - AST - IR - `.o` files - executable review CSE 351 / tomorrow's sections # self-exercise 1 complete the singly linked list: - add a function that returns the number of elements in a list - implement a program that builds a list of lists - i.e., it builds a linked list - but each element in the list is a (different) linked list - design and implement `slist_pop` - removes an element from the head of the list - make sure your linked list code, and customers' code that uses it, contains no memory leaks [solution](//courses.cs.washington.edu/courses/cse333/14wi/lectures/lec05_exercises/exercise1/) # self-exercise 2 implement and test a [binary search tree](http://en.wikipedia.org/wiki/Binary_search_tree) - don't worry about making it balanced - implement key `insert()` and `lookup()` functions - implement a key `delete()` function - implement it as `bst.c` and `bst.h` - implement `test_bst.c`: `main()` to test out your BST [solution](//courses.cs.washington.edu/courses/cse333/14wi/lectures/lec05_exercises/exercise2/)