Scheduling
Overview
- problem: how to share the CPU?
- more generally: multiplex tasks onto limited resources
- cloud service, limited machines, lots of requests
- supermarket, limited cashiers, lots of customers
- terms/metrics
- task
- user requests, eg. mouse click, web request, shell command
- a thread can perform multiple tasks (read data, encrypt it, write to file)
- latency/response time
- user-perceived time to finish a task (including time waiting)
- throughput
- the rate of task completion (# of tasks done per period of time)
- scheduling overhead
- time it takes to perform scheduling (run scheduling policy + context switch)
- fairness
- are tasks given similar times on the CPU & wait for similar amount of time
- predictability
- how consistent is the latency
- starvation
- lack of progress for one task due to higher priority tasks
Scheduling Policies
- first in first out (FIFO)
- run each task to completion, in the order they come in
- pros/cons?
- average latency: best case? worst case?
- shortest job first (SJF)
- schedule the shortest job first
- if a new shorter task arrives, preempts the current task and switches to the new task
- pros/cons?
- average latency?
- round robin (RR)
- fifo but each task only gets a fixed amount of time (time slice/quantum)
- how to pick the quantum?
- pros/cons?
- average latency
- seems fair but not all tasks need CPU equally
- some tasks use little CPU (< quantum) and blocks for events (I/O bound)
- other tasks use more CPU and take up the full quantum (CPU bound)
- multilevel feedback queue (MLFQ)
- multiple levels of RR queues, with increasing the quantum
- queues with shorter quantum has higher scheduling priority
- scheduler runs tasks within a queue in RR fashion starting from the top queue
- when the current queue is empty, scheduler moves down to the next queue
- when a queue with higher priority is populated with new tasks, preempt the current task and schedule tasks in the higher priority queue
- tasks from lower queues are periodically moved up to the top priority queue to avoid starvation
- improves latency for I/O tasks, fair to tasks that use less CPU
- a task starts at the top queue (shortest quantum, highest scheduling priority)
- if a task uses up the quantum, it moves down a queue (longer quantum), as it needs more CPU
- otherwise it stays in the same queue or moves up a queue