Synopsis¶
In this assignment, you will implement a concurrent job scheduler in the Go programming language. This extra credit assignment requires a substantial amount of independent learning, and is largely meant to teach you how to learn new programming skills on your own.
In the class thus far, we’ve already learned three separate languages (bash and C/C++). By completing this assignment, you’ll realize how far you’ve come in your ability to pick up new programming skills on the fly and apply them in a foreign code base.
Go¶
Go is a modern programming language originally created by Google. Its inventors sought to simplify a lot of the complexity commonly found in other programming languages (i.e. less is more).
Go is a fantastic language, and it is built with concurrency in mind - it even reserved the go
keyword to create goroutines (or lightweight threads).
Prerequisites¶
Before starting this assignment, you should explore the Tour of Go, which highlights a variety of features in the language. It only takes ~30 minutes to complete. The go
command is already installed on calgary
, so you don’t need to worry about any of the installation instructions.
We will be using Go 1.20, which is one of the more recent releases.
Functionality¶
In this assignment, you will build a concurrent job scheduler. The job scheduler will read a sequence of “jobs” and execute each job concurrently as long as all of the other jobs it depends on have concluded. Executing a job simply requires writing its name to stdout
, where each job’s name is separated by a newline (\n
). This is best described with a few examples (all of which are also included in the project’s README.md
).
1. Job Scheduling (3 points)¶
The following configuration file (written in YAML) defines a single job with its name
set to echo
.
jobs:
- echo
If the job scheduler is run with this configuration it should write (with a \n
at the end):
echo
A configuration that defines multiple jobs are written across multiple lines like so:
jobs:
- ls
- cd
- echo
ls
cd
echo
Given that the job scheduler runs jobs concurrently, it is equally valid to see these jobs written in a different order, such as:
cd
ls
echo
Your task is to start implementing the scheduler.Run
function to handle this case. All configuration parsing is already handled for you, so you can simply act upon the configured jobs found in the Scheduler
. You can experiment with this command by running the run
sub-command like so:
./scheduler run ./path/to/your/schedule.yaml
You’re expected to create your own job configuration files, but a variety of examples are provided in the ./internal/testdata
directory.
Your solution must run each job concurrently. Purely sequential solutions will not receive credit.
Note
We are not asking you to run the following as shell commands. It is equally as valid to have arbitrary names configured like one
, two
, and three
.
2. Handling Dependencies (10 points)¶
As mentioned above, jobs may define dependencies on one another. Any given job MUST NOT be executed until all of its dependencies have been executed. For example, consider the following:
jobs:
- ls
- cd
- name: echo
dependsOn: ls
In this case, echo
should never be executed before ls
has finished. In other words, ls
should always be printed above echo
. It should be impossible to produce the following output:
cd
echo
ls
Handling this case is the primary goal of the assignment. Several hints are labeled in the starter code to help get you started. Note that you will be graded on how well-structured and computationally efficient your solution is. You will not receive full credit if you unncessarily loop over the job list more times than needed.
Note
If any job requires dependencies, you will need to specify the name
key, too. Jobs that don’t have dependencies can omit the name
key entirely (e.g. ls
and cd
in the example above).
3. Generating a DOT graph (5 points)¶
The job scheduler configuration can be represented as a directed acyclic graph (DAG). Jobs without any dependencies are the leaf nodes (i.e. they have no outgoing edges), and jobs with dependencies have outgoing edges that point to their dependencies’ nodes.
Your task is to implement the graph
sub-command and write out a DOT-graph that can be visualized like so.
Your program should generate the graph with the graph
sub-command like so:
./scheduler graph ./path/to/your/schedule.yaml
An example is outlined in the starter code’s README.md
, as well as marked with TODO
s in the code. If you implement this feature, you will also need to submit a dependencies.png
file that encodes the graph represented by internal/testdata/config/dependencies.yaml
.
4. Testing (3 points)¶
A variety of test files are provided in the ./internal/testdata
directory, most of which are tested in internal/config/config_test.go
.
You are expected to add more tests in both internal/config/config_test.go
and internal/scheduler/scheduler_test.go
to test that your program works as expected. This task is intentionally open-ended; it’s up to you how you test your code, but you will be evaluated on their efficacy.
5. Add the delay configuration (2 points)¶
Now that you have things working and can visualize your results, we will see what it takes to extend the definition of a job. Your task is to add a configurable delay
, which is expressed in seconds. For example, consider the following job configuration:
jobs:
- ls
- name: cd
delay: 2
- name: echo
dependsOn: ls
delay: 3
With this, both ls
and cd
can be scheduled concurrently, but the cd
job should wait for 2 seconds before it writes its name. Similarly, the echo
job should wait for an additional 3 seconds after both the ls
and cd
job conclude. This is to emulate a more realistic use case, where each job does not finish right away.
A TODO
is left in the code to help get you started.
6. Add the --debug
flag (2 points)¶
With the configurable delay, there’s a potentially significant amount of time between a job being initially scheduled and a job completing. It would be helpful to write a debug message to stderr
as soon as a job is scheduled (e.g. ‘Scheduling job “ls”’).
Add support for a --debug
flag that can be used to enable these debug messages. These messages should not be written by default (i.e. when --debug
is not specified). When enabled, the messages should always be written to stderr
.
Specifying the --debug
flag should look like the following:
./scheduler run ./schedule.yaml --debug
Again, a TODO
is included to help get you started.
Implementation Hints¶
Take your time and familiarize yourself with the entire codebase before you write any code. It’s a lot easier to debug issues when you have a general idea of how each component fits together.
Play around with the scheduler
command and see how it behaves in its current state. Not everything will work right away, but it will successfully execute and report errors, where applicable.
You’re encouraged to use external resources to help learn Go. If you end up using or referencing any code snippets, please cite them in your implementation with a comment.
The implementation includes a variety of hints marked as TODO
comments, such as the following:
// Hint: You will need to use the 'go' keyword, which runs the
// job in a goroutine (which is a lightweight thread).
//
// For details, see https://go.dev/tour/concurrency
These TODOs will shed light on how to implement the behavior defined above.