CSE 163, Winter 2021: Final Project: Part 2

Overview

Due Tuesday, March 16 at 21:00 (PDT) on Gradescope.

Implement your analysis, process your data, and interpret the results. Then, complete your report to include the results and conclusions of your analysis. Plots and other visual representations of data are very useful in conveying your conclusions.

Report

Submit your report in PDF format. Your report will probably be about 4-6 pages of text long, but there are no fixed upper or lower bounds on its size. You should write at an appropriate length: neither so briefly that you omit information, nor so verbosely that you pad your report or bury the important information under irrelevant details. Visualizations might make your report longer - which is completely fine!

At this time you may also go back and improve any of the previous sections you have written.

In your report, please annotate any visualizations with the method used to produce them. Visualizations should add to your report's narrative and should be explained in your analysis. Plots should be included in your report, but you should also submit the plot images produced by your code.

Your report should contain at least the following parts. You should label your sections. You are definitely permitted to write additional sections as well.

  1. Title and author(s).
  2. Summary of research questions AND RESULTS. Repeat your research questions in a numbered list.

    After each research question, clearly state the answer you determined. Don't give details or justifications yet — just a brief summary of the answer.

  3. Motivation and background.

    Same information as Part 1 unless otherwise indicated by feedback

  4. Dataset.

    Same information as Part 1 unless otherwise indicated by feedback

  5. Methodology (algorithm or analysis). It is likely that you will come back and refine this section after implementing your project.
  6. Results. Present and discuss your research results. Treat each of your research questions separately. Focus in particular on the results that are most interesting, surprising, or important. Discuss the consequences or implications.

    Interpret the results: if the answers are unexpected, then see whether you can find an explanation for them, such as an external factor that your analysis did not account for. A good report not only presents the results, but gives an interpretation of them to the reader.

    Include some visualization of your results (a graph, plot, bar chart, etc.). These plots should be created programmaticaly in the code you submit. If you have to create plots by hand using a program like Excel, you must provide a good reason why it was not possible to create the plot you wanted using Python.

  7. Challenge Goals. In this section, you should outline which of the challenge goals from Part 0 you think your project completes and why you think so. Be specific in stating which challenge goals your project meets explicitly.

    It is acceptable for you to scale back, or to expand, the scope of your project if necessary. It's better to do a great job on a subset of your original proposal, than to do a bad job on a larger project. If you have to scale back, then explain why the task was more difficult than you estimated when you wrote your proposal. This will help you to make a better estimate for your next project. It will also convince the course staff that you have done an acceptable amount of work for CSE 163. If changes to your project caused you to meet different challenge goals than you originally proposed, that is also okay. However, you should keep in mind that your mentor gave you feedback on your project in the context of your original goals so you should really make sure your changed project meets your new goals.

  8. Work Plan Evaluation. Include your work plan from Part I (all parts) and evaluate it. Specifically, answer how accurate were your work plan estimates from Part I? Why were your estimates good or bad?
  9. Testing. You should make some attempt at testing your code to increase your certainty that your analysis is correct. In your report, describe how you tested your code. Did you use asserts? Smaller data files? Be sure to submit your tests and any testing files when you submit your code. You must also have artifacts or evidence from your testing (as in you cannot just say you used print statements and then remove them all from your code when you turn in) Make sure you tell us why we should trust your results!
  10. Collaboration. State which students or other people (besides the course staff and your group mates, if any) helped you with the assignment, or that no one did. Did you use online resources to help you? If so, what were they?

Code

Your code should follow the following requirements.

  • Your project must be a Python script (.py files). You are more than welcome to experiment and/or develop in a Jupyter Notebook, but your end result must be a runnable Python script to output all your results. Your project should use the main method pattern for modules that can be run. If you rely on a library that you need access to a Jupyter Notebook for, you may use a notebook if 1) you check in with the staff and 2) turn in two .py files along with your notebook that contain your code (you still need to give explicit instructions in your .README.md file of how exactly to reproduce your results).
  • Your code must pass flake8.
  • Your source code should be well-written and well-commented. Your code should use good naming conventions, be broken up into meaningful functions, should avoid unnecessary computations, and should be commented. Each module, function, and class you do write should have a doc-string describing it. Make sure that your code follows the guidelines from the Code Quality Guide. Your source code documentation should assume that the programmer has already read your report — you do not need to repeat any of those details.
  • Your code will need to be broken up into at least two Python modules (files). You can decide how to break up your code, but it helps to separate them by the part of the project they concern. A very common split is to have one module that does all the data processing (loading in data, cleaning it, etc.) while the other module is the actual runnable program that does the analysis. You can break up your code however you see fit, but you must have at least two Python modules that you submit. Your testing programs (which you should write) do not count as one of the two required modules.

Just for reference, most projects that adequately meet two challenge goals will be at least 120 lines of Python code long. This is not a hard requirement, and we will not count lines, but this is a very good heuristic to tell if your project has enough depth.

Instructions

Along with your code, you should submit a file named README.md that contains instructions on how to run your project. The .md file type is a Markdown file which is like a plain .txt but allows some special formatting that many websites render into a nice display. You can see what an example README.md looks like here. You can view the file that creates this page here.

You should turn in a Markdown file named README.md that looks like the file in the second link of the last paragraph. You do not need to use any special Markdown formatting in your document if you don't want to. However, this is a very common file format so we encourage you to try out those features! There are lots of online editors for Markdown to let you preview your Markdown document (e.g., StackEdit).

Your README.md should include:

  • You should write instructions for us to run your project to reproduce your results. Tell us which Python modules to run to get your results and anything else we need to do to run them.
  • If there is anything we need to do to set up your project, like install libraries or how to download your data (if you did not submit it), give us instructions for how to do so.
  • Anything else we need to know about running your project!

Your instructions should be detailed enough that your mentor can run your code to reproduce any of the results in your report. You can assume the reader of your instructions is familiar with programming environments in Python and have read your report. You should not assume your mentor will spend time "figuring out" how to run your project with anything outside of your instructions so make sure your instructions are clear and unambiguous.

If you are looking at the past project gallery, the information here corresponds to the "Reproducing the Results" sections of those reports. This requirement to put your instructions in a separate file is new.

Submission

For this part of the project, you will submit:

  • Your report as a PDF. Do not turn in a Word document or plain text.
  • Your README.md explaining how to set up your project and how to run your code to reproduce your results.
  • Your Python files and any other supporting files you need. This should include any tests you write. You do not need to submit your dataset for this part, but there should be clear instructions in your README.md for the course staff to download the data so they can run your program.

One group member should submit your report on Gradescope and should use Group Members functionality to add the appropriate partner(s) if you have them. If you want to learn about how to add Group Members on Gradescope, please see instructions here. Group members that are not listed in Gradescope by the due date will be marked as not submitted.

Recall, you are only able to make resubmissions on take-home assessments so the deadline