CSE 163, Winter 2020: Final Project: Part 1

Overview

Now that your Part 0 project proposal has been approved it is time to get started! You need to provide more background and decide on the details of how you will answer your research questions. For Part 1, you will submit a revised version of your Part 0 project proposal. Include any feedback you received from course staff and any further refinements you have made. In addition you must include three new sections: one on your motivation, one on your methodology, and one on your work plan.

Report

Your Part 1 project proposal should include the following sections:

  1. Title and author(s).
  2. Summary of research questions. This should include your research questions from Part 0 (or a modified version if you have changed it after Part 0). You don't need to include quite as much detail here since you will include motivation and background and a methodology below. Your research questions should be clear in this section though.
  3. Motivation and background. Explain the context and why the problem matters. This expands on the research questions that you already stated. Why are they worth computing? What difference would knowing the answers make? We require a problem with some kind of real-world motivation.
  4. Dataset. Same as before.
  5. Methodology (algorithm or analysis). Write a complete, clear, unambiguous English description of the analysis you will perform. This should be sufficient for someone else to write a Python program (or perform manual computations) that reproduces your results, without access to your source code, and without having to guess or make significant design choices. This description is also likely to be helpful to people who read your code later.
    This section explains how your analysis works on an abstract level, focusing on the problem domain and your algorithm. It should be written to be read by a scientist or engineer, not by a programmer. It should should say nothing about specific implementation choices, such as how your code is organized or implemented (such details belong in code comments), what data structures are iterated over, and the like.
    This section should tie your computations to your research questions, indicating exactly what results would lead you to what conclusions.

    Perhaps of Interest: Correlation

    You have learned some elementary statistics in CSE 163. One other concept you might find useful is correlation, a measure of whether two variables' values are related, such as when one variable depends on another. Correlation is a quantitative measure of how good a "best fit line" you can draw on a scatterplot. A concrete metric for correlation is the Pearson coefficient r. It is better to import and use SciPy's pearsonr function rather than implementing it yourself. After you have computed r, you will need to interpret it. You may find this handout useful.

  6. Work Plan. Include a breakdown of the remainder of your project (that is, the work you will do for Part II) into at least 3 and no more than 6 parts, and an estimate of the time you will spend on each of the parts. This does not need to be down to the level of the functions you will implement, but it is fine if you include that level of detail (more detail helps you). If you are working with a group you must also describe how you will develop and test your code and coordinate other aspects of working together in a team (e.g. sharing access to source code, dividing responsibilities or working together). All group members must contribute equally to BOTH the code and the report writing.

    Perhaps of Interest: Pair Programming

    If you are working with a group you may be interested in trying pair programming . Pair programming is a technique that is a part of Extreme and Agile software development approaches used in some software companies. Here are a few references on pair programming:

    Note: You are not required to work in pairs, working individually is fine.

    Perhaps of Interest: Version Control System

    When working with multiple people on a code project, it is very helpful to have some system in place to manage changes to code. There is a whole business of making software to help developers collaborate on code, the one that is by far the most popular is called Git. There are many, many tutorials online for how to get started with Git, so if this is something that you want to use I recommend asking the internet for help. I will suggest one of the guides I find most accessible which can be found here.

    Using Git is one component, but another is actually storing your code in a place that is accessible by you and your group. Most people use a website called GitHub to store their repositories so many people can access their code (surprisingly they have also written a Guide to Git too).

    There is no requirement that you use a version control system like this, but it might be something you find helpful!

  7. Questions. If you have any specific questions for us feel free to add those here. (Not required)

Submission

Submit your Part 1 as a PDF file. Do not turn in a Word document or plain text. One group member should submit your report on Gradescope and should use Group Members functionality to add the appropriate group members if you have some. If you want to learn about how to add Group Members on Gradescope, please see instructions here. Group members that are not listed in Gradescope by the late-cutoff will be marked as not submitted.

Reminder: You can only submit Part 1 one day late regardless of the number of late days you may have remaining.