CSE 163, Spring 2019: Homework 2: Processing CSV Data

Submission

This assignment and its reflection are due by Thursday, April 18 at 11:59 pm.

You should submit your finished hw2_manual.py, hw2_pandas.py, and hw2_test.py on Gradescope and the reflection on Google Forms

Overview

In this assignment, you will use the foundational Python skills you've been developing and apply them to analyze a small dataset. Many datasets you’ll be working with are structured as CSV or tabular representation - this assignment will be an introduction to reading, processing, and grouping rows and columns to calculate some interesting statistics. These skills will be very useful to have a strong foundation in when we work with much larger (and less complete) real-world datasets!

This assignment is broken in to two main parts, where each part mostly does the same computations in different ways. This is to give you the opportunity to compare/contrast different approaches to solving problems

Learning Objectives

After this homework, students will be able to:

  • Follow a Python development work flow for this course, including:
    • Writing a Python script from scratch and turning in the assignment.
    • Use the course infrastructure (flake8, test suites, course resources).
  • Use Python to review CS1 programming concepts and implement programs that follow a specification, including:
    • Use/manipulation of various data types including numbers and strings.
    • Control structures (conditional statements, loops, parameters, returns) to implement basic functions provided by a specification.
    • Basic text file processing.
    • Documenting code.
  • Write unit tests to test each function written including their edge cases.
  • Work with data structures (lists, sets, dictionaries) in Python
  • Process structured data in Python with CSV files as input with and without a library (Pandas)
    • Handle edge cases appropriately, including addressing missing values/data
    • Practice user-friendly error-handling
  • Apply programming to identify and investigate a question on a dataset using basic statistical concepts (e.g. mean, max)

Expectations

Here are some baseline expectations we expect you to meet:

Files

You should download the starter code hw2.zip and open it as the project in Visual Studio Code. The files included are:

  • hw2_manual.py: The file for you to put solutions to Part 0. This contains some starter code.
  • hw2_pandas.py: The file for you to put solutions to Part 1.
  • hw2_test.py: The file for you to put your tests for Part 0 and Part 1.
  • cse163_utils.py: A file where we will store utility functions for helping you write tests.
  • hw2_main.py: A client program provided to call your functions.
  • pokemon_box.csv: A CSV file that stores information about Pokemon. This columns of this file are explained below.
  • pokemon_test.csv: A very small CSV file that stores information about Pokemon. This columns of this file are explained below.

Data

For this assignment, you will be working with a dataset of Pokemon that you have caught on your Pokemon journey so far. The file pokemon_box.csv stores all the data about the captured Pokemon and has a format that looks like:

id name level personality type weakness atk def hp stage
1 Bulbasaur 12 Jolly Grass Fire 45 50 112 1
... ... ... ... ... ... ... ... ... ...

Note that because this is a CSV file, the file contents have these cells separated by commas.

Column Descriptions

  • id: Unique identification number corresponding to the species of a Pokemon. Note that if there are multiple Pokemon of the same species in the dataset, they all share the id.
  • name: Name of the species of Pokemon. For example Pikachu.
  • level: The level of this Pokemon (an integer)
  • personality: A one-word string describing the personality of this Pokemon
  • type: A one-word string describing the type of the Pokemon (e.g. "Grass" for Bulbasaur)
  • weakness: What type this Pokemon is weak to. For example, Bulbasaur is considered weak to the fire type.
  • atk, def, hp: Pokemon stats that indicate how many hits a Pokemon can take (hp), how strong its attacks are (atk), and how much hits affect it (def)
  • stage: Indicates if this Pokemon has evolved into a new species. For example, in the Charmander species (stage 1), it evolves into a Charmeleon (stage 2), which evolves into Charizard (stage 3). pokemon evolution stages

Table of Contents

Evaluation

Your submission will be evaluated on the following dimensions

  • Your solution correctly implements the described behaviors. You will have access to some tests when you turn in your assignment, but we will withhold other tests to test your solution when grading. All behavior we test is completely described by the problem specification or shown in an example.
  • Your code meets our style requirements:
    • All files submitted pass flake8
    • Every function written is commented using a doc-string format that describes its behavior, parameters, returns, and highlights any special cases.
    • There is a comment at the top of each file you write with your name, section, and a brief description of what that program does.
    • Any expectations on this page or the sub-pages for the assignment are met as well as all requirements for each of the problems are met.