CSE 163, Spring 2019: Homework 2: Part 0

Overview

In this part of the homework, you will write code to perform various analytical operations on data parsed from a file.

Expectations

  • All functions for this part of the assignment should be written in hw2_manual.py
  • For this part of the assignment, you may import the math module, but you may not use any other imports to solve these problems.

Step 0: Parsing (We do this step for you)

The first step is to process the CSV into a usable data structure. Like in class, we will use the parse function we wrote to take a CSV and turn it into a list of dictionaries. For example, if we had the file test.csv with the contents:

a,b,c
Foo,2,3
4,5,6

In order to parse the file and convert the 'b' and 'c' to integers, we call parse('test.csv', ['b', 'c']) which would return:

[{'a': 'Foo', 'b': 2, 'c': 3}, {'a': '4', 'b': 5, 'c': 6}]

Note that all the values association to the 'b' and 'c' columns integers while the 'a' column is still strings.

Step 1: Analysis

For this step of the assignment, you will be implementing various functions to answer questions about the dataset.

Each function should take the list returned by the parse function as the first argument, along with any other arguments specified in each problem. For example, for the third function, we would call filter_range(data, 1, 10) where data was the list returned by parse. This data structure should not be modified by any function you write. Every problem that deals with strings should be case-sensitive (this means "chArIzard" is a different species than "Charizard"). You may assume the list is non-empty for all functions you implement. For each problem, you may assume we pass parameters of the expected types described for that problem and that those parameters are not None. You should make no other assumptions about the parameters or the data.

For each of the problems, we will use the file pokemon_test.csv to show what should be returned.

id,name,level,personality,type,weakness,atk,def,hp,stage
59,Arcanine,35,impish,fire,water,50,55,90,2
59,Arcanine,35,gentle,fire,water,45,60,80,2
121,Starmie,67,sassy,water,electric,174,56,113,2
131,Lapras,72,lax,water,electric,107,113,29,1

Problem 1: species_count

Write a function species_count that returns the number of unique Pokemon species (determined by the name attribute) found in the dataset. You may assume that the data is well formatted in the sense that you don't have to transform any values in the name column.

For example, assuming we have parsed pokemon_test.csv and stored it in a variable called data:

species_count(data)  # 3

Problem 2: max_level

Write a function max_level that finds the Pokemon with the max level and returns a tuple of length 2, where the first element is the name of the Pokemon and the second is its level. If there is a tie, the Pokemon that appears earlier in the file should be returned.

For example, assuming we have parsed pokemon_test.csv and stored it in a variable called data:

max_level(data)  # ('Lapras', 72)

Problem 3: filter_range

Write a function called filter_range that takes as arguments a smallest (inclusive) and largest (exclusive) level value and returns a list of Pokemon names having a level within that range. The list should return the species names in the same order that they appear in the provided list of dictionaries.

For example, assuming we have parsed pokemon_test.csv and stored it in a variable called data:

filter_range(data, 30, 70)  # ['Arcanine', 'Arcanine', 'Starmie']

Problem 4: mean_attack_for_type

Write a function called mean_attack_for_type that takes a Pokemon type (string) as an argument and that returns the average attack stat for all the Pokemon in the dataset with that type.

If there are no Pokemon of the given type, this function should return None.

For example, assuming we have parsed pokemon_test.csv and stored it in a variable called data:

mean_attack_for_type(data, 'fire')  # 47.5

Problem 5: count_types

Write a function called count_types that returns a dictionary with keys that are Pokemon types and values that are the number of times that type appears in the dataset.

The order of the keys in the returned dictionary does not matter. In terms of efficiency, your solution should NOT iterate over the whole dataset once for each type of Pokemon since that would be overly inefficient.

For example, assuming we have parsed pokemon_test.csv and stored it in a variable called data:

count_types(data)  # {'water': 2, 'fire': 2}

Problem 6: highest_stage_per_type

Write a function called highest_stage_per_type that calculates the largest stage reached for each type of Pokemon in the dataset. This function should return a dictionary that has keys that are the Pokemon types and values that are the highest value of stage column for that type of Pokemon.

The order of the keys in the returned dictionary does not matter. In terms of efficiency, your solution should NOT iterate over the whole dataset once for each type of Pokemon since that would be overly inefficient.

For example, assuming we have parsed pokemon_test.csv and stored it in a variable called data:

highest_stage_per_type(data)  # {'water': 2, 'fire': 2}

Problem 7: mean_attack_per_type

Write a function called mean_attack_per_type that calculates the average attack for every type of Pokemon in the dataset. This function should return a dictionary that has keys that are the Pokemon types and values that are the average attack for that Pokemon type.

The order of the keys in the returned dictionary does not matter. In terms of efficiency, your solution should NOT iterate over the whole dataset once for each type of Pokemon since that would be overly inefficient.

For example, assuming we have parsed pokemon_test.csv and stored it in a variable called data:

mean_attack_for_type(data)  # {'water': 140.5, 'fire': 47.5}