flake8
Some requirements are handled by flake8 while others requires you to double check them manually. All the code files you submitted are expected to pass flake8 and follow the code quality guidelines as outlined below.
Note
Other programming languages (e.g. Java) have different conventions for naming. Since we are using Python in this class, you are expected to use the Python conventions.
Variable Names¶
Your variable names should be descriptive, concise, and lowercase with words separated by underscores (snake_case
). You should try to not use single letter names for variables because a single letter usually don’t describe the values contained in a variable very well, but they are fine for loop variables. You should also avoid using Python keywords, such as class
, print
, import
, or return
, or names of built-in functions or types, such as len
(those will be highlighted in the code editor).
A list of keywords can be found by typing the following into the Python interpreter:
import keyword
keyword.kwlist
# Good variable names
factor
total_weight
data1 # Okay to have a number right after a word
# Bad variable names
x # Most likely not descriptive enough,
# except for something like an x, y coordinate
totalWeight # Not using snake_case
NewImage # Not using snake_case
newresult # Not separating words with underscores
return # Python keyword
len # Python built-in function
Function Names¶
Your function names should be concise, descriptive, and lowercase with words separated by underscores (snake_case
). Just like variable names, you should also try to avoid using names of built-in Python functions. We usually specify the function names in the spec and you should make sure that you name your functions accordingly.
In some of the assignments, we ask you to write test functions to test your code. The name of your test function should clearly indicate which function it is testing.
Examples:
# Good function names
get_height(person)
plot_population(df)
# Bad function names
setScore(record) # not using snake_case
max() # will overwrite the built-in max function in Python
Note
Most of the whitespace requirements below should be handled by flake8
— you will receive a warning if you are not meeting some of the requirements.
Indentation¶
Unlike other languages that use explicit delimiters (like {}
in Java) to determine what goes inside a loop or function, Python only uses indentation. It is extremely important to properly indent your code because it might lead to unexpected bugs in your program. If you see an error saying IndentationError: unexpected indent
, it means there is an error in your indentation.
Blank Lines¶
In general, there should be two blank lines separating code for different functions and function definitions from other code. Try to minimize blank lines within function definitions, except when separating complex chunks of code/logic to provide readability.
Good Example 1 (two blank lines separating functions):
# Other code
def the_first_function():
# Implementation for the first function
def the_second_function():
# Implementation for the second function
# Other code
Good Example 2 (one blank line separating “complex” chunks of code):
def compute_avg(x, y, z):
"""
Sums the given three numbers x, y and z
and returns the average.
"""
sum_val = x + y + z
# compute and return the average value
result = sum_val / 3.0
return result
# other code
Operators¶
Include spaces between mathematical and logical operators and other elements in an expression. Mathematical operators include +
, -
, *
, /
, and **
. Logical operators include ==
, <=
, >=
, <
and >
. Limit space delimiters to 1 space, to avoid unnecessary whitespace. The exception to this is parentheses, which can be directly adjacent to whatever they are enclosing.
Examples:
# Good
x + y
(total ** 2) + 4 * val - 1
x * (4 + 6)
b + math.sqrt(4 * max_val)
# Bad
x+y # not enough spaces
(total**2)+4*val-1 # not enough spaces
x * ( 4 + 6 ) # unnecessary spaces around parentheses
b + math.sqrt( 4 * max_val) # inconsistent spacing
def add_three_nums(a, b, c=10):
return a + b + c
add_three_nums(b=20, a=20, c)
Function Calls¶
Avoid adding extra space(s) between a function name and its associated parameter list. Using the space suggests (incorrectly) that the parentheses are for grouping an expression, when in fact they are for calling the function.
You should, however, include spaces between individual parameters in the parameter list. This makes your function definitions and calls more readable.
Examples:
# Good
x = math.sqrt(n)
range_vals = range(n, 4)
# Bad
x = math.sqrt (n) # too much space before the parenthesis
range_vals = range(n,4) # no spaces between parameters
Line Length¶
According to flake8
, the maximum number of characters that you should have on a given line is 79 characters. You should try to avoid writing code with long lines, but here are some common ways to break a long line:
# Calling a function with many arguments
some_function(first_arg="This is a function",
second_arg="With many arguments",
third_arg="indent until everything lines up")
# Breaking a long expression
# (use \ after an operator and indent once on the second line)
total = first_num + second_num + third_num + \
fourth_num
# Breaking a long string
print("When you have a very very very long string, "
"this is how you could break it properly")
Main Method Pattern¶
For most of the assignments, when we say that certain files should use the main method pattern, it means those files should follow the structure below:
# Implementation for functions specified in the HW spec
def main():
# Code calling all functions that you implemented
if __name__ == '__main__':
main()
Documentation¶
For every Python file you write, you should include a header comment at the top of a file with your name, section, and a brief description of what the program in the file does. The header comment should be in doc-string ("""
or '''
) format.
Each function should contain a doc-string describing what the function does, right below the function definition. It should describe any parameters the function takes and values it returns (if any). If the spec requires you to handle some special cases (like case sensitivity, or special returns values for a certain input), you should also mention that in the comments as well. It is okay to not include a comment for you main
method.
Also, long blocks of code with a particular purpose or small bits of particularly complex code could be labeled/explained by a comment (using #
) on the preceding line(s) to help with readability.
Example:
"""
Jane Doe
CSE 163 AX
This program implements a function that adds two numbers
"""
# add any import below the file header comment
def add_two_numbers(a, b, return_zero=False):
"""
This function returns the sum of the two given numbers a, b
if return_zero is False. Otherwise returns 0.
return_zero has False as its default value.
"""
if return_zero:
return 0
else:
return a + b
Try to avoid commenting on information that is unnecessary or containing too much implementation detail (i.e. “initialize variable” or “nested for loop” or “increment count”). Your function comments should describe what the program does (its behavior), not how.
Example:
# Bad
# Note that the usage of a set and a loop are implementation details,
# as in they tell you HOW the method works.
# Returning the length of the set doesn't mean much to the clients.
def count_unique_characters(file_name):
"""
Opens the given file, loops through each character and adds them
to a set. Returns the length of the set.
"""
result = set()
with open(file_name) as f:
content = f.read()
for c in content:
result.add(c)
return len(result)
# Good
def count_unique_characters(file_name):
"""
Takes in a file name and returns the number of unique characters within
the given file.
"""
result = set()
with open(file_name) as f:
content = f.read()
for c in content:
result.add(c)
return len(result)
One way to think about it is if you are the client of the code and you can only see the function declaration and the comment, the comment should contain all the necessary information that you need in order to properly use the function. You probably won’t care whether the function is using a for loop, while loop, or just some if-else statements, but you will need to know the input parameters and the expected behavior of the function, especially under special cases.
Testing¶
In some of the assignments, we ask you to write a test program to test your implementation. Your test files should meet the same style requirements as specified above, including:
- When we run your code, it should produce no errors or warnings. The point of writing a test program is that you can use it to verify the correctness of your code, so you should make sure to run it before turning in your assignments.
- There is a file header comment at the top of with your name, section, and a brief description of what that program does.
- Use the main method pattern.
- Good naming convention of variables and functions (
snake_case
). - Each of the test functions should have a descriptive name that indicates which function is being tested. For example, if it is testing the function
add_two_numbers
, the test function should be namedtest_add_two_numbers
. - Each of the test functions should be commented in doc-string format. Unlike comments for the main solution of your take-home assessment, the comments for test functions can be relatively simple. For example, if the function is testing
add_two_numbers
, the comment can simply be “Tests the functionadd_two_numbers
”. - Each test function should contain the required number of tests from the spec and additional tests that you come up with. A single test is considered a call on
assert_equals
. You are highly encouraged to add more test cases to test your functions more comprehensively. - All test functions should be called from
main
.
Imports¶
All packages that you need for completing the homework assignments will be stated in the spec and you should try to avoid importing extra packages since it might include advanced material that makes the problem trivial to solve (see Advanced Material below).
All import statements should be located at the top of each file, below your file header comment. You should also remove unused imports as warned by flake8
.
Efficiency and Redundancy¶
In general, your code should avoid redundancy and unnecessary computation.
Factoring¶
If you have lines of code that appear in multiple places in your program, you should consider trying to cut down on redundancy with some kind of factoring. Don’t write the same code again if you already have a function that performs that action. Just call the function.
The following example shows how to factor if/else
statements:
# Bad
# Note that there are repeated lines of logic that actually always happen,
# instead of conditionally like how our structure is set up. We can factor
# these out to simplify and clean our code.
if x % 2 == 0:
print("Hello!")
print("I love even numbers too.")
print("See you later!")
else:
print("Hello!")
print("I don't like even numbers either.")
print("See you later!")
# Good
print("Hello!")
if x % 2 == 0:
print("I love even numbers too.")
else:
print("I don't like even numbers either.")
print("See you later!")
Boolean Zen¶
When working with bool
values, you should treat them like the True
and False
that they are instead of comparing them with ==
and !=
. Remember that you can use not
to negate a boolean value.
# Bad Example 1
if is_sunny:
return True
else:
return False
# Good Example 1
return is_sunny
# Bad Example 2
if is_sunny == True:
go_hiking()
# Good Example 2
if is_sunny:
go_hiking()
# Bad Example 3
return is_sunny == False
# Good Example 3
return not is_sunny
Loop Zen¶
Loop Bounds¶
When writing loops, choose loop bounds or loop conditions that help generalize code the best. For example, the code before this for
loop is unnecessary.
# Bad
l = [1, 2, 3]
total += l[0]
for i in range(1, len(l)):
total += l[i]
# Good (should just loop over the whole list instead)
for i in range(len(l)):
total += l[i]
Only Repeated Tasks¶
If you have something that only happens once, then don’t put the code for it inside of your loop.
# Bad
# Note that the mean of the whole list remains unchanged for each loop iteration
# so it should be outside the loop to avoid unnecessary computation
def demean(l):
"""
Takes in a list of numbers l and returns a new list with the mean
value of the original list subtracted from each corresponding value
"""
result = []
for i in range(len(l)):
mean = sum(l) / len(l)
result.append(l[i] - mean)
return result
# Good
def demean(l):
result = []
mean = sum(l) / len(l)
for i in range(len(l)):
result.append(l[i] - mean)
return result
Unnecessary Cases¶
Try to avoid making something a special case if unnecessary. Before you make something a special case, think about whether a more general operation in other cases could yield the same result. If so, you should combine those cases.
# Bad
# Note the first case is unnecessary since if a == 0 and b != 0, a / b will
# still return 0.
def divide(a, b):
"""
Takes in two numbers a, b and returns the result of a / b.
Returns 0 if b == 0.
"""
if a == 0:
return 0
elif b == 0:
return 0
else:
return a / b
# Good
def divide(a, b):
if b == 0:
return 0
else:
return a / b
Unnecessary Looping¶
If there are values that you could compute within the same loop iteration, you should avoid writing an extra loop and compute them separately.
# Bad
# Note that the total score for each sex could be computed at the same time
def get_total_for_each_sex(data):
"""
Takes in data containing scores of students as a list of dictionaries.
Returns a tuple containing the total score for each sex in the format
of (male total, female total).
"""
for row in data:
if row['sex'] == 'M':
male_total += row['score']
for row in data:
if row['sex'] == 'F':
female_total += row['score']
return male_total, female_total
# Good
def get_total_for_each_sex(data):
for row in data:
if row['sex'] == 'M':
male_total += row['score']
else:
female_total += row['score']
return male_total, female_total
Unnecessary Precomputation¶
Try to avoid precomputing values unless necessary, especially for functions taking in parameters; you should make use of the given parameter to compute the desired value.
# Bad
def get_average_score_for_sex(data, sex):
"""
Takes in a DataFrame data containing scores for students and a sex and
returns the average score for the given sex as a Series. Assume sex only
takes the value 'M'(male) and 'F'(female).
"""
male_avg = data[data['sex'] == 'M']
female_avg = data[data['sex'] == 'F']
if sex == 'M':
return male_avg
else:
return female_avg
# Good
def get_average_score_for_sex(data, sex):
return data[data['sex'] == sex]
Miscellaneous¶
The following section contains some other miscellaneous requirements.
Global Variable Usage¶
No global variables are allowed in CSE 160. If you intend to use constants, you should name them following the naming convention specified above.
global_variable = "cat" # BAD!
GLOBAL_CONSTANT = "dog" # OK sometimes (see above)
def main():
print(global_variable) # this is forbidden
print(GLOBAL_CONSTANT) # this is ok if the variable is light-weight
if '__name__' == '__main__':
main()
Remove Debugging Print Statements¶
You should remove all print statements for debugging instead of commenting them out in your final take-home assessment submission.
Fix All Warnings and Errors in Code¶
Your code should generate no warnings or errors when run. You should ask for help during office hours if you encounter any error or warning that you don’t know how to resolve.
Advanced Material¶
For homework assignments in CSE 160, avoid using programming concepts not yet presented in lecture.
Acknowledgements¶
This Code Quality Guide is built upon the CSE 163 Python Style Guide and adapted for use in CSE 160 by Kenny Le.
Additional thanks to proofreaders who help find typos and suggest rewordings. Finally, thanks to the students reading this guide. Surely after reading this guide, you’ll contribute to writing stylish code that exists in the world.
If you find any typos or otherwise have suggestions or clarifications, feel free to make a post on Ed to suggest a fix (and optionally have your name listed in the acknowledgements).