Data Structures and Files¶
In this lesson, we'll practice our Python programming skills: loops, strings, lists, and dictionaries.
import doctest
Group Activity: Count divisible digits¶
This function, count_divisible_digits
, takes two integers (can be negative) n
and m
and returns the number of digits in n
that are divisible by m
. For this problem, any digit in n
that is 0 is divisible by any number. If m
is 0, return 0.
There are 2 bugs in the given starter code. Identify and resolve them.
def count_divisible_digits(n, m):
"""
Returns the number of digits in n that are divisible by m. If m is 0, then return 0. Likewise,
if any digit in n is 0, then it is divisible by all numbers.
>>> count_divisible_digits(650899, 3)
4
>>> count_divisible_digits(-204, 5)
1
>>> count_divisible_digits(10, 0)
0
"""
# negative situation
if m == 0:
return 0
elif n == 0:
return 1
else:
n = abs(n)
# if n < 0:
# n = n * -1
count = 0
while n > 0:
digit = n % 10
if digit % m == 0:
count += 1
n //= 10
return count
doctest.run_docstring_examples(count_divisible_digits, globals())
Group Activity: Words by letter¶
Write a function words_by_letter
that takes a string path and returns a dictionary associating each letter with the number of words that begin with said letter. Normalize the first letter of each word to be lowercase. If the file is empty, return an empty dictionary.
def words_by_letter(path):
"""
Returns a dictionary containing letter-count pairs, where each the count represents the number
of words starting with a given letter in the specified file.
>>> words_by_letter("simple.txt")
{'t': 3, 's': 2, 'i': 1}
>>> words_by_letter("twister.txt")
{'p': 24, 'a': 3, 'o': 4, 'i': 1, 'w': 1, 't': 1}
>>> words_by_letter("empty.txt")
{}
"""
doctest.run_docstring_examples(words_by_letter, globals())
Practice: DNA match score¶
Write a function dna_match_score
that takes two strings of the same length that represent DNA sequences and returns their alignment score. DNA sequences are strings with only the characters "A"
, "C"
, "G"
, "T"
, or "-"
(to represent a gap). When aligning the two DNA sequences, there will never be a gap in both strings at the same index.
To compute the alignment score, compare the characters that appear at the same index in both strings:
- If both characters match and are one of
"A"
,"C"
,"G"
,"T"
, the score is +2. - If both characters are one of
"A"
,"C"
,"G"
,"T"
but they don't match, the score is -1. - If one character is one of
"A"
,"C"
,"G"
,"T"
and the other is a gap"-"
, the score is -2.
For example, dna_match_score("-ATGC", "CATGT")
returns 3
following the process in the table below for each index 0 through 4 in the DNA sequences.
i | seq1 |
seq2 |
score |
---|---|---|---|
0 | - | C | -2 |
1 | A | A | +2 |
2 | T | T | +2 |
3 | G | G | +2 |
4 | C | T | -1 |
def dna_match_score(seq1, seq2):
"""
Returns the alignment score of two DNA sequences of equal length, where score is the number of
matching (+2 points), non-matching (-1 points), and missing characters (-2 points).
>>> dna_match_score("-ATGC", "CATGT")
3
>>> dna_match_score("ATGC", "ATGC")
8
>>> dna_match_score("-AT", "C-T")
-2
"""
...
doctest.run_docstring_examples(dna_match_score, globals())
Testing¶
Run all the tests and ensure your code is working by running the following code block.
doctest.testmod()