Strings and Lists¶
In this lesson, we'll introduce strings and lists in Python. We'll also learn the principles of documenting code. By the end of this lesson, students will be able to:
- Evaluate expressions involving strings, string slicing, and lists.
- Apply
str
operations and slicing to compute a new string representing the desired text. - Apply
list
operations to store, retrieve, and modify values in a list.
We'll be writing doctests to verify that our programs work.
import doctest
String indexing¶
Strings are commonly used to represent text. In Python, str
(pronounced "stir") represents a string. We'll refer to string and str
interchangeably since we're focusing on Python programming.
In Python, str
values are defined by surrounding text in matching quotes: either a '
or a "
. The characters in a string are accessible by index starting from 0 and incrementing up to the length of the string.
h | e | l | l | o | w | o | r | l | d | |
---|---|---|---|---|---|---|---|---|---|---|
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |
To access a character at a specific index, use the s[i]
notation to get a particular character from a string s
.
s = "hello world"
s[0]
'h'
The built-in len
function returns the length of an object such as a string. It helps compute letters from the end of the string.
len(s)
11
s = "hello world"
s[len(s) - 2]
'l'
s[-2]
'l'
l = len(s)
i = l - 2
s[i]
'l'
Practice: Pairs swapped¶
Write a function pairs_swapped
that takes a string and returns all the characters in the given string except with each pair of characters swapped. For example, calling the function on a string "hello there"
should produce the result "ehll ohtree"
.
- Start by writing the function definition.
- Add a brief docstring that explains the behavior.
- Add at least two doctests: one using the example above, and another that you came up with on your own.
- Write the method using a
for
loop and building-up the result by adding each character one at a time.
def pairs_swapped(s):
# How effectively does the code communicate external behaviors to end-users?
# This includes writing docstrings in your own words with only the necessary
# information that clients need to run a function.
"""
Returns a new string with each pair of characters from the given string swapped.
>>> pairs_swapped("poop")
'oppo'
If the given string contains an odd number of characters, the very last character
is included in the result.
>>> pairs_swapped("hello there")
'ehll ohtree'
"""
result = ""
# "poop" i = 0, 2, but not 4
# "hello there" i = 0, 2, 4, 6, 8, but not 10
for i in range(0, len(s) - 1, 2):
# Get the two characters and "swap" them by adding them in the reverse order
first = s[i]
# Isolated the problem! It's on assigning second when i + 1 == 11
second = s[i + 1]
result = result + second + first
if len(s) % 2 == 1: # If it's an odd length...
result += s[-1]
return result
doctest.run_docstring_examples(pairs_swapped, globals())
pairs_swapped("hello 'there")
"ehll ot'eher"
String slicing¶
String indexing gets a single character from a string. How do we get multiple characters from a string? Python has a special syntax called slicing that enables patterned access to substrings: s[start:end]
.
h | e | l | l | o | w | o | r | l | d | |
---|---|---|---|---|---|---|---|---|---|---|
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |
s = "hello world"
s[2:7] # Slicing doesn't include end point
'llo w'
s[2:len(s)]
'llo world'
To slice all the way to the end of a string, simply don't specify an end
position.
s = "hello world"
s[2:]
'llo world'
Slices also allow a third parameter, step size, that works just like in range
.
s = "hello world"
s[2:8:2]
'low'
Looping over strings¶
There are two ways to loop over a string. One way is to loop over all the indices of a string with the help of the range
function.
s = "hello world"
for i in range(len(s)):
print(i, s[i])
0 h 1 e 2 l 3 l 4 o 5 6 w 7 o 8 r 9 l 10 d
Another way is to loop over the characters in a string directly. It turns out that the for
loop in Python iterates over sequences. A range
produces a sequence of integers. A str
is also a sequence composed of the characters within the string.
s = "hello world"
for c in s:
print(c)
h e l l o w o r l d
String functions¶
Earlier, we saw how the Python doctest
has a testmod()
function that we could call to run all the doctests that we wrote for our current module. Likewise, strings also have functions that you can call to answer questions about strings.
For example, every string has a find
function that you can call on a string s1
that returns the index of a given string s2
inside s1
.
"I really like dogs".find("ll")
5
If the string s2
is not found in s1
, the function returns -1.
"".find("I really like dogs")
-1
That said, if you only need to check whether s2
is in s1
, Python has a special in
operator for answering this question.
"ll" in "I really like dogs"
True
For future reference, here are some commonly-used string functions. This list is useful to memorize because these functions are used very frequently, but you'll probably learn them over time just by seeing them in other peoples' code.
s.lower()
returns a new string that is the lowercase version ofs
s.upper()
returns a new string that is the uppercase version ofs
s.find(t)
returns the index of the first occurrence oft
ins
. If not found, returns -1.s.strip()
returns a new string that has all the leading and trailing whitespace removed.lstrip()
andrstrip()
remove only left whitespace or right whitespace respectively.)
s.split(delim)
returns a list consisting of the parts ofs
split up according to thedelim
(defaults to whitespace).s.join(strings)
returns a single string consisting of the givenstrings
with the strings
inserted between each string.
Lists¶
The s.split(delim)
function defined in the list above introduced another data type called a list. Whereas a string is an indexed sequence of characters, a list is an indexed sequence that can store values of any type.
l = "I really like dogs".split()
l
['I', 'really', 'like', 'dogs']
len(l)
4
The great thing about lists in Python, is that they share a lot of the same syntax for operations as strings. Concatenation, indexing, slicing, the len
function, and for
looping over a list all works exactly like you learned for strings.
But, there is one major difference between lists and strings.
- Lists are mutable: they allow reassignment of individual values within the list.
- Strings are immutable: the characters within a string can never change. String functions like
s.lower()
return new strings as a result.
words = "I really like dogs".split()
words[2] = "love"
words
['I', 'really', 'love', 'dogs']
Practice: Count votes¶
Write a function count_votes
that takes a list of numbers indicating votes for candidates 0, 1, or 2 and returns a new list of length 3 showing how many counts each candidate got. See the doctest below for one example.
def count_votes(votes):
"""
Returns a 3-element list counting the number of votes for each candidate in the given list.
>>> count_votes([1, 0, 1, 1, 2, 0])
[2, 3, 1]
"""
counts = [0] * 3
for vote in votes:
counts[vote] += 1
return counts
doctest.run_docstring_examples(count_votes, globals())
List functions¶
Last time, we learned about string functions. There are also many list
functions. Lists are mutable, so all these operations modify the given list.
l.append(x)
addsx
to the end ofl
.l.extend(xs)
adds all elements inxs
to the end ofl
.l.insert(i, x)
insertsx
at indexi
inl
.l.remove(x)
removes the firstx
found inl
.l.pop(i)
removes the element at indexi
inl
.l.clear()
removes all values froml
.l.reverse()
reverses the order of all elements inl
.l.sort()
rearranges all elements ofl
into sorted order.
Just like we learned how strings support the in
operator, lists also support the in
operator too.
words = "I really like dogs".split()
"dogs" in words
True