Strings and Lists¶
In this lesson, we'll introduce strings and lists in Python. We'll also learn the principles of documenting code. By the end of this lesson, students will be able to:
- Evaluate expressions involving strings, string slicing, and lists.
- Apply
str
operations and slicing to compute a new string representing the desired text. - Apply
list
operations to store, retrieve, and modify values in a list.
We'll be writing doctests to verify that our programs work.
import doctest
String indexing¶
Strings are commonly used to represent text. In Python, str
(pronounced "stir") represents a string. We'll refer to string and str
interchangeably since we're focusing on Python programming.
In Python, str
values are defined by surrounding text in matching quotes: either a '
or a "
. The characters in a string are accessible by index starting from 0 and incrementing up to the length of the string.
h | e | l | l | o | w | o | r | l | d | |
---|---|---|---|---|---|---|---|---|---|---|
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |
To access a character at a specific index, use the s[i]
notation to get a particular character from a string s
.
"hello world"
'hello world'
s = "hello world"
s[0]
'h'
The built-in len
function returns the length of an object such as a string. It helps compute letters from the end of the string.
s = "hello world"
s[len(s) - 2]
'l'
len(s) - 2
9
s[-2]
'l'
Practice: Pairs swapped¶
Write a function pairs_swapped
that takes a string and returns all the characters in the given string except with each pair of characters swapped. For example, calling the function on a string "hello there"
should produce the result "ehll ohtree"
.
- Start by writing the function definition.
- Add a brief docstring that explains the behavior.
- Add at least two doctests: one using the example above, and another that you came up with on your own.
- Write the method using a
for
loop and building-up the result by adding each character one at a time.
print(s)
# range(stop)
# range(start, stop)
# range(start, stop, step)
for i in range(0, len(s), 2):
print(i)
hello world 0 2 4 6 8 10
def pairs_swapped(s):
"""
Given a string, returns the same string's characters except
with each adjacent pair of characters swapped.
>>> pairs_swapped("cse163")
'sc1e36'
For strings with an odd number of characters, the last
character is included at the end of the string.
>>> pairs_swapped("hello there")
'ehll ohtree'
>>> pairs_swapped("i like cse163")
' iilekc es613'
"""
result = ""
# Want to look at each pair of characters, with a loop?
# Want to loop only half the number of character-times.
# Do we want to loop over each character directly?
# Will use -> Or do we want to loop over the indices of each character?
# Need to handle odd characters on the last loop!
for i in range(0, len(s) - 1, 2):
# result = result + ...
result += s[i + 1] + s[i]
# Only want to add the last character if the string is odd length
if len(s) % 2 == 1:
result += s[-1]
return result
doctest.run_docstring_examples(pairs_swapped, globals())
# Pass the doctests since there's no messages below
globals()["s"]
# This can be used to access the variables defined outside of the function
'hello world'
def test_function():
return s
test_function()
'hello world'
s
'hello world'
String slicing¶
String indexing gets a single character from a string. How do we get multiple characters from a string? Python has a special syntax called slicing that enables patterned access to substrings: s[start:end]
.
h | e | l | l | o | w | o | r | l | d | |
---|---|---|---|---|---|---|---|---|---|---|
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |
s[2]
'l'
s = "hello world"
s[2:7] # Excludes the end point
'llo w'
To slice all the way to the end of a string, simply don't specify an end
position.
s = "hello world"
s[2:len(s)]
'llo world'
s[2:]
'llo world'
s[-5:]
'world'
Slices also allow a third parameter, step size, that works just like in range
.
s = "hello world"
s[2:8:2]
'low'
Looping over strings¶
There are two ways to loop over a string. One way is to loop over all the indices of a string with the help of the range
function.
s = "hello world"
for i in range(len(s)):
print(i, s[i])
0 h 1 e 2 l 3 l 4 o 5 6 w 7 o 8 r 9 l 10 d
Another way is to loop over the characters in a string directly. It turns out that the for
loop in Python iterates over sequences. A range
produces a sequence of integers. A str
is also a sequence composed of the characters within the string.
s = "hello world"
for c in s:
print(c)
h e l l o w o r l d
String functions¶
Strings have convenient utility functions that you can call to answer questions about strings.
For example, every string has a find
function that you can call on a string s1
that returns the index of a given string s2
inside s1
.
"I really like dogs".find("ll")
5
If the string s2
is not found in s1
, the function returns -1.
"ll".find("I really like dogs")
-1
That said, if you only need to check whether s2
is in s1
, Python has a special in
operator for answering this question.
"ll" in "I really like dogs"
True
For future reference, here are some commonly-used string functions. This list is useful to memorize because these functions are used very frequently, but you'll probably learn them over time just by seeing them in other peoples' code.
s.lower()
returns a new string that is the lowercase version ofs
s.upper()
returns a new string that is the uppercase version ofs
s.find(t)
returns the index of the first occurrence oft
ins
. If not found, returns -1.s.strip()
returns a new string that has all the leading and trailing whitespace removed.lstrip()
andrstrip()
remove only left whitespace or right whitespace respectively.)
s.split(delim)
returns a list consisting of the parts ofs
split up according to thedelim
(defaults to whitespace).s.join(strings)
returns a single string consisting of the givenstrings
with the strings
inserted between each string.
Lists¶
The s.split(delim)
function defined in the list above introduced another data type called a list. Whereas a string is an indexed sequence of characters, a list is an indexed sequence that can store values of any type.
"I really like dogs".split()
['I', 'really', 'like', 'dogs']
words = ['I', 'really', 'like', 'dogs']
words[1:]
['really', 'like', 'dogs']
" ".join(words[1:])
'really like dogs'
The great thing about lists in Python, is that they share a lot of the same syntax for operations as strings. Concatenation, indexing, slicing, the len
function, and for
looping over a list all works exactly like you learned for strings.
But, there is one major difference between lists and strings.
- Lists are mutable: they allow reassignment of individual values within the list.
- Strings are immutable: the characters within a string can never change. String functions like
s.lower()
return new strings as a result.
s.upper()
'HELLO WORLD'
s
'hello world'
s = s.upper()
s
'HELLO WORLD'
words = "I really like dogs".split()
words[2] = "love"
words # Changes the values of the list even without reassigning words = ...
['I', 'really', 'love', 'dogs']
Practice: Count votes¶
Write a function count_votes
that takes a list of numbers indicating votes for candidates 0, 1, or 2 and returns a new list of length 3 showing how many counts each candidate got. See the doctest below for one example.
def count_votes(votes):
"""
TODO: Your docstring here
>>> count_votes([1, 0, 1, 1, 2, 0])
[2, 3, 1]
"""
...
doctest.run_docstring_examples(count_votes, globals())
List functions¶
There are also many list
functions. Lists are mutable, so all these operations modify the given list.
l.append(x)
addsx
to the end ofl
.l.extend(xs)
adds all elements inxs
to the end ofl
.l.insert(i, x)
insertsx
at indexi
inl
.l.remove(x)
removes the firstx
found inl
.l.pop(i)
removes the element at indexi
inl
.l.clear()
removes all values froml
.l.reverse()
reverses the order of all elements inl
.l.sort()
rearranges all elements ofl
into sorted order.
words = "I really like dogs".split()
"dogs" in words