Strings and Lists¶

In this lesson, we'll introduce strings and lists in Python. We'll also learn the principles of documenting code. By the end of this lesson, students will be able to:

  • Evaluate expressions involving strings, string slicing, and lists.
  • Apply str operations and slicing to compute a new string representing the desired text.
  • Apply list operations to store, retrieve, and modify values in a list.

We'll be writing doctests to verify that our programs work.

In [1]:
import doctest

String indexing¶

Strings are commonly used to represent text. In Python, str (pronounced "stir") represents a string. We'll refer to string and str interchangeably since we're focusing on Python programming.

In Python, str values are defined by surrounding text in matching quotes: either a ' or a ". The characters in a string are accessible by index starting from 0 and incrementing up to the length of the string.

h e l l o w o r l d
0 1 2 3 4 5 6 7 8 9 10

To access a character at a specific index, use the s[i] notation to get a particular character from a string s.

In [2]:
s = "hello world"
s[0]
Out[2]:
'h'

The built-in len function returns the length of an object such as a string. It helps compute letters from the end of the string.

In [3]:
len(s)
Out[3]:
11
In [4]:
s[len(s) - 2]
Out[4]:
'l'
In [5]:
s[-2]
Out[5]:
'l'

Practice: Pairs swapped¶

Write a function pairs_swapped that takes a string and returns all the characters in the given string except with each pair of characters swapped. For example, calling the function on a string "hello there" should produce the result "ehll ohtree".

  1. Start by writing the function definition.
  2. Add a brief docstring that explains the behavior.
  3. Add at least two doctests: one using the example above, and another that you came up with on your own.
  4. Write the method using a for loop and building-up the result by adding each character one at a time.
In [8]:
def pairs_swapped(s):
    """
    Takes in a string s and returns a new string with each pair of characters in s swapped.

    >>> pairs_swapped("cse163")
    'sc1e36'

    If the input string is of odd length, the last character of the input string will be kept
    as is in the returned string.

    >>> pairs_swapped("hello there")
    'ehll ohtree'
    """
    result_s = ""
    for i in range(0, len(s) - 1, 2):
        result_s += s[i+1]
        result_s += s[i]
    if len(s) % 2 == 1: # this string is odd length
        result_s += s[-1]
    return result_s

# doctest.testmod()
doctest.run_docstring_examples(pairs_swapped, globals())

String slicing¶

String indexing gets a single character from a string. How do we get multiple characters from a string? Python has a special syntax called slicing that enables patterned access to substrings: s[start:end].

h e l l o w o r l d
0 1 2 3 4 5 6 7 8 9 10
In [9]:
s = "hello world"
s[2:7]
Out[9]:
'llo w'

To slice all the way to the end of a string, simply don't specify an end position.

In [12]:
s = "hello world"
s[2:]
Out[12]:
'llo world'

Slices also allow a third parameter, step size, that works just like in range.

In [13]:
s = "hello world"
s[2:8:2]
Out[13]:
'low'

Looping over strings¶

There are two ways to loop over a string. One way is to loop over all the indices of a string with the help of the range function.

In [14]:
s = "hello world"
for i in range(len(s)):
    print(s[i])
h
e
l
l
o
 
w
o
r
l
d

Another way is to loop over the characters in a string directly. It turns out that the for loop in Python iterates over sequences. A range produces a sequence of integers. A str is also a sequence composed of the characters within the string.

In [15]:
s = "hello world"
for c in s:
    print(c)
h
e
l
l
o
 
w
o
r
l
d

String functions¶

Strings have convenient utility functions that you can call to answer questions about strings.

For example, every string has a find function that you can call on a string s1 that returns the index of a given string s2 inside s1.

In [ ]:
s1.find(s2)
In [16]:
"I really like dogs".find("ll")
Out[16]:
5

If the string s2 is not found in s1, the function returns -1.

In [17]:
"ll".find("I really like dogs")
Out[17]:
-1

That said, if you only need to check whether s2 is in s1, Python has a special in operator for answering this question.

In [18]:
"ll" in "I really like dogs"
Out[18]:
True

For future reference, here are some commonly-used string functions. This list is useful to memorize because these functions are used very frequently, but you'll probably learn them over time just by seeing them in other peoples' code.

  • s.lower() returns a new string that is the lowercase version of s
  • s.upper() returns a new string that is the uppercase version of s
  • s.find(t) returns the index of the first occurrence of t in s. If not found, returns -1.
  • s.strip() returns a new string that has all the leading and trailing whitespace removed.
    • lstrip() and rstrip() remove only left whitespace or right whitespace respectively.)
  • s.split(delim) returns a list consisting of the parts of s split up according to the delim (defaults to whitespace).
  • s.join(strings) returns a single string consisting of the given strings with the string s inserted between each string.
In [19]:
sentence = "I really like dogs"
In [20]:
sentence.lower()
Out[20]:
'i really like dogs'
In [21]:
sentence.upper()
Out[21]:
'I REALLY LIKE DOGS'
In [22]:
sentence.strip()
Out[22]:
'I really like dogs'
In [24]:
"    print(c)\n\t".strip()
Out[24]:
'print(c)'
In [26]:
words = sentence.split()
In [29]:
" ".join(words)
Out[29]:
'I really like dogs'
In [34]:
sentence
Out[34]:
'I really like dogs'

Lists¶

The s.split(delim) function defined in the list above introduced another data type called a list. Whereas a string is an indexed sequence of characters, a list is an indexed sequence that can store values of any type.

In [33]:
"I really like dogs".split()
Out[33]:
['I', 'really', 'like', 'dogs']

The great thing about lists in Python, is that they share a lot of the same syntax for operations as strings. Concatenation, indexing, slicing, the len function, and for looping over a list all works exactly like you learned for strings.

But, there is one major difference between lists and strings.

  • Lists are mutable: they allow reassignment of individual values within the list.
  • Strings are immutable: the characters within a string can never change. String functions like s.lower() return new strings as a result.
In [35]:
sentence = "I really like dogs"
words = sentence.split()
words[2] = "love"
words
Out[35]:
['I', 'really', 'love', 'dogs']

Practice: Count votes¶

Write a function count_votes that takes a list of numbers indicating votes for candidates 0, 1, or 2 and returns a new list of length 3 showing how many counts each candidate got. See the doctest below for one example.

In [53]:
def count_votes(votes):
    """
    Takes in a list of votes for candidates 0, 1, or 2. Returns a new list of
    length 3 that shows how many votes each candidate got.

    >>> count_votes([1, 0, 1, 1, 2, 0])
    [2, 3, 1]

    >>> count_votes([])
    [0, 0, 0]

    >>> count_votes([1, 1, 2, 2])
    [0, 2, 2]

    We do not handle invalid votes that are not 0, 1, or 2.
    """
    counts = [0, 0, 0]
    for cand in votes:
        counts[cand] += 1
    return counts

doctest.run_docstring_examples(count_votes, globals())

List functions¶

There are also many list functions. Lists are mutable, so all these operations modify the given list.

  • l.append(x) adds x to the end of l.
  • l.extend(xs) adds all elements in xs to the end of l.
  • l.insert(i, x) inserts x at index i in l.
  • l.remove(x) removes the first x found in l.
  • l.pop(i) removes the element at index i in l.
  • l.clear() removes all values from l.
  • l.reverse() reverses the order of all elements in l.
  • l.sort() rearranges all elements of l into sorted order.
In [71]:
words = "I really like dogs".split()
catwords = "meow meow meow".split()
In [72]:
catwords
Out[72]:
['meow', 'meow', 'meow']
In [73]:
words
Out[73]:
['I', 'really', 'like', 'dogs']
In [74]:
words.append(catwords)
In [75]:
words
Out[75]:
['I', 'really', 'like', 'dogs', ['meow', 'meow', 'meow']]
In [76]:
catwords.insert(1, "purr")
In [77]:
catwords
Out[77]:
['meow', 'purr', 'meow', 'meow']
In [78]:
words
Out[78]:
['I', 'really', 'like', 'dogs', ['meow', 'purr', 'meow', 'meow']]

Just like we learned how strings support the in operator, lists also support the in operator.

In [57]:
print("dogs" in words)
print("dogs" in catwords)
True
False
In [58]:
words.append("!")
words
Out[58]:
['I', 'really', 'like', 'dogs', '!']
In [59]:
words.extend(catwords)
words
Out[59]:
['I', 'really', 'like', 'dogs', '!', 'meow', 'meow', 'meow']
In [60]:
catwords
Out[60]:
['meow', 'meow', 'meow']
In [61]:
catwords.insert(1, "purr")
catwords
Out[61]:
['meow', 'purr', 'meow', 'meow']
In [63]:
words
Out[63]:
['I', 'really', 'like', 'dogs', '!', 'meow', 'meow', 'meow']
In [64]:
catwords.remove("purr")
catwords
Out[64]:
['meow', 'meow', 'meow']
In [65]:
catwords.pop()
Out[65]:
'meow'
In [66]:
catwords
Out[66]:
['meow', 'meow']
In [67]:
words.reverse()
words
Out[67]:
['meow', 'meow', 'meow', '!', 'dogs', 'like', 'really', 'I']
In [69]:
# words.sort() # by default, sorts in ascending order
words.sort(reverse=True)
words
Out[69]:
['really', 'meow', 'meow', 'meow', 'like', 'dogs', 'I', '!']
In [70]:
words.clear()
words
Out[70]:
[]