Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

So far, we have read in standard CSV (Comma Separated Values) files where every line contains identically formatted data. However, many real-world datasets include a header row. The first line of the file contains column names rather than actual data, and the remaining lines contain the data. For example, in games.csv, we had to skip the header row to read the actual NFL game data!

schedule_season,schedule_date,team_home,team_away,score_home,score_away
2018,2019-01-06,Chicago Bears,Philadelphia Eagles,15,16
2022,2023-01-08,Atlanta Falcons,Tampa Bay Buccaneers,30,17
1981,1981-10-11,San Francisco 49ers,Dallas Cowboys,45,14

csv.DictReader

To handle CSV files with headers efficiently, we can use the DictReader class from Python’s built-in csv package. csv.DictReader(f) takes a file handle and creates an object that maps information from a CSV file into a sequence of dictionaries:

To use it, we must first import the csv package:

import csv

result = []
with open("data/games.csv") as f:
    reader = csv.DictReader(f)
    for row in reader:
        result.append(row)

The DictReader object has a few useful attributes that you can access while processing your file:

Practice: NFL DictReader

After running the above code to add each row to the result, which expression will run without error?

result["schedule_date"]
result["team_home"][0]
result["2023-01-08"]
result[0]["2019-01-06"]
result[1]["score_home"] = 24