Task: Find 2 datasets that you would be interested in exploring, and submit a brief writeup about each.

Overview

Let’s start thinking about the final project! As part of your preparation, we would like you to start searching for data and thinking about the kinds of research questions you’d like to answer.

Sources of Data

The best approach is to start with a problem that interests you, and then look for data. Google will be your friend for finding a dataset. Alternatively, an equally valid approach involves starting with a dataset that looks interesting and designing questions that interest you around that data. Here are some possible data sources, but many more exist:

Requirements

For each dataset that you find, you will submit a short answer of at least 3-4 sentences on each of the following prompts:

  1. Briefly describe your dataset. What is it about? Who collected it? Where did you find it? Provide a link or attachment.

  2. What makes this dataset interesting to you?

  3. Write and describe at least two potential research questions you might want to explore using this dataset. It’s OK if you don’t know how to answer these questions yet!

  4. Describe at least two potential limitations or biases in this dataset.

Make sure to answer all four questions for each dataset that you find!

Submissions and Grading

You will be graded on the quality and thoughtfulness of your responses, so make sure you are giving adequate time to each question.

There will be no resubmissions or late work accepted since this assignment is a project component. Make sure that you are managing your time wisely!

Submit your work on Gradescope by 7 July 2025, 11:59pm PST. Make sure that you submit a PDF that contains your answers to the questions, about each dataset. We will not be able to grade files that are not PDFs.