{ "metadata": { "kernelspec": { "name": "python3", "display_name": "Python 3 (ipykernel)", "language": "python" } }, "nbformat": 4, "nbformat_minor": 2, "cells": [ { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "For this problem, we will be using a new dataset about UFOs." ] }, { "cell_type": "code", "execution_count": 0, "metadata": {}, "outputs": [], "source": [ "import pandas as pd\n", "\n", "df = pd.read_csv('/home/ufos.csv')\n", "df.head() # Method to only display the first few rows" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "# Problem 0\n", "Compute the average duration (`'duration (seconds)'`) for each UFO shape (`'shape'`).\n", "\n", "**For testing purposes, store the result in a variable called `ans0`.**" ] }, { "cell_type": "code", "execution_count": 0, "metadata": {}, "outputs": [], "source": [ "### edTest(test_problem_0) ###\n", "" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "# Problem 1\n", "For this problem, we will provide you with some buggy starter code. Your job is to fix it so it meets the specification below! \n", "\n", "Write code to compute the longest duration UFO sighting (`'duration (seconds)'`) for each city (`'city'`) that is in a \"positive location\". A \"positive location\" is one where either at least one of its latitude (`'latitude'`) or longitude (`'longitude'`) are greater than 0. \n", "\n", "**For testing purposes, store the result in a variable called `ans1`.**" ] }, { "cell_type": "code", "execution_count": 0, "metadata": {}, "outputs": [], "source": [ "### edTest(test_problem_1) ###\n", "ans1 = df.groupby('city')['duration (seconds)'].max()[df['latitude'] \u003e 0 | df['longitude'] \u003e 0]\n", "ans1" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "# Problem 2\n", "Find the name of the city (`'city'`) that has the longest total duration of UFO sightings. Use the column `'duration (seconds)'` to compute the total duration of all UFO sightings in each city. \n", "\n", "**For testing purposes, store the result in a variable called `ans2`.**" ] }, { "cell_type": "code", "execution_count": 0, "metadata": {}, "outputs": [], "source": [ "### edTest(test_problem_2) ###\n", "" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "# Problem 3\n", "Compute how many words are in each comment (`'comments'`). Your result should be a `Series` of the same length as `df` that has the number of words in the comment as values. Like with previous problems we have done with counting words, we are just looking for the number of sequences of characters separated by whitespace.\n", "\n", "**For testing purposes, store the result in a variable called `ans3`.**\n", "\n", "*Hint: How would you do this for a single string `'I love dogs!'`?*" ] }, { "cell_type": "code", "execution_count": 0, "metadata": {}, "outputs": [], "source": [ "### edTest(test_problem_3) ###\n", "" ] } ] }