Pandas Practice¶
In this lesson, we'll practice working with DataFrames.
import pandas as pd
Our Dataset!¶
Run the cell below to see the DataFrame you'll be working with today.
gradebook = pd.DataFrame({
"StudentID": [101, 102, 103, 104, 105, 106, 107, 108, 109],
"Name": ["Arona", "Hannah", "Renusree", "Arpan", "Mia", "Asmi", "Alyssa", "Vani", "Vatsal"],
"Grade": [60, 92, 78, 90, 88, 73, 50, 99, 75],
"Attendance": [95, 80, 88, 92, 98, 100, 98, 77, 95],
"Major": ["CS", "ECE", "CS", "ECE", "CS", "Informatics", "ECE", "Informatics", "Informatics"],
"GradYear": [2024, 2024, 2026, 2026, 2026, 2024, 2027, 2025, 2025]
})
gradebook
StudentID | Name | Grade | Attendance | Major | GradYear | |
---|---|---|---|---|---|---|
0 | 101 | Arona | 60 | 95 | CS | 2024 |
1 | 102 | Hannah | 92 | 80 | ECE | 2024 |
2 | 103 | Renusree | 78 | 88 | CS | 2026 |
3 | 104 | Arpan | 90 | 92 | ECE | 2026 |
4 | 105 | Mia | 88 | 98 | CS | 2026 |
5 | 106 | Asmi | 73 | 100 | Informatics | 2024 |
6 | 107 | Alyssa | 50 | 98 | ECE | 2027 |
7 | 108 | Vani | 99 | 77 | Informatics | 2025 |
8 | 109 | Vatsal | 75 | 95 | Informatics | 2025 |
Individual Activity¶
Display all the column names in the cell below.
# TODO: Display column names
# .columns will return you the columns of the dataframe
Then, select the rows where Grade
is greater than 85 in the cell below.
# TODO: Rows where Grade > 85
# hints: filtering
Group Activity¶
Add a new column called Participation
with the values [100, 80, 70, 90, 100, 20, 80, 90, 100]
Now, calculate the mean of the Grade
, Attendance
, and Participation
column, and store that value under a new column, FinalScore
.
# TODO: Calculate mean and create FinalScore
# remember that you need to store the value under a new column, not a new variable name
Next, change the index of the DataFrame to the StudentID
and find what Student 105's Major
is. What about their GradYear
?
# TODO: Who is Student 105?
# Hints: in order to change the index of the dataFrame, we can use .set_index() function
# you can use .loc()
Whole Class Activity¶
Write a function categorize_student
that categorizes students into "High Performer" if their FinalScore
is 85 or above, and "Needs Improvement" otherwise. Apply this function to create a new column PerformanceCategory
.
def categorize_student(data):
"""
"""
...
categorize_student(gradebook)