Next, you will write functions to generate data visualizations using the Seaborn library. For each of the functions save the generated graph with the specified name. These methods should only take the pandas DataFrame as a parameter. For each problem, drop any rows that have missing data in the columns that are necessary for plotting that problem (do not drop any additional rows).
hw3.py
. Each function should produce an image with the exact name specified.
/home/FILE_NAME
). If you specify absolute paths for this assignment your code will not pass the tests!bbox_inches='tight'
to the call to savefig
to make sure edges of the image look correct!math
, pandas
, seaborn
, and matplotlib
modules, but you may not use any other imports to solve these problems.seaborn
plotting functions for this assignment besides the ones we showed in the reference box below. For example, even though the documentation for relplot
links to another method called scatterplot
, you should not call scatterplot
. Instead use relplot(..., kind='scatter')
like we showed in class. This is not an issue of stylistic preference, but these functions behave slightly differently. If you use these other functions, your output might look different than the expected picture. You don't yet have the tools necessary to use scatterplot
correctly! We will see these extra tools later in the quarter.As stated in the Overview, it is difficult to write tests for functions that create graphs. Instead, you can check the graphs manually. Some ways to gain confidence in your generated graph:
describe()
method to see some statistical information about the data you've selected. This can sometimes help you determine what to expect in your generated graph.(2005, 28.8)
because of this row in the dataset: 2005,A,bachelor's,28.8,34.5,17.6,11.2,62.1,17.0,16.4,28.0
Plot the total percentages of all people of bachelor's degree as minimal completion with a line chart over years. To select all people, you should filter to rows where sex
is A
. Label the x-axis "Year", the y-axis "Percentage", and title the plot "Percentage Earning Bachelor's over Time". Name your method line_plot_bachelors
and save your generated graph as line_plot_bachelors.png
.
Your result should look like the following:
Plot the total percentages of women, men, and total people with a minimum education of high school degrees in the year 2009. Label the x-axis "Sex", the y-axis "Percentage", and title the plot "Percentage Completed High School by Sex". Name your method bar_chart_high_school
and save your generated graph as bar_chart_high_school.png
.
Do you think this bar chart is an effective data visualization? Include your reasoning in hw3-written.txt
as described in Part 3.
Your result should look like the following:
Plot the results of how the percent of Hispanic individuals with degrees has changed between 1990 and 2010 (inclusive) for high school and bachelor's degrees with a chart of your choice. Make sure you label your axes with descriptive names and give a title to the graph. Name your method plot_hispanic_min_degree
and save your visualization as plot_hispanic_min_degree.png
.
Include a justification of your choice of data visualization in hw3-written.txt, as described in Part 3.