Exercise: Forces & Matrices

In this exercise, you’ll first get more hands-on experience using force-directed layout for networks as well as other chart types. Then you’ll practice making sense of network data using the alternative representation of an adjaceny matrix.

Task 1: Configuring Force Directed Layout

Getting an appropriate force-directed layout requires balancing the strengths of the various forces at play. In the example below, the forces are all wrong! Use the controls to adjust the settings until you find a layout that you believe best suits the data. Note your values and update the code for the controls so that each uses your preferred setting as the initial value.

The network data depicts character co-occurrences in Victor Hugo’s novel Les Misérables. Two characters are linked if they appear in the same chapter.

Task 2: Bubble Chart Layout

Force-directed layout can be used for more than graphs! In this task, you will create two charts that apply a force simulation to produce a visualization layout.

The dataset states contains the 2023 tax revenues for each U.S. state. The columns are abbr (2 letter state abbreviation), state (the full state name), and revenue (tax revenue in billions of dollars).

const states = FileAttachment('../data/state-tax-revenue-2023.csv')
  .csv({ typed: true });

Your first goal is to create a bubble chart, in which data points are represented by circles with areas proportional to revenue. The circles should be “packed” into an overall circular configuration without any overlap.

The provided D3 scaffolding code sets up the main visualization elements. Your task is to fill in the force simulation definitions needed to create a legible bubble chart. You will very likely need to consult the d3-force documentation! (Hint: in addition to centering, you may find that your design benefits from explicit positional forces (forceX, forceY).)

The plot above shows the total absolute revenues per state. While interesting, that number is of course driven by both tax rates and the number of people paying taxes. What data transformations might also be valuable to apply here?

Provide your answer here.

While perhaps visually engaging, we could critique the bubble plot on perceptual grounds: area (size) is not the most effective channel for quantitative comparison, and than random positions of the state bubbles may hamper scanning and search. Can you suggest an alternative encoding channels to consider?

Provide your answer here.

One design idea is to create a Dorling cartogram by placing the circles in a configuration that more closely matches their geographic layout. If we also had projected cartographic coordinates (centroids) for each state, how might you then use a force directed layout to provide the final layout? What set of forces might you apply?

Provide your answer here.

While you’ve focused on a bubble chart with a single cluster of circles, your work could be easily adapted to create a Beeswarm Plot, in which an ordinal axis determines the “attraction” points for a discrete set of categories, dividing the data into multiple clusters.

Task 3: Reading Matrices

One of the readings for this week is a study of graph readability using both node-link and matrix visualizations. While less “intuitive”, adjacency matrices were found to support a range of graph reading tasks more effectively than node-link diagrams. (The one notable exception being path following from node-to-node.)

By taking an edge-centric representation, adjacency matrices side-step occlusion (such as edge crossings) and reduce clutter. By using appropriate sortings (or permutations) of the rows and columns of the matrix, different patterns of interest can be discovered. However, these advantages come at a cost: one must spend time learning to “read” the diagrams and more readily translate visual patterns to an understanding of graph structure.

Your task is to analyze the structure of the Les Misérables network using an adjacency matrix. We’ve implemented a sortable matrix for you below. In addition, each character has an associated cluster group, which we’ve used to color the edges among characters within the same group. Use the matrix diagram to answer the analysis questions below, adjusting the sort order as you see fit.

Questions

Which character(s) co-occurs the most with others? Your answer here.
How many characters only co-occur with another once? Your answer here.
What is the size of the largest group? Your answer here.
Which two groups have the most connections between them? Your answer here.
Which groups is the densest, i.e. has the highest in-group connectivity? Your answer here.
Which orderings did you find the most helpful and why? Your answer here.

Don’t forget to add, commit, and push your exercises to your GitLab repo!