# Graph Data Type

## Graph data type

All of the data structures we’ve seen so far structure data according to the properties of data. Search trees use the relative ordering of elements to store them in a tree. Hash tables use values to determine the bucket index. In both cases, the properties of data enable efficient insertion, search, and retrieval.

Graphs are a different kind of abstract data type. Rather than focus on insertion, search, and retrieval (as in sets, maps, and priority queues), graphs focus on explicitly modeling the relationships between data. Graphs afford access not only to elements, but also to the relationships between elements.

Undirected graph
The elements of a graph are represented with nodes (aka vertices).
The relationships between elements are represented with edges. Optionally, edges can have weights associated with them. Given a vertex, its immediate edges lead to neighboring (aka adjacent) vertices. The degree of a vertex refers to the number of neighbors.
A path is a sequence of vertices connected by edges. A cycle is a path whose first and last vertices are the same. Two vertices are connected if there is a path between them.

Unlike search trees, the interface for a graph explicitly specifies the nodes and edges!

``````public interface Graph<V> {

// Add an edge between vertices (v, w).
public void addEdge(V v, V w);

// Returns all of the vertices adjacent to v.
public List<V> neighbors(V v);
}
``````

This undirected `Graph` interface has vertices of some type `V` and unweighted edges. In a social network, a `Graph<Person>` might store in each vertex a `Person`, represent edges as friendships, and offer specialized methods for returning whether two people are friends or friends-of-friends.

In a directed graphs, a given vertex can have different outgoing and incoming edges.

Directed graph
A vertex has out-neighbors and in-neighbors with corresponding out-degree and in-degree.

### Question 1

Which of these applications is best modeled as a directed graph?

• Social network: vertex = person; edge = friendship.
• The web: vertex = webpage; edge = URL link.
• Campus map: vertex = building; edge = footpath.
• Course prerequisites: vertex = course; edge = prerequisite.
Explanation

In a course prerequisite schedule, some courses need to be taken before other courses, but that relationship is typically one-directional as well. For example, CSE 143 is a prerequisite for CSE 373, so we wouldn’t expect CSE 373 to be a prerequisite for CSE 143.

That said, it is possible to represent all of these graph applications with directed graphs too. As we’ll see later, undirected graphs are very much like directed graphs except that they ensure edges are added/removed/traversed in both directions.

## Graph features

### Question 1

Which of the following features apply to this graph?

An undirected graph is connected if every pair of vertices is connected.

A directed graph is connected (aka weakly connected) if:

1. We first replace every directed edge with an undirected edge.
2. The resulting undirected graph is connected.

A directed graph is strongly connected if a directed path exists between every pair of vertices.

• Undirected
• Directed
• Acyclic
• Cyclic
• Connected
• Strongly connected

## Graph properties

In this problem, we will try to apply our knowledge of graph properties to determine what is possible in a simple graph. All graphs in this course are simple graphs that have no self-loops and no parallel edges.

1. No self-loops. A node v cannot have an edge (v, v).
2. No parallel (duplicate) edges. There can only exist one edge (v, w).

### Question 1

Suppose we have a simple undirected graph with |V| vertices. What is the minimum number of edges in this graph?

• 0
• 1
• |V| − 1
• |V|
Explanation

The graph could be completely disconnected with no edges at all.

### Question 2

Suppose we have a simple, connected and undirected graph with |V| vertices. What is the minimum number of edges in this graph?

• 0
• 1
• |V| − 1
• |V|

### Question 3

Suppose we have a simple directed graph with |V| vertices. What is the maximum number of edges in this graph?

• |V| − 1
• |V|
• |V|2 − |V|
• |V|2 − 1
• |V|2
• 2|V| − |V|
• 2|V| − 1
• 2|V|
Explanation

Consider the adjacency matrix representation: it is a |V| × |V| matrix except we know that the |V| vertices on the diagonal cannot be edges since self-loops are not allowed.

## Partisanship

How might we use graphs to understand partisanship in the US Congress? Partisanship refers to always voting with your political party and never voting on legislation sponsored by another party, such as a Democrat never voting on a bill with a Republican or vice versa. Andris et al. (2015) answered this by modeling US Congress sessions between 1949 and 2012.

Each member of the U.S. House of Representatives from 1949 - 2012 is drawn as a single node. Republican (R) representatives are in red and Democrat (D) representatives are in blue, party affiliation changes are not reflected. Edges between nodes are drawn if each member agrees with another member more often than the “threshold value” of votes specific to that particular Congress. The threshold value is the number of agreements where any pair exhibiting this number of agreements is equally likely to comprised of two members of the same party (e.g. D-D or R-R), or a cross-party pair (e.g. D-R).

### Question 1

Give an affordance analysis for connectedness as a measure of US Congress partisanship. An undirected graph is connected if every pair of vertices is connected.

Explanation

We can also ask similar questions about the impact of social media on political polarization.

Filter bubble (aka social media bubble) refers to the intellectual isolation that can be caused by personalization algorithms that determine the particular content to show to a user. Brady et al. (2017) investigated how “Emotion shapes the diffusion of moralized content in social networks.”

Network graph of moral contagion shaded by political ideology. The graph represents a depiction of messages containing moral and emotional language, and their retweet activity, across all political topics (gun control, same-sex marriage, climate change). Nodes represent a user who sent a message, and edges (lines) represent a user retweeting another user. The two large communities were shaded based on the mean ideology of each respective community (blue represents a liberal mean, red represents a conservative mean).