# Graph Data Type

## Table of contents

## Graphs

## Terminology

## Graph data type

All of the data structures we’ve seen so far structure data according to the properties of data. Search trees use the relative ordering of elements to store them in a tree. Hash tables use values to determine the bucket index. In both cases, the properties of data enable efficient insertion, search, and retrieval.

Graphs are a different kind of abstract data type. Rather than focus on insertion, search, and retrieval (as in sets, maps, and priority queues), graphs focus on explicitly modeling the relationships between data. Graphs afford access not only to elements, but also to the relationships between elements.

- Undirected graph
- The elements of a graph are represented with
**nodes**(aka**vertices**). - The relationships between elements are represented with
**edges**. Optionally, edges can have**weights**associated with them. Given a vertex, its immediate edges lead to**neighboring**(aka**adjacent**) vertices. The**degree**of a vertex refers to the number of neighbors. - A
**path**is a sequence of vertices connected by edges. A**cycle**is a path whose first and last vertices are the same. Two vertices are**connected**if there is a path between them.

Unlike search trees, the interface for a graph explicitly specifies the nodes and edges!

```
public interface Graph<V> {
// Add an edge between vertices (v, w).
public void addEdge(V v, V w);
// Returns all of the vertices adjacent to v.
public List<V> neighbors(V v);
}
```

This undirected `Graph`

interface has vertices of some type `V`

and unweighted edges. In a social network, a `Graph<Person>`

might store in each vertex a `Person`

, represent edges as friendships, and offer specialized methods for returning whether two people are friends or friends-of-friends.

In a directed graphs, a given vertex can have different outgoing and incoming edges.

- Directed graph
- A vertex has
**out-neighbors**and**in-neighbors**with corresponding**out-degree**and**in-degree**.

### Question 1

Which of these applications is best modeled as a directed graph?

**Social network**: vertex = person; edge = friendship.**The web**: vertex = webpage; edge = URL link.**Campus map**: vertex = building; edge = footpath.**Course prerequisites**: vertex = course; edge = prerequisite.

## Explanation

On the web, URL links are one-directional! The affordances of the web mean that your web browser needs to maintain a back button and browser history.

In a course prerequisite schedule, some courses need to be taken before other courses, but that relationship is typically one-directional as well. For example, CSE 143 is a prerequisite for CSE 373, so we wouldn’t expect CSE 373 to be a prerequisite for CSE 143.

That said, it is possible to represent all of these graph applications with directed graphs too. As we’ll see later, undirected graphs are very much like directed graphs except that they ensure edges are added/removed/traversed in both directions.

## Implementations

## Graph features

### Question 1

Which of the following features apply to this graph?

An undirected graph is **connected** if every pair of vertices is connected.

A directed graph is **connected** (aka **weakly connected**) if:

- We first replace every directed edge with an undirected edge.
- The resulting undirected graph is connected.

A directed graph is **strongly connected** if a directed path exists between every pair of vertices.

- Undirected
- Directed
- Acyclic
- Cyclic
- Connected
- Strongly connected

## Graph properties

In this problem, we will try to apply our knowledge of graph properties to determine what is possible in a **simple graph**. All graphs in this course are simple graphs that have no self-loops and no parallel edges.

**No self-loops**. A node*v*cannot have an edge (*v*,*v*).**No parallel (duplicate) edges**. There can only exist one edge (*v*,*w*).

### Question 1

Suppose we have a simple **undirected** graph with |*V*| vertices. What is the **minimum** number of edges in this graph?

- 0
- 1
- |
*V*| − 1 - |
*V*|

## Explanation

The graph could be completely disconnected with no edges at all.

### Question 2

Suppose we have a simple, **connected** and **undirected** graph with |*V*| vertices. What is the **minimum** number of edges in this graph?

- 0
- 1
- |
*V*| − 1 - |
*V*|

### Question 3

Suppose we have a simple **directed** graph with |*V*| vertices. What is the **maximum** number of edges in this graph?

- |
*V*| − 1 - |
*V*| - |
*V*|^{2}− |*V*| - |
*V*|^{2}− 1 - |
*V*|^{2} - 2
^{|V|}− |*V*| - 2
^{|V|}− 1 - 2
^{|V|}

## Explanation

Consider the adjacency matrix representation: it is a |*V*| × |*V*| matrix except we know that the |*V*| vertices on the diagonal cannot be edges since self-loops are not allowed.

## Partisanship

How might we use graphs to understand partisanship in the US Congress? **Partisanship** refers to always voting with your political party and never voting on legislation sponsored by another party, such as a Democrat never voting on a bill with a Republican or vice versa. Andris et al. (2015) answered this by modeling US Congress sessions between 1949 and 2012.

Each member of the U.S. House of Representatives from 1949 - 2012 is drawn as a single node. Republican (R) representatives are in red and Democrat (D) representatives are in blue, party affiliation changes are not reflected. Edges between nodes are drawn if each member agrees with another member more often than the “threshold value” of votes specific to that particular Congress. The threshold value is the number of agreements where any pair exhibiting this number of agreements is equally likely to comprised of two members of the same party (e.g. D-D or R-R), or a cross-party pair (e.g. D-R).

### Question 1

Give an affordance analysis for **connectedness** as a measure of US Congress partisanship. An undirected graph is connected if every pair of vertices is connected.

## Explanation

We can also ask similar questions about the impact of social media on political polarization.

**Filter bubble** (aka social media bubble) refers to the intellectual isolation that can be caused by personalization algorithms that determine the particular content to show to a user. Brady et al. (2017) investigated how “Emotion shapes the diffusion of moralized content in social networks.”

Network graph of moral contagion shaded by political ideology. The graph represents a depiction of messages containing moral and emotional language, and their retweet activity, across all political topics (gun control, same-sex marriage, climate change). Nodes represent a user who sent a message, and edges (lines) represent a user retweeting another user. The two large communities were shaded based on the mean ideology of each respective community (blue represents a liberal mean, red represents a conservative mean).