CSEP-561 Spring'22
Toggle Dark/Light/Auto mode Toggle Dark/Light/Auto mode Toggle Dark/Light/Auto mode

Project 2: Link and Network with SDN

Turnin: Online via Gradescope
Teams: Individual or Teams of 2 (register teams on gradescope)
Due: May 25, 2022 @ 11:59PM PST.

Project Overview

In this project you will continue learning about Software Defined Networking (SDN), building on your experience from the previous assignment. Using Virtualbox, Mininet, and Pox as the implementers of the OpenFlow protocol, you will build some simple networks using SDN primitives.

  • For the first portion you will be building an entire network, with multiple switches capable of handling traffic in a static topology.

  • In the second part, you will be modifying your part 3 solution to implement a layer 3 learning IP router that handles ARP and routes traffic dynamically between subnets.

Setup:

This project re-uses your existing mininet environment from project 1. You will need to make the following modifications to the environment:

  1. As the vagrant user in the vagrant home directory of the virtual machine, clone the new starter code repository:
cd ~/
git clone https://gitlab.cs.washington.edu/561p-course-staff/project-2-starter project-2
  1. Run the project-2-starter bootstrap-p2.sh script. This script will create links for project2 similar to the ones created in project 1.
cd /home/vagrant/project-2
./bootstrap-p2.sh

Part 1: A more complex network

Learing Objectives (After this section, students will be able to…):

  • Create match –> action rules to forward traffic efficiently out specific ports without flooding according to static knowledge of addresses and topology in the network.
  • Create match –> action rules that correctly forward L3 traffic across multiple distinct L2 networks according to a static addressing plan.
  • Create an SDN controller program capable of configuring multiple switches with different behavior specified for different types of switches.

In project 1 part 2 you implemented a simple firewall that allowed ICMP packets, but blocked all other packets. For this project, you will be expanding on this to implement L3 routing between subnets, and implementing firewalls for certain subnets. The idea is to simulate a (slightly : ) ) more realistic network.

We will be simulating a network for a small company. The company has a 3 floor building, with each floor having its own switch and subnet. Additionally, we have a switch and subnet for all the servers in the data center in the basement, and a core switch connecting everything together. Note that the names and IPs are not to be changed. As with the prior assignment, we have provided the topology (project-2-starter/topos/part1.py) as well as a skeleton controller (project-2-starter/topos/a2part1controller.py). As with part 2 of the previous assignment, you need only modify the controller.

A network diagram showing s1 connected directly to h10, s2 connected to h20, and s3 connected to h30. S1, s2, and s3 are all connected to cores21. Cores 21 is connected to hnotrust1 and dcs31. dcs31 is connected to serv1.

Your goal will be to allow traffic to be transmitted between all the authorized hosts in the company. In this assignment, you will be allowed to flood traffic on the secondary routers (s1,s2,s3,dcs31) in the same method that you did in a1 part2 (using a destination port of of.OFPP_FLOOD). However, in the core router (cores21) you will need to specify specific ports for all IP traffic. You may do this however you choose– however, you may find it easiest to determine the correct destination port and drop/allow action by using the destination IP address and source IP address, as well as the source port on the switch that the packet originated from.

While all authorized hosts should be able to communicate between themselves and the datacenter server, we need to protect the network from the untrusted Internet. In this scenario we represent the external hosts by hnotrust1, and have the following requirements:

  1. We need to block all IP traffic from hnotrust1 to serv1, while still allowing the regular hosts (h10, h20, etc.) to communicate with hnotrust1.
  2. To block the Internet from discovering our internal IP addresses, we want to block all ICMP traffic from hnotrust1 to the regular hosts (h10, h20, etc.) and serv1.

In summary of your goals:

  • Create a Pox controller (as per a1 part 2) with the following features: All nodes able to communicate EXCEPT
    • hnotrust1 cannot send ICMP traffic to h10, h20, h30, or serv1 (but should be able to send IP traffic to hosts).
    • hnotrust1 cannot send any IP traffic to serv1.
Deliverables:
  1. A screenshot of the pingall command. All nodes but hnotrust should be able to send and respond to pings.
  2. A screenshot of the iperf hnotrust1 h10 and iperf h10 serv1 commands. Though not shown in these commands, hnotrust1 should not be able to transfer to serv1. It should be able to transfer to other hosts.
  3. A screenshot of the output of the dpctl dump-flows command. This should contain all of the rules you’ve inserted into your switches.
  4. Your a2part1controller.py file.

Part 2: A learning router

Learing Objectives (After this section, students will be able to…):

  • Write a program to handle ARP requests and generate valid ARP replies to establish connectivity without static address mappings.
  • Dynamically create match –> action rules that correctly forward L3 traffic across multiple distinct L2 networks with dynamic L2 addresses.
  • Create an SDN controller program capable of holding dynamic state to configure multiple switches in response to traffic observed in the network.
  • Reason about the relationship between L2 addresses within a particular subnet and the L3 addresses used for internetwork routing.

For part 2, we’ll extend your part 1 code to implement a somewhat more realistic level-3 router out of the cores21 switch. The a2part2controller.py skeleton is very similar to part1, and you may want to begin by copying forward some of your functionality from part1. For the topology, we again provide a file (part2.py). The difference between part1.py and part2.py topologies is that there is no longer a static L3<–>L2 address mapping loaded into each host, and the default route ‘h{N}0-eth0’ was changed to ‘via 10.0.{N}.1’ where ‘10.0.{N}.1’ is the IP address of the gateway (i.e. router) for that particular subnet. This effectively changes the network from a switched network (with hosts sending to a MAC address) into a routed network (hosts sending to an IP address which may require a gateway out of the L2 network). A minimal a2part1controller will not work on this new topology!

To handle this L2<–>L3 mapping cores21 will need to:

  • Handle ARP traffic in multiple subnets (without forwarding);
  • Generate valid ARP replies when needed; and
  • Forward IP traffic across link domains (which will require updating the L2 header);

Additionally, this assignment requires that your implementation work dynamically. You may not install static routes on cores21 at startup. Instead, your router must learn which ports and which L2 addresses correspond to particular L3 addresses, and install the appropriate rules into the cores21 switch dynamically. This information can be inferred by processing the content of received ARP messages, or possibly other traffic on the network (although processing ARP is sufficient). You may handle each of the individual ARP packets in the controller (i.e., not with flow rules) for part 2, but most IP traffic should be handled with flow rules for efficiency. The other switches (e.g., s1) do not need to be modified and can continue to flood traffic with static rules.

For this part of the project, you should drop packets where your router has not learned the appropriate destination. This will mean some packets will have to be dropped that could be routed if your router had more information. See extension 1 to address this shortcoming.

Your implementation should still apply the same L3 policy rules from part 1, particularly that:

  1. We need to block all IP traffic from the hnotrust1 subnet to serv1, while still allowing the regular hosts (h10, h20, etc.) to communicate with hnotrust1.
  2. To block the Internet from discovering our internal IP addresses, we want to block all ICMP traffic from the hnotrust1 subnet to both the regular hosts (h10, h20, etc.) and serv1.

Note: Combining learning with L3 policies like this is not very secure in practice, since in the real world there is no way to validate the L3 source address with plain IP. Do not naively take the approach used in this assignment and apply it in a production network.

In summary, your complete implementation should:

  • Handle ARP traffic in multiple subnets (without forwarding)
  • Generate valid ARP replies when needed
  • Forward IP traffic across L2 networks as well as within them (which may require updating the L2 header)
  • Dynamically learn the L3 topology of the network (which subnets are accessible on which ports) via snooping ARP traffic (or another method)
  • Allow all hosts to bidirectionally communicate with the server and with each other after the destination addresses have been learned
  • Prevent hnotrust1 from sending IP traffic to serv1
  • Prevent hnotrust1 from sending ICMP to any of the regular hosts (h10, h20, etc.) or serv1
  • Allow bidirectional IP communication between regular hosts (h10, h20, etc.) and hnotrust1
Deliverables:
  1. A screenshot of the pingall command immediately after the network has been started. All nodes but hnotrust should be able to send and respond to pings. Note that some pings will fail as the router learns of routes (why would that be the case?).
  2. A screenshot of a second pingall command following the first. All traffic should be delivered except to/from the blocked hnotrust.
  3. A screenshot of the iperf hnotrust1 h10 and iperf h10 serv1 commands. hnotrust should not be able to transfer to serv1, but should be able to transfer to other hosts
  4. A screenshot of the output of the dpctl dump-flows command. This should contain all of the rules you’ve inserted into your switches.
  5. Your a2part2controller.py file.

Turn-in

When you’re ready to turn in your assignment, do the following:

  1. Modify the README.md file to include the names and UW netids of the member(s) of your team.
  2. Archive all the materials (your project-2-starter directory and everything in it) in a single .zip file named partner1netid_partner2netid.zip.
  3. (Optional) If you’re submitting any extensions, make sure you submit them in addition to your regular submission materials as additional files in your archive, not overwriting the regular parts of the assignment!
  4. Submit the partner1netid_partner2netid.zip file to Gradescope.

Extension 1: ARP Generation

** Reminder: ** This part of the assignment is for intellectual curiosity only, and is neither part of the grade nor extra credit. As an extension it’s less fully documented than the main assignment, so please ask questions as needed!

It’s unfortunate that the implementation for part2 leads to traffic being dropped even when the destination host is online and available. One way to resolve this is to have your router implementation generate arp requests for hosts on the relevant subnet, and wait for the ARP reply before forwarding the queued packet.

For this first extension, implement dynamic ARP requests from the cores21 router, allowing all traffic to be delivered successfully during the first round of pingall (except from the blocked hnotrust). You should use the same topo file from part2, and your implementation should have the following properties:

  • ARP should only be generated on the relevant subnet, based on the IP destination address.
  • Your router should not generate gratuitous ARP, and only ARP when necessary.
  • Your implementation should be robust to the case that no ARP reply is ever received, and have some way to limit the amount of memory taken by queued packets. Consider how you might estimate an appropriate timeout for a given request.
  • Your implementation should ideally not prevent the controller from handling other packets while waiting for an ARP reply (although a blocking implementation is a fine place to start!). Consider how you could design your controller to handle concurrent requests while still tracking the state of each queued packet awaiting a destination address.

After exploring the problem, consider the following questions:

  1. What is the high-level design of your queueing strategy, and will your queues overflow if you receive a high-bandwidth stream of traffic to a new host address? Will this queue overflow be isolated to this one flow or impact other flows through your router?
  2. Remember that the controller is usually much more throughput limited than the actual dataplane switches. With this is mind, is your implementation robust to a Denial of Service Attack (an attempt to overwhelm with malicious traffic) against the controller? How might you detect and/or mitigate an attack against your controller?
Deliverables:
  1. Create a new controller file for this part, called a2ext1controller.py in the pox directory implementing arp request generation.
  2. Add the answers to the posed questions as a new section in your README file.

Extension 2: A Stateful Firewall

This assignment was motivated by an office context, with a set of workstation hosts and a more protected internal server. The assignment spec allows IP traffic between the workstations and an untrusted host, but in practice this traffic would also be best avoided in many situations! One way to do this is via a network component called a stateful firewall. A stateful firewall gets its name from tracking the individual connection state for each flow (or host), and blocking or allowing traffic based on the connection state.

Consider how your part2 controller might be extended to implement a stateful firewall (using the same part2 topo) with the following properties:

  • Hosts within the corporate network can ping hosts and receive ICMP echo replies from the public Internet (hnotrust), but hnotrust cannot send ICMP echo requests to hosts within the network.
  • Hosts within the network can initiate TCP and UDP connections to hnotrust, and exchange bidirectional traffic, but hnotrust cannot initiate a connection into the network on its own after a timeout period of no traffic exchange.

Nice-to-haves:

  • Your implementation could detect when a tcp connection is FIN or RST, to close immediately without waiting for a timeout. (Can you do this with UDP?)

After exploring the problem, consider the following question:

  1. Imagine an office worker behind a stateful firewall trying to initiate a zoom call to another worker at another office behind a different firewall. Is this possible with your implementation? Would it be possible with the help of a public “rendezvouz” server in the public Internet?
  2. With this “peer to peer” use case in mind, what are the tradeoffs between basing a stateful firewall on L3 state only, L4 state only, or a combination of L3 and L4 state?

If this is interesting to you, definitely check out the work done at the IETF around Interactive Connectivity Establishment, rfc8445, and The ICE working group.

Deliverables:
  1. Create a new controller file for this part, called a2ext2controller.py in the pox directory implementing your stateful firewall logic.
  2. Add the answers to the posed questions as a new section in your README file.