CSEP524: Parallel Computation
Basics
- Course Staff (
<course-id>-staff@cs...
)
-
Instructor: Brad Chamberlain (
bradc
)
TA: Brandon Myers (bdmyers
)
- Meeting Times
-
Lectures: Tuesdays 6:30-9:20pm (Mary Gates Hall 231)
Office Happy Hour (Brad and Brandon): Tuesdays following
lecture at a nearby pub (or other venue that you suggest)
Office hours by appointment (Brandon)
- Mailing List
-
Alias: csep524a_wi13@u...
Archives: Click here
Information on subscribing (registrants only): See the Class Discussion Board
- Textbook
-
Lin & Snyder, Principles
of Parallel Programming
2nd printing: preferable due to bug fixes
1st printing: Errata available
here
Course Description
Software
- UW CSE Fedora 17 Home
Virtual Machine image:
- This is the officially supported operating system for
assignments in this course. Though you're welcome to work
using your native operating system, the burden of installing
software, making sure it works correctly, and that your
solutions work for the VM will be on you.
- Pthreads: (included with the above)
- OpenMP: (included with the above)
- Chapel: chapel-fedora17-1.6.1.1.tar.gz
- (This is a pre-release of the Chapel 1.6.1 sources,
pre-compiled for the Fedora 17 VM; unpack and follow the
directions in README, mentally substituting "1.6.1.1" for "1.6.1"
where necessary and skipping step 2 if you're using the VM)
- MPI: Setup MPI
This describes how to install MPI and how to run it locally and on our course VM cluster.
Widgets
Lectures
- Lecture #1 (Motivation,
Definitions, Course Overview, Metrics, Embarassing Parallelism,
starting/stopping Pthreads)
- Lecture #2 (Course
Details, Introduction to Chapel)
- Lecture #3 (Assignment
#1 debrief, False Sharing, Computational Intensity, Load Balance,
Block and Cyclic evaluation and alternatives, Multidimensional
Distributions, Measuring Load Imbalance, RRWW bugs, Pthreads and
Chapel Synchronization Primitives)
- Lecture #4 (Pthreads
vs. Chapel comparison; Unbalanced Block Distributions; Deadlock,
Livelock, and dealing with them; Memory Consistency Models; Data
Races; Fences; Reductions) [audio (CSE NetID required)]
- Lecture #5 (Search and
Eurekas; Task vs. Data Parallelism; Mapping Tasks to Threads;
Reductions on Arrays; Scans; Barriers; Data Parallelism in Chapel;
OpenMP; Atomic/Lock-Free Programming Concepts) [audio (CSE NetID required)]
- Lecture #6 (Shared
Memory wrap-up; ccNUMA architectures; distributed memory; networks;
Flynn's taxonomy; SPMD; MPI) [audio (CSE NetID required)]
- Lecture #7 (MPI wrap-up,
Multigrid Method, Abstract Machine Models, Intro to PGAS, Chapel
multi-locale computations and distributions, Smith-Waterman)
- Lecture #8 (PGAS: UPC,
CAF, Chapel; single-sided communication and active messages;
Smith-Waterman algorithm)
- Lecture #9 (Amdahl's
Law; Processor Technology Trends; Hierarchical Locales; Parallel
Programming Model Evaluations; All-to-all communications; approaches
to FFTs; STM; Wrap-up; Bonus slides: Finite Element Method;
Partitioning Irregular workloads/Graph Partitioning; HPF, ZPL, and
their impact on Chapel's design; User-definable Chapel features)
Assignments
- Assignment #1 (due
before class 1/15/13)
- Note: As announced in class, you don't need to do
the Chapel portion of question 5 this week since we didn't
get to it in lecture. We will likely do it next week if
you'd like to work ahead independently.
- Assignment #2 (due
before class 1/22/13)
-
treeSearch.chpl
(starting point for question 2)
Notes/clarifications for
Assignment #2 (edited 1/21, 8:45am -- search on 'added')
- Assignment #3 (due
before class 1/29/13)
-
Boehm: Threads Cannot be Implemented as a Library (requires CSE NetID)
Bounded buffer code (tar.gz) or (zip)
Notes/clarifications for
Assignment #3 (edited 1/24, 1:16pm)
- Assignment #4 (due before class 2/5/13)
-
Update: Problem 5, part d ("use a d-ary tree") is now completely optional and will not be graded
Notes/clarifications (edited 2/2) for
Assignment #4
MapReduce: Simplified Data Processing on Large Clusters
userDefReduce.chpl (starting point for problem 4) Updated 1/30 to fix signatures of combine() methods
reduction.chpl (starting point for problem 5)
Makefile Required to compile code for problems 4 and 5
- Assignment #5 (due before class 2/12/13) Q2 updated/clarified, Feb 7th
-
Notes/clarifications
for Assignment #5 Updated Feb 9th
LogP:
Towards a realistic model of parallel computation
Code Starting Points (tar.gz, zip)
- Assignment #6 (due before class 2/26/13) Reading assignment duedate corrected Feb 13
-
Notes/clarifications (edited 2/16) for
Assignment #6
Technology-Driven,
Highly-Scalable Dragonfly Topology; Kim, Dally, Scott,
Abts; ISCA'08
Setup MPI
manual-mpi-reduce.c (starting point for Q2)
stencil9-mpi.c (starting point for Q3)
Makefile Must minimally build the solution code with this
README Instructions on building and running
- Assignment #7 (due
before class 3/5)
-
Notes on Assignment #7: See sticky conversations on message board (they made more sense in context)
Chapel multilocale setup (tar, zip) (includes README.csep524.multiloc for setting up Chapel to use multiple Locales)
hw7 code (tar, zip) (includes README)
- Final Project
(next step: fill out survey below to propose topic/date by Feb
23rd)
-
Survey
for proposing topic and presentation date
List of potential
study topics for consideration (feel free to propose your
own topic, or to propose topics that should be on this list
for others to consider)
Draft Schedule of
Dates/Topics
March 12th Presentation Order
March 19th Presentation Order
Final Reports and Presentations:
- GPU Programming Models:
-
- Parallelism in Mainstream Languages:
-
- Parallelism/Concurrency in Javascript
-
- Multithreaded Programming Models
-
- Parallelism at Microsoft
-
- Recent Concurrent Languages
-
- Go (Jordan)
[slides
| report]
- Rust (Krishnamurthy)
[slides
| report]
- (see also "Task Parallelism in X10, Rust, Haskell" below)
- Parallelism in Functional Languages:
-
- Graph Computing
-
- HPC Languages
-
- Implementation Techniques
-
- Dynamic Load Balancing via Work Stealing (Tamulonis)
[slides
| report]
- Lighweight Threading (Weiner)
[slides
| report]
- Static Verification of Deadlock Freedom (Spishak)
[slides
| report]
- Transactional Memory
-
- Hardware
-
- Miscellaneous
-
- Programming Projects
-