CS 377P: Programming for Performance

Assignment 3: Operator formulation of algorithms

Due date: March 7th, 2017

Late submission policy: Submission can be at the most 2 days late. There will be a 10% penalty for each day after the due date (cumulative).

Clarifications
to the assignment are posted at the bottom of the page.

Description

This assignment introduces you to the operator formulation of algorithms. The motto introduced in class is Algorithm = Operator + Schedule, and in this assignment, you will implement sequential algorithms for the single-source shortest-path (sssp) problem to understand this motto. Read the entire assignment before starting your coding, and get started early: this assignment requires more programming than previous assignments.

Key concepts

Recall that we classify algorithms into topology-driven and data-driven algorithms.

Topology-driven algorithms make a number of sweeps over the graph. At the start of the algorithm, node labels are initialized as needed by the algorithm (for example, for sssp, the label of the source node is initialized to zero and the labels of all other nodes are initialized to \infty). In each sweep, the operator is applied to all nodes. The algorithm terminates when a sweep does not modify the label of any node. In some problems, particularly those in which labels are floating point numbers, we may never get to exact convergence so we terminate the algorithm when node updates are below some threshold or when some upper bound on the number of iterations is reached.

Data-driven algorithms maintain a work-list of active nodes. The work-list can be considered to be an abstract data type (class) that supports two methods: put and get. Active nodes are added to the work-list by invoking the put method with the set of active nodes. The work-list can be maintained either as a set (so no duplicates are allowed) or as a multi-set (duplicates are allowed). In this assignment, work-lists can be implemented as multi-sets so you do not need to check for duplicates. The get method returns an active node from the work-list if it is not empty, and removes it from the work-set. If there are multiple active nodes in the work-list, the schedule determines which one is returned. Applying the operator to an active node may change the labels of other nodes in the graph; if so, these nodes become active and are added to the work-list. For problems in which labels are floating-point numbers, we may choose not to activate a node if the change to its label is below some threshold. Data-driven algorithms terminate when the work-list is empty and all active nodes have been processed.

Graph formats

Input graphs will be given to you in DIMACS format, which is described at the end of this assignment. The output for each algorithm should be produced as a text file containing one line for each node, specifying the number of the node and the label of that node.
Coding
  1.  I/O routines for graphs:  These routines will be important for debugging your programs so make sure they are working before starting the rest of the assignment.
  2. Data-driven algorithms: Implement a routine that takes a graph G and a work-list w of active nodes as input, and performs a data-driven sssp computation on graph G.  By passing different work-lists to this routine as described below, you can implement different data-driven algorithms for sssp without changing the code in your routine. Instrument your code to count the number of node and edge relaxations.

Experiments

Data-driven sssp algorithms

Submission

Submit (in canvas) your code and all the items listed in the experiments above.

Grading

  • Code: 50 points
  • Experiments: 50 points
  • DIMACS format for graphs

    One popular format for representing directed graphs as text files is the DIMACS format (undirected graphs are represented as a directed graph by representing each undirected edge as two directed edges). Files are assumed to be well-formed and internally consistent so it is not necessary to do any error checking.  A line in a file must be one of the following.

    Notes added after assignment was posted: