CS 378: Programming for Performance

Assignment 6: Bellman Ford on GPU

Due date: May 8

Late submission policy: Submission can be at the most 2 days late. There will be a 10% penalty for each day after the due date (cumulative).


Modify the supplied implementation of SSSP (Bellman Ford) by exploring various performance enhancements.

For this assignment, you will need an NVIDIA GPU that supports CUDA texture objects.

For each of the tasks below, report the performance improvement (as time in ns and speedup over baseline) you observe for the supplied inputs:

  1. Modify the CSRGraph class to use CUDA texture objects for the read-only row_start, edge_dst, and edge_data arrays.
  2. Explore the use of CUDA texture objects for the read-write node_data array. Describe how you would use this in the SSSP kernel.
    Do you see any performance benefit?
  3. Modify sssp_kernel to use dynamic scheduling to process large-degree nodes in parallel.
    Report the number of iterations that sssp runs for (the value of i).
    Also explain your results -- focus on operations performed after dynamic scheduling vis-a-vis the simpler all-edges-per-thread version originally supplied.
    Tip: This new input might help clarify the performance.
Report the runtime for each input in tables (instead of plots).


Submit (in canvas) your code and a PDF containing the tables and explanation. You will not be given any points if we are not able to run your code on the supplied inputs and verify correctness.

Validating your program:

The reference outputs are supplied along with the inputs. Running 'diff -q' on the output file generated by your program (./sssp NY.gr -o output.txt) and the corresponding reference output should highlight any errors.

Helpful Reference Material:

CUDA texture objects: http://devblogs.nvidia.com/parallelforall/cuda-pro-tip-kepler-texture-objects-improve-performance-and-flexibility/

System with NVIDIA GPU:

Read https://portal.tacc.utexas.edu/user-guides/stampede#gpu to use the NVIDIA GPUs on Stampede.