Assignment 1 - Pthreads Warmup

Using Longhorn

A sample job_script would look like this

#!/bin/bash
#$ -V
#$ -cwd
#$ -N jobname #Replace with your job name
#$ -j y
#$ -o $JOB_NAME.o$JOB_ID
#$ -pe 2way 16 #read explanation below
#$ -q development #queue name - do not change this
#$ -l h_rt=00:03:00 #specifies resource limits - in this case, the maximum amount of time for which your job can run
#$ -M emailid #Replace with your email id
#$ -A A-cs41 #Project Name - Do not change this
./a.out < args >

The -pe option specifies the number of cores on which your job runs and the distribution of these cores. Consider Longhorn where each node has 8 cores, the -pe should be used as follows
-pe < TpN > way < NoN x 8 >

where < TpN > is the number of cores to use per node
and < NoN x 8 > is the total number of cores, which is 8 x number of nodes

More details about submitting jobs on Longhorn can be found at the following links.
Longhorn

Pthreads Resources

The tutorials here and here should get you started with Pthreads programming.

Compiling your pthreads program

gcc -o exec_name program_name.c -lpthread

Debugging and Profiling

Debugging and optimizing parallel programs is much harder as compared to sequential programs, and that is something you will realize fairly soon into assignment 1. This tutorial should help you get started.

Performance

While performance is not a concern for this assignment, it'll be a crucial component from assignment 3 onwards. Therefore, it is probably a good idea to play around with performance measurements while doing this warm-up assignment. You should use the optimizations discussed in class to improve the performance of your parallel program.

We will be using PAPI to measure the performance of all parallel programs. PAPI is a performance measurement tool that uses hardware counters to keep track of various performance based pararemeters. Brief instructions on how to compile and link PAPI with your code is as follows:

module load papi

The PAPI module file defines the following environment variables: TACC_PAPI_DIR, TACC_PAPI_LIB, and TACC_PAPI_INC for the location of the papi distribution, libraries, and include files, respectively. To use the PAPI library, compile the source code with the option:

-I$TACC_PAPI_INC

and add the following options to the link step:

-Wl,-rpath,$TACC_PAPI_LIB -L$TACC_PAPI_LIB -lpapi

An example of how to use PAPI on TACC clusters is here and a detailed description of how to use PAPI counters is available on the PAPI home apage.

Submission Instructions

The assignment is due on Feb 1 by 11:59pm.

Prepare a tar file for your submission. That tar file should have your report, your source code and a readme file. The readme file should have your names, instructions for compiling your code and the number of slip hours used. The total number of slip hours should be mentioned as "slip_hours_used:< number >" in the readme file. (I will use a script to read this, so ensure you stick to this format.)

I will run the parallel prefix computation as : ./exec num_threads input_file
The input file will contain the size of the array followed by a space separated string of numbers. A sample input file is here.
For the problems that you implement using the parallel prefix framework, you can submit input files and write down the instructions in the readme file.

To submit your assignment, use the following command
turnin --submit akanksha hw1 < your_tar >