- Login to the SP machine
Workshops differ in how this is done. The instructor will go over this
beforehand.
- Verify that guidec and guidef77 are in your path
These executables are required to compile the OpenMP exercise codes. Issue
the following commands. If they do not resolve to the location of the
executables, see the instructor before going any further.
which guidec
which guidef77
- Copy the example files
- In your home directory, create a subdirectory for the example codes
and then cd to it.
mkdir ~/openMP
cd ~/openMP
- Then, copy either the Fortran or the C version of the parallel OpenMP
exercise files to your openMP subdirectory:
C:
| cp /usr/local/spclass/blaise/openMP/samples/C/*
~/openMP
|
Fortran:
| cp /usr/local/spclass/blaise/openMP/samples/Fortran/*
~/openMP
|
- List the contents of your openMP subdirectory
You should notice the following files:
Note: Most of these are simple example files. Their primary
purpose is to demonstrate the basics of how to parallelize a code with OpenMP. Most execute in less than a second.
- Review the Hello World example code
Take a moment to examine the source code and note how OpenMP directives and
library routines are being used.
- Compile the Hello World example code
Depending upon your language preference, use one of the following commands
to compile the code with KAI's Guide:
C:
| guidec omp_hello.c -o hello
Fortran:
| guidef77 omp_hello.f -o hello
| |
- Run the Hello World executable
Simply type the command hello and the program should run.
Your output should look similar to below. The actual order of output
strings may vary.
Hello World from thread = 0
Number of threads = 4
Hello World from thread = 3
Hello World from thread = 1
Hello World from thread = 2
|
- Vary the number of threads and re-run Hello World
Set the number of threads to use by means of the OMP_NUM_THREADS environment
variable. Then re-run Hello World and notice the output.
setenv OMP_NUM_THREADS 8
hello
- Review / Compile / Run the workshare1 example code
This example demonstrates use of the OpenMP loop work-sharing construct.
Notice that it specifies dynamic scheduling of threads and assigns a
specific number of iterations to be done by each thread.
-
After reviewing the source code, use the following commands to compile
and run the executable, being sure to set the OMP_NUM_THREADS variable
to 4 before running the executable.
C: |
guidec omp_workshare1.c -o workshare1
setenv OMP_NUM_THREADS 4
workshare1 | sort |
Fortran: |
guidef77 omp_workshare1.f -o workshare1
setenv OMP_NUM_THREADS 4
workshare1 | sort |
- Review the output. Note that it is piped through the sort utility.
This will make it easier to view how loop iterations were actually
scheduled across the team of threads.
- Run the program again and review the output. Does it differ?
- Edit the workshare1 source file and change the dynamic scheduling to
static scheduling.
- Recompile and run the modified program. Notice the difference in
output compared to dynamic scheduling.
- Review / Compile / Run the workshare2 example code
This example demonstrates use of the OpenMP SECTIONS work-sharing construct
Note how the PARALLEL region is divided into separate sections, each of
which will be executed by one thread.
As before, compile and execute the program after reviewing it:
C: |
guidec omp_workshare2.c -o workshare2
workshare2 |
Fortran: |
guidef77 omp_workshare2.f -o workshare2
workshare2 |
Run the program 5 or 6 times and observe the differences in output. Because
there are only two sections, you should notice that only two threads do work
even though there are more than two in the team. You may/may not notice that
the two threads doing work can vary. For example, the first time thread 0 and
thread 1 may do the work, and the next time it may be thread 0 and thread 3.
- Review / Compile / Run the orphan example code
This example computes a dot product in parallel, however it differs from
previous examples because the parallel loop construct is orphaned - it is
contained in a subroutine outside the lexical extent of the main program's
parallel region.
After reviewing the source code, compile and run the program:
C: |
guidec omp_orphan.c -o orphan
orphan | sort |
Fortran: |
guidef77 omp_orphan.f -o orphan
orphan | sort |
- Review / Compile / Run the matrix multiply example code
This example performs a matrix multiple by distributing the iterations
of the operation between available threads.
- After reviewing the source code, compile and run the program:
C: |
guidec omp_mm.c -o matmult
matmult |
Fortran: |
guidef77 omp_mm.f -o matmult
matmult |
The output shows which thread does each iteration and the final result
matrix.
- Run the program again, however this time sort the output to clearly see
which threads execute which iterations:
matmult | sort | grep Thread
Do the loop iterations match the SCHEDULE(STATIC,CHUNK) directive for
the matrix multiple loop in the code?
- Change this SCHEDULE clause to SCHEDULE(DYNAMIC). Then recompile
and then run the example again (use sort | grep Thread again).
How are the loop iterations distributed now?
- When things go wrong...
There are many things that can go wrong when developing OpenMP programs. The
omp_bugX.X series of programs demonstrate just a few.
See if you can figure out what the problem is with each case and then fix it.
Use guidec or guidef77 to compile each code as
appropriate.
The buggy behavior will differ for each example. Some hints are provided
below.