- Login to the SP machine
Workshops differ in how this is done. The instructor will go over this
beforehand.
- Verify that guidec and guidef77 are in your path
These executables are required to compile the OpenMP exercise codes. Issue
the following commands. If they do not resolve to the location of the
executables, see the instructor before going any further.
which guidec
which guidef77
- Copy the example files
- In your home directory, create a subdirectory for the example codes
and then cd to it.
mkdir ~/openMP
cd ~/openMP
- Then, copy either the Fortran or the C version of the parallel OpenMP
exercise files to your openMP subdirectory:
C:
| cp /usr/local/spclass/blaise/openMP/samples/C/*
~/openMP
|
Fortran:
| cp /usr/local/spclass/blaise/openMP/samples/Fortran/*
~/openMP
|
- List the contents of your openMP subdirectory
You should notice the following files:
Note: Most of these are simple example files. Their primary
purpose is to demonstrate the basics of how to parallelize a code with OpenMP. Most execute in less than a second.
- Review the Hello World example code
Take a moment to examine the source code and note how OpenMP directives and
library routines are being used.
- Compile the Hello World example code
Depending upon your language preference, use one of the following commands
to compile the code with KAI's Guide:
C:
| guidec omp_hello.c -o hello
Fortran:
| guidef77 omp_hello.f -o hello
| |
- Run the Hello World executable
Simply type the command hello and the program should run.
Your output should look similar to below. The actual order of output
strings may vary.
Hello World from thread = 0
Number of threads = 4
Hello World from thread = 3
Hello World from thread = 1
Hello World from thread = 2
|
- Vary the number of threads and re-run Hello World
Depending upon your shell, set the number of threads to use by means of the
OMP_NUM_THREADS environment variable. Then re-run Hello World and notice
the output.
csh/tcsh
| setenv OMP_NUM_THREADS 8
hello
bsh/ksh
| export OMP_NUM_THREADS=8
hello
| |
- Review / Compile / Run the workshare1 example code
This example demonstrates use of the OpenMP loop work-sharing construct.
Notice that it specifies dynamic scheduling of threads and assigns a
specific number of iterations to be done by each thread.
After reviewing the source code, use the following commands to compile
and run the executable, being sure to set the OMP_NUM_THREADS variable
to 4 before running the executable.
C:
| guidec omp_workshare1.c -o workshare1
setenv OMP_NUM_THREADS 4 (export OMP_NUM_THREADS=4)
workshare1 | sort
Fortran:
| guidef77 omp_workshare1.f -o workshare1
setenv OMP_NUM_THREADS 4 (export OMP_NUM_THREADS=4)
workshare1 | sort
| |
Note that the output is piped through the sort utility. This will make it
easier to view how loop iterations were actually scheduled across the team
of threads.
- Review / Compile / Run the workshare2 example code
This example demonstrates use of the OpenMP SECTIONS work-sharing construct
Note how the PARALLEL region is divided into separate sections, each of
which will be executed by one thread.
As before, compile and execute the program after reviewing it:
C:
| guidec omp_workshare2.c -o workshare2
workshare2
Fortran:
| guidef77 omp_workshare2.f -o workshare2
workshare2
| |
Run the program 5 or 6 times and observe the differences in output. Because
there are only two sections, you should notice that only two threads do work
even though there are more than two in the team. You should also notice that
the two threads doing work can vary. For example, the first time thread 0 and
thread 1 may do the work, and the next time it may be thread 0 and thread 3.
- Review / Compile the workshare3 example code
This example demonstrates use of the combined parallel loop construct and
static scheduling. However, it also demonstrates an error in the use of the
parallel loop construct.
Try compiling the code:
C:
| guidec omp_workshare3.c -o workshare3
Fortran:
| guidef77 omp_workshare3.f -o workshare3
| |
The compilation will fail. You should see error messages similar to below.
See if you can identify the source of the error.
Fortran
Guide 3.6 k310744 19990126 23-May-1999 15:13:11
### !$OMP PARALLEL DO SHARED(A,B,C,N)
### in line 25 procedure WORKSHARE3 of file omp_workshare3.f ###
### warning: This directive is not adjacent to the loop it affects. ###
### TID = OMP_GET_THREAD_NUM()
### in line 29 procedure WORKSHARE3 of file omp_workshare3.f ###
### error: Statement cannot follow compiler directive. ###
Guide 3.6 k310744 19990126 : 1 error in file omp_workshare3.f
Guide -- Syntax Errors Detected
ERROR: omp_workshare3.f failed in the Guide step
C / C++
"omp_workshare3.c", line 27: error: a for statement must follow an OpenMP for
pragma
#pragma omp parallel for \
^
1 error detected in the compilation of "omp_workshare3.c".
KCC: Compilation failed.
|
- Review / Compile / Run the workshare4 example code
Now, review a corrected version of the previous code. The correction
includes removing all statements between the parallel loop construct and
the actual loop. OpenMP requires that the loop immediately follows the
construct.
Other corrections include introducing logic to preserve the ability to
query a thread's id and print it from inside the DO loop. Notice the use
of the FIRSTPRIVATE clause to intialize the flag.
Compile and run the program:
C:
| guidec omp_workshare4.c -o workshare4
workshare4 | sort
Fortran:
| guidef77 omp_workshare4.f -o workshare4
workshare4 | sort
| |
- Review / Compile / Run the orphan example code
This example computes a dot product in parallel, however it differs from
previous examples because the parallel loop construct is orphaned - it is
contained in a subroutine outside the lexical extent of the main program's
parallel region.
After reviewing the source code, compile and run the program:
C:
| guidec omp_orphan.c -o orphan
orphan | sort
Fortran:
| guidef77 omp_orphan.f -o orphan
orphan | sort
| |
- Review / Compile / Run the matrix multiply example code
This example performs a matrix multiple by distributing the iterations
of the operation between available threads.
After reviewing the source code, compile and run the program:
C:
| guidec omp_mm.c -o matmult
matmult
Fortran:
| guidef77 omp_mm.f -o matmult
matmult
| |
The output shows which thread does each iteration and the final result
matrix.
Do the loop iterations match the !$OMP DO SCHEDULE(STATIC,CHUNK)
directive for the matrix multiple loop in the code?
Remove the SCHEDULE(STATIC,CHUNK) clause from this directive, recompile
and then run the example again. How are the loop iterations distributed now?