HPCS lab1

pdf > download > ebook > pobieranie > do ÂściÂągnięcia

HPCS lab1, Systemy obliczeniowe wysokiej wydajności

[ Pobierz całość w formacie PDF ]
High performance computing systems
Lab 1
Dept. of Computer Architecture
Faculty of ETI
Gdansk University of Technology
Paweł Czarnul
For this exercise, study basic MPI functions such as:
1. for MPI management: MPI_Init(...), MPI_Finalize(),
Each MPI program should start with MPI_Init(...) and finish with MPI_Finalize().
Each process can fetch the number of processes in the default communicator
MPI_COMM_WORLD (the application) by calling MPI_Comm_size (see the example
below).
Processes in an MPI application are identified by so-called ranks ranging from 0 to n-1
where n is the number of processes returned by MPI_Comm_size().
Based on the rank, each process can perform a part of all required computations so that all
processes contribute to the final goal and process all required data.
2. for point-to-point communication: MPI_Send(...), MPI_Recv(...),
int MPI_Send(void *buf, int count, MPI_Datatype dtype, int dest,
int tag, MPI_Comm comm)
MPI_Send sends data pointed by buf to process with rank dest. There should be count
elements of data type dtype. For instance, when sending 5 doubles, count should be 5 and
dtype should be MPI_DOUBLE. tag can be any number which additionally describes the
message and comm can be MPI_COMM_WORLD for the default communicator.
int MPI_Recv(void *buf, int count, MPI_Datatype dtype,
int src, int tag, MPI_Comm comm, MPI_Status *stat)
MPI_Recv is a blocking receive which waits for a message with tag tag from process with
rank src in communicator comm. Dtype and count denote the type and the number of
elements which are to be received and stored in buf. Stat holds information about the
received message.
3. for collective communication: MPI_Barrier(...), MPI_Gather(...), MPI_Scatter(...),
MPI_Allgather(...).
As an example,
int MPI_Reduce(void *sbuf, void* rbuf, int count,
MPI_Datatype dtype, MPI_Op op, int root,
MPI_Comm comm)
reduces all values given by processes in communicator comm to a single value in process
with rank root. See the code below for adding numbers given by all processes to a single value in
process 0.
Study the following tutorial on MPI:
The following example computes pi in parallel using an old method from the 17
th
century:
Pi/4=1/1 – 1/3 + 1/5 – 1/7 + 1/9 …. (1)
Note that the program works for any number of processes requested. Successive elements of (1) are
assigned to successive processes with ranks from 0 to (proccount-1).
For 2 processes:
Pi/4 = 1/1 – 1/3 + 1/5 – 1/7 + 1/9 ….
process 0 1 0 1 0 ….
For 3 processes:
Pi/4 = 1/1 – 1/3 + 1/5 – 1/7 + 1/9 – 1/11 ….
process 0 1 2 1 0 2 ….
etc.
This is a simple load balancing technique. For example, checking if successive numbers are prime
numbers might involve more time for larger numbers. This strategy balances the execution time
among processes quite well.
Note that in reality we only consider a predefined number of elements in (1). In general, we should
make sure that the data types used for adding the numbers can store resulting subsums.
#include <stdio.h>
#include <mpi.h>
 int main(int argc, char **argv) {
double precision=1000000000;
int myrank,proccount;
double pi,pi_final;
int mine,sign;
int i;
// Initialize MPI
MPI_Init(&argc, &argv);
// find out my rank
MPI_Comm_rank(MPI_COMM_WORLD, &myrank);
// find out the number of processes in MPI_COMM_WORLD
MPI_Comm_size(MPI_COMM_WORLD, &proccount);
// now distribute the required precision
if (precision<proccount) {
printf("Precision smaller than the number of processes - try again.");
MPI_Finalize();
return -1;
}
// each process performs computations on its part
pi=0;
mine=myrank*2+1;
sign=(((mine-1)/2)%2)?-1:1;
for (;mine<precision;) {
// printf("\nProcess %d %d %d", myrank,sign,mine);
// fflush(stdout);
pi+=sign/(double)mine;
mine+=2*proccount;
sign=(((mine-1)/2)%2)?-1:1;
}
// now merge the numbers to rank 0
MPI_Reduce(&pi,&pi_final,1,
MPI_DOUBLE,MPI_SUM,0,
MPI_COMM_WORLD);
if (!myrank) {
pi_final*=4;
printf("pi=%f",pi_final);
}
// Shut down MPI
MPI_Finalize();
return 0;
}
Assuming the code was saved in file program.c, we have to:
1. compile the code:
mpicc program.c
2. run it
1 process:
[klaster@n01 1]$ time mpirun -np 1 ./a.out
pi=3.141593
real 0m9.286s
user 0m9.244s
sys 0m0.037s
2 processes:
[klaster@n01 1]$ time mpirun -np 2 ./a.out
pi=3.141593
real 0m4.706s
user 0m9.286s
sys 0m0.063s
4 processes:
[klaster@n01 1]$ time mpirun -np 4 ./a.out
pi=3.141593
real 0m2.420s
user 0m9.380s
sys 0m0.118s
Note smaller execution times for larger numbers of processes used for computations.
Lab 527:
For this lab, you can use the default MPI implementation on desXX computers in the lab (XX range
from 01 to 18) – Open MPI.
Compile the code:
student@des01:~> mpicc program.c
create a configuration for the virtual machine – in this case just 2 nodes (des01 and des02):
student@des01:~> cat > machinefile
des01
des02
then invoke the application for 1 process (running on des01):
student@des01:~> mpirun -machinefile ./machinefile -np 1 time ./a.out
pi=3.141593 9.25user 0.01system 0:09.27elapsed 99%CPU (0avgtext+0avgdata
13008maxresident)k
0inputs+0outputs (0major+1009minor)pagefaults 0swaps
and 2 processes (running on des01 and des02):
student@des01:~> mpirun -machinefile ./machinefile -np 2 time ./a.out
Password:
pi=3.141593 4.63user 0.01system 0:04.65elapsed 99%CPU (0avgtext+0avgdata
13072maxresident)k
0inputs+0outputs (0major+1013minor)pagefaults 0swaps
4.63user 0.01system 0:04.67elapsed 99%CPU (0avgtext+0avgdata 13312maxresident)k
0inputs+0outputs (0major+1023minor)pagefaults 0swaps
You can create a larger virtual machine and test the scalability of the application.
Lab 527:
You can also use mpich on desXX:
student@des01:~> /opt/mpich/ch-p4/bin/mpicc program.c
program.c: In function ‘main’:
program.c:12:7: warning: unused variable ‘i’
student@des01:~> scp a.out des02:~
Password:
a.out
100% 1427KB 1.4MB/s 00:00
student@des01:~> scp a.out des03:~
Password:
a.out
100% 1427KB 1.4MB/s 00:00
student@des01:~> scp a.out des04:~
Password:
a.out
[ Pobierz całość w formacie PDF ]
  • zanotowane.pl
  • doc.pisz.pl
  • pdf.pisz.pl
  • chiara76.opx.pl
  •